If you have lived in Kenya as long as I have, the term 'Harambee' should, at the very least, ring a bell. 'Harambee' is a Swahili term that literally means 'all pull together'. If you need help, be it money-wise, labour-wise… it doesn't matter, all you have to do 'ni kuita Harambee' and your community's got you!! Harambee is such a revered activity in Kenya, it's our country's official motto! (I bet you didn't know that) *To be clear, I do realise pulling together is not exclusively Kenyan*

So, with Harambee's prevelance, someone must have decided to bring it into the digital world. Yes, they did. So, what do you get when you cross our spirit of Harambee with technology? You build massive platforms and call it crowdfunding! Kickstarter, Indiegogo, Crowdsupply, GoFundme etc. There's a tonne of these crowdfunding platforms on the web. If you have an idea that you'd like to execute but do not have the capital to do that, you have a simple basic formula to follow:

i) Sign up on any of the platforms. The HowTos depend on which platform you choose. Remember: These platforms usually come at a cost to use.

ii) Prepare an attractive 'pitch' for your idea. This usually includes a video explaining what the product is, graphics etc. Just, sell your product.

iii) State the amount of money you'd like to raise, your goal and by when.

iv) Users of said platform, if attracted to your idea, decide to back you. Basically, 'wanakuchangia'

v) Depending on the platform, you may need to provide incentives for backing e.g. the product, equity etc. More details here.

vi) On most of the platforms, it's 'all or nothing'. You either reach your goal and/ or surpass your goal, or you don't get any money AT ALL.

It goes without saying that it'd be quite neat if you met your goal. So, what factors determine whether or not you reach your goal? A yuuuuuge number of articles on how to have a successful crowdfunding campaign have been written but most of them take the format of 'How to manipulate people psychologically to like your product and throw money at it' rather than take an objective look at what has worked before and draw insights from them.

Those who do not know history's mistakes are doomed to repeat them – George Santayana

How do you take an objective look? You collect the data about any and all crowdfuding campaings that have happened in the past, analyse it and draw insights. For fun, I'll predict whether or not 24 new campaigns (as of June 30th 2018) will be successful. An update to this post shall follow, to check how accurate the predictions are. An accuracy of >=80% shall grant me permission to change my name to Kelvin 'The Prophet' Gakuo.

Updates (Aug 28th 2018)

I have checked back on the projects. The actual outcome is appended to the predictions table as follows

Project Name	Link	pred outcome
Gafas de Sol Ecol\u00f3gicas \| Carpris Sunglasses	https://kck.st/2tOXFNn	0 0
Project for soyamax join to JAPAN EXPO 2019 in Paris	https://kck.st/2Kuxkyn	0 1
The Witch's Cupboard: Potions & Ingredients Hard Enamel Pins	https://kck.st/2Mys2Q0	1 1
Ultra Entertainment : A Business	https://kck.st/2KkQiby	0 0
Karama Yemen Human Rights Film Festival	https://kck.st/2Ky8wlM	0 0
The Witches Compendium: A guide to all things Wicca	https://kck.st/2KCoVWn	0 0
Quickstarter: CELIE & COUCH	https://kck.st/2NdDn9l	0 0
Tiger Friday	https://kck.st/2KzOUBg	0 1
Vainas De Vainilla - Vanilla Bean	https://kck.st/2tT2dSG	0 0
\"Simone de Beauvoir\" Libro Ilustrado/Picture Book	https://kck.st/2yTg4yl	0 0
Artist Lost/Heiress Denied - A True Story	https://kck.st/2yWJPhG	0 0
Stolen Weekend's debut EP	https://kck.st/2MDbHJx	0 0
Maidens of the dragon, 28mm quality pewter miniatures.	https://kck.st/2NbJGtW	0 1
Little Tumble Romp Indoor Sensory Center	https://kck.st/2KAtaBO	0 0
Watercolour Style - Nebula Washi Stickers	https://kck.st/2KAZogy	0 0
HOWPACKED.COM New nightlife monitoring website version 1	https://kck.st/2yWJMT2	0 0
Boob Planter Pins	https://kck.st/2lNmup7	0 1
Operation: Boom - Issue 3	https://kck.st/2Mvu576	0 1
Rapture VR	https://kck.st/2N793gt	0 0
Goos'd - Bring the party and Get Goos'd!	https://kck.st/2KyVt3P	0 0
Smuggles n' Snuggles	https://kck.st/2Kxb4Rb	1 1
Makeshift	https://kck.st/2tTddj0	0 0
Creative Community Project	https://kck.st/2Ku6LcG	0 0
Opening a fabric store in Downtown Kent!	https://kck.st/2yVEC9P	0 0

Of the 24 projects, 5 were predicted incorrectly. So, that gives us an accuracy of 79.17%. So close!!!!

Join me, and let’s find out how inciteful this data can get…

Note: All the technical details (the code, techniques) are explained in depth here

The data

To answer our question, data was extracted from two of the most popular crowdfunding platforms: kickstarter.com and indiegogo.com. The data was a couple of thousands of campaigns from June 2017 to June 2018 on each platform, each adequately described by a good number of features

Kickstarter vs Indiegogo

To understand how the crowdfunding world behaves, it's worth it to compare the two most popular platforms on some metrics

Most populated categories (have the most number of campaigns)

* Indiegogo: Home, Travel & Outdoors, Phones & Accessories

* Kickstarter: Product Design, Tabletop Games, Art

Most successful categories

* Indiegogo: Photography, Comics,Dance & Theater

* Kickstarter: Product Design, Tabletop Games, Accessories

Most popular words used in campaign titles for the top categories

Looks like everyday there's a new Ultimate product seeking funding on Kickstarter. And people seem to loooove their enamel pins!

Successful vs failed campaigns

You can clearly see the choice of platform doesn't matter. The chances of succeeding on either are dismmal.

Kickstarter users and locations

Kickstarter provides a comprehensive defintion of the campaigners i.e users who have started a campaign and where the campaign is located. I decided to dig in

Most active users (By number of campaigns)

For Freedoms is seriously on the grind!

Most successful users (By most number of successful campaigns)

For Freedoms' grind seems to be paying. However, notice that only 3.846% of the attempts were successful

Location of campaign

psst… You can click on the countries

How are there no projects hosted from Kenya?!?! I thought we went digital and stuff.

The most successful project

The most successful project with a goal of $35 and a whooping $5429 in funding is by Joe Magic Games. Read about the campaign, here

So, what determines success?

Excluding the most obvious indicators of success, such as the number of backers (pple wiliing to fund your project) and the difference between the goal and pledged amount, the other indicators were put through XGBoost algorithm, and their importance computed and plotted.

Clearly, the category of your campaign, how long the campaign is on for and the text describing what the campaign is about, the creator and the name matter the most.

Shakespeare, there definitely is something in a name!!

Surprisingly, the country of origin has very little impact!! Take your shot, you never know.

Let's do prophecy!!!

I decided to collect 24 of the newest projects as of 30th June and predict whether they'll be successful (1) or unsuccessful (0). The results were as follows:

Project Name	Link	Prediction
Gafas de Sol Ecol\u00f3gicas \| Carpris Sunglasses	https://kck.st/2tOXFNn	0
Project for soyamax join to JAPAN EXPO 2019 in Paris	https://kck.st/2Kuxkyn	0
The Witch's Cupboard: Potions & Ingredients Hard Enamel Pins	https://kck.st/2Mys2Q0	1
Ultra Entertainment : A Business	https://kck.st/2KkQiby	0
Karama Yemen Human Rights Film Festival	https://kck.st/2Ky8wlM	0
The Witches Compendium: A guide to all things Wicca	https://kck.st/2KCoVWn	0
Quickstarter: CELIE & COUCH	https://kck.st/2NdDn9l	0
Tiger Friday	https://kck.st/2KzOUBg	0
Vainas De Vainilla - Vanilla Bean	https://kck.st/2tT2dSG	0
\"Simone de Beauvoir\" Libro Ilustrado/Picture Book	https://kck.st/2yTg4yl	0
Artist Lost/Heiress Denied - A True Story	https://kck.st/2yWJPhG	0
Stolen Weekend's debut EP	https://kck.st/2MDbHJx	0
Maidens of the dragon, 28mm quality pewter miniatures.	https://kck.st/2NbJGtW	0
Little Tumble Romp Indoor Sensory Center	https://kck.st/2KAtaBO	0
Watercolour Style - Nebula Washi Stickers	https://kck.st/2KAZogy	0
HOWPACKED.COM New nightlife monitoring website version 1	https://kck.st/2yWJMT2	0
Boob Planter Pins	https://kck.st/2lNmup7	0
Operation: Boom - Issue 3	https://kck.st/2Mvu576	0
Rapture VR	https://kck.st/2N793gt	0
Goos'd - Bring the party and Get Goos'd!	https://kck.st/2KyVt3P	0
Smuggles n' Snuggles	https://kck.st/2Kxb4Rb	1
Makeshift	https://kck.st/2tTddj0	0
Creative Community Project	https://kck.st/2Ku6LcG	0
Opening a fabric store in Downtown Kent!	https://kck.st/2yVEC9P	0

Pretty bleak, ey? See you on September 1st 2018 as I check how accurate the predictions are

Note: The following section provides an in-depth explanation of the process, tools and techniques applied to answer today’s question. Feel free to skip to the comment section, here

THE TUTORIAL

In this section, I provide a full tutorial on how to recreate exactly what I did.

Preriquisites:

Python >2.7
The code repository off Github, here
A background in web development
How AJAX works (start here) and workings of infinite scrolling (here)
Basics of web crawling using Scrapy (start here)
Fundementals of machine learning (start here)
Machine learning in Python (start here)

1. DATA COLLECTION AND CLEAN-UP

A. SETUP

- For this project, Scrapy was the primary data collection tool. To set-up

pip install scrapy

- To start a Scrapy project and move into it:

scrapy startproject ScraperName
cd ScraperName

B. INDIEGOGO

Refer to indiegogoWrangler.py

- Due to how indiegogo.com it was not possible to perform efficient crawling (as you'll witness on kickstarter). However, raw datasets exist on the internet and one was gotten as a CSV file at goo.gl/6pPCJb

- Our code file takes the CSV dump, extracts our desired features and dumps them to a JSON file, using Pandas

READ CSV FILE AND EXTRACT RELEVANT COLS

import pandas as pd
dump = pd.read_csv('data/'+inputFile)

#A few of the relevant features
ids = dump['project_id'].tolist()
names = dump['title'].tolist()

WRITE TO JSON

import json
obj = {}
obj['projects'] = []

for j in range(len(ids)):
	item = {} #Empty project object
	#A few relevant attrs
	item['projectId'] = ids[j]
	item['projectName'] = names[j]
	
	#Append populated project object to objects list
	obj['projects'].append(item)

#Write objects to JSON file
with open('data/'+outputFile, 'a') as dp:
	json.dump(obj, dp)

C. KICKSTARTER

Refer to KickstarterWrangler/

- There are two ways to scrape a website:

Screen scraping: Visit the URL, and extract information from the displayed HTML
Crawling: Figure out how data is moved around i.e. how and where data is displayed as HTML then write a spider to mimic this behaviour * Websites that use infinite scrolling are just begging to be crawled!!

- In a nutshell, here's how most website implement infinite scrolling:

Initiate an AJAX call based on a click or scroll trigger
The AJAX call passes parameters (usually page number) to the backend
The backend returns a JSON file containing the data based on said parameters
The JSON file is parsed appropriately and the contents displayed as HTML

Word to the wise: Whenever you see a website utilises infinite scrolling, $100 says a scrolling event is triggering an AJAX call that returns a JSON file with the data.

- kickstarter.com utilises the same logic, which I used to my advantage as follows:

Trigger an AJAX call passing the appropriate parameters

- I did some digging and found a couple of things:

The AJAX call is passed using a URL in the form https://www.kickstarter.com/discover/advanced?google_chrome_workaround&woe_id=0&sort=magic&seed={—A 7 digit seed--}&page={—An integer—}

Refer to KickstarterWrangler/KickstarterWrangler/spiders/kickstarteritems.py

import json
import scrapy

class GetKickstarterItems(scrapy.Spider):
	name = 'kickstarteritems'
	baseURL ="https://www.kickstarter.com/discover/advanced?google_chrome_workaround&woe_id=0&sort=magic&seed=%07d&page=%05d"%(2547000, 00001)
	start_urls = [baseURL]

	def parse(self, response): #Scrapy returns an object 'response' containing data returned by server
		data = json.loads(response.body)

For the AJAX call to work properly, request headers need to be set. Under KickstarterWrangler/KickstarterWrangler/settings.py, set:

DEFAULT_REQUEST_HEADERS = {	
	'Accept': 'application/json, text/javascript, */*; q=0.01',
	'Accept-Encoding': 'gzip, deflate, br',
	'Accept-Language': 'en-US,en;q=0.5',
	'Connection': 'keep-alive',
	'DNT': 1,
	'Host': 'www.kickstarter.com',
	'X-Requested-With': 'XMLHttpRequest'
}

Parse the returned JSON file to get the relevant data points

for project in data.get('projects', []): #Parse project JSON
	item = dict()
# A few relevant attrs
	#Project Details
	item['projectId'] = project.get('id')
	item['projectName'] = project.get('name')
	#Creator details
	item['creatorId'] = project.get('creator', {}).get('id')

yield item #Scrapy command to return the populated object

The logic above will only return data from the first page only i.e only the first 12 items. To overcome this and get as many pages as possible:

Each JSON file returned has an attr 'has_more' where if set to 'True' means there are other retrievable JSONs. Therefore, increment the page number by one, then parse the new response

currentPage = response.url #The current URL being crawled
pageNumber = int(currentPage[-5:]) #Extract current page number as last 5 chars on the URL
currentSeed =  int(currentPage[-18:-11])#Seed is 7 chars, 11th from last char

if (data['has_more']):
	nxt = pageNumber + 1
	seed = currentSeed
	getNxtJSON = "https://www.kickstarter.com/discover/advanced?google_chrome_workaround&woe_id=0&sort=newest&seed=%07d&page=%05d"%(seed, nxt)
				time.sleep(5) #Sleep for five seconds then go to the next page
				yield scrapy.Request(url=getNxtJSON , callback=self.parse)

The code above will only return 200 pages. Why? Kickstarter has set a restriction whereby a single seed value can only return 200 pages. To circumvent this:

For a starting seed value e.g. 2547000, extract all 200 pages (2400 items)
At page number 200, increment the seed by a random int, reset page number to 1 then crawl for the data
When the seed value gets to a predefined value e.g. 2548000, it's time to stop!!!

This method will produce A LOT of redudant data that will be cleaned later

import random
if(pageNumber==200):
	nxt = 1
	var = random.randint(1, 100)
	seed = currentSeed + var #Increment by a random value
	if(seed <= 2548000):
		getNxtJSON = "https://www.kickstarter.com/discover/advanced?google_chrome_workaround&woe_id=0&sort=newest&seed=%07d&page=%05d"%(seed, nxt)
		time.sleep(600) #Sleep for 10 minutes before using the next seed from page 1
		yield scrapy.Request(url=getNxtJSON , callback=self.parse)
	else: #Enough seeds, stop!!
		raise CloseSpider('ENOUGH DATA. TIME FOR CLEANING!!!') #Close the spider

The spider's output was:

LABELLING

A campaign where the ratio of pledged to goal amounts is >= 1.0, is considered successful. This can be easily computed on a dataframe as follows:

import pandas as pd
df['ratio'] = df['pledged'] / df['goal']

The data is then labeled:

df['label'] = np.where(df['success'] >= 1, 1, 0) #1: Successful, #0: Unsuccessful

2. PREPROCESSING

Refer to preprocessor.py

ML models only work with numerical data. We therefore need to preprocess any categorical data into usable features. For our data the following needed to done:

Compute length of campaign

The length is computed as the time between the deadline and the launch, duh!

df['projectLength'] = df['deadline'] - df['launch']

Assign countries and currencies to unique IDs

This is done using Pandas assign method and cat.codes. Read more here

df =df.assign(countryID=(df['country']).astype('category').cat.codes)
df =df.assign(currencyID=(df['currency']).astype('category').cat.codes)

#Two new columns, 'countryID' and 'currencyID' are created

Remove stop words from blurb and name

Commonly used English words that don't add a lot of meaning need to be removed.

from nltk.corpus import stopwords
stopWords = stopwords.words('english')
df['processedName'] = df['projectName'].apply(lambda x: ' '.join([word for word in x.split() if word not in (stopWords)]))
df['processedBlurb'] = df['blurb'].apply(lambda x: ' '.join([word for word in x.split() if word not in (stopWords)]))

Engineer some new features

The blurb and name lengths were computed to have more features per sample

df['blurbLen'] = df['processedName'].apply(lambda x: len(x))
df['nameLen'] =  df['processedBlurb'].apply(lambda y: len(y))

* The Pandas 'apply' method, takes a function that is run on every item in a row or column. Read more here

Remove unwanted columns

After converting the categorical into numerical, you want to remove the unwanted features (columns):

df.drop(['state','projectId', 'categoryName', 'creatorName', 'success'], axis=1, inplace=True)
#Add as many columns as you want to the list

3. EXPLORATORY DATA ANALYSIS

KICKSTARTER EDA

Refer to eda.py

USER ACTIVITY

- Create a new column with the count of each creator

tf['count'] = tf.groupby('creatorName')['creatorName'].transform('size')

- Uniquefy the dataframe by creator

tf.drop_duplicates('creatorName', keep='first', inplace=True)

- Sort the dataframe according to the count

sf = tf.sort_values('count', ascending=False)

- Plot the top three entries

lbls = sf[0:3]['creatorName'].tolist()
y = sf[0:3]['count'].tolist()	
x = np.arange(len(y))
plt.bar(x, y, color=['Green', 'Purple', 'Black'])
plt.xticks(x, lbls)
plt.xlabel('USER', fontsize=10)
plt.ylabel('CAMPAIGNS', fontsize=10)
plt.title('MOST ACTIVE USERS', fontsize=20)
sns.set()
plt.show()

USER SUCCESS

- Compute success levels of each user

kf['success'] = kf['pledged'] / kf['goal']

- Extract only the successful rows i.e. success >=1.0

sf = kf[kf['success'] >= 1]

- Count number of times a creator appears

sf['count'] = sf.groupby('creatorName')['creatorName'].transform('size')

- Uniquefy the dataframe by creator

sf.drop_duplicates('creatorName', keep='first', inplace=True)

- Sort according to count

gf = sf.sort_values('count', ascending=False)

- Plot the top three entries

lbls = gf[0:3]['creatorName'].tolist()
y = gf[0:3]['count'].tolist()	
x = np.arange(len(y))
#plot the barplot here:

CAMPAIGN BY LOCATION

- Count the number of times a country appears

fd['count'] = fd.groupby('country')['country'].transform('size')

- Uniquefy the dataframe by country

fd.drop_duplicates('country', keep='first', inplace=True)

- Sort according to count

fd = fd.sort_values('count', ascending=False)

- Write to CSV file

fd.to_csv('data/countries.csv')

- Import the CSV file in a Tableau viz

- Add 'country' to rows and 'count' to the size mark

- Select the world heat map viz

- Publish to your Tableau server

- Share to extract the embed code

INDIEGOGO VS KICKSTARTER

Refer to indieVSkick.py

The EDA in this section was done using raw Python i.e. splitting a DataFrame into lists and manually doing comparisons 'n shit. This is a very tedious, inefficient process that doesn't utilise the power of Pandas as a data analytics library.

I therefore will not go through how things were executed here, but advise that you do your own comparisons using Pandas only as I have done in 'Kickstarter EDA' above

4. MODELLING

Refer to model.py

Couple of things first:

i) Since I needed to compute the importance of features in determine success, some form of Decision Trees needed to be used. I chose to use Extreme Gradient Boosted Trees (XGBoost)

ii) The data is very biased towards unsucessful campaigns. Warning: May overfit

TRAINING

WORD2VEC

First things, first. We need to convert the blurb and project names into usable numeric data. A TfIdf Vectoriser is perfect for the job!

from sklearn.feature_extraction.text import TfidfVectorizer
vec = TfidfVectorizer(ngram_range=(1,1), analyzer='word', max_df=1.0, binary=False, sublinear_tf=False)
df['nameToVec'] = list(vec.fit_transform(df['processedName']).toarray())
df['blurbToVec'] = list(vec.fit_transform(df['processedBlurb']).toarray())

You will notice that XGBoost only works with ints, floats etc. but the vectoriser produces vectors only. A decision to compute the sum of the vector as the feature instead was reached.

df['projectName'] = df['nameToVec'].apply(lambda x: x.sum())
df['blurb'] = df['blurbToVec'].apply(lambda y: y.sum())
df.drop(['processedName', 'processedBlurb', 'nameToVec', 'blurbToVec'], axis=1, inplace=True)
#'projectName' and 'blurb' will now contain the sum of the vector form of each name and blurb

THE MODEL

The features were chosen to be all other columns except for 'goal', 'pledged', and the number of backers, since these are veeeery obvious indicators of success. Obviously the label is not a feature

df.drop(['pledged', 'goal', 'backers'], axis=1, inplace=True) #Exclude the obvious markers of success
x = df[df.columns[df.columns != 'label']]

The target was the 'label' feature

y = dfTrain['label']

Split the data

XTrain, XTest, yTrain, yTest = train_test_split(x, y, test_size=0.33, random_state=42)

Initialise and fit the classifier

from xgboost import XGBClassifier
sgd = XGBClassifier()
sgd.fit(XTrain, yTrain)

Note that XGBoost doesn't come prepackaged with SKLearn hence you'll need to:

pip install xgboost

METRICS

Compute the cross-validation accuracy of the model:



from sklearn.model_selection import cross_val_score

from sklearn.metrics import classification_report, confusion_matrix 

print('Accuracy for 10 folds: {}'.format(cross_val_score(sgd, XTrain, yTrain, cv=10)))




Output:


Accuracy for 10 folds: [ 0.7755102 0.77868852  0.77459016  0.76229508  0.7704918 0.76131687 0.781893  0.7654321 0.77366255  0.77777778]

Plot feature importance

from xgboost import plot_importance
plot_importance(sgd)
plt.show()

TESTING

On the test set, predict

y_pred = sgd.predict(XTest)

Compute the confusion matrix

print('Confusion Matrix:\n {}'.format(confusion_matrix(yTest, y_pred)))

Output:
Confusion Matrix:
 [[909  15]
 [256  21]]

Compute the classification Report

print('Report:\n {}'.format(classification_report(yTest, y_pred)))

Output:
Report:
              				precision    recall  f1-score   support

          Unseccessful       0.78      0.98      0.87       924
          Successful         0.58      0.08      0.13       277

		avg / total          0.73      0.77      0.70      1201

PREDICTING

On a bunch of new unlabeled data, label and write to file

#Label unseen data
preds = sgd.predict(dfUnlabeled)
dfUnlabeled['label'] = preds
dfUnlabeled.to_json('data/'+outFile)

THE END

Comments

erectile dysfunction25 November 2020 at 02:52
I do not know if it's just me or if everyone else experiencing problems with your site. It appears as though some of the text within your content are running off the screen. Can somebody else please provide feedback and let me know if this is happening to them as well? This might be a problem with my browser because I've had this happen before. Appreciate it
Dig My Cart10 March 2023 at 04:53
Easter is a holiday celebrates the resurrection of Jesus Christ from the dead. It is typically observed on the first Sunday following the first full moon after the spring equinox. Easter is one of the most important holidays in the Christian calendar and is celebrated by millions of people around the world.Dig my cart Easter coupon discount is live on website.Easter coupons vouchers or promo codes that offer discounts or special deals during the Easter holiday.

Search

I HAVE MOVED TO kelvingakuo.github.io

IS YOUR CROWDFUNDING CAMPAIGN GOING TO SUCCEED?

The data

Kickstarter vs Indiegogo

Kickstarter users and locations

Location of campaign

The most successful project

1. DATA COLLECTION AND CLEAN-UP

A. SETUP

B. INDIEGOGO

READ CSV FILE AND EXTRACT RELEVANT COLS
import pandas as pd dump = pd.read_csv('data/'+inputFile) #A few of the relevant features ids = dump['project_id'].tolist() names = dump['title'].tolist()

C. KICKSTARTER

The spider's output was:

LABELLING

3. EXPLORATORY DATA ANALYSIS

4. MODELLING

Comments

Post a Comment

Popular posts from this blog

HIP HOP MUSIC. IS IT ONLY ABOUT DRUGS, SEX AND MONEY?

IS YOUR CROWDFUNDING CAMPAIGN GOING TO SUCCEED?

The data

Kickstarter vs Indiegogo

Kickstarter users and locations

Location of campaign

The most successful project

1. DATA COLLECTION AND CLEAN-UP

A. SETUP

B. INDIEGOGO

READ CSV FILE AND EXTRACT RELEVANT COLS import pandas as pd dump = pd.read_csv('data/'+inputFile) #A few of the relevant features ids = dump['project_id'].tolist() names = dump['title'].tolist()

C. KICKSTARTER

The spider's output was:

LABELLING

3. EXPLORATORY DATA ANALYSIS

4. MODELLING

Comments

Post a Comment

Popular posts from this blog

HIP HOP MUSIC. IS IT ONLY ABOUT DRUGS, SEX AND MONEY?

READ CSV FILE AND EXTRACT RELEVANT COLS
import pandas as pd dump = pd.read_csv('data/'+inputFile) #A few of the relevant features ids = dump['project_id'].tolist() names = dump['title'].tolist()