Analyzing Turkish entrepreneurship focused youtube channels
This blog includes a an analysis of Turkish entrepreneurship focused youtube channels.
To scrape data I write a python script to get all the video links of each channel(there are about 15), then found some google app script made with youtube api in google and customized that for my need. I ran the app script to get the like, view, and title. After that, I wrote another Python script that reads the data in excel(xlsx) and writes the date data to excel by using xlswriter to write&read excel and selenium to scrape.
Then I cleaned the data by replacing Turkish characters, filling NA values, changing date format, making all of them same etc. Somehow excel did not change the date format. Then I found this video to change data text to column.
In final data was ready to visualize in jupyter notebook. It is better if you read this blog on pc.
Here is the visualization:
Channels:
Number of videos per channel
Webrazzi has the most videos by far among all channels. There are 6659 videos and webrazzi has most of them.
Average view/channel
Average like/channel
Despite that Kolay Değil is the second one in average view/channel it’s the first one in Average like/channel.
Most used 10 words in titles
Startup, Istanbul, Webrazzi, 2015, Day, Demo, Dijital, 2019, 2018, 2017
Wordcloud of the titles
First, I removed stop words by using nltk.corpus library and chose Turkish as a language. Then I realized there are still some words&characters that needed to be removed like “ile”, “?” removed them manually. After cleaning titles I created another column called clean title.
Wordcloud with custom font and background
I thought it looks better with coolvetica and dark theme. Looks cool! (Also Andrew recommends to implementing dark mode always)
User mask wordcloud
Egirişim wordcloud
Wordcloud of egirişim’s most used words in the title
Kolay Değil wordcloud
Wordcloud of Kolay Değil’s most used words in the title
Bir ihracat hikayesi wordcloud
Wordcloud of Bir Ihracat Hikayesi’s most used words in title
Thank you for reading. If you have any feedback I would like to hear it. Please contact me via Linkedin to give your feedback.
Extra resource
medium blog about youtube data scrapping, towardsdatascience, youtubeapi github, stackoverflow(getting all the video links of a youtube channel), youtube analyzer github,