The dust has settled, somewhat, on the whole #WhoIsKing debate, featuring Rabbit King Kaka, Juliani, and Karlgraph. I am not sure how it ended but I was curious to find out just how it should have ended. Data never lies. So I started by collecting data. The idea was to measure whose songs were popular based on how often people viewed the songs on youtube over a period of a month. This data can be gleaned, readily, from youtube.
But first, I needed a list to start with. I did not know how many songs these artists had released but I wanted to track all of their songs that were available on youtube. Lucky enough, I found a really comprehensive list at kenyans.co.ke. I ended up with a list of over 5800 youtube videos from a lot more artists than the three. I tracked these videos collecting information about how many view, likes, dislikes they got twice every day for about a month plus. I ended up with 600,000+ rows of data looking like this.
The next task was to get all songs by the artists. Whether they were collaborations or not. This was easy where there was only one artist. Where there is a collaboration, this task was quite tricky. Essentially, I had to process the string so you have list of artists that is distinct. For instance “Kanjii Aaron Rimbui” should be processed to “Kanjii” and “Aaron Rimbui”. (you can find the algorithm and the data on my github). The final data looked something like this.
So who is King?
I calculated a popularity score of each video for each day. I wrote about calculating the popularity score here.
Based on the score, I fitted a linear model, controlling for age of videos and also collabos. The Chart below shows the results of #whoisking
Looks like over time, King Kaka videos perform relatively better than the rest.
But just to be sure that its not the age of the videos that brings this effect, I looked at that average age of the videos from the three artists. The boxplot shows that King Kaka has slightly older videos compared to the other two (see black line)
Well, now you know #whoisking
The full analysis and the script, you can download from my github account.