Making sense of my YouTube Watch History

For a long time, I have heard the phrase `You are what you eat`. A variation of that which I sparingly have heard is 'what you read'. And now even though I haven't heard it, I have at least pondered 'Am I, what I see and read online?' The question mark is present because I wonder about freedom of choice, or at least the illusion of one.

Brushing those heavy topics aside, I wanted to at least be conscious of how I spend my time online. I don't use popular social media sites. I do use content aggregator ones likes Reddit and YouTube. Reddit is easy to quantify and make sense because of how it is structured. Also, Reddit does its own version of Spotify wrapped each year. That leaves me with just YouTube.

The Hard Part - Getting the Data

YouTube does have a stat's page which lets you know how much time you spend for that particular week. You can also check your watch history, but it is not in machine-readable form. Even wanting to know which channels I watch more often than not, there isn't a straight forward way.

After some googling, I decided to use Google Takeout. This is a way to download all the data Google has on you. Here, I chose to download just my YouTube watch history, got a link and voilà, I had my entire watch history. Neat! Now all I had to do was to fire up my IDE and analyze the data! Or that's what I thought until I opened the file.

  "header": "YouTube",
  "title": "Watched https://www.youtube.com/watch?v=0BzGlfm1wFo",
  "titleUrl": "https://www.youtube.com/watch?v=0BzGlfm1wFo",
  "time": "2022-04-10T18:13:36.349Z",
  "products": ["YouTube"],
  "activityControls": ["YouTube watch history"]

The above is an example of what data was available. Now, I have the URL for the video watched, but I don't have any data pertaining to the channel or any of the video tags.

Googling more, I found YouTube API exists. While these APIs were useless in getting my data, they had the means to get publically available data such as what channel a video belonged to and tags. After a couple of hours of setting everything up, I was able to get channel details of my last three months of watch history. Just three months, as I didn't want to use up my free API calls.

The Easy Part - Inference

Now that data is available, I was able to write small scripts to parse them. I tried inferring what type of videos I spent watching by counting the unique tags. While there were what I consider a lot of unwanted noise, I was still able to infer the data at hand to make meaningful conclusions without much fuss.

Tag Count
elden ring 205
Gameplay 92
Classics 63
Literature 63
T20 62
Literary Analysis 63
Myths 62
Cricbuzz 61
linus 56
rust programming language 41

The above is not reflective of my entire data, as I cleaned up the duplicates and noise. But this overall gives a better idea of how I spent my time on YouTube over the past three months. This still not indicative of the complete picture because a video will have multiple tags and creators tend to spam tags for SEO purposes.

Next step was to find out which channels I spend a lot of time watching videos of. The list below has my top 24 channels.  

Channel NameCount
PlayStation Access 100
gameranx 86
Linus Tech Tips 78
Overly Sarcastic Productions 70
Cricbuzz 62
IGN 58
LMG Clips 54
WatchMojo.com 50
Let's Get Rusty 39
Jarrod Kimber 38
The Grade Cricketer 36
Fextralife 35
Ethan Chlebowski 35
Saturday Night Live 32
The Graham Norton Show 32
Ashwin 31
The Late Show with Stephen Colbert 29
Dan Murrell 25
Jeremy Jahns 24
videogamedunkey 24
Marques Brownlee 19
Chris Stuckmann 18
MrMobile [Michael Fisher] 16
Eurogamer 15

This is reflective of how I spent the three months in question. Broadly, these channels fall into one of the categories.

  • Gaming
  • Technology
  • Entertainment & Stories (Include movies)
  • Cricket
  • Cooking (There is a lone cooking channel)

While I didn't have to go through the entire exercise of  getting hard data and parsing it, it was satisfying to back up anecdotal understanding of self with hard data. Or this may have been an exercise in vanity.

Foot Note of sorts

My original intention was to highlight a few channels which I think are great and enjoy immensely. And I wanted to back it up with data. While I can pick a few from this list, the reality is that some smaller ones can't keep pace with the bigger ones. I will probably do a separate post on it.

Subscribe to Naveen Piedy's Blog

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe