← Back to portfolio
Published on 23rd June 2018

Visualizing My Facebook Messenger Data


Recently, I discovered that Facebook allows you to download your data they have saved on you, including every single message you’ve ever sent or received. So naturally, I downloaded my messages and got to work. The data go as far back as March 16th, 2010 (at 9:55pm, to be exact), the day I sent my first Facebook message.

What’s in the data?

  • 1,242 unique individuals I’ve messaged with
  • 231,665 total messages, 101,990 of which I sent
  • Thousands of cringeworthy messages from my adolescence I wish I hadn’t been reminded of

Messaging Trends Over Time

Even though I started using Facebook messenger in 2010, my activity didn’t really pick up until a few years in.

Tracking the number of messages I’ve sent each month is pretty cool, but only scratches the surface of what’s possible with these data. I can also take a look at my messaging rates with any individual, assuming they’ve sent me at least one message. Here’s an example from my good friend Alina, who I met in August 2016:

Evolving Relationships

Life is full of change; relationships evolve constantly, people come and go, and best friends often turn into ‘old friends’ as we move from one life chapter to another. This became abundantly clear to me when I looked up the ten friends who have sent me the most messages on Facebook and plotted each individual’s messaging patterns together.

The two individuals taking the #1 and #2 spots are outliers by any definition; they’ve respectively sent me 27,593 and 22,904 messages – both outranking the next 8 friends combined (friends ranked #3-8 have sent me a total of 22,424 messages). Because my top two friends dominated the field, let’s omit them to get a better look at the rest of the bunch.

Here we can much more clearly see the messaging patterns of the other eight friends. So what all is going on here?

  • Even though I’ve been using messenger since 2010, many friends from the top 10 didn’t emerge until 2012. I relocated cities in January of 2012, changing up nearly every aspect of my life. My social circle definitely changed in tandem, as it was in this new city I met several best friends. This is also likely a product of not using Facebook messenger much until after the move.
  • You may notice the yellow, black, and blue lines all gradually fade out in 2014, while many new relationships emerged shortly after and throughout 2015. I graduated high school in May 2014 and began college the following fall, a change which catalyzed the beginning of several new friendships, as well as the unfortunate losing touch with old friends.
  • You may notice the green, turquoise, and pink lines all move together – rising and falling in unison. In fact, it’s almost shocking how correlated they all are, however there’s a clear explanation: these three individuals and I were in a group chat together. And upon closer look, another fun tidbit is revealed: the friend represented by the pink link wasn’t added to the group chat until a few months after it was created.
  • So out of these ten individuals, any guesses as to the one I’m the closest with? You may be surprised to learn that it’s the friend represented by the purple line! This is a good place to note one limitation of this study – I use a variety of means to communicate with friends, including phone texts and face-to-face conversations. So while the ‘purple’ friend doesn’t capture the top spots in terms of Facebook messaging, that’s likely because we tend to text and hang out in person more, an important reminder that these plots don’t capture the whole picture.

“Huh, I wonder what we were talking about.”

When learning that Alina has sent me the most messages in January of 2017, she openly wondered what we had been talking about. A great question, and thankfully, one I can answer!

To determine the stand-out topics of a month’s worth of messages, I used tf-idf (which stands for term frequency-inverse document frequency, but we’ll stick with the abbreviation). For each month, I grouped all of Alina’s messages into one long text document and looked for words that appeared often. But here’s the kicker – tf-idf compares to other text documents in the collection, so if a word was used often in one month but also used commonly in other months, it wouldn’t stand out as a relevant conversation topic. Therefore, for a word to score highly on the tf-idf measure, it needs to have been used a lot in one month and not often in other months.

To recap:

  • If a word wasn’t used often, it’d have a low tf-idf score.
  • If a word was used often in several different months, it’d have a low tf-idf score. This would include words that we use all the time, like “Denison” and “food” (as in, “will you please buy me food?’’).
  • If a word was used often in one month but not others, it’d have a high tf-idf score. What might fall into this category? Look below!

Calculating tf-idf on myself is particularly interesting because it includes every message I sent, regardless of who the recipient was. This therefore gives us a decent idea of what was going on in my life at the time or what I found to be important. Let’s take a look at what I was up to in the most recent full year, 2017.

This seems like an odd assortment at first glance, but every seemingly random word has a solid explanation…for the most part.

In February, much of my free time was devoted to my work in the student government – otherwise known as ‘DCGA’ – and on promoting sexual consent.

April was a busy month! I hosted a trivia game, ate a lot of tacos, and advertised a very fun race to my friends.

In May I moved to LA for a summer internship and spent all of my hard-earned money on Uber.

Throughout June and July, I used Facebook messenger to collect data for an article about social networks. Because I was sending the same, long script to nearly 500 students, my messenger usage skyrocketed during those months.

In August, I was frustrated by a major plot hole in the season 1 finale of The Flash (SPOILERS BELOW):

Throughout September, my senior seminar class hosted a public deliberation. I used Facebook to advertise the event and query for potential discussion topics.

And in October, I did some research on the teletubbies and shared my findings with anyone who would listen. No joke.

Cumulative Frequency Plot

Rather than plotting monthly counts, I can keep a running tally of total messages, which would be helpful in creating a cumulative frequency plot. This is helpful not only in showing how many messages have been sent in total by a certain month, but paying attention to the derivative of the plotted line can also reveal the rate at which the message total is growing; steep lines indicate more messaging than average, narrow lines indicate less. For example, look at how the rate of messaging from a close friend of mine increased once I moved back to the same city as him and declined after we graduated high school and left for college.

Now that you’ve seen that example, check out how my cumulative message count has grown over time. Since the big bump in Facebook usage in 2012, it appears to have grown fairly consistently.



0 Comments Add a Comment?

Add a comment
You can use markdown for links, quotes, bold, italics and lists. View a guide to Markdown

You will need to verify your email to approve this comment. All comments are moderated before publication.

Close