Quinnipiac Assignment 01 – ICM501 – Ratings and Recommender Systems

Ratings and Recommender Systems

Unlike the offline world, the Internet is chock full of ratings and recommender systems. Why? Because metrics are everywhere. On YouTube, you know how many people watched a video, and who made it to the end. On WordPress, you know where your readers came from. On Facebook, you are constantly being served the likings, joinings, and friendings of your friends. Plus you have likely provided any number of cues about your preferences, your age, your gender, your marital status and/or sexual preference, and your location, among dozens if not hundreds of other data points. This includes everything from indicating you are a Red Sox fan to watching a cooking demonstration video to its end, to listing your college graduation year on LinkedIn, to joining a group devoted to German Shepherd Dog rescue, to reviewing a book on GoodReads about bi-curiosity. With all of this data, attempts are made to get a clear(er) picture of a person. With the picture comes an effort at predictability.

Amusingly, as I’ve been writing this blog post, to prove the point, the WordPress Zemanta plugin is currently serving me pictures of German Shepherd Dogs and a number of articles about them, allegedly related to this post. But, surprise! This post isn’t really about German Shepherds at all, this image notwithstanding.

Quinnipiac Assignment 01 – ICM501 – Ratings and Recommender Systems
German Shepherd Dog from 1915 (Photo credit: Wikipedia)

Close But No Cigar. Not Even a Kewpie Doll

Just as Zemanta screws up, so do plenty of other sites with recommender systems. Spotify, Pandora, and other music-matching sites seem to fairly routinely not get it. In September of 2013, Forbes reporter Amadou Diallo wrote about a search for a perfect playlist. In his article, Diallo compared iTunes Radio, Spotify, and Pandora, by using various seed artists to create playlists. The matching algorithms were given Stevie Wonder, Herbie Hancock, and The Alabama Shakes. Diallo concluded that Pandora had the best matching algorithm, but there were definite flaws with all three.

To my mind, a pure computer-driven search is a misplaced notion. One of the issues is of categorization. For musical, film, book, and other recommendations, it’s all only as good as how it’s categorized, and often goods are poorly organized. Consider Johnny Cash. A country artist? Sure. Male artist? Of course. He came from a particular time period and his work was generally guitar-heavy. And then, late in his career, he threw a curve and recorded a cover of Nine Inch Nails’ Hurt. If recommender systems had existed when he released it, the song would have dented the algorithms, perhaps even fatally.

A further issue with recommender systems is that they seem to treat people’s preferences like computer problems. E. g. if you like, say, movies that involve the American South, history, and a strong male lead, you might be served, under a movie recommender system, both Gone With The Wind and Midnight in the Garden of Good and Evil. Yet one is a classic romance, whereas the other is a nonfiction work. Even if perfect granularity is achieved, and all of the seemingly relevant data points are hit, recommender systems still aren’t necessarily truly up to the task.

As J. Ellenberg says, in This psychologist might outsmart the math brains competing for the Netflix Prize. Wired (2008, February 25). [Link] “Of course, this system breaks down when applied to people who like both of those movies. You can address this problem by adding more dimensions — rating movies on a “chick flick” to “jock movie” scale or a “horror” to “romantic comedy” scale. You might imagine that if you kept track of enough of these coordinates, you could use them to profile users’ likes and dislikes pretty well. The problem is, how do you know the attributes you’ve selected are the right ones? Maybe you’re analyzing a lot of data that’s not really helping you make good predictions, and maybe there are variables that do drive people’s ratings that you’ve completely missed.” (Page 3)

There are any number of thoroughly out there reasons why people like or dislike something or other. Some are far from quantifiable, predictable, or replicable. They can’t be scaled to the entire population, or even one of its segments. Do we prefer a particular song because it reminds us of a point in our life that is no more? Do we avoid a film because it’s where we took our lost love on our first date?

Going Along to Get Along

Another issue with recommender systems is that people can often be persuaded one way or another. The Salganik and Watts study is rather interesting in this regard. These two researchers presented subjects with a number of unreleased songs and asked them to rate the songs and also download whatever they liked. Certain songs rose to the top of the charts (just like we normally see on Billboard, the Hot 100 and the like) whereas others were clunkers that fell swiftly. When the researchers switched the presented numbers, showing higher ratings for the stinkers and lower ratings for euphony, test subjects changed their minds. All Salganik and Watts had to do was convince their test subjects that this was the right outcome.

Salganik, M. J., & Watts, D. J. (2008). Leading the herd astray: An experimental study of self-fulfilling prophecies in an artificial cultural market.Social Psychology Quarterly, 71(4), 338–355. [PDF] “…over a wide range of scales and domains, the belief in a particular outcome may indeed cause that outcome to be realized, even if the belief itself was initially unfounded or even false.” (Page 2)

Are these instances of undue influence? Self-fulfilling prophecies? Test subjects wanting to appear ‘cool’ or go along with the majority in order to increase personal social capital? And where are ratings and recommender systems in all of this? Are they measuring data? Or is it, like is the case with the Observer Effect, that the very acts of observation and measurement are skewing the numbers and generating false outcomes?

Or is it, perhaps still the case, that there’s no accounting for taste?

Enjoy Johnny Cash (but only if you want to).

Facebook Opinion

Of Ice Buckets and Virality

Of Ice Buckets and Virality

You have to be on another planet (or, offline) to not know about this. It has, most likely, been covered in the mainstream offline media by now (I don’t watch a lot of television, and I have not yet seen it in newspapers, but that does not mean it is not coming).

I am, of course, talking about the terrifically inventive charitable idea du jour –  The ALS Ice Bucket Challenge.


English: Amyotrophic lateral sclerosis MRI (parasagittal FLAIR) demonstrates increased T2 signal within the posterior part of the internal capsule and can be tracked to the subcortical white matter of the motor cortex, outlining the corticospinal tract), consistent with the clinical diagnosis of ALS. and case (Photo credit: Wikipedia)

On July 31st of 2014, Pete Frates, who has ALS, challenged some celebrities, including New England Patriots quarterback Tom Brady, to what would become known as the ice bucket challenge. The challenge already existed, per the linked article, but the concept did not begin to gain viral social media traction until July 31st .

The premise is simple – either donate $100 to ALS research or douse yourself with ice water. You’ve got 24 hours and, once you’ve done either (or both), challenge at least three more people. The idea spread virally. A lot of people were pleased to see something other than Gaza and Ferguson in their news feeds.


ALS (Amyotrophic lateral sclerosis) is a serious medical condition for which there is no known cure. It is a disease of the nerve cells in the brain and spinal cord that control voluntary muscle movement. ALS is also known as Lou Gehrig’s disease. Complicating matters is the fact that there are only about 30,000 or so people in the United States who are known to have it. Hence big drug manufacturers aren’t pursuing cures, as such cures and research are not profitable enough. It is essentially an orphaned disease.

But let’s get back to the challenge.

Purpose of the Challenge

I can come up with five purposes, and probably three (maybe four) of them were not the original ones.

  1. Raise money for ALS research
  2. Boost awareness of the disease
  3. Lean on drug companies to pressure them to work on developing a cure
  4. Lean on politicians to push them to pressure drug companies or perhaps pass laws subsidizing or otherwise encouraging research into orphan diseases
  5. Hire a celebrity spokesperson (or more than one) to advocate for the victims of the disease

The challenge performs the first two tasks perfectly. Either you pony up the funds or you get soaked – and that’s all done on camera and is uploaded to social media. Most people reveal their choice on Facebook, which has over 1.4 billion users. The last three purposes are the kinds of things that this sort of attention can be used for. I do hope the big folks in ALS research don’t squander this opportunity, and try for all three.

Why is it Viral?

The challenge hits about every mark when it comes to virality. Here are some reasons.

  1. It has a strong visual and auditory appeal. The dousing, the screaming, or even smug people signing checks and getting doused anyway – don’t underestimate how much humor there is in seeing someone getting their (allegedly deserved) comeuppance.
  2. Humor is one component to virality, or at least it is one of the elements that is somewhat more likely to be present when any piece of media goes viral. By definition, the flipping of super-chilled water onto anyone’s head is going to be funny.

More Reasons

  1. Another component that is often present in viral media is uplifting images, text, and actions. This is why Upworthy does as well as it does. When people write a check instead of dump water, they hit this mark instead. Either way, if they speak a bit about ALS, they also hit this mark.
  2. Timing – initiating the challenge in January would not have worked as well. While people have been answering polar bear-style challenges for years, if you want to go viral, you want the majority to participate, or at least believe that they might want to participate. Selecting the month of August was brilliant, as this is either the warmest or second-warmest month in most years in the Northern Hemisphere. That is, the hemisphere where people are, in general, more likely to be wealthy and more likely to be online. In short, these were the people most likely to either participate in the challenge or at least watch videos of it and read articles (like this one!) about it.

Yet More Reasons

  1. The second piece of timing is how it came when so many people had seen a lot about Gaza and Ferguson, as I stated previously. For many, the challenge was a welcome bit of good news in an otherwise dreary Dog Days of Summer media landscape.
  2. There is an element of daring in it but, except for the elderly (Sir Ian McKellan notwithstanding) and the gravely ill, it’s not really dangerous. But do watch out for slippery floors. Yes, there are already blooper videos out there (many of them are NSFW; Google is your friend if you’re interested in such things).
  3. Virality is baked right into it. One aspect of the challenge is to call out at least three other people and challenge them. Furthermore, they have 24 hours to respond either way. Hence you have added three more names. However, the names begin to repeat after a while. So the trebling of participants slows down, eventually shrinking a lot closer to a doubling of participants. The duplication of names also happens because most people run in somewhat small circles and share neighbors, friends, family members, etc. The 24 hour time limit plays very well into most people’s demands for more and more and different entertainment to consume, on daily, hourly, or even minute by minute basis.

Still More Reasons

  1. Social media has shaped a lot of our behaviors, and the ice bucket challenge plays nicely into how ordinary people are finding out that they are now entertainers online. They have followers, and they are beginning to understand that they need to provide content for their followers. As a result, so many people are obsessed with taking selfies, or Instagramming everything they eat or wear. Because they know they need to provide content, but they’re stumped as to what to provide! The challenge provides a perfect prompt.
  2. The involvement of celebrities added a cool factor. The involvement of all sorts politicians on the political spectrum allowed a way for political rivals to talk to each other. After all, who could be against trying to defeat a horrible disease?

The challenge has even spawned parodies and copycats. There’s the rice bucket challenge in India, the rubble bucket challenge in Gaza, and the Orlando Jones bullet bucket challenge. It’s a helluva clever campaign, and deserves all the props it’s been getting.


Amusing and upbeat, the challenge has, as of the writing of this blog post, raised over $50 million for research. Please give generously and, in the meantime, enjoy Bill Gates getting drenched.

Analytics Quinnipiac Social Media Social Media Class

Quinnipiac Assignment #12 ICM524 – Final Project (Journalism)

Should Journalism Be Data Driven?

For my final project for Quinnipiac University’s Social Media Analytics class, I created a short presentation about journalism and data. This video is available on YouTube.

My essential question was whether data and story popularity should be drivers for journalistic choices. Those choices are everything from what to put on a ‘front page’ to what to bold or italicize, to where to send scarce (and expensive) reporter resources, to what to cover at all.

Popularity Breeds Contempt

For news organizations looking to save some money, it can be mighty appealing to only cover the most popular story lines. News can very quickly turn into all-Kardashian, all the time, if an organization is not careful. For a news corporation searching for an easier path to profitability, hitching their metaphoric wagon to the popularity star might feel right. After all, and to borrow from last semester’s Social Media Platforms class, they have buyer personae to satisfy. If all of their readers or viewers or listeners want is to know the latest about Justin Bieber or Queen Elizabeth II, then why shouldn’t a news organization satisfy that demand?

But there is a corollary to all of this.

Quinnipiac Assignment #12 ICM524 – Final Project (Journalism)
Journalism is going to survive. I just don’t see how the businesses that have provided it will survive – Clay Shirky @cshirky #openjournalism #quotes (Photo credit: planeta)

News organizations often have dissimilar foci. If I am reading, say, the Jewish Daily Forward, I am looking for news, most likely, about either the Jewish people or Israel, or at least for stories which are relevant to either of these two not-identical (albeit somewhat similar) entities. Hence a story about the Kardashians, for example, is not going to fly unless it can be related somehow.


Dovetailing into all of this is journalistic ethics. Shouldn’t journalists be telling the stories of the downtrodden, the oppressed, and the forgotten? I well recall the coverage of Watergate as it was happening (even though I was a tween at the time). I’m not so sure that many people today appreciate the sort of courage that that really took.

What is the future of journalism? I feel it has got to be both. There must be a combination. News organizations need to show profits just as much as all other businesses. But that should not come at the expense of their responsibilities.

This was a great class, and I learned a lot. My next semester starts on August 25th.