My interview with Aleksandr Kogan: what Cambridge Analytica were trying to do and why their algortihm doesn’t work.

David Sumpter
11 min readApr 22, 2018

--

Alex Kogan is not an easy person to get an interview with. The man who set up the thisisyourdigitallife app, which collected data from 30 million Facebook users and was allegedly used by Cambridge Analytica during the U.S. Presidential campaign, has until now been reluctant to talk about his relationship with the political consulting firm. And after signing a non-disclosure agreement, he risks legal action if he talks openly to journalists about the events surrounding Cambridge Analytica.

I first contacted Alex in summer 2017. I was researching my book Outnumbered and I had read the Guardian’s article on Cambridge Analytica’s use of data in the Ted Cruz campaign. Alex had created an app that paid people a few dollars to take a personality test, it then asked them for permission for their own and their friends’ Facebook personal profile data. Alex had experienced first hand, through his previous academic research, just how willing people were to agree to hand over their personal data. In a study he conducted on international friendships, 80% of participants provided access to their profile and their friends’ location data in exchange for just $1. The same principle now worked on a larger scale and Alex quickly collected in data on millions of people.

Alex Kogan

At first Alex didn’t want to talk to me. When I emailed him a list of questions he replied, “Given where I think the book is going, I’d like to decline to make any statements. Not that there’s anything to hide here. More that I’ve made it a point to not make public comments about anything related to politics and Facebook because it is an issue that folks have run with to scandalize and mislead the general public.”

I found a post made by Alex on, ironically, Facebook where he went in to more details. He was upset by sensationalist news sites and blogs discussing his name change, from Kogan to Spectre, which they speculated was linked to some form of James Bond-like spying activities. In fact, he explained on Facebook, it was a name chosen by him and his wife to celebrate their union. While I could sympathize with many of the bloggers, who were documenting the activities of Wall Street magnate Robert Mercer the financer of Cruz and then Trump, I did feel sorry for how Alex’s personal life had been intruded upon to provide fuel to the story.

When it comes to writing about science, I find that there are certain advantages to being a researcher first and an author second. Academics have an obsession with understanding things. That is why we become scientists. We want to dissect the world and find out the truth, and we want to talk to each other about what we find out. When I ask for interviews, I often find other researchers are happy to talk to me, as a researcher.

So I tried another tact when I next emailed Alex. Instead of asking questions, I told him about the research that I had done on personality algorithms. I had been intrigued by a video of (then) Cambridge Analytica CEO Alexander Nix speak at Concordia Summit 2016. He said his company could ‘predict the personality of every single adult in the United States of America’. Highly neurotic and conscientious voters were targeted with the message that the ‘second amendment was an insurance policy’. Traditional, agreeable voters were told about how ‘the right to bear arms was important to hand down from father to son’. Nix claimed that he could use ‘hundreds and thousands of individual data points on audiences to understand exactly which messages are going to appeal to which audiences’.

Michal Kosinski

To test these claims, I had been analysing a data set similar to the one collected by Alex. Between 2007 and 2012, the MyPersonality project tested millions of people’s personalities and collected their Facebook ‘likes’ in order to find a link between the two. The project was run by Michal Kosinski, who I had interviewed earlier in the year. Michal was convinced by the power of this data. He had used the ‘likes’ people made on Facebook to accurately predict whether they were democrats or republicans, gay or straight, as well as their ethnicity. When I spoke to him, Michal repeatedly emphasized that algorithms are better than humans at classifying other humans. Like Nix, he was describing a world where algorithms could be used to manipulate our personalities. Unlike Nix, Michal wasn’t proposing to use this knowledge to tailor political messages and get populist Republicans’ elected.

I had reanalysed part of the MyPersonality data set and reached a different conclusion. While it was clear that Facebook ‘likes’ could reveal hardcore Democrats (they like Barack and Michelle Obama, National Public Radio, TED Talks, Harry Potter and I Fucking Love Science) and Republicans (they like George W. Bush, The Bible, country music and camping), I found that it couldn’t convincingly determine whether a person was likely to be neurotic or not. This was central to Cambridge Anlaytica’s business idea: in 2016 they aimed to identify and target our personalities.

When I told Alex about my research results in a follow-up email I got a very different response. Not only did he reply at length, but he sent me a draft article he had been working on. He told me that the way Michal Kosinski presented his work was misleading, because the errors in predictions were so large that Facebook ‘likes’ gave very little insight in to individuals’ personality. It just wasn’t possible to reliably determine if a person was neurotic from their ‘likes’.

Alex and I started to correspond via email, and a few weeks later, we set up a time to talk. He was open with me about his interactions with Cambridge Anlaytica, in so far as he could be given the non-disclosure agreement he had signed. Due to the company’s willingness to take legal action against the newspapers writing about the affair, I am not going to repeat anything about what happened to the data Alex collected (the text of Outnumbered has been scrutinized by Bloomsbury lawyers), but nothing Alex said substantially contradicted Chris Wylie’s account in the Guardian.

What I can say about Alex is that, while he emphasized that he hadn’t broken any laws or professional guidelines, he was also very open about his failings. He admitted to me that he had been naïve. He had never worked with a private company before, having been in academia throughout his undergraduate degree at Berkeley, his PhD in Hong Kong and now his research position at Cambridge. ‘I didn’t really appreciate how business is done,’ he told me.

He also admitted that he hadn’t considered other people’s feelings and perceptions when they heard about the Facebook data collection. ‘It is pretty ironic, if you think about it,’ he said. ‘A lot of what I study is emotions and I think if we had thought about whether people might feel weird or icky about us making personality predictions about them, then we would have made a different decision.’

Alexander Nix

As an academic, Alex could see the advantages of working with a company like Cambridge Analytica. At the time, vast quantitates of social-media and consumer data was being traded between companies, and access to this data was interesting from a research perspective. Cambridge Analytica’s aim, as Nix had outlined in his talk, was to link this data together with a model. Alex found this challenge interesting. “Lots of academics consult for companies”, he told me, “and I was reassured by the fact that they had lots of people with Cambridge PhDs working for them”.

But Alex soon realized that the plan was impractical. Like me, he found that correlations between ‘likes’ and personality are very weak (the correlation co-efficient was around 0.3). And in order to target people in a political campaign this data needed to be further correlated to a consumer data set and electoral records. Alex laughed as he summarized how the plan amounted to “taking a noisy signal and making it even noisier. You’ve got one 0.3 correlated data set and you are going to correlate it with another 0.3 correlated data set. This would give correlations of less than 0.1 of the original.”

Alex was blunt when he talked about how Alexander Nix had presented Cambridge Analytica’s work. “Nix has very little comprehension of what he is talking about,” he told me. “He is trying to promote [the personality algorithm] because he has a strong financial incentive to tell a story about how Cambridge Analytica have a secret weapon.”

The impression Nix gave back in 2016, that Cambridge Analytica had a secret weapon, was later going to come back to haunt the company.

Alex Kogan often sounded genuinely disappointed with himself, while at other times he laughed out of exasperation with the ridiculousness of the situation he is now in. One Guardian article made a link between his having accepted a research grant from the University of St. Petersburg and Cambridge Analytica’s work for Trump during the presidential election. Such an arrnagement is not unusual in the international world of academia, and hardly implies that he was paid by Russia to get Trump elected.

I have continued to correspond regularly with Alex via email, and he told me he was getting phone calls from journalists asking him in all seriousness if he was a Russian spy. He remains somewhat perplexed by the attention his app has received and feels that the real story of the limitations of personality algorithms has been lost in the noise around Facebook and Cambridge Analytica.

I can’t help but feel sympathy for his situation. At the time Alex did his work, creating apps to download Facebook users data for research was seen in some academic circles as clever, rather than wrong. Many researchers felt that Facebook should be more open, that they should allow us to look at their data (with identities removed) to help us answer questions about friendship networks, about the spread of ideas online and about personalities. In 2012, I would not have reflected over ethics if I had heard either Alex or Michal Kosinski present their work at an academic conference, although I think I would have been uncomfortable if I knew about a collaboration with Cambridge Analytica.

Chris Wylie

Chris Wylie described his work with Alex Kogan as creating ‘SteveBannon’s psychological warfare tool’. In the Guardian article that broke the Cambridge Analytica story, Wylie implies that he succeeded. He claims that he ‘broke’ Facebook.

While it is important that Wylie’s story is told, these particular claims are vastly overstated. Recently, when I asked him about Chris, Alex told me, “He is speaking outside of his expertise. He’s not a data scientist. At SCL, he dealt with business development and data law. He had no role I know of in handling data and certainly no role in modeling.”

My own research supports Alex view on the accuracy of the method. When I put the profiles of pairs of Facebook users into a personality algorithm based on MyPersonality data, then it identified the most neurotic user only about 60 per cent of the time. Not very good compared to a baseline 50 per cent accuracy. Similar results are found for predicting people’s extroversion, conscientiousness and agreeableness. The model is a bit better at classifying people in terms of their openness, getting it correct about two-thirds of the time. But it is still a weak signal in all that social media noise.

Michal Kosinski takes a more cautious view than I do. When I interviewed him for Outnumbered, he had emphasized the dangers of computer building up a hundred dimensional representation of us. “We are better than computers at doing insignificant things that we, for some reason, think are very important, like walking around,” Michal told me. “But computers can do other intelligent tasks that we can never do.”

In Michal’s view, the personality algorithms he has created are the first step towards a computerised, high-dimensional understanding of human personality that outperforms the understanding we currently have of ourselves.

Nor was Michal convinced by my analysis. When I showed him my results, he accepted the numbers I had got for the accuracy of the algorithms, but disagreed with my interpretation. He argued that it isn’t predicting personality per se that is the problem, claiming “we should worry about algorithms predicting future behavior. Human psychologists need to reduce the complexity of human behavior to five numbers; computers don’t.” He sees the correlation between Facebook likes and psychologists personality scores as a side-effect of the deeper understanding algorithms have of us than we have of ourselves.

Other academics have come to similar conclusions as I have. Cognitive anthropologist, Chris Kavanagh argued convingly on Medium why he thought that (almost) everything reported about Cambridge Analytica is wrong. I interviewed several psychologists who have told me that the claims around personality and social media are overblown. Cathy O’Neil’s book Weapons of Math Destruction, written before the Cambridge Analytica story came out, showed that the most common problem with algorithms is not their pinpoint accuracy, but their tendency to get everything wrong.

When I think back to my interviews with Michal Kosinski and Alex Kogan, former colleagues and collaborators in Cambridge whose careers have taken very different directions and who no longer speak to each other, I can’t help thinking about the challenge I use to test algorithms: to see how often it correctly identifies the most neurotic of two individuals. I’m not accusing either Michal or Alex of being neurotic, far from it, but when it comes to algorithms they have two very different concerns. Michal’s concerns are consistent with the dystopian vision portrayed about Cambridge Analytica, Alex’s concerns is that the hype about personality algorithms is overblown and he is one victim of the hyperbole.

I have, both in this article and in Outnumbered, come down on Alex’s side on this particular point. But, just as we do when we try to identify which of two people are most neurotic, the question I should really answer is ‘what is the probability I have made the correct classification?’

I can’t, I’m afraid, go any higher than 80% on that one. There is still a 20% chance that a dystopian future of social-media mind control is on its way. Either way, what Carole Cadwalladr’s investigation of Cambridge Analytica and Facebook have revealed is that we cannot leave it to ambitious young men, like the four protagonists of this story, alone to decide our digital future.

Read more about Cambridge Analytica and the many other algorithms that impact our lives in Outnumbered. Out Now.

--

--

David Sumpter

Books: Four Ways of Thinking (2023); The Ten Equations (2020); Outnumbered (2018); Soccermatics (2016) and Collective Animal Behavior (2010).