This morning, the Samaritans announced SamaritansRadar, a tool that flags social media content within your network that might indicate someone is vulnerable. The tool was created by a digital agency – SpreadingJam – and claims to use a ‘specially designed algorithm that looks for specific keywords and phrases’ (which technically doesn’t sound like an algorithm).
The intention is probably noble. It’s a good thing that groups like the Samaritans think about how to get involved in social media: we know it’s somewhere that people increasingly go when they are feeling low. Of course, there are moral implications around an app that flags the vulnerable when they are most vulnerable. There are issues around how those looking to use social media to communicate would respond knowing their tweets were being flagged up. Some of these ethical questions are covered by the journalist Adrian Short here, and have been responded to (in part) by the Samaritans here.
My issue is with how it has been delivered. As social media users we leave a rich trail of data that can reveal an unprecedented amount about who we are and how we feel. There are countless examples of using data and analytics to understand and categorise us, from Walmart’s receipt analytics to work done by google to predict flu trends. However, turning this mass of data into something meaningful is extraordinarily difficult. Trying to cash in on ‘big data’ and ‘analytics’, companies are selling tools to businesses (and charities) that promise much but deliver little.
The work we do at the Centre for the Analysis of Social Media relies on tools and technologies that have been developed over years with some of the leading figures in text analytics, natural language processing and machine learning. Nevertheless it’s very difficult indeed to determine accurately the genuine sentiment, feeling or emotions behind a Facebook post or tweet, especially when you start using algorithms.
Let alone automating the real-time identification of a person’s vulnerability on a subject as sensitive and nuanced as suicide. Text analytics of social media platforms requires technology and tools far beyond simple word recognition. This is, of course, pretty obvious. As a result, the app seems to be flagging a lot of people who are talking about the ‘suicide app’.
Take a look on Twitter yourself. One look at ‘suicide’ finds references to news, lectures, opinions, ISIS and alternative glamour models. This could be a case – and it’s a fairly common problem – of being blinded by the promise of data, and well-intentioned charity trying to reach out to vulnerable social media users with a tool which simply cannot capture the nuances necessary to make it really effective.
We don’t know how much testing has been done of this app, but from our own experience, we know the correct way to go about this would be to build a model. It would take weeks. You would need informed consent from hundreds of people who admit feeling very depressed or suicidal and are posting online. You would collect that data and train an algorithm to look for statistical patterns in their language use. Once it’s performing at a reasonably accurate level, you would then use that model to search for other examples where similar language is being used by others.
The truth is, at the moment, no one really knows how suicidal individuals really talk on Twitter – but I’m fairly certain it’s often not a case of saying it explicitly. Natural language processing can help spot and understand those underlying patterns: and using it to better understand how people are expressing themselves on social media could make a big difference. But it’s long and difficult work.