Making the Most of Online Health Data


The internet is changing people’s approach to healthcare. The last 10 years have witnessed a dramatic increase in the number of people going online to seek out information and advice, to research health issues and connect with those sharing their experiences. NHS Choices, for example, has seen a steady increase in usage every year since 2011, with over 31 million unique visitors to the site in January 2015 alone.

In many ways, this change should come as no surprise. The internet offers a fast, free method to seek medical advice without having to make and keep an appointment with a GP. It also provides a degree of anonymity; the disinhibitory effect of posting online, often under a pseudonym, has been long documented, and people often feel able to disclose facts on web forums which they would be unlikely to discuss face-to-face. This latter aspect is of particular significance for areas such as mental and sexual health, where patients may feel unable or unwilling to talk to a professional.

At the same time, however, these are often unregulated spaces, and the quality of advice and information can be inaccurate, misleading, or even dangerous. Equally, online communities can sometimes become small echo-chambers of peers who can normalise or even encourage dangerous behaviour.

All of this activity has generated an extremely large body of publicly accessible data on health, from discussions on the unexpected side-effects of drugs, to detailed opinions on hospitals and forms of treatment. This information is accessed and offered through a broad variety of sites, ranging from official, accredited sources of medical advice to small, community run forums.

At present, this potentially valuable store of information remains broadly untapped by the healthcare profession. Indeed, the process of gaining any benefit from these data for health professionals, commissioners, researchers and patients themselves is extremely difficult. While sophisticated tools have been developed to enable meaningful analysis of the huge quantities of information involved, they have been most eagerly taken up by those with commercial or political interests, in an attempt to sell products or win elections; little research has yet been conducted into the potential use of these technologies to make sense of the mass of unstructured public health data. There is also some uncertainty as to how exactly understanding these data could be used to benefit patients and organisations.

There are also serious ethical considerations concerning its collection and analysis. Crucial here is the issue of online privacy. In many places on the web – on Twitter for example, or in the comment section of a Youtube video – it is widely understood that posting or commenting is a public act. On a citizen-run forum, however, with a close knit community of subscribers and contributors, this assumption is not so easily made; although the data may be publically accessible, serious research needs to be done to determine what can and should be analysed, and how this can be achieved without causing harm to the author.

This week the Centre for the Analysis of Social Media (CASM) at Demos, in association with the Kings Fund, has launched a research project that aims to shed light on some of these topics. This year-long study will provide an opportunity to bring powerful web-crawling and natural language processing technologies to bear on the questions:

1. How might we harness the growing body of patient generated content online, and
2. What might British healthcare look like if we can?