Mar 05 2014 | Posted by SSBibek

‘Big Data’ research at the University of Leeds

Leeds’ Pro-Vice-Chancellor for Research and Innovation, Professor David Hogg, says the University is poised to be a leader in Big Data research:

We have heard quite a lot about ‘Big Data’ recently, but it is not clear how successful the media has been in explaining to the public what it actually means – beyond a general impression that it involves ‘data’ and is, well, rather ‘big’. 

In fact, developing the ability to deal with the exponential growth in the availability of massive datasets is one of the key challenges facing our society, and is critical to the future of a major research university like Leeds. The infrastructure of the modern university, built up over centuries (research libraries, laboratories, lecture theatres and journals) brought us to where we are today. Big Data analysis will be essential to our future.

This goes to the heart of what we do as researchers. Where we have traditionally worked with relatively small samples of tens, hundreds or thousands of research participants, the accessibility of online data allows us to interrogate huge datasets that describe what is actually going on around us.

In an area such as cancer care, for example, the benefits could be enormous. If we can process the anonymised data in medical records better – crossreferencing the clinical characteristics of patients who have agreed to use of their data for this purpose with the molecular features of their cancer, their treatments and outcomes – we could tailor individuals’ treatments more closely, based on the past experiences of patients with similar characteristics.

The government’s Big Data funding announcement at the start of February was very good news for the University. We were given funding from four of the research councils, with two multimillion-pound grants from the Medical Research Council (MRC) and Economic and Social Research Council (ESRC).

Leeds is already a recognised centre for Big Data, with key pillars of strength in areas including health informatics, geo-informatics, environmental data analytics, machine learning, behavioural analysis, artificial intelligence and visualisation. The next steps will be to bring this activity together, build on it and apply these capabilities across the entire range of the University’s research. The new projects will involve every University faculty.

There are understandable public concerns about the potential threat to privacy from these powerful new ways of processing information, and we are taking the ethical dimension of this work extremely seriously. Leeds is very well equipped for the practical task of ensuring that datasets are anonymised, held securely and that only trained researchers can access them for ethically approved projects, but perhaps the biggest challenge will be one of communication: explaining to the public not only the procedures in place to ensure privacy, but also the huge potential benefits of the next stage of the information revolution.

(taken from University of Leeds Reporter issue 580 March 2014)