Working together on big data

Every day more than 2.5 quintillion (that’s 1 with 18 zeros after it!) bytes of data are created, and its growing every day. What are the possibilities that could be unleashed by analysing and applying that data to solve some of the globe’s most pressing challenges? 

Could analysing anonymised health data help to cure disease? Could we use traffic data to ensure a smoother run for commuters and freight? Or could we even track Twitter feeds to pinpoint the next breakout of the flu? 

And what role for academia in this data science revolution? 

In August, WUN hosted a workshop at the University of Rochester to explore the possibilities in this burgeoning field of data science. 29 participants representing eleven WUN institutions discussed issues across three topical areas:

  • Computing Resources, IT infrastructure, and Data Management/Preservation;
  • Social Networks and Data Mining;
  • Biomedical Informatics and Health Prediction.

The workshop commenced with an introductory welcome by Dr. Peter Lennie, Provost of the University of Rochester, noting that he hoped the workshop would allow participants to explore the role for data science across WUN as well as to forge broader partnerships in data science. 

The workshop identified the main issues currently driving data science:

  • data science educational and training requirements;
  • data integrity;
  • distributed data and data integration from various sources;
  • size and scale of data sets and rapidity of their emergence;
  • information and governance related to large-scale data, especially concerning healthcare records, and the need for public awareness and engagement;
  • data storage, including the need for long-term curation and preservation, and the associated costs.

Speakers also addressed the academic challenges of working in this new field: traditional academic metrics of publications and grants are generally unsuitable for measuring contributions in this area and more consideration of this challenge is needed. 

The workshop identified three key avenues through which WUN, as a network of leading universities, can have meaningful impact in the field of data science: 

  • Student Mobility: a range of options were discussed that would create international training options for students earning an MS in data science (or similar) at WUN institutions. Participants were also interested in creating or leveraging paradigms for doctoral and undergraduate students, which could include summer programs or workshops in addition to longer training programs.
  • Distributed Data Sharing & Analysis: WUN could play a particularly important role in facilitating the process for data sharing and analysis, perhaps by linking and leveraging the libraries of member institutions. In addition, linked data and collaborative repositories could play a role in addressing the needs of data storage and curation. Data science could also provide solutions to help WUN better map and leverage the resources and infrastructure within its network of members.
  • Joint Data Science Research Projects & Events: Network collaboration can result in innovative research programs at the cutting edge of data science.

These options will be explored more fully over the coming months, with funding from the WUN Research Development Fund available to catalyze outcomes.