Knowledge 4 innovation

Today´s discussion is focused on innovation and I would like to talk to you about yet another aspect to consider. About the implementation and application of data protection rules and their impact on innovation.

  • I believe – this is a very important moment for us. With the data protection package adopted, we can now focus on practical implementation, practical solutions needed for research and development of science and innovation.
  • One of the GDPR aim was to provide harmonization (one for Europe, not 28 regimes) and create new data protection framework – building on the trust of all involved parties.
  • The other aim was to create legal basis that will be technologically neutral.
  • Now, after adoption of GDPR, we are entering 2 years of implementation or adjustment period. This time ahead of us is very important. We have to make sure it is used wisely.
  • It is the most proper time, when we are starting the Pact for Innovation –   to discuss the issue of sensitive data. We all agree, that they need special protection, but it is important to remember that – sensitive data might be also a fuel of scientific progress.
  • Firstly, there are many new possibilities to collect the data, to process them, to re use, to transfer, to share them – under clear conditions, with data protection and security. It is made by many companies, and it is the real expression of the data driven economy. It establishes the new relation between data owners’ and users at the first stage and data operators’ at the next stages and in many senses.
  • Secondly, in addition – we can find that personal data – donated for research (for example, as a part of citizen science projects) might soon become a significant part of scientific resources.
  • We need to look for solutions to be able to achieve both possibilities. And it creates the new challenge – how to understand, how to define, and how to use the open data. In this specific meaning – accessible for research under clear rules.
  • There’s an intrinsic value of open resources in sciences. We have worked with open scientific databases for half of century now. We need to continue the efforts for setting in Europe harmonized rules for text and data mining development.
  • The balance between open and closed data – can be reached – and we have to make sure, that legal solutions catch up with technological development. In this respect, we should discuss how to use data, also sensitive data, open them for the research, but in the same time provide sufficient protection of these data.
  • More than often, the access to the raw data is not really needed and what is important – the key is: the access to the computation of data. Data that are already aggregated, run through an algorithm, even encrypted. This computation brings the answers that the researchers are looking for.
  • I would like to give you a specific example of processing sensitive data. One of the most special cases of sensitive data is human genomes. The genome of a person is a unique, static and valuable resource that can be abused – way beyond the life of a single person, as plenty of its data are inherited and heritable.
  • The current research on the genomic privacy clearly show the failure of anonymization methods, temporality of the security measures and, what is the most important, not perfect legal framework to protect the donors of genetic information from the consequences of sharing such data. On the other hand, large-scale human genome sequencing – is needed for an advance in fighting and preventing major diseases such as cancers or diabetes.
  • What does it mean for the life science research, can it still use big data analysis or is it too dangerous?
  • Not necessarily, as the field of genomic privacy has already generated several solutions that provide the possibility to analyze the genomic data without releasing the data and without compromising the privacy of data donors. And the same solutions, but at a larger scale, can be applied to provide researchers with access to computations on sensitive data of many different kinds.
  • To benefit fully – from computational access to sensitive data, it is important to remember that no resource holds all available data of particular kind. Even for open data, researchers routinely consult different databases, as the process of adding new records and maintaining the quality of the resources – differs from institution to institution. For sensitive data it is even harder to aggregate different data because of legal requirements.
  • Therefore, the crucial element of providing computational access to sensitive data for research purposes is development of unified standard of communication between these resources.

The recommendations in view of the implementation of the General Data Protection Regulation were as follows:

  1. It is important to develop maximally unified legal framework for research data between Member States. Too many options for closing research data will yield the projects of Europe-wide research – impossible due to legal incompatibilities.
  2. It is important to start a systematic research on the privacy and security of sensitive data. It will not only allow developing appropriate measures, but also will let data owners be less restrictive about the security of some types of data.
  3. It is important to develop a parallel process of creating technology counterpart to the GDPR. The same process of unification that occurs during work on the GDPR should be mirrored by a unification of standards of security and computations on sensitive data across Member States. European Data Protection Board could have a role in this process.
  4. It is important to extend the spectrum of stakeholders when discussing the implementation of the GDPR by research organisations and academic societies directly involved with open data and open access to scientific literature. These communities have the most up-to-date knowledge about the innovations and opportunities stemming from unrestricted access to scientific resources.
  5. And the last point. Those efforts are important for innovation – in all areas, although I have presented today, just only one example and case. But – at the end: it will be a key for data driven economy development, which means: it will have real impact on the innovative business. Finally, it will be dedicated for the clients/ consumers/ users – as the possibility to make the services and products addressed to them: much more personalized.

Michal Boni, MEP

Brussels, 25 th April, 2016