Rafael Jimenez, ELIXIR Chief Technical Officer, ELIXIR Hub, Wellcome Genome Campus, Hinxton, Cambridge, UK
How did you get into bioinformatics?
I am a biologist and a computer scientist specialised in the coordination and management of bioinformatics services. I am interested in topics related to data infrastructure, data integration, data federation, data visualisation and software development best practices. I have experience coordinating technical projects, developing technical strategies, delivering services, leading software development and managing community groups. Since 2014 I have been employed by ELIXIR as part of the Hub team, which supports coordination with more than 140 organisations, helping to build a sustainable European infrastructure for life-science information.
What are the challenges you see for life scientists in the data driven science era?
It would be difficult to think about how to do any efficient research without data today as any research and every step in the scientific methodology is data driven. So research requires not only scientific knowledge but a minimum understanding of information technology. This is a challenge for life scientists since both disciplines are evolving rapidly. So keeping up with the latest developments is indeed one of the most important challenges. The second most important challenge is triggered by the data deluge. The volume of new data being generated is overwhelming our capacity to manage it, share it and use it.
Would you say this is different for actual bioinformaticians? Do they face different challenges?
I think it is the same challenge especially for those who are dedicated to bioinformatics research in contrast to those dedicated to bioinformatics services.
What is open data, and what does it mean to you?
Open data means data that is freely used, re-used and redistributed by anyone, and it is important since it has a clear impact on facilitating research and creating new knowledge. Though open data is important we should make sure we pay attention to HOW scientists and organisations engage with open data. There is data that is easily accessible in a recognised public repository and there is data that is deposited somewhere but hard to find. There is data annotated with quality metadata including detailed information about the experiment, and there is data without any metadata. Agreeing on how we make data open is going to be crucial to making data truly open. Organisations and researchers should ideally adhere to open data principles, undertaking a commitment to adopting an open data policy based on open data principles.
What is Bioschemas and why is this timely and relevant to Bioinformatics?
Bioschemas is an open community effort contributing to the improvement of data interoperability in life sciences. It does this by encouraging data service providers in our domain to adopt ‘schema’ mark-up language. Schema.org mark-up helps to expose the existing structure of data available within web pages without changing its look and feel. This makes the information more accessible to third party software as well as search engines, and in an indirect manner this benefits many users. Bioschemas started to test how to make easier sharing bioinformatics events and training materials using schema.org as well as minimum information and vocabularies which are important in our community. At the moment in schema.org there are not defined biological types i.e. for genes, proteins or pathways, so we are looking at how to engage with existing community efforts in this field to enhance the biological side of schema.org.
What is currently missing in the field of bioinformatics AND life sciences?
We are missing more sustainable and quality services to better support science, a framework to support and improve the services that really matter to scientists. Unfortunately most of the services developed in bioinformatics are supported by research grants. Once a research grant is over it is challenging to maintain new useful services due to the lack of funds. Also services developed in a scientific environment are normally regarded as the means to an end where not the service but the knowledge is what counts. These two factors are major handicaps to quality and sustainability. A solution to this problem does not just require a cultural change but some work on exposing the usefulness and quality of services and finding ways to fund their ongoing support. More and better training is also a key factor not just to prepare scientist for new challenges but to provide better quality services.
It is early days yet, but what would you like to see EMBL-ABR become, achieve?
An organisation dedicated to support life scientists by providing quality services and on-demand training that reaches most of the Australian life scientists and bioinformaticians.
Biosketch: Rafael Jimenez is one of those special and in-demand scientists who is both a biologist and a computer scientist – ORCID ID: 0000-0001-5404-7670.