EMBL-ABR Network: an interview with Kate LeMay

Kate LeMay, Senior Research Data Specialist, National Data Service, focusing on health and medical data. She was in Melbourne recently to attend our Data Life Cycle workshops.


Kate LeMay, ANDS

November 2016


What is bioinformatics for you and why does it matter?

Bioinformatics is using computer science, statistical and mathematical techniques to interrogate large biological datasets. It matters because research is producing larger and more complex data than ever before, which needs powerful techniques for analysis. It allows intelligent and innovative analysis, and can lead to important discoveries e.g. to improve health, medical outcomes and quality of life for people.

What are the challenges you see for life scientists / medical researchers in the data driven science era?

  • Keeping up to date with new innovations, techniques, analysis, and findings
  • Managing increasingly large data sets
  • Maintaining participant privacy in research
  • Meeting requirements from government, institutions, funders and publishers.

What is open data, and what does it mean to you?

Open data is a subset of FAIR  data (findable, accessible, interoperable and reusable). FAIR doesn’t necessarily mean making data available for anyone to download; data can be publicly described (findable) but accessed through controlled conditions (e.g. only to other legitimate researchers who apply to the data owner’s institution’s ethics committee). Open data is part of an ongoing conversation between researchers to enable better research outcomes, more reproducible research, and avoid unnecessary duplication.

It is imperative that data outputs are effectively managed and shared. Better data – better described, more connected, more integrated and organised, more accessible, more easily used for new purposes – allows new questions to be answered, larger issues to be investigated, and data landscapes to be explored.

How do you see EMBL-ABR and ANDS working together?

The Australian National Data Service (ANDS) makes Australia’s research data assets more valuable for researchers, research institutions and the nation. We are doing this through:

  1. Trusted partnerships: working with partners and communities on research data projects and collaborations
  2. Reliable services: delivering national services to support data discovery, connection, publishing, sharing, use and reuse
  3. Enhanced capability: building the data skills and capacity of Australia’s research system.

ANDS is here to help by leading the creation of a cohesive national collection of research resources and a richer data environment that will:

  • make better use of Australia’s research outputs
  • enable Australian researchers to easily publish, discover, access and use data
  • enable new and more efficient research.

ANDS and EMBL-ABR are both working towards a goal of increasing Australia’s capacity to manage and share research data. In particular we think we could partner on the following:

  • coordinated training/skill development opportunities
  • closer integration between EMBL-ABR information systems and ANDS services
  • publication of EMBL-ABR information resources in Research Data Australia.


What is it?

The data life cycle is a model that follows data throughout a research project and considers at each stage how it can be managed, and how planning to share that data can be incorporated into every stage. It shows that data can have an application beyond the original research project in which it was produced, and how that can be achieved.

Does annotation/curation matter?

Yes! It is very important to provide enough information about the dataset (metadata) to secondary users, to enable its reuse and to enhance the ease with which it can be discovered. Curation is important because it ensures that the data remains usable into the future.

Why does it matter now?

The data life cycle matters because it provides a framework for researchers to work within when planning to manage data within their research projects. It also matters now because data that is being collected/generated now will be useful and used into the future.

Who should care about it?

Anyone who creates data and intends for it to be available for reuse, or wants to reuse someone else’s biological data.

How is it relevant to Bioinformatics in Australia?

Using a framework like the data life cycle ensures that Australian bioinformaticians are keeping up with best practice standards being used and developed internationally.

Where can I get more information?

EMBL-EBI, Bioplatforms Australia, the Open Data Institute, ANDS.


Biosketch: Kate LeMay has worked as a Pharmacist in both community and hospital pharmacies. She has also worked for several years as a Project Manager at the University of Sydney and the Woolcock Institute of Medical Research, in community pharmacy-based programs designed to assist patients with chronic disease management. Kate now works at the Australian National Data Service as a Senior Research Data Specialist, focusing on health and medical data.

kate.lemay@ands.org.au | W:  ands.org.au | Twitter: @katelemayands | ORCID: orcid.org/0000-0002-2405-7365.