The Antibiotic Resistant Pathogens project, led by Prof. Mark Walker (UQ) and funded by Bioplatforms Australia (NCRIS), is generating a rich multi-omics dataset on multiple bacterial taxa isolated from patients with bacterial sepsis. Five bacterial species commonly associated with sepsis, and five strains of each species, are being cultured in the laboratory in two different growth media. Each sample is then being subjected to DNA and RNA sequencing, proteomic and metabolomic profiling to gather information about variation among and within strains across the molecular levels. This will improve our understanding of antibiotic resistance in sepsis as well as producing an excellent exemplar dataset for multi-omics research which will be of immense value to Australian and international systems biology and life science researchers.
Closely connected is the OMICS Data Services Flagship, a Research Data Services (also NCRIS-funded) project which is developing and implementing a flexible, stable cloud-based data management platform, where multi-omics data can be stored, organised and searched, then linked with downstream bioinformatics analyses set up as reproducible workflows within the platform. Finally, data can be submitted to international repositories. This platform will initially be developed to operate across the data generated from the BPA-funded initiative.
EMBL-ABR: Melbourne Bioinformatics Node members are leading the genomics data processing and developing a flexible, user-friendly data analysis platform, the Microbial Genomics Virtual Lab (mGVL). This platform hosts pre-installed, bacterial-specific analysis tools including SPAdes, prokka, and snippy. This group is also developing a set of online tutorials to train researchers in microbial genome assembly, annotation and variant calling. The EMBL-ABR: AGRF Node is performing the transcriptomics sequencing and data processing. The EMBL-ABR: QCIF Node, the EMBL-ABR: Melbourne Bioinformatics Node and other project members are contributing to the Data Management Platform for the OMICS Data Services Flagship project and leading the data chaperoning needed to submit the multi-omics data to international repositories.
EMBL-ABR Hub involvement in these projects centres on metadata collection and management. The EMBL-ABR Hub is working with Jeff Christiansen (Intersect NSW), RDS staff, Anne Kunert and other EMBL-ABR: QCIF Node staff experienced in data management and repository submission to capture the comprehensive metadata that should be associated with the data to record the rich and valuable contextual information necessary for full analysis, discoverability and reproducibility of this work. We are following European Bioinformatics Institute standards to collect metadata
- describing the bacterial samples themselves, with taxonomic and environmental context
- The sample metadata will be submitted to the BioSamples database, where all raw and derived data for each BioSample (each bacterial strain) and the overall Group (all strains and species involved in this project) will be linked and collated.
- describing each experiment that is run
- The experimental metadata describes the experimental context of producing raw genomic, transcriptomic, proteomic and metabolomic data generated in the BPA project. These datasets and accompanying experimental metadata will be submitted to the ENA repository, ArrayExpress, PRIDE, and MetaboLights databases respectively.
- describing downstream analysis performed to produce derived results
- The analysis metadata will allow replication of these analyses from the raw data. It will be associated with the derived data products, for example the bacterial genome and transcriptome assemblies, which will also be submitted to ENA.