EMBL-ABR: UA Node

EMBL-ABR: UA Node

The University of Adelaide’s School of Agriculture, Food and Wine is the largest agricultural research centre in the Southern Hemisphere. The School is co-located with CSIRO and the South Australian Research and Development Institute (SARDI) on the Waite Campus. The School has world-class expertise in the areas of plant genomics, crop improvement, sustainable agriculture, horticulture, viticulture and oenology. Crop Bioinformatics Adelaide (previously ACPFG Bioinformatics) provides extensive capabilities in the development and application of bioinformatics to biological problems, particularly in the field of cereal genomics. The strength of this Node lies in its ability to interface with the biological domain, addressing problems often only seen in plants and/or non-model organisms.

The Node is also known for close and successful collaborations with national and international industry partners.

Key Area ACTIVITY DESCRIPTION
 

EMBL-icon-tools

EMBL-icon-platforms

 

The UA Node has access to large memory compute cluster for crop research using the Slurm workload manager. The cluster connects to 100’s TB of clustered filesystem storage. The node has expertise managing and using such a system for the analysis of large-scale genomics datasets. Access to this infrastructure for those involved in crop research collaborations. Provision of tools for the aggregation of genomic information to expedite novel biological insights from biologists. The Node has developed tools for aggregating various genomic datasets and presenting them in a way that facilitates biologists in interpreting their own data in the context of existing datasets. Some of these tools contain large amounts of processed data which we plan to make more readily available to other Nodes. In addition, some of these tools have been “dockerised” so they are easily deployed within cloud and HPC environments for others to use with their own data and infrastructure. There is also the possibility of developing/providing a mechanism for users to launch Docker containers, on-demand, within the compute environment provided by a Node for larger scale, custom analyses.The Node has developed tools for aggregating various genomic datasets and presenting them in a way that facilitates biologists in interpreting their own data in the context of existing datasets. Some of these tools contain large amounts of processed data which we plan to make more readily available to other Nodes. In addition, some of these tools have been “dockerised” so they are easily deployed within cloud and HPC environments for others to use with their own data and infrastructure. There is also the possibility of developing/providing a mechanism for users to launch Docker containers, on-demand, within the compute environment provided by a Node for larger scale, custom analyses.

The Node has developed tools for aggregating various genomic datasets and presenting them in a way that facilitates biologists in interpreting their own data in the context of existing datasets. Some of these tools contain large amounts of processed data which we plan to make more readily available to other Nodes. In addition, some of these tools have been “dockerised” so they are easily deployed within cloud and HPC environments for others to use with their own data and infrastructure. There is also the possibility of developing/providing a mechanism for users to launch Docker containers, on-demand, within the compute environment provided by a Node for larger scale, custom analyses.The Node hosts a PacBio Sequel System capable of generating sequence data using the Single Molecule, Real-Time (SMRT) technology. It can deliver 5-8Gbp of data per SMRT Cell with average read lengths of 10-15kbp. The random error profile and even coverage make the Sequel ideal for generating high-quality genome assemblies. Beyond genome assemblies, the Sequel System is also capable of sequencing full-length transcripts, thereby negating the need for transcriptome assembly while directly uncovering splice forms. In addition, the Sequel System inherently captures the kinetics of nucleotide incorporation events during sequencing runs; thereby enabling the analysis of base-modifications.

The Node hosts a PacBio Sequel System capable of generating sequence data using the Single Molecule, Real-Time (SMRT) technology. It can deliver 5-8Gbp of data per SMRT Cell with average read lengths of 10-15kbp. The random error profile and even coverage make the Sequel ideal for generating high-quality genome assemblies. Beyond genome assemblies, the Sequel System is also capable of sequencing full-length transcripts, thereby negating the need for transcriptome assembly while directly uncovering splice forms. In addition, the Sequel System inherently captures the kinetics of nucleotide incorporation events during sequencing runs; thereby enabling the analysis of base-modifications.It is intended that this Node will provide access for biologists to the Sequel System on a collaborative basis if there is available capacity.

It is intended that this Node will provide access for biologists to the Sequel System on a collaborative basis if there is available capacity.

As part of the EMBL-ABR network, these valuable resources will be extended to include the current in-house methods for the analysis of non-model organisms. Further platform and tool development will make unique Australian datasets more easily accessible and visible internationally.

EMBL-icon-training  

The UA Node has expertise in the development and delivery of hands-on bioinformatics training workshops using the Nectar Research Cloud (the Cloud). In addition, the Node holds a large allocation (400 VCPUs and 1.6TB RAM) on the Cloud for the purpose of delivering hands-on bioinformatics training. Biologists will benefit from having better access to introductory hands-on bioinformatics training while bioinformaticians will benefit from having better access to advanced training materials and a larger network of collaborators. This will be achieved by deployment of workshops developed through a collaboration between Bioplatforms Australia and CSIRO as well as by contributing to the development of new training resources made through EMBL-ABR. The Cloud allocation held by the Node for training purposes will be made more widely available to the network of trainers across EMBL-ABR.

NODE HEAD:

Dr Ute Baumann

Bioinformatics Group Leader

University of Adelaide

ute.baumann@adelaide.edu.au

+61 8 8313 7388