Researcher Training Day for Life Scientists

GAMe2017logoX

 

The Galaxy Australasia Meeting 2017 kicks off on Friday 3rd February with the Researcher Training Day.

This event targets life scientists working with Next-Generation Sequencing data, in particular RNA-Seq, RAD-Seq, and microbial genomes.

 You can register to attend the Researcher Training Day without attending the rest of GAMe2017. It will take place in LAB-14, 700 Swanston St, University of Melbourne.

Time Concurrent Session 1 Concurrent Session 2
08:00 Registration Registration
08:45 Introduction to Galaxy, Genome Assembly and Annotation

Introduction to Galaxy and RAD-Seq

09:15 Bacterial Genome Assembly and Annotation in Galaxy

RAD-Seq with Stacks in Galaxy

10:15 Break Break
10:45 Bacterial Genome Assembly and Annotation in Galaxy

RAD-Seq with Stacks in Galaxy

12:45 Catered Lunch Catered Lunch
13:45 Introduction to RNA-Seq

Introduction to Variant Calling

14:15 RNA-Seq in Galaxy

Variant Calling in Galaxy

15:15 Break Break
15:30 RNA-Seq in Galaxy

Variant Calling in Galaxy

17:30 Done Done

Hardware

For this workshop participants will need a wifi-enabled laptop.

Software

Sessions require a web browser such as Chrome, Firefox, or Safari.  The latest version of Internet Explorer should also work.

Bacterial Genome Assembly and Annotation in Galaxy

Convenor: Simon Gladman

The workshop will cover the basics of de novo genome assembly using a small genome example. This includes project planning steps, selecting fragment sizes, initial assembly of reads into fully covered contigs, and then assembling those contigs into larger scaffolds that may include gaps. The end result will be a set of contigs and scaffolds with sufficient average length to perform further analysis on, including genome annotation (link to that nomination). This workshop will use tools and methods targeted at small genomes. The basics of assembly and scaffolding presented here will be useful for building larger genomes, but the specific tools and much of the project planning will be different.

This workshop will also introduce genome annotation in the context of small genomes. We’ll begin with genome annotation concepts, and then introduce resources and tools for automatically annotating small genomes. The workshop will finish with a review of options for further automatic and manual tuning of the annotation, and for maintaining it as new assemblies or information becomes available.

This session will include an introduction to the Galaxy platform.

RAD-Seq with Stacks in Galaxy

Convenor: Pip Griffin, Sonika Tyagi

This session demonstrated how to use Stacks tools wrapped in Galaxy to process your RAD-seq (Restriction-site Associated DNA sequencing) data for population genomics. We started with raw sequencing reads and take them through quality-control, and de novo assembly steps. We then demonstrated the use of Stacks tools in Galaxy to call variants, build a variant catalogue and output filtered variant calls for downstream analysis.

We also touched on some of the options in molecular approach to RAD-seq library design and explained the important considerations for different research questions and species of interest.

This session included an introduction to the Galaxy platform.

Instructions for the RAD-Seq with Stacks in Galaxy training

Slides:

RAD-Seq molecular approach and library design

Introduction to Stacks in Galaxy, Illumina sequencing, quality control and initial data processing

Stacks workflow without a reference genome and downstream analysis options

 

RNA-Seq in Galaxy

Convenor: Jessica Chung

This session will cover standard, advanced, and alternative RNA-seq analysis pipelines, all using workflows and highlighting their advanced features. Two general pipelines will be addressed:

  • A standard RNA-seq analysis pipeline using the Tuxedo suite (TophatCuffdiff) for standard transcript quantification with a reference transcriptome.
  • An alternative RNA-seq analysis pipeline using count based quantification methods (DESeq2, edgeR, or limma) to generate abundance measurements.

These pipelines will be used as examples to highlight usage of workflows and their advanced features.

This session will include an introduction to RNA-seq.

Variant Calling in Galaxy

Instructor: Clare Sloggett

The session introduces the tools, datatypes and workflow of variation detection using human genomic DNA using a small set of sequencing reads from chromosome 20. In this session we will:

  • Evaluate the quality of the short data. If the quality is poor, then adjustments can be made – e.g. trimming the short reads, or adjusting your expectations of the final outcome.
  • Map each of the individual reads in the sample FASTQ readsets to a reference genome, so that we can then identify the sequence changes with respect to the reference genome. Some of the variant callers need extra information regarding the source of reads in order to identify the correct error profiles to use in their statistical variant detection model, so we add more information into the alignment step so that that generated BAM file contains the metadata the variant caller expects.
  • Calling Variants using the GATK Unified Genotyper. The GATK Unified Genotyper is a Bayesian variant caller and genotyper from the Broad Institute. Many users consider the GATK to be best practice in human variant calling.
  • Try an alternative caller: Freebayes
  • Evaluate known variations. We know a lot about variation in humans from many empirical studies, including the 1000Genomes project, so we have some expectations on what we should see when we detect variants in a new sample.
  • Annotate the detected variants against the ensembl database and interpret the annotation output.

This session will include an introduction to variant calling.

 


GAMe 2017

Help get the word out: share the GAMe 2017 Poster