Specialised Skills
Metagenomics with High Performance Computing
Metagenomics explores all the genetic material in an environmental sample. It can be used to characterise the taxonomic characteristics of microbial communities.
This two-week online module will be taught from Tuesday 11 until Friday 21 April 2023.
Metagenomics analyses involve a lot of data and can take hours to days to complete! But don't worry! The format of the workshop takes account of this. For longer analyses there will be scheduled online zoom sessions to cover concepts and get started followed by offline time for the analysis run and for you to complete some exercises. These will be supported by online drop-ins and a Slack channel for trouble-shooting.
Target audience
UK based Environmental Science researchers at any career stage or anyone with an interest in Metagenomics. We assume no prior experience of the command line or high performance computing.
Study mode
You have the option of either attending the online workshops or completing the materials independently via the self-study mode. Both of these options use the same educational materials and will be supported by online drop-ins and a Slack channel for trouble-shooting. The instances will be active from 11 - 28 April 2023 which should allow you enough time to cover all of the educational materials. Usage of the instances will be monitored; instances which are inactive will be closed down as a daily charge is incurred.
Registration
There are 30 places available for the online course and 30 places available on the self-study mode. Please register using the relevant button below. This course is free of charge.
Programme
During the first week we start by teaching the essential tools used in High Performance computing such as file systems and the command-line to connect to and use cloud computing for file navigation and script writing. We then introduce metagenomics and building a metagenomic assembly. In the second week, you will learn how to improve your assembly by ‘polishing’, separate your assembled metagenome into individual genomes (MAGS) and conduct taxonomic assignment and analysis.
But if you can't get something to work, we have a Slack channel and a weekly drop-in to help.
Win £25 in Amazon vouchers for showing commitment to your learning!
At Cloud-SPAN we know it's difficult to find time to train so we've come up with a little incentive for learners signed up for the online course! For each week of the Metagenomics course there will be a prize draw and two people will win a £25 Amazon voucher! All you have to do to enter the prize draw is go through the exercises - we will use your command history to verify your efforts. It doesn't matter if you make mistakes or can't get something to work, it is your commitment that matters!
Happy learning!
Week 1
Tuesday, 11 April | 10:00 - 12:00 Introduction. Command-line programming: File systems, files and directories.
Tuesday, 11 April | 14:00 - 16:00 Command-line programming: Using the command line.
Thursday, 13 April | 10:00 -13:30 An introduction to metagenomics, Quality Control and Assembly. Offline time for the assembly to complete.
Friday 14 April | 10:00-11:00 Online trouble shooting drop-in session (optional).
Week 2
Monday, 17 April | 13:00 - 14:00 Polishing your metagenome assembly. Offline time for the polishing to complete.
Tuesday, 18 April | 13:00 - 14:00 Online trouble shooting drop-in session (optional).
Wednesday, 19 April | 13:00 - 14:00 Binning a metagenome assembly into individual genomes (MAGS). Offline time for the binning to complete.
Thursday 20 April | 1300 - 1400 Online trouble shooting drop-in session (optional).
Friday 21 April | 1000 -1300 Taxonomic Assignment and Analysis.
If you are unable to attend an online session, recordings will be distributed. You will have access to your instance until Friday 28 April (one week after the course is completed) to give you time to consolidate your learning.
We recommend workshop participants have a second monitor and a headset. Dual monitors allow you to engage with the trainers, view materials shared on screen and other participants on one monitor while doing the activities on another. To support you with your learning we have funding to provide you with a headset and a monitor if you require this. To request this additional funding please indicate this in the registration form. Funds for the monitors and headsets will be distributed at the end of course to those participants who regularly attended the online sessions.
Pre-requisites
You will need familiarity with biological concepts, including the concept of microbiome. We assume no prior experience of command line or high performance computing. We assume no prior experience of the command line or high performance computing. Windows users will need to install GitBash.
You don't need to worry about installing metagenomics software or putting the data on your own computer! You will have access to an Amazon Web Services instance with all the data and software and will only need to log in to it.
Learning outcomes
Following completion of this course, learners will be able to:
explain the hierarchical structure of a file system and describe the files and file structure used in the course
explain what is meant by a working directory, a path and a relative path and write down paths that they will need for the course
start a Terminal (Mac) or Git Bash Terminal (Windows)
navigate a file system using the command line
log in to and exit their AWS instance (the cloud)
use common commands such as ls, pwd and cd, on the command line
know the difference between genomics and metagenomics
describe the steps in a metagenomic workflow
perform quality control on reads and assemble them into a metagenome
perform polishing to improve an assembly
use binning to separate the metagenome into different species or MAGs (Metagenome-Assembled Genomes)
use Kraken 2 to assign taxonomy to reads and contigs and phyloseq in R to analyse taxonomic diversity
Scholarships and additional support
We offer Diversity and Hardship Scholarships to enable members of underrepresented groups and those with financial difficulties to participate in our training courses. To apply for a scholarship please complete the relevant section in the online application form, before 12 noon on the 13 March 2023. Submissions after this date will not be considered. All applicants will be notified of the results by the 20 March 2023.
We recommend workshop participants have a second monitor and a headset. Dual monitors allow you to engage with the trainers, view materials shared on screen and other participants on one monitor while doing the activities on another. To support you with your learning we have funding to provide you with a headset and a monitor if you require this. To request this additional funding please indicate this in the registration form. Funds for the monitors and headsets will be distributed at the end of course to those participants who regularly attended the online sessions.
Join us for a Code Retreat!
The next Code Retreat will take place at 10:30-15:00 on 31 May 2023 at the University of York. Our instructors will be on hand to help you problem-solve the issues that arise as you work. Lunch will be provided and funding is available to support travel to the Code Retreat.
Genomics Alumni contact cloud-span-project@york.ac.uk to register and invitations will be distributed shortly.
What could you achieve?
Working with your peers and with help from our instructors, you could:
Revise our Metagenomics course
Get help organising and documenting your own analysis
Apply tools taught in Metagenomics to your own data
Get help with Creating your own Amazon Web Services instance for Genomics
Network with other genomics researchers
Pre-requisites
Attendees should have completed the a Cloud-SPAN course
Attendees will need to bring their own laptop to this event
This course teaches data management and analytical skills for genomic research.
Registration
This course is free of charge, however we ask that you register for the Self-Study learning mode using the registration button below, so we have your details and we can provide you with additional information.
Pre-requisites
Knowledge: learners should have completed the Prenomics course or be able to successfully complete the self-assessment quiz. Learners are also expected to have some familiarity with biological concepts, including the concept of genomic variation within a population. Please note, if you are completing the course on a self-study learning path you will need to complete the Creating your own instance course beforehand.
Software: view the required software set-up.
Genomics self-study mode
📢 If you require any support or help, we have a weekly online drop in session every Thursday at 3pm. We will be there to answer any questions that you may have, you will receive the link once you have registered.
Step 1 - Create your own AWS instance module
Starting with the ‘Create your own instance’ you are given a step by step guide to create your own instance which will be used during the subsequent Prenomics and Genomics courses.
Step 2 - Prenomics Module
If you are new to the realm of navigating file systems and using the command line we recommend you complete the ‘Prenomics course.’ We have designed this course to allow more time for those with less experience to cover some foundation concepts. If you aren’t sure how to gauge your skills take the self-assessment quiz to help you decide.
Step 3 - Genomics Module
The Genomics course allows you to move on to the more fun stuff as you develop your skills in managing data. You will tackle tasks such as assessing read quality, trimming and filtering, and variant calling.
Step 4 - Community
After completing the three courses (or two) we hope that you are able to attend one of our regular code retreats where our course instructors will be on-hand to problem solve any issues that arise in your data-sets. Also it is strongly encouraged to take advantage of our welcoming Cloud-SPAN community and don’t be afraid to lean on your peers for help or discussions.
Overview & programme for in-person workshop
Topics: Session 1 - Project management for cloud genomics
Learn how to structure your data and metadata
Plan for an NGS project
Learn about the benefits of cloud computing
Session 2 - Data preparation and organisation
Learn how to structure your data and metadata
Plan for an NGS project
Learn about the benefits of cloud computing
Session 3 - Assessing read quality; trimming and filtering reads
Trimming and filtering, learn how to filter out poor quality data
Assessing read quality
Session 4 - Finding sequence variants
Understand the steps involved in variant calling
Describe the types of data formats encountered during variant calling
Use command line tools to perform variant calling
Instructors will give a demonstration on how to use the Integrative Genomics Viewer (IGV), an interactive tool for the visual exploration of genomic data.
Target audience
Learners who have completed the Prenomics Course
PhD Students
Researchers
This course would be appropriate for learners with experience using the command line, who are expecting to generate a dataset in the future or those who already have a dataset and would like guidance on how to analyse it.
Learning outcomes
Following completion of this course, learners will be able to
structure their data and metadata and plan for an NGS project
organise and document genomics data and bioinformatics workflows
understand what information is needed by a sequencing facility
gain practice navigating file systems, creating, copying, moving, and removing files and directories
use command-line tools to assess read quality and perform quality control
align reads to a reference genome, and identify and visualise sequence variants
work with Amazon AWS cloud computing and transfer data between a local computer and cloud resources
Statistically useful experimental design
Experimental design is critical for 'omics experiments in order to generate data capable of addressing your research questions and control your reagent costs. There are choices to be made about sample preparation and storage, sequencing technologies, the numbers of technical and biological replicates and sequencing depth. The most appropriate choices depend on the type of research question you have, the strengths and weaknesses of the platform and the biological variability.
The next course will be held 11:30 am-3:00 pm on Friday, 14 April at the University of York.
Registration
This course is free of charge, please register by completing the online form using the registration button below.
Programme
In this half-day workshop we consider case-studies in detail to discuss some of the most important aspects that need to be taken into account to design and perform experiments that generate the reproducible, high-quality data you need. There will also be an opportunity to discuss your own experimental designs.
Pre-requisites
This lesson assumes no experience with designing omics’ experiments but some previous experience of simple experimental design and statistical testing will useful.
Learning outcomes
Following completion of this course, learners will be able to:
Understand why statistical design is important
Incorporate good statistical design into their experiments
Scholarships
We offer Diversity and Hardship Scholarships to enable members of underrepresented groups and those with financial difficulties to participate in our training courses. To apply for a scholarship please complete the relevant section in the online application form, before 12 noon on Friday 31 March 2023. Submissions after this date will not be considered. All applicants will be notified of the results by Friday 7 April 2023.