SARS-COV-2_B4B

This is a repository for the SARS-CoV-2 Bioinformatics for Beginners Course

*May 2023 update note - Access to file download links may change in the first two weeks of May 2023 which would impact the input data for example commands. Please expect errors over this time period. *

SARS-CoV-2 variant lineage identification is key to pandemic tracking and enabling public health response. This course is an introduction to bioinformatics by applying skills used in SARS-CoV-2 genomic data analysis. This will be a distributed classrooms style course run across Africa; Latin America and the Caribbean; and Asia. This model was developed by H3ABioNet, see this publication for more info.

SARS-CoV-2 variant lineage identification is key to pandemic tracking and enabling public health response. This course is an introduction to bioinformatics by applying skills used in SARS-CoV-2 genomic data analysis. Bioinformatics skills are fundamental in management and assessment of viral sequences. This course will introduce you to processing data programmatically, the data formats used in viral sequencing, how to determine the variant lineage (Delta, Omicron etc.), and how to share data so that others around the world can benefit. These skills are the building blocks for scaling up analysis to pandemic response levels.

Course website
Glossary

Course structure

This course is making use of Google Colab - https://colab.research.google.com/, a free to use service.

Access to Colab is via a Google Account, which can be made for free.

Time commitment

Contact sessions will run twice a week, lasting for 4 hours per session. It will run between the 31st of October – 2nd of December 2022. There will be sessions in two time zones. Note, each session for Oceania and Asia; and Latin America and Africa; will run in the same block of time, but with regional time differences.

Target audience

The course is aimed at postgraduate scientists, postdoctoral scientists, junior faculty members or clinicians/healthcare professionals based in the regions across Africa, Asia, and Latin America & the Caribbean. It does not require bioinformatics skills as a prerequisite.

Programme

The programme will cover the following core topics:

Learning Outcomes

Instructor Team

Course manual

Introduction Week

Introduction Notebook - Begin here

Video Playlist - Introduction Week

Introduction Day 2 Dayplan

Module 1: Introduction to Notebooks & Unix command line
Module 1 Video Playlist (Parts 1 and 2)

Module 1 Part 1 Day Plan

Module 1 Part 2 Day Plan

Module 1 Part 1 and Part 2 Notebook Instructions

Bonus Videos for NGS technologies

Module 2: Data QC and Consensus sequences

Module 2 Video Playlist (Parts 1 and 2)

Module 2 Part 1 Day Plan

Module 2 Part 2 Day Plan

Module 2 Data QC and Consensus Notebook Instructions Parts 1,2,3

Module 3: Variant Lineage Identification
Module 3 Video - Variant Lineage Identification

Module 3 Part 1 Day Plan

Module 3 Variant Lineage Identification Notebook Instructions

Module 3 Part 2 Day Plan - Exercise

Module 4: Data sharing and interpretation

Module 4 Video Playlist (Please watch Sections 1-2 for Day 1, and Sections 3-7 for Day 2)

Module 4 Part 1 Day Plan

Module 4 Data Sharing and Interpretation Notebook Instructions

Module 4 Part 2 Day Plan (Exercises for Day 2 are in the videos for Sections 5-7)

Module 4 Slide deck pdf

Additional information

WCS LMS
COG-Train Online courses
Your digital mentor podcast
WCS courses and conferences

Any reuse of the course materials, data or code is encouraged with due acknowledgement.


License

Creative Commons Licence
This work is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).