The manual and programme for Wellcome Connecting Science's Genome Academy
This tutorial has been modified from a tutorial delivered at Scifest Africa by the Student Council of the South African Society for Bioinformatics - SASBI-sc
You will explore genes for in Taste Receptors across different species!
Task 1: Retrieve sequences:
Obtain the protein sequences for TAS1R3, TAS1R2 and TAS1R1 for the organisms:
Procedure:
TAS1R1
TAS1R2
TAS1R3
in the textbox provided and select WiKi-Gene Name(s) on the dropdown menu above the textbox.
Repeat steps 4 - 14 (changing the species name under the -CHOOSE DATASET- dropdown menu) for each species to get the required protein sequences for the three species listed below.
Then for these species you can right click and retrieve as follows
wget https://raw.githubusercontent.com/WCSCourses/genomeacademy/main/course_data/chicken_mart_export.txt
Or - right click here for Chicken and open in new tab then right click on this page and say “save as” the name the file: chicken_mart_export.txt
wget https://raw.githubusercontent.com/WCSCourses/genomeacademy/main/course_data/japan_medaka_mart_export.txt
Or - right click here for Japanese Medaka and open in new tab then right click on this page and say “save as” the name the file: japan_medaka_export.txt
wget https://raw.githubusercontent.com/WCSCourses/genomeacademy/main/course_data/pufferfish_mart_export.txt
Or - right click here for Pufferfish and open in new tab then right click on this page and say “save as” the name the file: pufferfish_mart_export.txt
Copy and paste all the sequences into a single file and call it all_sequences.fasta.
if you keep the suffix of mart_export, you can use the command:
cat *mart_export.txt > all_sequences.fasta
Task 2 Data Cleaning Then ensembl gene indentifier can be used to translate the organism the gene came from:
ENSTRU Takifugu rubripes (Fugu)
ENSMOD Monodelphis domestica (Opossum)
ENSCAF Canis lupus familiaris (Dog)
ENSGAL Gallus gallus (Chicken)
ENSORL Oryzias latipes (Medaka)
ENSTNI Tetraodon nigroviridis (Tetraodon)
Do last: ENSG0 Homo sapiens (Human)
Open your all sequences file with gedit
Hit control h to open up the find replace menu, or control f, then select replace.
In the search, place the “ENSTRU” symbol, and in the replace option, the species name - Takifugu rubripes (Fugu) for each symbol and species type.
Finally, the web aligner doesnt like spaces in the names of fasta headers, so use the replace tool to replace “ “ (a space, just hit space) with _
Sequence Alignment
Task 3: Perform a multiple sequence alignment using the sequences you retrieved.
Procedure:
Extra task - use ensembl biomart to find your favourite organism from its selection of species and retrieve the taste genes. How do these compare to the other species we have seen today?
References