Microbial comparative genomics

Microbial comparative genomics

– phylogenetics, genome sequencing, genome annotation, antimicrobial resistance, comparative genomics

 

Time: April 8-12, 2024

Place: University of Bergen, Computational Biology Unit

Course responsible: Anagha Joshi and Tom Michoel  

Invited lecturers: Dr. Grigorios Amoutzias, Marios Nikolaidis

Suggested amount of ECTS: 5

Number of participants: Max. 20

Registration form: Here

Registration deadline: March 8th. 

 

Course description  

The course will be given over a week (5 days Monday-Friday) with lectures (34 hours including discussions) in the mornings and practical hands-on sessions in the afternoons. On the last day there will be a summing-up session where the students can provide feedback. The students will receive a reading list before the course and are expected to prepare well for the course. The students will do a project after the course and deliver a report within two weeks after finishing the course. Students must bring a laptop running a UNIX-Like operating system (any flavor of Linux, UNIX or Mac) to the course and will receive software installation instructions along with the reading list. 

PLEASE NOTE: The students are expected to bring their own laptop with a virtualbox installed in their laptops. During the course, the students will perform analysis in a virtual machine with all the necessary software. The students will download and install/run it in their machines. The students will get detailed instructions about it, close to the start of the course.

 

Course program 

Day 1: General Introduction into Bacterial Genomics and the Linux environment.

  • The Notable Achievements and the Prospects of Bacterial Pathogen Genomics.
  • Familiarization with the linux environment and command syntax for running genomic pipelines (all day at the computer lab).  

 

Day 2: Homology search, multiple alignment and phylogenetics/phylogenomics.

  • Basic evolutionary concepts.
  • Homology search with BLAST and DIAMOND.
  • Multiple alignment with Muscle and MAFFT.
  • Alignment filtering with GBlocks. Alignment editing with Seaview, Jalview, MEGA.
  • Phylogenetic trees (Neighbor Joining and Maximum Likelihood) with Seaview and phyML.
  • Bootstrapping for assessing the confidence of the various tree branches.
  • Tree annotation with interactive tree of life (iTOL).
  • Phylogenomics.  

 

Day 3: Bacterial genome sequencing with short and long reads

  • The most popular sequencing technologies (Illumina/Pacific Biosciences/Nanopore).
  • The FASTQ format of sequence reads.
  • The Sequence Read Archive (SRA) for storing and obtaining publicly available raw sequencing data.
  • Quality control (FASTQC) and filtering (trimmomatic) of sequence reads. Bacterial genome de novo assembly (SKESA, Spades, Canu, Hybrid assembly).  

 

Day 4: Genome annotation – antimicrobial resistance – virulence factors.

  • Gene prediction/annotation with NCBI RAPT/PGAP, Prokka.
  • Functional annotation with eggNOG mapper and the eggNOG database.
  • Detection of antimicrobial resistance genes with AMRfinderplus/staramr.
  • The Comprehensive Antibiotic Resistance Database (CARD). The Virulence Factor DataBase (VFDB).
  • Other popular bacterial genome databases (NCBI assembly/datasets, PATRIC, Pathogenwatch).  

 

Day 5: Comparative genomics, species demarcation with FASTANI (average nucleotide identity).

  • Taxonomic classification with Multilocus sequence typing and the PubMLST web-based bioinformatics resource.
  • The core and accessory genome. Species-specific fingerprints.
  • The pyPGCF pipeline for comparative genomics.
  • Examples of comparative genomics with the Pseudomonas, Bacillus, Streptomyces and Staphylococcus genus.  

 

Learning outcomes and competence 

Students will learn the principles underlying sequence comparison including an evolutionary understanding of sequence alignments, genome analysis, genome annotation and comparative genomics. The scope of the course will bridge basic concepts, often new to biologists, with state of the art issues raised by large scale data analysis. Students will gain a solid understanding of diverse aspects of bacterial comparative genomics including antibiotic resistance and virulence factors. At the end of the course, the students will have a practical understanding of homology search, multiple alignment and phylogenetics/phylogenomics, bacterial genome sequencing with short and long reads, genome annotation, antimicrobial resistance, virulence factors, comparative genomics. 

 

Prerequisites  

There are no prerequisites for the entry to the course. 

 

 Evaluation  

The student will work individually or in small groups to perform a project based on the course content on any dataset of their interest. The student will submit a report within two weeks of finishing the course and the written report will need to be approved. Grades: pass / no-pass (based on the report).