Large genetic studies in biobanks: from registries screening, to interpretation of GWAS and beyond

Dates: October 23-27, 2017

Location: University of Oslo, Oslo University Hospital at Ullevål. Room: Aud kir avd 2et at Kvinnesenteret. Enter Building 10 (Hotel) and then follow signs showing the way to the lecture room, which is at the second floor of Building 8 (Kvinnesenteret). It should take about 4 minutes to get there from the Hotel entrance. See map [ddownload id=”2344″ text=”here” style=”link”].

Lecturers: Wes Thompson UCSD/UCPH, San Diego/Copenhagen, Stephanie Le Hellard, Tatiana Polushina (UiB), Ole A. Andreassen, Marissa LeBlanc, Ted Reichborn-Kjennerud (UiO)

Recommended credits: 5 ECTS. Note that this course runs without a UiO course code. We will provide a diploma describing the work load after the course, to be approved at your local institution.

Schedule: A preliminary overview and references for reading can be viewed here.

Preparations: Please make sure you prepare well for the course by reading the material found in this folder, which will be updated with the last material as soon as we get this from the lecturers.

Software installation: please visit this page

Exam: please find the assignment here. Report is due 25 November.

Presentations: can be found in this folder.

Description:

In order to have enough power, genetic studies need to be carried out at very large scales and usually require sample sizes larger than 10,000s. This poses biostatistic- and bioinformatic challenges at different levels. The course will cover challenges facing the use of biobanks and registries for large scale genetic analyses; from the design of the studies, to the execution, the analyses of the data (beyond single p-value analyses), and the functional interpretation of the results.

Details of topics to be covered:
Scandinavian countries have a specific privilege for large scale genetic studies by the availability of precious registries that can be mined to that purpose. We will first focus on using registries to obtain large scale samples and phenotypes. Genotyping large samples requires implementation of very robust pipelines from DNA extraction to genotyping, QC, generation of genotypes, haplotypes and imputations, and we will present state of the art pipelines for these purposes.

Primary analysis on a single phenotype can lead to genetic findings for one phenotype, but the full use of GWAS data usually implies working in consortia, and also performing analyses other than single p-value studies. In our groups, we have implemented different tools which cover the use of polygenic risk scores, multivariate methods, FDR based methods and pleiotropy methods. Students will be presented with a series of state of the art statistical analyses of GWASs.

Finally, in order to move from p-value to a functional meaning of GWASs it is necessary to perform in silico analyses of the hits. We will present several analyses possible from the analysis of single variants to pathway analyses of GWASs.

Course program:
This is an intensive one week course with lectures and hands on exercises. The students will be given a pensum before the course that they will need to get through in order to be ready for lectures. In the last day of the course we will prepare for a study project that the students will have to work on for the report and the validation of the course to obtain the 5 ECTs.

Day 1: Use of registries to collect phenotype information

Day 2: Performing large scale genotyping,imputation and long range haplotype phasing

Day 3: Analysis of GWAS, single phenotype, polygenic risk scores

Day 4: Multivariate and Bayesian analysis of GWAS

Day 5: Towards function: annotating hits of GWAS in silico

Learning outcomes:
Following this course the students will:

be able to design a study using registries
know about the different Norwegian registries and biobank requirements
be able to describe the different steps of large scale genotyping
be able to perform basic analyses of GWAS
know about additional tools for analyses of GWAS such as FDR, condFDR, conjFDR, polygenic risk score
know a catalog of bioinformatic tools that can be used to annotate GWAS hits and perform pathway analyses and gene set analyses for GWAS

Prerequisites:
A pensum of lectures will be given before the course which the students will have to prepare for before the course. We expect the students to have a basic knowledge on genetic principles, with a special focus on polygenic traits. The students must be familiar with basic statistics such as regression and FDR, and have some knowledge on Bayesian statistics. The students must have some experience with R.