OSIRIS - Course offerings BMB508218 2017

Help

Course module: BMB508218

BMB508218

Analytics and Algorithms for Omics Data

Course info

Course code		BMB508218
EC		1.5

Course goals

At the end of the course the student:

can read and understand a paper in current computational and systems biology literature,
identify relevant parts in the paper on the topic of data generation and the algorithms used to analyse these data and criticise the computational approaches taken,
list and describe several high-throughput data types and computer algorithms to analyse these data and motivate why a certain algorithm is suitable for the analysis of a certain data type,
apply the algorithms discussed in this course to toy problems, and derive and design adaptations of these algorithms for new data types,
draw biologically meaningful conclusions from results obtained with a analysis algorithm.
understands and can explain the basics of unsupervised Machine learning (ML) and the specifics of k-means, hierarchical and spectral clustering
understands and can explain the basics of supervised Machine learning (ML), including concepts such as cross-validation and overtraining and the specifics of probabilistic, knn and random forest classifiers
understands and can explain the basics of dimension reduction and the specifics of PCA, NMF and tSNE.
understands and can explain the basics of Hidden Markov Models and their application to (epi)genomic data
understands and can explain the basics of sequence analysis and alignment and the specifics of dynamic programming, variant calling and modern next generation sequencing analysis

Content

Period (from-till): 18 June 2018 - 22 June 2018

Lecturer(s):
Name, faculty/department, participation (%) in course
Dr. Jeroen de Ridder, UMC University, 60%
Dr. Alexander Schoenhuth, Utrecht University, 40%

Extended course description (for Osiris):
Bioinformatics is at the heart of many modern genomics research, and encompasses the application of statistics and computer science to (large-scale) biomolecular datasets. In essence, bioinformatics is about smart ways of extracting knowledge from the enormous amounts of data that can be generated using modern measurement techniques. For instance, it plays an important role in finding the genetic origins of various diseases, such as cancer, diabetes or alzheimer.

In this course we will study some key examples of bioinformatics analyses, i.e. data analytics and computational algorithms, by reading a set of selected papers that present some significant biological conclusions. Instead of the teachers giving lectures about the methodologies, the students are stimulated to read, study and comprehend the available course material. Some lectures will be provided to ensure the basic concepts are clear.

Schedule: The course runs for five days from 9.00 till approximately 17.00. Each day will start with a lecture followed by two rounds of paper discussions that goes into depth with regards to the computational approaches taken.

Content:

Unsupervised learning, Hierarchical and k-means clustering, spectral clustering
Supervised learning, cross-validation, overtraining, Bayes classifier, Random Forest classifier
Dimension reduction, PCA, NMF, tSNE
Hidden Markov Models, Forward Backward algorithm, Viterbi
Sequence alignment, Dynamic programming
Read mapping techniques
Sequence data indexes, such as Burrows-Wheeler Transform
Genome assembly basics, de Bruijn graphs, overlap graphs
Hash-based techniques, for example for overlap detection

Literature/study material used:
Provided course materials (slides) will be made available through our online learning platform: elearning.ubc.uu.nl

Registration:
Please register online on the CS&D website: www.CSnD.nl/courses.
Bioinformatics Profile students will have priority when this course is followed as a part of their profile.
Thereafter, registration is on 'first-come-first-serve' basis until the maximum number of 20 participants is reached.

Mandatory for students in own Master’s programme:
No

Optional for students in other GSLS Master’s programme:
Yes

Prerequisite knowledge:
Basic knowledge of Linear Algebra and Statistics.

Help