Kies de Nederlandse taal
Course module: BMB502114
Advanced Bioinformatics: Data Mining and Data Integration for Life Sciences
Course info
Course codeBMB502114
ECTS Credits1.5
Category / LevelM (Master)
Course typeCourse
Language of instructionEnglish
Offered byFaculty of Medicine; Graduate School of Life Sciences; Cancer, Stem Cells and Development Biology;
Contact persondr. J. de Ligt
Contactperson for the course
dr. J. de Ligt
Other courses by this lecturer
Teaching period
MASTER  (20/08/2018 to 17/08/2019)
Teaching period in which the course begins
Time slot-: Not in use
Study mode
Remark8-12 April 2019 Register via students have priority til 3 weeks before start. Max 30 students.
Course application processadministratie onderwijsinstituut
Enrolling through OSIRISNo
Enrolment open to students taking subsidiary coursesNo
Waiting listNo
Course placement processadministratie onderwijsinstituut
At the end of the course, the student is able to;
  1. Integrate biological data
  2. Think critically about data storage and sharing
  3. Use alternatives to tabular data
  4. See the advantages and limitations of different data storage and sharing solutions
  5. Plot data in an interactive way
Period8 April - 12 April 2019, see
Joep de Ligt, Biomedical genetics/Genetics,
Pjotr Prins, Biomedical genetics/Genetics,
Edwin Cuppen, Biomedical genetics/Genetics,
Berend Snel, Theoretical Biology and Bioinformatics, 
Invited speakers (different each year), 2016 participants listed below:
Jayne Hehir-Kwa, Radboud UMC,
Ruben van Boxtel, UMC Utrecht,
Victor Guryev, Groningen University/UMC,
Mark Wilkinson, Center for Biotechnology, Madrid.
Description of content
Effective mining of data and integrating data is one of the major challenges in biomedical research. Decennia of research have led to an accumulation of databases world-wide, including important resources, such as NCBI, KEGG, ENCODE, SWISS-PROT etc. Lately, new data acquisition technologies, especially next generation sequencing (NGS), are rapidly increasing the amount of information available online, from data published with papers all the way to large scale collaborations, such as The Genome Cancer Atlas (TCGA) involving a wide range of  hospitals and research groups offering information on patients, diagnostics, treatments together with data on sequenced tumors, gene expression, methylation, etc.  For an inspiring example see
The challenge is to effectively mine resources, such as the TCGA, after performing an experiment or getting clinical results.  For example, if you are sequencing cancer tumors of patients, the question is: how to mine this public data and compare the results against your own data and results. TCGA alone numbers over 50,000 files, there is no way to mine this data by hand. Likewise we have access to 1,000 public genomes and the genome of the Netherlands (GoNL). What are feasible strategies for using this data?
In this course the morning is started with a lecture by a leading biomedical scientist. The topic can be in cancer research, for example, diagnostics or personalised medicine. The presenter will tell us about his/her research and the short term data mining and data integration issues he or she is facing. The lecture is followed by a discussion on possible approaches in solving one or more of these issues.  Topics covered will include parsing tabular data, SQL databases, web services and the semantic web. The rest of the day the students will be tasked with finding a solution to a particular problem. Solving such problems can only be done through writing (small) computer programs. This course is suitable for students who take an interest in informatics and biomedical application of informatics. The course builds on the skills acquired in introductionary programming courses; having completed one of these is a hard prerequisite.  The introduction to bioinformatics course is not a prerequisite but is highly recommended.
The goal of this course is to outline current data integration challenges in biology and biomedical research and discuss state-of-the-art approaches for tackling these challenges. Students from other disciplines and other universities are invited to attend this course. The topic is suitable for all students in the life sciences dealing with NGS data.
Literature/study material used:
Lectures, Scientific articles, Course laptop (students can bring their own), Online resources and documentation, Online tutorials, Unix operating system, Online discussion and Q&A platform.
Please register online on the CS&D website: CS&D students have priority in registration until 3 weeks before the start of the course. Thereafter, registration is on 'first-come-first-serve' basis until the maximum number of 30 participants is reached.
Mandatory for students in own Master’s programme 
Optional for students in other GSLS Master’s programme:

Prerequisite knowledge:
Basic programming knowledge
Entry requirements
You must meet the following requirements
  • Enrolled for a degree programme of faculty Faculty of Medicine
Required materials
Instructional formats
Basic lecture

Final result
Test weight100
Minimum grade-

Kies de Nederlandse taal