Kies de Nederlandse taal
Course module: 201600038
Data analysis and visualisation
Course infoSchedule
Course code201600038
ECTS Credits7.5
Category / LevelM (Master)
Course typeCourse
Language of instructionEnglish
Offered byFaculty of Social Sciences; M&S for Behavioural, Biomedical & Social Scien;
Contact persondr. D.L. Oberski
PreviousNext 1
A. Bagheri
Other courses by this lecturer
dr. M.J.L.F. Cruyff
Other courses by this lecturer
prof. dr. P.G.M. van der Heijden
Other courses by this lecturer
E. van Kesteren
Other courses by this lecturer
dr. D.L. Oberski
Other courses by this lecturer
Teaching period
2  (11/11/2019 to 31/01/2020)
Teaching period in which the course begins
Time slotD: WED-afternoon, Friday
Study mode
RemarkPlease take notice: 7,5 ECTS instead of 5 ECTS.
Enrolling through OSIRISNo
Enrolment open to students taking subsidiary coursesYes
Waiting listNo
After successfully completing this course, you will be able to:
  • Understand and explain the different approaches to data analysis;
  • Given a practical data science problem, select appropriate techniques to tackle this problem;
  • Apply various data analysis techniques, including regression, trees, clustering, PCA, correspondence analysis, etc. in R;
  • Implement generic Data Science tools such as train/validation/test sets, crossvalidation, bagging, boosting, and error evaluation in R ;
  • Interpret and evaluate the results of such analyses;
  • Explain these evaluations in layman's terms;
  • Understand and explain the basic principles of data visualization and the grammar of graphics;
  • Construct appropriate visualizations in connection with each of the data analysis techniques in R.
What puts former criminals on the right track? How can we prevent heart disease? Can Twitter predict election outcomes? What does a violent brain look like? How many social classes does 21st century society have? Are hospitals spending too much on health care, or too little? When is a series of spikes in hundreds of website logfiles an operational problem?

Data analysis is the art and science of tackling questions like these by looking at data. Just as cartographers make maps to see what a country looks like, data analysts explore the hidden structures of data by creating informative pictures and summarizing relationships among variables. And just as doctors diagnose sick patients and advise healthy ones on how to stay healthy, data analysts predict important events and variables so we can act on this knowledge. Methods from statistics, machine learning, and data mining play an important part in this process, as well as visualizations that allow the analyst and other humans to better understand what we can conclude from the available facts.

During this course, participants will actively learn how to apply the main statistical methods in data analysis and how to use machine learning algorithms and visualizing techniques. The course has a strongly practical, hands-on focus: rather than focusing on the mathematics and background of the discussed techniques, you will gain hands-on experience in using them on real data during the course and interpreting the results.
This course covers both classical and modern topics in data analysis and visualization:
  1. Exploratory data analysis (EDA);
  2. Supervised machine learning and statistical learning;
  3. Unsupervised learning and data mining techniques;
  4. Visualization (throughout the course).
This course is essential as a basis for each track of the Master of Applied Data Science. If you want to register for this course, please also register for the Applied Data Science profile. Students that need to follow this course mandatory for the ADS profile need to enroll themselves before the end of September (information on how and where will be provided within the profile). Other interested students can enroll themselves during the FSW Elective enrollment in the beginning of October, depending on the still available place in the course. Further information on this procedure can be found on the website of one of the academic masters of the faculty of social and behavioural sciences.
Note also that if you are not an FSW student, the registration period may differ from your habitual one.

Please take notice: 7,5 EC instead of 5 EC.
Entry requirements
Prerequisite knowledge
If you want to take this course you should be familiar with the basic principles of applied statistics (up to regression), able to use and understand basic statistical analysis in R or Python. There are many good free online courses to bring you up to speed, including at Coursera (JHU Data Science specialization) or through Swirl ( The book R for Data Science contains exercises for self-study and is free:
Required materials
Excerpt from the freely available text: James, Witten, Hastie & Tibshirani (2015). An introduction to statistical learning with applications in R. New York: Springer.
Excerpt from the freely available text: Wickham. R for Data Science (2016). O’Reilly.
All software used (Rstudio, R) is open source and freely available online, as is the mandatory literature.
Recommended materials
Zumel & Mount (2014). Practical data science with R. Shelter Island: Manning.
Additional literature and references are provided during the course
Instructional formats
Computer practical

General remarks
In every week, two computer practicals. The exact programme is outlined in the course manual.

Class session preparation
Assigned literature must be read before the lectures, assignments must be made before the meetings.

Contribution to group work
Collaboration by students on homework assignments is allowed and encouraged.
Lecture: Copying or simply dividing up assignments among collaborating students is strongly discouraged.

Small-group session

Final result
Test weight100
Minimum grade5.5

Will be announced in the course manual.

Aspects of student academic development
Academic thinking, working and acting
Material / data analysis and processing

Kies de Nederlandse taal