CloseHelpPrint
Kies de Nederlandse taal
Course module: 201900027
201900027
ADS: Applied data analysis and visualization
Course infoSchedule
Course code201900027
ECTS Credits7.5
Category / Level3 (Bachelor Advanced)
Course typeCourse
Language of instructionEnglish
Offered byFaculty of Social Sciences; Methods and Statistics;
Contact persondr. E. Aarts
E-maile.aarts@uu.nl
Lecturers
Contactperson for the course
dr. E. Aarts
Other courses by this lecturer
Teaching period
4  (26/04/2021 to 02/07/2021)
Teaching period in which the course begins
4
Time slot-: Not in use
Study mode
Full-time
Enrolment periodfrom 02/11/2020 up to and including 29/11/2020
Enrolling through OSIRISYes
Enrolment open to students taking subsidiary coursesYes
Pre-enrolmentNo
Waiting listNo
Aims
This course builds on the course Fundamental techniques in data science with R (course code: 201900026)This course builds on the course Fundamental techniques in data science with R (course code: 201900026).
After successfully completing this course, you will be able to:
  • Understand and explain the different approaches to data analysis that go beyond regression analysis;
  • Given a practical data science problem, select appropriate techniques to tackle this problem;
  • Apply various (supervised) data analysis techniques, including regression, trees, classification, clustering, etc. in R;
  • Implement generic Data Science tools such as train/validation/test sets, crossvalidation, and error evaluation in R;
  • Interpret and evaluate the results of such analyses;
  • Explain these evaluations in layman's terms;
  • Understand and explain the basic principles of data visualization and the grammar of graphics;
  • Construct appropriate visualizations in connection with each of the data analysis techniques in R.
 
Content
What puts former criminals on the right track? How can we prevent heart disease? Can Twitter predict election outcomes? What does a violent brain look like? How many social classes does 21st century society have? Are hospitals spending too much on health care, or too little?
Data analysis is the art and science of tackling questions like these by looking at data. Just as cartographers make maps to see what a country looks like, data analysts explore the hidden structures of data by creating informative pictures and summarizing relationships among variables. And just as doctors diagnose sick patients and advise healthy ones on how to stay healthy, data analysts predict important events and variables so we can act on this knowledge. Methods from statistics, machine learning, and data mining play an important part in this process, as well as visualizations that allow the analyst and other humans to better understand what we can conclude from the available facts.
 
During this course, you will actively learn how to apply the main statistical methods in data analysis and how to use machine learning algorithms and visualizing techniques. The course will go beyond linear and logistic regression, and thus continue where “Fundamental techniques in data science with R” ended. The course has a strongly practical, hands-on focus: rather than focusing on the mathematics and background of the discussed techniques, you will gain hands on experience in using them on real data during the course and interpreting the results.
 
This course covers both classical and modern topics in data analysis and visualization:
  1. Exploratory data analysis (EDA);
  2. Supervised machine learning and statistical learning;
  3. Basic unsupervised learning techniques;
  4. Visualization (throughout the course).
Note that you need to register for this course during the FSW registration periods (page is in Dutch). Note also that if you are not an FSS student, the registration period may differ from your habitual one. This course is part of the minor Applied Data Science. If you also want to register for this minor you can do so via OSIRIS student.

Also note that this course builds on the course Fundamental techniques in data science with R (course code: 201900026).


Students who cannot comply with the general entrance requirements mentioned (see below) are advised to take the pre-course  for the ADS minor ADS: Basis van Onderzoeksmethoden en Statistiek (code 201900025, Dutch taught). Students that cannot comply with entrance requirements, but believe to have the necessary background and skills are asked to provide further information on their eligibility. The course coordinator will decide on their eligibility. 
 

Entry requirements
Students should have at least followed an introductory statistics course of 7.5 EC, and familiarity with correlation and regression, comparing means and cross tabulations of categorical variables. We also expect that you have hands on experience in carrying out these analyses, with, for example, SPSS, Stata, R or SAS.

Competencies
-
Entry requirements
-
Prerequisite knowledge
You should be familiar with the basic principles of applied statistics (up to regression). Familiarity with the high-level programming language R is highly desirable; it will be very hard to complete the course without knowledge of R. This knowledge can be obtained through the course Fundamental techniques in data science with R (course code: 201900026), also part of the minor Applied data science.
Required materials
Literature
Excerpt from the freely available text: James, Witten, Hastie & Tibshirani (2015). An introduction to statistical learning with applications in R. New York: Springer. http://www-bcf.usc.edu/~gareth/ISL/
Literature
Excerpt from the freely available text: Wickham. R for Data Science (2016). O’Reilly. http://r4ds.had.co.nz/
Software
All software used (Rstudio, R) is open source and freely available online, as is the mandatory literature.
Recommended materials
Book
Zumel & Mount (2014). Practical data science with R. Shelter Island: Manning.
Literature
Additional literature and references are provided during the course
Instructional formats
Exam inspection

Lecture

General remarks
Note: the lectures will be taught in parallel with the course 'Applied data analysis and visualization for economists', part of the minor 'Applied data science for economists'.

Class session preparation
Assigned literature must be read before the lectures.

Small-group session

Contribution to group work
Collaboration by students on homework assignments is allowed and encouraged.
Copying or simply dividing up assignments among collaborating students is strongly discouraged.

Tests
Digital exam
Test weight60
Minimum grade5

Assignment(s) 1
Test weight40
Minimum grade5

CloseHelpPrint
Kies de Nederlandse taal