Applied data analytics is a multidisciplinary field where you will learn insights needed to make sense of data, research, and observations from everyday life. You will learn how to apply a data-driven approach to problem solving, but will not only learn about tools, methods, and techniques, or the latest trends, but also more generic insights: why do certain approaches work, why the field is so popular, what common mistakes are made, and so on. You will also learn that data analytics is part science and part βartβ, since in applying methods and searching for findings there is a creative component.
Throughout the workshops you will work on several individual DA assignments, on predefined problems/datasets, using R tools. However, many of these assignments allow for freedom for your own individual approach. Most assignments involve real-world and relevant data sets, often connected to active research.
The lectures will provide the theoretical background of how a DA process should be performed according to industry standards. Furthermore, we discuss an overview of popular DA techniques to help match techniques with information needs, including applications of text mining and data enrichment.
The course will be taught in English.
Course form
The course consists of lectures and individual (weekly) assignments. The answers to the assignments are to be submitted to the appropriate section of Blackboard.
Literature
The main text for this course is Peng and Matsui (2016), which is is available as PDF, e-book, paperback, but you can also read the latest version online at https://bookdown.org/rdpeng/artofdatascience/.
Peng, Roger D., and Elizabeth Matsui. 2016. The Art of Data Science: A Guide for Anyone Who Works with Data. Lulu.com. https://leanpub.com/artofdatascience.
The second text we use is a technical report by Chapman et al. (2000).
Chapman, Pete, Julian Clinton, Randy Kerber, Thomas Khabaza, Thomas Reinartz, Colin Shearer, and Rudiger Wirth. 2000. βCRISP-DM 1.0: Step-by-Step Data Mining Guide.β Technical report. The CRISP-DM consortium. ftp://ftp.software.ibm.com/software/analytics/spss/support/Modeler/Documentation/14/UserManual/CRISP-DM.pdf.
Lastly, we will use a (small) part of the work by James et al. (2013), which is a standard work in (advanced) data analysis courses. Note that all are available for free online, however, you can also buy copies.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning. Springer Texts in Statistics. Springer. https://doi.org/10.1007/978-1-4614-7138-7.
|