At the end of this course, you will be able to:
- Understand the role of data science and its societal impact
- Recognise the knowledge discovery processes in applied data science
- Identify trends and developments in big data technologies
- Apply selected big data technologies to solve real-world problems
The graded deliverables generate the final course grade as follows:
This is the introductory course for the Applied Data Science profile
and the Applied Data Science postgraduate
MSc programme. As such, it's primary objective is to inspire and introduce you to the emerging domain of Applied Data Science
. The following assignments are among the key parts of the course:
- Book review: Explore data science and its societal impact
- Mid-term data analysis assignment
- Final data analysis assignment
- [A] Book review
- [B] Mid-term assignment
[C] Final assignment
[D] Written, mostly multiple choice, exam
- [E] Optional bonus for extraordinary participation/performance
Grade = [A]*0.10 + [B]*0.25 + [C]*0.30 + [D]*0.35 + [E]
NB: To qualify for the second chance exam, all grading components need to be at least 4.0, and components A-C need to have been submitted within the allotted time.
|Pritzker, P., and May, W. (2015). NIST Big Data interoperability Framework (NBDIF): Volume 1: Definitions. NIST Special Publication 1500-1. Final Version 1. National Institute of Standards and Technology.|
|Shenoy, A. (2014). Hadoop Explained: An introduction to the most popular Big Data platform in the world. Packt Publishing.|
|Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113.|
|Ghemawat, S., Gobioff, H., & Leung, S. (2003). The Google file system. SIGOPS Operating Systems Review, 37(5), 29-43.|
|Spruit,M., & Jagesar,R. (2016). Power to the People! Meta-algorithmic modelling in applied data science. In Fred,A. et al. (Ed.), Proc. 8th Int.Conf. on Knowledge Discovery (pp. 400–406). KDIR 2016, November 11-13, 2016, Porto, Portugal: ScitePress.|
|Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The parable of Google Flu: traps in big data analysis. Science, 343(6176), 1203-1205.|
|Davenport, T. H., & Patil, D. J. (2012). Data scientist: The Sexiest Job of the 21st Century. Harvard business review, 90(5), 70-76.|
|Stair, R. & Reynolds, G. (2012). Fundamentals of Information Systems. Sixth Edition. NOTE: Chapters 1 and 3 ONLY, on Information Systems in Perspective & Database Systems, Data Centers, and Business Intelligence. Cengage: Boston, MA.|
|Chapman, P. Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., and Wirth, R. (2000). CRISP-DM 1.0 Step-by-step Data Mining Guide.|Werkvormen
AlgemeenThere will be 6 contact hours per week. On Tuesdays and Thursdays, regular lectures will be given.
In the first weeks, the lectures will focus more on the fundamentals of applied data science, whereas in the second half we will be introduced into current research of various UU/UMCU researchers related to applied data science.
AlgemeenThe Thursday lectures are then followed by workshop sessions where we will practice with big data tools (esp. Hadoop) and collaboratively investigate their societal impact.