The course will run from 9.11.2020 to 23.11.2020.
Materials from the previous Chemometrics course.
A very useful book on getting comfortable in coding for R: R for Data Science
Monday 9.11. Getting started
9:30 to 12:00 Seminar: What is Data Analysis?
13:00 to 16:00 Computer lab: Getting started on machine learning in R on the example of retention time predictions. Plotting with ggplot() and interactive pages with shiny. Data
Code from the computer lab 2020-11-09.
Tuesday 10.11
9:30 to 12:00 Seminar: Similarity-based learning for clustering. A case study.
13:00 to 16:00 Computer lab: Identification of compounds and evaluating the similarity of compounds from MS/MS spectra. Functions in R. Data.
Code from the computer lab on 2020-11-10.
Wednesday 11.11
9:30 to 12:00 Seminar: Revision of methods: kNN and some more on kNN, Decision trees, Random Forest, Supported Vector Machines. A case study.
13:00 to 16:00 Computer lab: ROC curve, hyperparameter tuning, parameter importance for complex models, smart splitting of data. Data.
Code from the computer lab 2020-11-11.
Thursday 12.11
9:30 to 12:00 Seminar: Ensemble learning Case study.
13:00 to 16:00 Computer lab: Coding our own ensemble for regression tasks. Data.
Code from the computer lab 2020-11-12.
Friday 13.11
9:30 to 12:00 Seminar: Presentation of papers
13:00 to 16:00 Individual discussions for projects
16. – 19. November
Mornings: individual work on your data
Afternoons: consultations, select a suitable time in doodle
Friday 20.11
13:00 Project presentations
Exam
It is an on-line exam that you can solve from 20.11 until 23.11.