The course will run from 9.11.2020 to 23.11.2020.

Materials from the previous Chemometrics course.

Introduction to the course

A very useful book on getting comfortable in coding for R: R for Data Science

Monday 9.11. Getting started

9:30 to 12:00 Seminar: What is Data Analysis?

13:00 to 16:00 Computer lab: Getting started on machine learning in R on the example of retention time predictions. Plotting with ggplot() and interactive pages with shiny.  Data

Code from the computer lab 2020-11-09.

Tuesday 10.11

9:30 to 12:00 Seminar: Similarity-based learning for clustering. A case study.

13:00 to 16:00 Computer lab: Identification of compounds and evaluating the similarity of compounds from MS/MS spectra. Functions in R. Data.

One possible solution.

Code from the computer lab on 2020-11-10.

Wednesday 11.11

9:30 to 12:00 Seminar: Revision of methods: kNN and some more on kNN, Decision trees, Random Forest, Supported Vector Machines. A case study.

13:00 to 16:00 Computer lab: ROC curve, hyperparameter tuning, parameter importance for complex models, smart splitting of data. Data.

Code from the computer lab 2020-11-11.

Thursday 12.11 

9:30 to 12:00 Seminar: Ensemble learning Case study.

13:00 to 16:00 Computer lab: Coding our own ensemble for regression tasks. Data.

Code from the computer lab 2020-11-12.

Friday 13.11

9:30 to 12:00 Seminar: Presentation of papers

13:00 to 16:00 Individual discussions for projects

16. – 19. November

Mornings: individual work on your data

Afternoons: consultations, select a suitable time in doodle

Friday 20.11

13:00 Project presentations

Exam

It is an on-line exam that you can solve from 20.11 until 23.11.