Navigation

Springe zum Hauptinhalt
Professur Numerische Mathematik
Professur Numerische Mathematik

Introduction to Data Science (4V, 2Ü) Prof. Ernst, WS 2018/19

Content

Tentative list of course topics:
  • Introduction: What is Data Science
  • Learning Theory
  • Regression
  • Neural Networks
  • Classification
  • Clustering and Tree-Based Methods
  • Support Vectors
  • Unsupervised Learning

Notices

Extra lab session In place of the lecture on Thursday, December 20, there will be an extra lab session during the regular class time in the computer pool.
Class cancellations There will be no class on Monday, December 17.
Rescheduled lecture To make up for the lecture on December 10 lost to the railway strike, we will have a lecture in place of the lab session 13:35-15:05 on Tuesday, December 11 in computer pool.
Temporary lecture room change On Thursday, November 1 and Monday, November 5, the lecture will take place in Room 2/N101.
New time for Lab Exercises Beginning on Tuesday, October 23, 2018, we will start our labs 10 minutes earlier, i.e., 13:35.
Cancellation There will be no lab exercise session on Tuesday, October 16.
Note To participate in the lab exercises, all students should have an account with the MRZ (Mathematics Computing Center). Those who do not already have one, please apply for one here and collect your login credentials with Ms. Margit Matt (Rh39, Room 704).
First Lab Exercises Tuesday, October 9, 2018.
First Class Monday, October 8, 2018.

Listing of this course in the electronic Vorlesungsverzeichnis (course directory):

Nummer Name Zeit Raum Details
220000-C80
[Vorlesung]

Start: 08.04.
Detaillierte Informationen finden Sie im OPAL-Kurs.
Mittwoch (Wöchentlich)
11:30-13:00
2/39/633
(neu: C46.633)
220000-C80A
[Vorlesung]
Donnerstag (14-tägig, ungerade KW)
13:45-15:15
2/39/633
(neu: C46.633)
220000-C81
Optimaler Transport und Data Science / Optimal Transport and Data Science
[Übung]
Donnerstag (14-tägig, gerade KW)
13:45-15:15
2/39/633
(neu: C46.633)

Lecture

Literature
  • James, Witten, Hastie & Tibshirani. An Introduction to Statistical Learning – with Applications in R. Springer 2013. Available online here.
  • Here's a continually updated annotated reading list for the course (16.01.2019).
Slides

Exercises

Installation of Programming Environment under Linux (64 bit)

If you want to do the homework on your personal computers, you may clone the programming environment used in the labs. Get miniconda from here and follow the steps in the installation dialogue. Next, download the specification file spec-file.txt used in the labs and create a conda environment (under Linux):

conda create --name DS2018 --file spec-file.txt

Installation of Programming Environment under Windows and MacOS

Download miniconda for your distribution here and follow the installation instructions. Next, download the yml-file containing the packages used in the labs and create a conda environment in a miniconda/Anaconda shell:

conda env create -f DS2018.yml

If your plots are not displayed in the browser, this might be due to a missing package. After sourcing of the correct environment, the following might help in some cases python -m ipykernel install --user Please refer to Conda (Installation under Windows, Linux and MacOS) and Conda (Managing environments) for further information.

Material

In order to start the jupyter notebooks you have to open a terminal and source our conda environment DS2018 via

source /LOCAL/Software/DataScience2018/setup_env

Next, change the directory to your exercise folder and download the jupyter notebook (right click and "Save link as") into this folder. Finally, start the notebook via the command (make sure you see the (DS2018) in front of your username):

jupyter notebook Problem_Sheet_XX.ipynb