1. Course title
Introduction to Machine Learning
2. Course code
-Code in Oodi/OTM (upcoming academic administration information system) and other
systems
DATA11002
3. Course status: compulsory or optional
-Which degree programme is responsible for the course?
Data Science Master's programme
-Which module does the course belong to?
Data Science Methods
-Is the course available to students from other degree programmes?
yes
4. Course level (first-, second-, third-cycle/EQF levels 6, 7 and 8)
-Bachelor’s level = first-cycle degree/EQF level 6
-Master’s level, degree programmes in medicine, dentistry and veterinary medicine = secondcycle
degree/EQF level 7
7 (and 8)
-Doctoral level = third-cycle (doctoral) degree/EQF level 8
-Does the course belong to basic, intermediate or advanced studies (cf. Government Decree
on University Degrees)?,
advanced studies
5. Recommended time/stage of studies for completion
-The recommended time for completion may be, e.g., after certain relevant courses have
been completed.
First semester (Autumn)
6. Term/teaching period when the course will be offered
-The course may be offered in the autumn or spring term or both.
Autumn
-If the course is not offered every year, this must be indicated here.
-Specification of the teaching period when the course will be offered
Typically 2nd period
7. Scope of the course in credits
5 cr
8. Teacher coordinating the course
Kai Puolamäki
9. Course learning outcomes
-Description of the learning outcomes provided to students by the course
- See the competence map (https://flamma.helsinki.fi/content/res/pri/HY350274).
Machine learning is the core technology under the recent developments of artificial intelligence (AI) and it is applied widely in several domains. This course will provide you with the necessary theoretical background to understand the fundamental machine learning concepts and to use the basic methods of supervised and unsupervised learning in a proper manner to solve real-life problems. The course will prepare you for the further studies in machine learning and introduce you to the methods and tools that are used to solve the problems in practice.
More specifically:
- You will have the necessary theoretical background to understand and explain the fundamental machine learning principles and concepts (e.g., training data, feature, model selection, loss function, training error, test error, overfitting). You recognise various ingredients in machine learning task (task, computational problems, models, algorithm etc.).
- You are able to map a practical data analysis problem into a machine learning task, take the correct steps to solve the task, and know how to interpret and evaluate the outcomes. You understand the underlying assumptions and limitations of the machine learning solution.
- You are familiar with the basic tools and of a programming environments suitable for solving machine learning problems and you are able to independently to do the basic data analysis tasks with such programming environments.
- You understand the concept of generalisation, can use validation set methods, and you are able to evaluate the performance of machine learning methods and to do model selection.
- You know the principles of and are able to apply to the real-world problems the following techniques:
- supervised learning: basic regression methods (linear etc.), classification methods (at least one example of: linear, distance based, generative, discriminative, and algorithmic).
- unsupervised learning: the most important clustering formalisms (k-means, hierarchical clustering) and the most important dimensionality reduction approaches (PCA, at least one distance-based, at least one manifold method).
- You can read machine learning literature (textbooks, scientific articles etc.) and you are prepared for further studies in machine learning or in other disciplines which need machine learning methods.
- You can explain and report your machine learning approaches and solutions to your peers and to your future colleagues in an understandable and coherent manner.
10. Course completion methods
-Will the course be offered in the form of contact teaching, or can it be taken as a distance
learning course?
Contact teaching
-Description of attendance requirements (e.g., X% attendance during the entire course or
during parts of it)
Possible attendance requirements are specified each year at the course web page
-Methods of completion
Completion is based on exercises and term project. Possible other methods of completion will be announced on the course web page.
11. Prerequisites
-Description of the courses or modules that must be completed before taking this course or
what other prior learning is required
The students should have the following prerequisite knowledge, with examples of courses providing the necessary skills:
- Generic skills learned during BSc studies, including writing of academic reports.
- High school mathematics and university mathematics, including basics of optimization with differentiation. Courses: MAT11001 or FYS1010.
- Linear algebra, including basic matrix and vector operations, eigenvalues, and eigenvectors. Courses: MAT11002 or MAT11009 or FYS1012.
- Probability and statistics, including random variables, expectation, and rules of probability. Courses: MAT12003 or MAT11015 or FYS1014.
- Programming skills, some programming experience, and ability to quickly acquire the basics of a new environment such as R or Python (courses: TKT10002 or FYS1013). Additionally, it is useful to know the basic ideas of pseudocode and the analysis of time and space complexity with big O notation.
The course has a short prerequisite knowledge test – available at the course web site – which contains more detailed description of the required prerequisites and pointers to self-study materials. Courses Introduction to Data Science and Introduction to Artificial Intelligence are recommended but not required.
12. Recommended optional studies
-What other courses are recommended to be taken in addition to this course?
-Which other courses support the further development of the competence provided by this
course?
Courses in the Machine Learning module. Courses in other degree programmes in which machine learning methods are applied.
13. Course content
-Description of the course content
The course includes the following content:
- Ingredients of machine learning: components (tasks, computational problems, algorithms etc.) and necessary tools.
- Introduction to statistical learning and probabilistic modelling.
- Supervised learning: basic definition, basic regression and classification algorithms (linear, probabilistic, distance based models).
- Statistics and evaluation: estimating parameters and resampling methods (including validation set methods).
- Unsupervised learning: clustering methods (k-means, agglomerative clustering) and basics of dimensionality reduction (PCA and variants).
14. Recommended and required literature
-What kind of literature and other materials are read during the course (reading list)?
- Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani: An Introduction to Statistical Learning with Applications in R, 2nd edition. Springer, 2021.
- Additional readings are announced during the course
-Which works are set reading and which are recommended as supplementary reading?
Parts of the textbook that are required are specified on the course web page.
15. Activities and teaching methods in support of learning
-See the competence map (https://flamma.helsinki.fi/content/res/pri/HY350274).
-Student activities
The course includes lectures, solving exercises, and doing the term project.
-Description of how the teacher’s activities are documented
16. Assessment practices and criteria, grading scale
-See the competence map (https://flamma.helsinki.fi/content/res/pri/HY350274).
-The assessment practices used are directly linked to the learning outcomes and teaching
methods of the course.
Assessment and grading is based on completed exercises and term project. Possible other criteria will be specified on the course web page.