Basic machine learning concepts
- supervised learning, unsupervised learning, reinforcement learning
- fitting, generalization, overfitting, regularization
- data generating distribution, train/development/test set
Linear regression
- analytical solution
- solution based on stochastic gradient descent (SGD)
Classification
- binary classification via perceptron
- binary classification using logistic regression
- multiclass classification using logistic regression
- deriving sigmoid and softmax functions from the maximum entropy principle
- classification with a multilayer perceptron (MLP)
- naive Bayes classifier
- maximum margin binary classifiers
Kernel methods
- kernelized linear regression
- Support vector machines (SVM) and their training with Sequencial minimization optimization algorithm (SMO)
Decision trees
- classification and regression trees (CART)
- random forests
- gradient boosting decision trees (GBDT)
Clustering
- K-Means algorithm
- Gaussian mixture model
Dimensionality reduction
- principal component analysis (PCA)
Training
- dataset preparation, classification features design
- constructing loss functions according to maximum likelihood estimation principle
- first-order gradient methods (SGD) and second-order methods
- regularization
Statistical testing
- Student t-test
- Chi-squared test
- correlation coefficients
- paired bootstrap test
Used Python libraries
- numpy (n-dimensional array representation and their manipulation)
- scikit-learn (construction of machine learning pipelines)
- matplotlib (visualization)
Machine learning is reaching notable success when solving complex tasks in many fields. This course serves as in introduction to basic machine learning concepts and techniques, focusing both on the theoretical foundation, and on implementation and utilization of machine learning algorithms in Python programming language.
High attention is paid to the ability of application of the machine learning techniques on practical tasks, in which the students try to devise a solution with highest performance.