As our data generation capabilities are fast outpacing our ability to analyze and make sense of the data, statistical inference and machine learning methods are becoming invaluable across the realms of science. This is a course on the fundamentals of stastical inference/machine learning/data science. Emphasis will be on learning the methods from first principles as opposed to using them as black boxes. We will use real biological examples as much as possible. Note that this is not really a course on statistics (e.g. hypothesis testing is not covered).

**Syllabus:** Basics of probability: axioms, conditional
probability, random variables, expectations, standard probability
distributions, methods for sampling from a distribution, Introduction to
Markov Chains, Monte Carlo sampling, Least squares and Linear
regression, Constrained Optimization, Bias-variance tradeoff and
variable selection (ridge regression and lasso), Linear Mixed Effects,
Maximum Likelihood and Expectation Maximization, Bayesian inference,
MCMC and Gibbs sampling, Bayesian model selection, Dimensionality
reduction and clustering, supervised learning: SVMs, neural networks,
deep networks, genetic algorithms.

**Course structure: **Lectures + weekly
programming/maths assignments. A term project involving implementing one
of the methods on a biological example is to be submitted in the last
month of the course.

**Prerequisites: **Basic programming skills in Python or R, mathematics upto class 12.

**Evaluation: **6 homework assignments + 1 end-term project.

**Course outcome: **Familiarity with basic concepts in
statistical inference, able to choose and implement the above sorts of
methods for their own research.

**Venue**: SAFEDA

**Instructors**: Shruthi Viswanath, Shaon Chakrabarti

- Teacher: Shaon Chakrabarti
- Teacher: Shruthi Viswanath