Data Science Machine Learning Training Syllabus

Introduction to Data Science

Introduction to Data Science – Introduction to Machine Learning – What is Python – Role ofPython in Machine Learning and Python –Installing Python – Python IDE’s – JupyterNotebook Installation and usage.

Python Packages

What is package – Advantage of usingpackages over core Python – Pandas – Numpy– Sci-kit Learn – Mat-plot library.

Manipulating data

Selecting rows / observations – RoundingNumbers – Selecting columns and fields –Merging data – Data aggregation – Datamunging techniques.


Probability Basics – What is meant byprobability, Types, ODDS ratio – StandardDeviation – Data deviation andDistribution , Variance. Bias variantTradeoff – Underfitting, Overfitting.Distance metrics - Euclidean Distance,Manhattan Distance. Outlier analysis -What is an Outlier?, Inter Quartile Range,Box & whisker plot, Upper Whisker, LowerWhisker, Scatter plot, Cook’s Distance, -Missing Value treatment - What is NA?,Central Imputation, KNN imputation,Dummification. Correlation - Pearsoncorrelation, positive & Negativecorrelation.

Machine Learning

Supervised Learning - Linear Regression,Linear Equation, Slope, Intercept, R squarevalue, Logistic regression, ODDS ratio,Probability of success, Probability of failureBias Variance Tradeoff, ROC curve, BiasVariance Tradeoff –

Unsupervised Learning - K-Means, K-Means ++, Hierarchical Clustering - SVM -Support Vectors, Hyperplanes, 2-D Case,Linear Hyperplane - SVM Kernal – Linear,Radial, Polynomial

SQL Introduction

Select query - Sub Query - Regular Expressions Functions andConditions in SQL - Date and Time in SQL - SQL Privileges