 ## Data Science with Python

#### Introduction to Data Science

• Data Science Introduction
• Data Science Toolkit
• Job outlook
• Prerequisite, Target Audience
• Data Science Project Lifecycle – CRISP-DM Model

#### Basics of Statistics

Statistics Concepts,
Random variable
Type of Random variables
Central Tendencies – Mean, Mode, Median, Probability, Probability Distribution
of Random variables, PMF, PDF, CDF
Type of RV – Nominal, Ordinal, Interval, Ratio; Variance, Standard Deviation
Normal Distribution, Standard Normal Distribution
Binomial Distribution
Poisson Distribution

Sampling
Inferential Statistics
Sampling Distribution
Central Limit Theorem
Simulation
Null and Alternative Hypothesis
Hypothesis Testing
1 tail test and 2 tail test, type I and Type II error
z test & t test

#### Python Programming for Data Science (Lab)

Introduction to Python, Anaconda & Spyder, Installation & Configuration
Data Structures in Python
List
Tuples
Array in NumPy
Matrices
Data frame in Pandas;

Control Structure & Functions – If-Else, For loop, While loop
Slicing, dicing & filter operations

#### Graphics and Data Visualization, Exploratory Data Analysis in Python (Lab)

Graphics and Data Visualization libraries in Python
– Plotly
– Matplotlib
– Seaborn
– other useful packages/functions in Python
Exploratory Data Analysis Exercise in Python

#### Machine Learning Concepts:

Introduction to Machine Learning
Supervised and Unsupervised ML, Parametric/Non-parametric Machine
Learning Algorithms,
Machine Learning Models
– Linear Regression
– Logistic Regression
– Classification & KNN
– Decision trees
– Random Forest
– Clustering – K Means & hierarchical Clustering,
– Time Series Analysis
– ARIMA Models,
– Support Vector Machine
Model Validation/Cross-validation techniques, Parameter tuning,
Model evaluation metrics, MSE, RMSE, R square, Adjusted R Square
Confusion Matrix
Bias and Variance
Underfitting, over Fitting.

#### Real World Data Science & Machine Learning Case Studies in Python (Lab)

ML Case Studies on
– Regression
– Classification
– Decision Tree
– Random Forest
– Clustering
– Time Series Analysis