a

Data Science with Python

  • Data Science Introduction
  • Data Science Toolkit
  • Job outlook
  • Prerequisite, Target Audience
  • Data Science Project Lifecycle – CRISP-DM Model

Statistics Concepts,
Random variable
Type of Random variables
Central Tendencies – Mean, Mode, Median, Probability, Probability Distribution
of Random variables, PMF, PDF, CDF
Type of RV – Nominal, Ordinal, Interval, Ratio; Variance, Standard Deviation
Normal Distribution, Standard Normal Distribution
Binomial Distribution
Poisson Distribution

Sampling
Inferential Statistics
Sampling Distribution
Central Limit Theorem
Simulation
Null and Alternative Hypothesis
Hypothesis Testing
1 tail test and 2 tail test, type I and Type II error
z test & t test

Introduction to Python, Anaconda & Spyder, Installation & Configuration
Data Structures in Python
List
Tuples
Array in NumPy
Matrices
Data frame in Pandas;

Control Structure & Functions – If-Else, For loop, While loop
Slicing, dicing & filter operations

Graphics and Data Visualization libraries in Python
– Plotly
– Matplotlib
– Seaborn
– other useful packages/functions in Python
Exploratory Data Analysis Exercise in Python

Introduction to Machine Learning
Supervised and Unsupervised ML, Parametric/Non-parametric Machine
Learning Algorithms,
Machine Learning Models
– Linear Regression
– Logistic Regression
– Classification & KNN
– Decision trees
– Random Forest
– Clustering – K Means & hierarchical Clustering,
– Time Series Analysis
– ARIMA Models,
– Support Vector Machine
Model Validation/Cross-validation techniques, Parameter tuning,
Model evaluation metrics, MSE, RMSE, R square, Adjusted R Square
Confusion Matrix
Bias and Variance
Underfitting, over Fitting.

ML Case Studies on
– Regression
– Classification
– Decision Tree
– Random Forest
– Clustering
– Time Series Analysis