Data Science With Python Online Workshop
Duration : 3 Days (9:30 AM to 5:30 PM)
Date : 3, 4, 5 February 2022
Overview
This Hands-on training program on Data Science drives participants through the basics of Python and Statistics before diving in and exploring Data Science in depth. It takes participants through exploratory as well as Real time scenarios in Data Science and also touches base on introduction to Machine Learning.
Prerequisities
This training requires participants to have a basic understanding in Python Programming. A knowledge in Mathematics & Statistics would be helpful to attend the training program.
Course Contents
Introduction to Data Science
- Data Science Introduction
- Data Science Toolkit
- Job outlook
- Prerequisite, Target Audience
- Data Science Project Lifecycle – CRISP-DM Model
Basics of Statistics
- Statistics Concepts,
- Random variable
- Type of Random variables
- Central Tendencies – Mean, Mode, Median, Probability Distribution of Random variables,PMF, PDF, CDF
- Type of RV – Nominal, Ordinal, Interval, Ratio; Variance, Standard Deviation
- Normal Distribution, Standard Normal Distribution
- Binomial Distribution
- Poisson Distribution
Advanced Statistics
- Sampling
- Inferential Statistics
- Sampling Distribution
- Central Limit Theorem
- Simulation
- Null and Alternative Hypothesis
- Hypothesis Testing
- 1 tail test and 2 tail test, type I and Type II error
- z test & t test
Python Programming for Data Science (Lab)
- Introduction to Python, Anaconda & Spyder, Installation & Configuration
- Data Structures in Python
- List
- Tuples
- Array in NumPy
- Matrices
- Data frame in Pandas;
- Control Structure & Functions – If-Else, For loop, While loop
- Slicing, dicing & filter operations
Graphics and Data Visualization, Exploratory Data Analysis in Python (Lab)
- Graphics and Data Visualization libraries in Python
- Plotly
- Matplotlib
- Seaborn
- other useful packages/functions in Python
- Exploratory Data Analysis Exercise in Python
Machine Learning Concepts
- Introduction to Machine Learning
- Supervised and Unsupervised ML, Parametric/Non-parametric Machine
- Learning Algorithms,
- Machine Learning Models
– Linear Regression
– Logistic Regression
– Classification & KNN
– Decision trees
– Random Forest
– Clustering – K Means & hierarchical Clustering,
– Time Series Analysis
– ARIMA Models,
– Support Vector Machine - Model Validation/Cross-validation techniques, Parameter tuning,
- Model evaluation metrics, MSE, RMSE, R square, Adjusted R Square
- Confusion Matrix
- Bias and Variance
- Underfitting, over Fitting.
Real World Data Science & Machine Learning Case Studies in Python (Lab)
- ML Case Studies on
– Regression
– Classification
– Decision Tree
– Random Forest
– Clustering
– Time Series Analysis