Data Science with Machine Learning (using PYTHON)
(Professional Course) Time: 50 hrs
Introduction to Data Science Duration
Module 1 2 hrs
Introduction to Data Science Data Science Era
Data Science involvement in Industries Business Intelligence vs Data Science Data Science Life Cycle
Tools of Data Science Introduction to Python Introduction to Machine Learning
Module 2 2 hrs
Introduction to Python Programming
Introduction to Python Basic Operations in Python Variable Assignment
Functions: in-built functions, user defined functions Condition: if, if-else, nested if-else, else-if
Module 3 4 hrs
Data Structure – Introduction
List: Different Data Types in a List, List in a List Operations on a list: Slicing, Splicing, Sub-setting Condition (true/false) on a List
Applying functions on a List Dictionary: Index, Value
Operation on a Dictionary: Slicing, Splicing, Sub-setting Condition (true/false) on a Dictionary
Applying functions on a Dictionary Modules and Packages
Numpy Array: Data Types in an Array, Dimensions of an Array
Operations on Array: Indexing, Slicing, Splicing, Sub-setting
Conditional (T/F) on an Array
Loops: For, While Shorthand for For
Conditions in shorthand for For Control statements
Shape Manipulation Linear Algebra
Module 4 6 hrs
Python Pandas – Home Python Pandas – Introduction
Python Pandas – Environment Setup Introduction to Data Structures Python Pandas – Series
Python Pandas – DataFrame Python Pandas – Panel
Python Pandas – Basic Functionality
Function Application Python Pandas – Reindexing Python Pandas – Iteration
Python Pandas – Sorting
Working with Text Data
Options & Customization Indexing & Selecting Data
Python Pandas – Window Functions
Python Pandas – Aggregations
Pandas – Missing Data
Python Pandas – GroupBy
Python Pandas – Merging/Joining Python Pandas – Concatenation
Python Pandas – Date Functionality Python Pandas – Categorical Data Python Pandas – Visualization
Module 5 2 hrs
Intro to Statistics Statistical Inference Terminologies of Statistics Descriptive statistics Statistical functions Measures of Centers Mean
Median Mode
Measures of Spread Variance
Standard Deviation Histogram Probability
Normal Distribution Binary Distribution Poisson distribution Skewness
Bell curve
Hypothesis Building and Testing
Chi-Square Test
Correlation Matrix
Module 6 2 hrs
Scientific computing with Python
SciPy and its Characteristics SciPy sub-packages
SciPy sub-packages –Integration SciPy sub-packages Optimize Linear Algebra
SciPy sub-packages – Statistics SciPy sub-packages – Weave SciPy sub-packages – I O
Module 7 2 hrs
Data Analysis Pipeline What is Data Extraction Types of Data
Raw and Processed Data Data Wrangling Exploratory Data Analysis Visualization of Data
MatplotLib Bar Plot Histogram Plot Box Plot
Area Plot Scatter Plot Pie Plot Seaborn
Module 8 2 hrs
Introduction to Machine Learning
Machine Learning Use-Cases
Machine Learning Process Flow
Machine Learning Categories
Module 9 1 hr
Data Preprocessing
Data preparation
Intro to Scikit Learn
Module 10 2 hrs
Regression Types Algorithms
Linear Regression
Logistic Regression
Importance of Dimensions
Introduction to Dimensionality
Why Dimensionality Reduction PCA
Factor Analysis
Scaling dimensional model
Implementation with Case study
Intro to Kaggle and UCI repository
Module 11 8 hrs
Classification
K-nearest neighbors Metrics
Confusion Matrix
Classification report
Support Vector Machines
Working of SVM Naive Bayes
Hyperparameter Optimization Decision Tree Classifier Entropy
Gini Entropy ROC
AUC
Random Forest classifier Linear Discriminant Analysis
Ensemble Techniques and SVM tuning (Self Paced)
Underfitting & Overfitting
Implementation with Case study
Cross –validation
Module 12 2 hrs
Unsupervised learning
Clustering Algorithms
K-Means Clustering
Hierarchical Clustering
Implementation with Case study
Unsupervised Learning (Self Paced) 1 hr
Module 13 7 hrs
NLTK Installation
Tokenize words
Tokenize sentences
Stop words in NLTK
Stemming words with NLTK
Speech tagging
ChatterBot (Self Paced)
Web Scraping
Urllib
BeautifulSoup
Tf-idf Vectorizer
Sentiment analysis
Twitter Sentiment Analysis (Self Paced)
Implementation with Case Studies
Pipelines (Self Paced)
Implementation with Case study
Module 14 2 hrs
Association rule
Association Analysis
Association Rule Parameters
Apriori Algorithm
Market Basket Analysis
Implementation with 1 Case study
Module 15 5 hrs
Recommendation Engine
Collaborative filtering
Content Based Filtering
Implementation with Case Study
Surprise Library (Self Paced)
Introduction to Artificial Intelligence
Tensorflow library
Keras library
The course will be covering industrial real time case study .
At the end any one of the Capstone projects will be assigned.
Projects: 4 hrs
- Customer churn prediction
- Bank fraud Loan Prediction
- Wine Type Prediction
- Titanic dataset
- Marketing channel sales prediction