Machine Learning Certification 120 hours
Data Science is one of the highest paying job. Give your career the much needed Analytics Boost by learning from Industry Experts. We provide hands-on experiential learning and capstone projects.
Pick and Choose: You can pick and choose the module you wish to learn. Pricing for an individual module is given at module level details.
Next Batch starts: Dec 12, 2021
Limited no. of seats available
Duration: 120 hours
3-4 hours/week
Mode: Instructor Led Online
Class Format
MACHINE LEARNING CERTIFICATION SYLLABUS
NONE
R PROGRAMMING | 10 hours
COURSE DETAILS
Introduction to R and R Studio
Understanding R Data Structures
Vector, List, Matrix, Dataframe
Data Import – Export in R (.CSV, .XLSX, Fixed Width Format File)
Data Manipulation
- Selecting Rows / Observations
- Selecting Columns / Fields
- Merging Data
- Relabeling the Column Names
- Converting Variable Types
- Data Sorting
- Data Aggregation
Apply Family of Functions
Functions and Programming Structures
Charts and Graphs in R
COURSE DURATION
10 HOURS
COURSE FEES
₹ 5000 ( + GST )
PYTHON PROGRAMMING | 10 hours
COURSE DETAILS
Introduction to Python and Anaconda
Spyder and Jupyter Notebook
Understanding Python Data Structures
- List, Tuple, Dictionary, Sets
- Mutable and immutable Objects
Numpy and Pandas Packages in Python
- 1D, 2D, 3D Array
- Series and Dataframe
Data Import – Export using PANDAS
Data Manipulation
- Selecting Rows / Observations
- Selecting Columns / Fields
- Merging Data
- Relabeling the Column Names
- Converting Variable Types
- Data Sorting
- Data Aggregation
Matplotlib and Seaborn packages
- Charts & Graphs
COURSE DURATION
10 HOURS
COURSE FEES
₹ 7000 ( + GST )
MS EXCEL | 04 hours
COURSE DETAILS
Introduction to MS Excel Spreadsheet
Cell Referencing in Excel
Formatting Text
Autofill and Format Painter
Cell Merging
Insert Columns and Rows
if, sumif, countif, sumifs, countifs
vlookup, index, match, offset
Data Validation in Excel
Conditional Formatting
Pivot tables
Freeze Panes
Top 10 short-cuts in Excel
EXCEL WEBINAR LINK
Basic Excel Webinar
Advanced Excel Webinar
COURSE DURATION
04 HOURS
COURSE FEES
₹ 3000 ( + GST )
SQL | 04 hours
COURSE DETAILS
Introduction to SQL
Understanding the concept of Data
Applications – OLTP and OLAP
DDL, DML and DCL
CRUD Operation
CREATE, INSERT, UPDATE, DELETE SQL Queries
SELECT Query
Concept of Normalization and Denormalization
COURSE DURATION
04 HOURS
COURSE FEES
₹ 3000 ( + GST )
STATISTICS | 20 hours
COURSE DETAILS
Introduction to Statistics for Data Science
Types of Variables
Descriptive Statistics – Numerical Methods
Measures of Central Tendency
- Mean, Median, Mode
Measures of Dispersion
- Range, Interquartile Range, Standard Deviation, Variance
Descriptive Statistics – Tabular & Graphical Methods
- Histogram, Line Plot, Bar Plot, Pie Chart
- Box Plot, Scatter Plot
- Frequency Table, Crosstab
Probability Concepts
Distributions
- Normal Distribution
- Binomial Distribution
Central Limit Theorem
Hypothesis Testing
COURSE DURATION
20 HOURS
COURSE FEES
₹ 14000 ( + GST )
UN-SUPERVISED MACHINE LEARNING | 10 hours
COURSE DETAILS
Clustering
- Why Clustering? What is Clustering?
- Measure of Similarity, Distance Measures
- Hierarchical Clustering
- K Means Clustering
- Finding Optimal No. of Clusters
Principal Component Analysis (PCA) & Factor Analysis
- Why PCA? – Dimensionality Reduction
- Factor Analysis
- PCA vs FA
- Eigen Vector and Eigen Value
- Loading Factor
- Principal Components (PC) and PC Score
PROJECTS
- Clustering of Retail Customers
- PCA & FA on Data Scientist student’s data
BLOG
Hierarchical Clustering
K Means Clustering
COURSE DURATION
10 HOURS
COURSE FEES
₹ 7000 ( + GST )
SUPERVISED MACHINE LEARNING - LINEAR REGRESSION | 08 hours
COURSE DETAILS
Introduction to Linear Regression
Assumptions of Linear Regression
Simple Linear Regression
Multiple Linear Regression
Line of Best Fit
Residual Error, SSE
R-Squared & Adj, R-Squared
Correlation & Multi-Collinearity
Variance Inflation Factor
Homoscedasticity & Heteroscedasticity
Variable Transformation and its Importance
PROJECTS
- Build a Linear Regression Model to Estimate Monthly Household Expense
BLOG
COURSE DURATION
08 HOURS
COURSE FEES
₹ 6000 ( + GST )
SUPERVISED MACHINE LEARNING - LOGISTIC REGRESSION | 10 hours
COURSE DETAILS
Introduction to Logistic Regression
Log Odds Concept and Logistic Function
Development, Validation and Hold-out
Hypothesis Testing
Outlier Treatment & Missing Value Imputation
Information Value
Pattern Detection and Visualization
Variable Transformation
Weight of Evidence
Multi-Collinearity & Variance Inflation Factor (VIF)
Model Development & Validation
Model Performance Measurement
- KS, Rank Order, Lift Chart, AUC-ROC, Gini, Concordance, Hosmer-Lemeshow Goodness of Fit Test
PROJECTS
- Personal Loans Cross-Sell Model using Logistic Regression Technique
BLOG
Logistic Regression blog series
COURSE DURATION
10 HOURS
COURSE FEES
₹ 10000 ( + GST )
SUPERVISED MACHINE LEARNING - K NEAREST & NAIVE BAYES | 04 hours
COURSE DETAILS
K Nearest Neighbours
- What is KNN?
- KNN Concept and Distance Measures
- Lazy Learning
- KNN Optimization Algorithms
- Ball Tree and KD Tree
- Advantages and Disadvantages
Naive Bayes
- Bayes Theorem
- Naïve Bayes Derivation
- Naïve Bayes Algorithms
- Bernoulli, Multinomial and Gaussian Naïve Bayes
- Advantages and Disadvantages
PROJECTS
- Missing Value Imputation using KNN technique
- Predictive Model Development using Naïve Bayes
COURSE DURATION
04 HOURS
COURSE FEES
₹ 3000 ( + GST )
SUPERVISED MACHINE LEARNING - CLASSIFICATION TREE | 10 hours
COURSE DETAILS
Introduction to Classification Tree
CHAID, CART, C4.5
Greedy Algorithm
Balanced & Unbalanced Data
CART – Gini Gain Calculation
Binary / Multi-way Split
Pruning
Cross-Validation
Overfitting
Model Development & Evaluation
Pros & Cons of Classification Tree Technique
PROJECTS
- Case-Study – Dormant Account Win-back Model
- Classification Tree Model Development on Balanced Dataset
BLOG
COURSE DURATION
10 HOURS
COURSE FEES
₹ 7000 ( + GST )
SUPERVISED MACHINE LEARNING - BAGGING & BOOSTING | 12 hours
COURSE DETAILS
Bagging – Random Forest
- Concept of Ensemble Modeling
- What is Bootstrapping
- Random Forest Algorithm
- Out of Bag Error
- Tuning the Random Forest Model
- Variable Importance
- Model Evaluation and Performance Measure
Boosting
- What is Boosting
- AdaBoosting Algorithm Explained
- Boosting Model Development
- Hypergrid Tuning
- Model Evaluation and Performance Measure
PROJECTS
- Model Development on Banking Dataset
- Comparing the Model Performance of Boosting and Bagging Model
COURSE DURATION
12 HOURS
COURSE FEES
₹ 8500 ( + GST )
SUPERVISED MACHINE LEARNING - ARTIFICIAL NEURAL NETWORK | 06 hours
COURSE DETAILS
Artificial Neural Network Overview
Artificial NN vs Biological NN
Single / Multi-Layer NN
Neurons & Activation Functions
Cost Function
Backpropagation with Gradient Descent
Delta Rule, Learning Rate
Building an Artificial Neural Network
Model Performance Measures
Model Implementation Strategy
PROJECTS
- Credit Default Model using Keras with Tensorflow
COURSE DURATION
06 HOURS
COURSE FEES
₹ 800 ( + GST )
COMPUTER VISION - WEB SCRAPING, NLP, IMAGE PROCESSING | 20 hours
COURSE DETAILS
- Web Scraping
- What is Web Scraping?
- Why Web Scraping?
- Web Scraping Process
- Web Scraping using Selenium, BeautifulSoup, lxml packages
- Natural Language Processing
- Python and NLP Text Basics
- Text Mining using Regular Expressions
- Image Processing
- Concept of Image as Signal
- Image Processing Basics
- Zooming, Blurring, Smoothing, Gray Scaling, Thresholding, Edge Detection
- Image Processing using Python OpenCV package
PROJECTS
- Web Scraping Google Search results
- Applying Regular Expression to extract information for Search Results
- Number Plate Recognition using OpenCV and Web Scraping vehicle information from Vahan Database
COURSE DURATION
20 HOURS
COURSE FEES
₹ 15000 ( + GST )
Talk to us
+91 89396 94874
Rajesh Jakhotia
Instructor
Rajesh is an Analytics Professional with 20+ years of experience. He started his analytics career with Fractal Analytics in year 2003. He is an Adjunct Faculty at Great Learning.
His past work experience includes working with Fractal Analytics, Sutherland Global Services, Hansa Customer Equity and Positive Integers providing Analytics Consultancy for some of the marquee Indian Banks & NBFCs like HDFC Bank, Axis Bank, Kotak Mahindra Bank, India Infoline.
His expertise includes building Machine Learning Models for Risk Management and Marketing. He has worked on tools like Python, R, SAS, SQL.
He successfully completed the Senior Management Program from IIM-C. He is an Engineering Graduate from V.J.T.I, Mumbai University. He is also Oracle Certified Associated and Project Management Program certified from PMI.
Capstone Projects
An IT company has more than 100000 employees and have a very high attrition rate. The business environment is very competitive, and the cost of replacement is much higher than the cost of retaining an existing employee. If a skilled employee resigns the replacement involves cost of hiring and training. There is also some loss of efficiency till the new employee comes to speed.
The IT company sample data has been provided. HR Department of the company is looking for an attrition model that can help identify the employees who are likely to resign. Based on the model, HR will build an employee retention strategy and they have estimated that they can save more than Rs. 100 Million if they reduce the attrition rate by 0.5%.
As a Data Scientist our goal is to build Employee Attrition Model
A bank in Middle East would like to build Credit Default Model for their Home Loans portfolio. The model is an Application Scorecard for Home Loans and it will be used to evaluate credit worthiness of future customers applying for home loans.
The data of about 20000 loans customer with their default status has been provided. The data is a mix of expats and locals. The demographic details, income details and loan related parameters have been provided.
You have been assigned the task of building Application Scorecard for Home Loans using Logistic Regression Model. The probability of default as predicted by the model has to be converted into credit score such that a total score of 600 points corresponds to good/bad odds of 50 to 1 and an increase of the score of 20 points corresponds to a doubling of the good/bad odds.
MyBank wish to develop Direct Marketing channel by cross-selling various banking products and services to their existing customer base. The bank executed a pilot campaign to sell personal loans to their deposit account holders. The campaign offer was communicated to the customers through email, sms, and direct mailers.
The customers were incentivized to respond by giving the loan at 1% rate lower than market rate along with the processing fee waiver if the customer availed the loan within 15 days time period.
The demographics and behavioural variables along with responder / non-responder of the campaign has been provided. You have been assigned the task to build a Predictive Model to find profitable segments for cross-selling personal loans. Along with the model you must provide the model implementation and deployment strategy for future campaigns.
Sample Certificate
Testimonials
We had invited Mr Rajesh from K2 Analytics to conduct a workshop on Machine Learning and R Programming. The workshop was very well appreciated by all the participants. We are thankful for your time and the knowledge shared with us. I would like to rate the training 5 out of 5 for the training quality, content and the case-study way of explaining the topic which struck the right chord with the audience who were from the First Year and Second Year of Engineering. Thanks, k2analytics.
Within 2 months of Machine Learning course commencement, my perspective of looking at data had changed drastically. It helped me to present my existing reports and dashboards with insightful information. It is truly said "If you don't know the business, data can teach you." Complex terms were explained in a very elegant and simpler way to make it very easy to understand. The industry experience regularly shared by the trainer helps a lot. Many thanks to K2 Analytics!
I think joining K2 has been one of the best decision i have made in my career. Rajesh sir is very passionate instructor with immense knowledge in most demanding domain of this era and has great teaching skills, he keeps it simple for us to understand any complex concepts. I joined here with level-0 analytics skills but now after machine learning with R session i think I am ready to transform myself into analytics domain. I would highly recommend K2 Analytics to those who aspire to make career in Analytics domain.
3000+ STUDENTS POSITIVELY IMPACTED
120+ HOURS OF TRAINING CONTENT
FAQ's
What is the median salary of a Data Scientist in India ?
According to the report, the median salary being offered for analytics jobs in India is INR 11.5 lakhs/annum.
What is Data Science ?
Data science is a multi-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data.
Is Data Science a good career option ?
A Big YES, Data Science is a good career option. The U.S. Bureau of Labor Statistics reports that the rise of data science needs will create 11.5M job openings by 2026. According to IBM, the demand for Data Scientists will increase up to 28% by the year 2020.
Best way to learn Data Science as a beginner ?
Make sure you are guided by an experienced Professional Faculty in DATA SCIENCE.