**PG Certification in Data Science:- **This postgraduate certificate programme in data science training accepts both first-year students and those with varying levels of expertise, preparing them for a wide range of fascinating and sought-after employment.

Gain in-depth knowledge of the most widely used coding languages and acquire essential abilities in classical programming, mathematics, statistics, data analysis, machine learning, and Python-based data science. The Scholars awarded this certificate of completion, and the course was made available via the Nulearn platform.

## PG Certification in Data Science

Experts in the area developed this curriculum for the PG Certification in Data Science, which gives students a thorough understanding of real-world scenarios. Students will gain the most practical knowledge from this, which they may use to their careers once they start working.

Industry professionals with over 8 years of expertise in the field provided explanations for this PG certification in Data Science training course. The instructors that provide an explanation of this online certification course on the NuLearn website are Mr. Rahul Tiwary, Mr. Rakesh Sharma, Mr. Rajesh Kumar, Mr. Deepak Arora, Miss Ritu Shukla, and Mr. Muzzammi.

### PG Certification in Data Science Details

Article Name | PG Certification in Data Science BY The Scholar via Nulearn |

Year | 2024 |

Category | Courses |

Official site | www.nulearn.in |

**Check Also:-** Post Graduate Certificate in Cybersecurity at MIT Cambridge

### The Highlights

- 100% job assistance
- 10 months course duration
- Course completion certificate
- Covering most demand tools
- One-to-one industry mentors
- Building project portfolio
- AI-based resume building
- 24/7 LMS access
- 150 hours of in-depth live online learning
- Industry mentorship

### Program Offerings

- LinkedIn profile
- Assignments
- Projects
- Mock interviews by experts
- Resume building by experts
- Dedicate customer support

## Course And Certificate Fees

### Fees Information

The online PG Certification in Data Science admission cost information are Rs. 1,40,000 + GST. The payment process is divided into five parts.

Description |
Amount |

Course Exam Fee |
Rs.1,40,000 |

### Certificate Availability

- Yes

### Certificate Providing Authority

- The Scholar

## Eligibility Criteria

Enrollment in this postgraduate certification in data science course is open to any graduate with a minimum grade of 60% and above; prior programming experience or expertise is not required.

**Read Also:-** PG Program in Artificial Intelligence and Machine Learning

### Certification Qualifying Details

the completion of both Nulearn and The Scholar’s postgraduate certification programme in data science. Only students are eligible for the diploma; in order to earn the certificate, they must pass the admission exam and finish the course of study.

## What you will learn

Students of all experience levels, including freshmen, can benefit from this PG certification in data science curriculum, which also equips them for a variety of fascinating and in-demand occupations. Acquire essential expertise in Data Science mainstream programming with Python, mathematics, statistics, data analysis, and machine learning, along with specialised, employable abilities in the most widely used coding languages.

Professionals in the field created this curriculum to provide students a thorough grasp of real-world situations so they may absorb as much information as possible and use it in the workplace.

## Who it is for

Students interested in the field of data science are welcome to enroll in this course, as are professionals and executives at middle and senior levels from for-profit and non-profit organisations operating in all industries and trades. Some of the following fields are also eligible.

- Data scientist
- Lecturer
- Software developer
- IT Engineer

## Step To Apply For Admission Details

Applicants may take the following steps to gain admission to the PG Certification in Data Science course classes:

- Follow the official URL: https://www.nulearn.in/courses/pg-certification-in-data-science
- Participants are required to register on the nulearn website.
- Only after the participant has had a chance to register and log in is admission confirmed.

**Also Check:-** Sathyabama University: Courses,

## The syllabus

## Module 1: SQL

### Introduction to Data Warehouse

- What is data warehouse
- Data warehouse architecture
- Top down approach
- Bottom up approach
- Data lake architecture
- Data lake and warehouse architecture
- Data lake vs. Data warehouse
- OLTP vs. OLAP

### Facts and Dimensions

- Additive Fact
- Factless Fact
- Non Additive Fact
- Semi Additive Fact
- Role playing dimension
- Junk dimension
- Degenerate Dimension
- Conformed Dimension
- Slowly Changing Dimension

### Normal Forms

- Normal Forms 1NF
- Normal Forms 2NF
- Normal Forms 3NF
- Normal Forms 4NF

### Introduction to Constraints – Part 1

- Naming Rules
- Data Types
- Default Constraint
- Primary Key Constraints
- Foreign Key Constraints

### Introduction to Constraints – Part 2

- Not Null Constraint
- Unique Constraint

### Types of SQL Commands

- Data Definition Language
- Data Query Language
- Data Manipulation Language
- Data Control Language
- Transaction Control Language

### Operators and Clauses

- Concatenation
- Where Clause
- Like Clause
- Between and
- Is null
- Order by

### Single Row Function – Part 1

- Character
- Number
- Date

### Single Row Function – Part 2

- Conversion
- General Functions
- NVL
- NVL2
- COALESCE
- Case When
- Decode

### Multi Row Function

- Avg
- Count
- Min
- Max
- Stddev
- Sum
- Variance

### Joins in SQL – Part 1

- Different types of joins
- Inner
- Left Outer Join
- Right Outer Join
- Full Outer Join

### Joins in SQL – Part 2

- Self-Join
- Non Equi Join
- Cross Joins

### Joins in SQL – Part 3

- Self-Join

### Single Row SubQuery

- Detailed understanding of single row subquery

### Multiple Rows SubQuery

- Detailed understanding of multiple row subquery
- In clause
- Any Clause
- All Clause

### Multiple Column Sub Queries

- Non Pairwise
- Pairwise
- Correlated Sub Queries

### Exists and With Clauses

- Detail understanding of Exists and With Clauses

### Set Operators in SQL

- Union
- Union All
- Intersection
- Minus

### Data Manipulation

- Insert Statements
- Update Statements
- Delete Statements
- Truncate

### Views in SQL

- Simple and Complex views
- Update through views

### Sequences, Synonyms and Index

- Sequences and its uses
- Synonyms and its uses
- Index and its benefits and Drawbacks

### Alter Statements in SQL

- How to add the constraints
- Alter statements
- System tables

### Multi Table Inserts

- Unconditional Inserts
- Conditional Inserts
- Condition First Insert
- Pivoting Insert

### Merge in SQL

- Update using Merge
- Insert using Merge

### Hierarchical Data Fetching

- Connect by Prior

### Regular Expressions

- REGEXP_LIKE
- REGEXP_INSTR
- REGEXP_SUBSTR
- REGEXP_REPLACE
- REGEXP_COUNT

### Analytics Functions – Part 1

- Row Number
- Rank
- Dense Rank

### Analytics Functions – Part 2

- Row Number
- Rank
- Dense Rank

### Materialized Views

- Importance of Materialized Views
- Complete on Demand
- Complete on Commit
- Fast on Demand
- Explain Plan

### SQL Interview Prep – Advanced Select

- Understanding of advanced Select

### SQL Interview Prep U+2013 Joins

- Understanding of Joins and advanced joins

**Also Read:-** Artificial Intelligence and Machine Learning for Business Excellence at IIM Kozhikode

## Module 2: Python

### Basics of Python (Python History and Features)

- Python History and Features

### Basics of Python (Data Types Part 1)

- Data Types in Python
- Integer
- Float
- String
- Boolean
- Complex Numbers

### Basics of Python (Data Types Part 2)

- Memory Allocation in Python
- Lists
- Tuples
- Range
- Sets
- Frozen sets
- Dictionaries

- Basics of Python (Lists in Python)
- Detailed explanation of Lists and its functions
- Basics of Python (Tuples, Dictionaries and Sets)

### Detailed explanation of Tuples, Dictionaries and Sets

- Operators in Python – Part 1
- Arithmetic Operators
- Relational Operators
- Equality Operators
- Logical Operators
- Ternary Operators
- Operators in Python – Part 2
- Bitwise Operators
- Compound Assignment Operators
- Special Operators
- Identity Operators
- Membership Operators
- Precedence in Python
- Math Modules

### Input ways in Python

### Input statement

- Eval
- Command Line Arguments
- Output ways in Python
- Print and its formats

### Flow Control – Part 1

- if, elif and else
- for loops

### Flow Control – Part 2

- for loops
- while loops
- break
- continue
- pass

### String Operations in Python

- Various string related functions and its applications

### String Operations in Python – More Examples

- More examples on String operations

### Functions in Python – Part 1

- Functions and its benefits
- Types of arguments
- Positional arguments
- Keyword arguments
- Default arguments
- Variable length arguments

### Functions in Python – Part 2

- Types of variables
- Recursive functions
- Lambda functions
- filter

### Functions in Python – Part 3

- map
- reduce
- Memory allocation in functions – Function
- Aliasing
- Nested functions

### Exception Handling – Part 1

- Exception Handling and its various scenarios

### Exception Handling – Part 2

- Various syntaxes of exception handling

### Exception Handling – Example

- Custom exception handling example

### File Handling – Part 1

- Modes to open the files
- Various properties of file object
- How to write and read data from files With statement
- seek and tell commands

### File Handling – Part 2

- seek and tell commands
- os module and its functions

### Binary files

- Write data in csv files
- How to handle the files and folders in Python using os modules
- pickling and unpickling

### Regular Expressions – Part 1

- re module in Python and its functions
- re module in Python and its functions
- Object Oriented Programming – Part 1

### Introduction to OOPs

- Constructor in OOPs

### Object Oriented Programming – Part 2

- Self
- Instance variables
- Static variables

### Object Oriented Programming – Part 3

- Static variables
- Local variables

### Object Oriented Programming – Part 5

- Passing members from one class to another class
- Inner classes

### Object Oriented Programming – Part 6

- Garbage Collector
- Desctructor

### Object Oriented Programming – Part 7

- Inheritance
- Single Inheritance
- Multilevel Inheritance
- Hierarchical Inheritance
- Multiple Inheritance

### Object Oriented Programming – Part 8

- Composition and Aggregation
- Hybrid Inheritance

### Object Oriented Programming – Part 9

- super in OOPs

### Object Oriented Programming – Part 10

- Polymorphism in OOPs

### Object Oriented Programming – Part 11

- Method Overloading
- Constructor Overloading
- Method Overriding

### NumPy – Part 1

- Understanding of NumPy and its various functions
- Understanding of arrays and Matrix

### NumPy – Part 2

- Discussion on various NumPy functions

### NumPy – Part 3

- Discussion on various NumPy functions

### Pandas – Part 1

- Pandas – An introduction
- Understanding of Data Frames
- Joins between Data Frames
- Concat

### Pandas – Part 2

- Slicing and playing with the csv file and understand various functions on pandas

### Pandas – Part 3

- Understanding of various functions in Pandas

### Pandas – Part 4

- Understanding of various functions in Pandas

### Pandas – Part 5

- Understanding of various functions in Pandas

### dfply understanding

- Detailed understanding of dfply

### Data Visualization of Matplotlib

- Creating various plots using Matplotlib

### Data Visualization of Seaborn

- Creating various plots using Seaborn

### Data Preprocessing using SKLearn- Missing Value Imputation

- Traditional methods of imputation using mean, median and mode KNN Imputation

### Data Preprocessing using SKLearn – Outlier Treatment

- Outlier Treatment

### Data Preprocessing using SKLearn – Feature Scaling – Theory

- Standardization
- MinMax Scalar
- Robust Scaler
- MaxAbs Scalar
- Power Transformer
- Quantile Transformer
- Normalization

### Data Preprocessing using SKLearn – Feature Scaling – Handson

- Hands on of all the techniques using Python

### Python connection with Oracle – Part 1

- Create and Insert using Python

### Python connection with Oracle – Part 2

- Alter statement, Reading Data from csv and upload in Oracle

### Exploratory Data Analysis – Part 1

- End to end understanding of how to explore the data before applying Machine Learning

### Exploratory Data Analysis – Part 2

- End to end understanding of how to explore the data before applying Machine Learning

**Can Check:-** Post Graduate Program In Business Analytics

## Module 3: Tableau

### Tableau – An introduction

- Understanding Tableau and its products
- Tableau Public
- Tableau Desktop
- Tableau Reader
- Tableau Online
- Tableau Server
- Tableau Prep
- Tableau Mobile

### Tableau – Creating first chart

- Understanding the Calculated Field concept
- If and Else statements
- Left Function
- Making data ready for visualization using Tableau Prep

### Tableau – Mode of connection

- Understanding the concept of Live and Extract connections
- Data Source filters
- Union and its usage

### Tableau – Data from Heterogeneous data sources

- Dive deep into the concept of Joins and how can we achieve in Tableau
- Applying the data joining using Tableau Prep
- Dive deep into the concept of Cross database joins
- Applying Cross database joins using Tableau Prep
- Understanding of Data Blending
- Understanding the concept of Primary and Secondary data sources

### Tableau – Data Blending in Tableau

- Dive Deep into Data Blending
- Understanding the usage of Data Joining, Cross Database Join and Data Blending
- Solving the end-to-end problem using Data Blending

### Tableau – Bins, Parameters and creating our first dashboard

- Creating the map view and understanding the idea of maps
- Creating the histogram and dynamic histogram using parameters
- Creating Donut chart
- Creating the stackbar chart
- Creating the interactive dashboard

### Tableau – Dashboard and Story

- Difference between Tiled and Floating options
- Adding the buttons to download the dashboard as ppt, image and pdf
- Creating the story
- Understanding the difference between Dashboard and Story

### Tableau – Time series analysis

- Understanding the concepts of Data dimension
- Difference between Discrete and continuous date field
- What is the relation between aggregation and Granularity?
- How to apply normal filters or Quick Filters
- Options in the Normal Filters or Quick Filters
- Understanding on Context Filters

### Tableau – Sets and Parameters for more dynamic visualization

- Create a dynamic scatter plot using the concepts of Sets and Parameters
- Understanding the concept of sets
- Understanding the concept of Static Sets
- Understanding the concept of Dynamic Sets
- How to tag parameters with sets to make it dynamic sets
- How to tag parameters with reference lines

### Tableau – Combined Sets and Formatting of the visuals

- Understanding the concept of Combined sets
- Formatting the visuals
- Adding sets as filters

### Tableau – Animation in Tableau – Data Prep

- Formatting the data as per the requirement using Tableau Prep
- Formatting the data as per the requirement using Tableau Desktop
- Advantage and Disadvantage of using Tableau Desktop as data preparation tool

### Tableau – Animation in Tableau – Visualization

- Creating the animated scatter plot using Animation concept
- How to animate the chart using page shelf
- Which data to choose as Primary data source
- Formatting the plot
- Sorting the secondary data source
- Adding show history option in the animation

### Tableau – Level of Details (Include)

- Understanding the concept of Include Function
- Creating the visual using Include function in Tableau

### Tableau – Level of Details (Include and Exclude)

- Implementing the include function on the real time use case
- Creating the visual using Include function in Tableau
- Understanding the concept of Exclude and where to use it
- Implementing the exclude function on the real time use case
- Creating the visual using exclude function in Tableau

### Tableau – Level of Details (Fixed)

- Understanding the concept of Fixed and where to use it
- Implementing the Fixed function on the real time use case
- Creating the visual using Fixed function in Tableau

### Tableau – Level of Details (Real time use cases)

- Customer Order Frequency
- Cohort Analysis
- Daily Profit KPI
- Percentage of Total
- New Customer Acquisition
- Average of top deals by city
- Actual vs. Target
- Value on the last day of the period
- Return Purchase by cohort
- Percentage difference from average across a range
- Relative Period Filtering
- User login frequency
- Proportional Brushing
- Annual Purchase frequency by customer cohort
- Comparative sales analysis

### Tableau – Table Calculation – Data Prep Part 1

- Preparing the data using Tableau Prep
- Understanding the datetime dimension and its purpose

### Tableau – Table Calculation – Data Prep Part 2

- Preparing the data using Tableau Desktop
- Pivoting using Tableau Prep
- Pivoting using Tableau Desktop

### Tableau – Table Calculation – Functions Part 1

- Running Sum
- Difference
- Percentage Difference
- Understanding Table Across, Table Down
- Fixing the calculation by dimension
- Understanding the difference between Table calculation and Calculated Field
- Dive deep on the functions Lookup, ZN

### Tableau – Table Calculation – Functions Part 2

- Moving Average

### Creating the logic for Quality assurance in the output

**Check Also:-** Post Graduate Program in Motion Control BY Skill Lync

## Module 4: Statistics

### Statistics – An Introduction

- Types of Data
- Analyzing Categorical Data
- Reading Pictographs
- Reading Bar Graphs
- Reading Pie Chart
- Two way Frequency Table and Venn Diagram
- Marginal and Conditional Distribution
- Displaying Quantitive data with Graphs
- Frequency Table and Dot plots
- Creating Histogram

### Measure of Central Tendency

- Mean
- Median
- Mode

### Measure of Spread – Part 1

- Range
- Variance
- Standard Deviation
- Interquartile Range

### Measure of Spread – Part 2

- Population
- Sample
- Difference in formula for Population parameter and
- Sample statistic
- Box and Whisker Plot
- Left Skewed and Right Skewed
- Outlier detection
- Mean Absolute Deviation

### Regression – Part 1

- Exploring bivariate numerical data
- Slope of a line
- Intercept

### Regression – Part 2.1

- Covariance
- Covariance Matrix

### Regression – Part 2.2

- Karl Pearson Correlation Coefficient

### Regression – Part 2.3

- Spearman Rank Correlation Coefficient
- Various scenarios

### Regression – Part 3

- Residuals

### Regression – Part 4

- R-Squared and RMSE

### Normal Distribution – Part 1

- Understanding Normal Distribution
- Z-Score concept
- Emperical Rule
- Density Curve

### Normal Distribution – Part 2

- More Examples in Z-score and Z-table

### Normal Distribution – Part 3

- More Examples in Z-score and Z-table
- Exploring Gaussian distribution equation

### Symmetric Distribution, Skewness, Kurtosis and KDE

- Symmetric distribution
- Skewness
- Kurtosis
- KDE

### Probability – An Introduction

- Simple Probability
- Probability – With Counting Outcome
- Sample space and its subset
- Intersection and Union of sets
- Relative complement or Difference between sets
- Universal Set and Absolute complement
- Subset, Strict, Subset and Superset
- Set operations together
- Difference between theoretical and experimental probability

### Probability – Part 2

- Statistical Significance of experiment
- Probability with venn diagram
- Addition rule of probability
- Sample space for compound space
- Compound probability of independent events
- Independent events

### Probability – Part 3.1

- Dependent Probability
- Conditional Probability

### Probability – Part 3.2

- Conditional Probability – Tree Diagram
- Conditional Probability and Independence

### Permutations and Combinations

- Permutation Formula
- Understanding of Zero factorial
- Factorial and Counting
- Combinations

### Combinatorics and Probability

- Probability using combinations
- General form

### Random Variables

- Continuous and Discrete and its distribution
- Mean (Expected value) of a discrete random variable
- Variance and Standard deviation

### Random Variables Variance

- Intuition for why independence matters for variance of sum
- Analyzing distribution of sum of two normally distributed random variables
- Analyzing distribution of sum of two normally distributed random variables

### Binomial Distribution – Part 1

- Binomial Variables
- Recognizing Binomial variables
- 10% Rule of assuming independence
- Binomial Distribution

### Binomial Distribution – Part 2

- Binomial Probability
- Binompdf and Binomcdf functions
- Mean and variance of Bernoulli distribution
- Expected value and variance of Binomial distribution

### Geometric Random Variables – Part 1

- Geometric random variables – Introduction
- Probability of a geometric random variable
- Cumulative geometric probability

### Geometric Random Variables – Part 2

- Cumulative geometric probability
- Proof of expected value of geometric random variables

### Poisson Distribution

- Detailed understanding of Poisson Distribution

### Sampling Distribution

- Sampling Distributions
- Sampling distribution of sample proportion
- Normal conditions for sampling distributions of sample proportions
- Probability of sample proportions

### Central Limit Theorem

- Inferring population mean from sample mean
- Central Limit Theorem
- Standard Error of the mean

### Confidence Interval

- Confidence intervals and margin of error
- Interpretation of Confidence interval

### Margin of error

- Margin of Error
- Condition for valid confidence interval for a proportion

### T-statistic

- Introduction to T-Statistics
- Conditions for valid t intervals
- Find critical t value
- Confidence interval for a mean with paired data
- Sample size for a given margin of error for a mean
- T-statistic confidence interval
- Small sample size confidence intervals

### Significance Tests – An Introduction

- Simple Hypothesis Testing
- Idea Behind Hypothesis Testing
- null and alternative hypothesis
- P-values and significance tests

### Type 1 and Type 2 Errors

- Comparing P-values to different significance levels
- Estimating a P-value from a simulation
- Type 1 and Type 2 Errors
- Introduction to Power in significance tests
- Power in significance test

### Constructing Hypothesis for a significance test about a proportion

- Constructing hypothesis for a significance test about a proportion
- Conditions for a z test about a proportion
- Calculating a z statistic in a test about a proportion
- Calculating a P-value given a z statistic
- Making conclusions in a test about a proportion

### Constructing Hypothesis for a significance test about a mean

- Writing Hypothesis for a significance test about a mean
- Condition for a t test about a mean
- When to use z or t statistics in significance tests
- Calculating t statistic for a test about a mean
- Calculate p-value from t statistic
- Comparing P-value from t statistic to significance level
- Significance Test for Mean

### More on Significance Testing

- Hypothesis testing and p-values
- Small sample hypothesis test
- Large Sample proportion hypothesis testing

### Comparing two proportions

- Detailed understanding of Comparing Two Proportions

### Comparing two means

- Statistical Significance of experiment
- Difference of Sample mean distribution
- Confidence interval of difference of means

### Introduction to Chi Squared Distribution

- Chi-Square distribution introduction
- Pearson’s Chi-Square test (Goodness of fit)
- Chi-Square statistic for Hypothesis testing
- Chi-square goodness of fit example
- Filling out frequency table for independent events

### Chi Square Test for Homogeneity and Association

- Contingency table chi-square test
- Chi-Squared test for Homogeneity
- Chi-Squared test for association (Independence)

### Advanced Regression

- Introduction to inference about slope in linear regression
- Conditions for Inference on slope
- Confidence interval for the slope of a regression line
- Calculating t statistic for slope of regression line
- Using a P-value to make conclusions in a test about slope
- Using a confidence interval to test slope

### Anova and Fstatistic

- Anova – Calculating SST, SSW and SSB
- Hypothesis testing with F-statistic

**Read Also:-** Executive PG Certificate Program in Data Science at IIT Roorkee

## Module 5: Machine Learning

### Supervised Machine Learning – Classification and Regression – KNN

- How “ Classification” and “Regression”works?
- Data matrixnotation.
- Classification vs Regression(examples)
- K-Nearest Neighbors Geometric intuition with a toy example.
- Failure cases of K-NN
- Distance measures: Euclidean(L2),Manhattan(L1),Minkowski, Hamming
- Cosine Distance & Cosine Similarity
- How to measure the effectiveness of K-NN?
- Test/Evaluation time and space complexity.
- k-NN Limitations.
- Decision surface for K-NN as K changes.
- Overfitting and Underfitting.
- Need for Cross validation.
- K-fold cross validation.
- Visualizing train, validation and test data sets
- How to determine overfitting and underfitting?
- Time based splitting
- k-NN for regression.
- Weighted k-NN
- Voronoi diagram.
- Binary search tree
- How to build akd-tree.
- Find nearest neighbors using kd-tree
- Limitations of kd-tree
- Extensions.
- Hashing vs LSH.
- LSH for cosine similarity
- LSH for euclidean distance.
- Probabilistic class label
- Code Sample: Decision boundary.
- Code Samples: Cross-Validation

### Classification algorithms in various situations

- Introduction
- Imbalanced vs balanced data set.
- Multi-class classification.
- k-NN, given a distance or similarity matrix
- Train and test set differences.
- Impact of Outliers
- Local Outlier Factor(Simple solution: mean distance to k-NN).
- k-distance (A),N(A)
- reachability-distance(A,B)
- Local-reachability-density(A)
- Local Outlier Factor(A)
- Impact of Scale & Column standardization.
- Interpretability
- Feature importance & Forward Feature Selection
- Handling categorical and numerical features.
- Handling missing values by imputation.
- Curse of dimensionality.
- Bias-Variance tradeoff.
- Intuitive understanding of bias-variance.
- Revision Questions.
- Best and worst case of an algorithm

### Performance Measurement Of Models

- Accuracy
- Confusion matrix, TPR, FPR, FNR,TNR
- Precision & recall,F1-score.
- Receiver Operating Characteristic Curve (ROC) curve and AUC.
- Log-loss.
- R-Squared/ Coefficient of determination.
- Median absolute deviation(MAD)
- Distribution of errors.

### Supervised Machine Learning – Classification – Naive Bayes

- Conditional probability.
- Independent vs Mutually exclusive events.
- Bayes Theorem with examples.
- Exercise problems on Bayes Theorem
- Naive Bayes algorithm.
- Toy example: Train and test stages.
- Naive Bayes on Text data.
- Laplace/Additive Smoothing.
- Log-probabilities for numerical stability.
- Bias and Variance tradeoff.
- Feature importance and interpretability.
- Imbalanced data
- Outliers.
- Missing values.
- Handling Numerical features (Gaussian NB)
- Multiclass classification.
- Similarity or Distance matrix.
- Large dimensionality.
- Best and worst cases.

### Supervised Machine Learning – Classification – Logistic Regression

- Geometric intuition of logistic regression
- Sigmoid function: Squashing
- Mathematical formulation of objective function.
- Weight Vector.
- L2 Regularization: Overfitting and Underfitting.
- L1 regularization and sparsity.
- Probabilistic Interpretation: Gaussian Naive Bayes
- Loss minimization interpretation
- Hyperparameter search: Grid Search and Random Search
- Column Standardization.
- Feature importance and model interpretability.
- Collinearity of features.
- Train & Run time space and time complexity.
- Real world cases.
- Non-linearly separable data & feature engineering.
- Code sample: Logistic regression, Grid Search CV, Random Search CV
- Extensions to Logistic Regression: Generalized linear models(GLM)

### Supervised Machine Learning – Regression – Linear Regression

- Geometric intuition of Linear Regression.
- Mathematical formulation.
- Real world Cases.
- Code sample for Linear Regression

### Solving Optimization Problems

- Differentiation.
- Online differentiation tools
- Maxima and Minima
- Vector calculus: Grad
- Gradient descent: geometric intuition.
- Learning rate.
- Gradient descent for linear regression.
- SGD algorithm
- Constrained optimization &PCA
- Logistic regression formulation revisited.
- Why L1 regularization creates sparsity?

### Supervised Machine Learning – Classification and Regression – Support Vector Machines (SVM)

- Geometric intuition.
- Mathematical derivation.
- why we take values +1 and -1 for support vector planes
- Loss function (Hinge Loss) based interpretation.
- Dual form of SVM formulation.
- Kernel trick.
- Polynomial kernel.
- RBF-Kernel.
- Domain specific Kernels.
- Train and run time complexities.
- nu-SVM: control errors and support vectors.
- SVM Regression.
- Cases.

### Supervised Machine Learning – Classification and Regression – Decision Trees

- Geometric Intuition of decision tree: Axis parallel hyperplanes.
- Sample Decision tree.
- Building a decision Tree: Entropy(Intuition behind entropy)
- Building a decision Tree: Information Gain
- Building a decision Tree: Gini Impurity.
- Building a decision Tree: Constructing a DT.
- Building a decision Tree: Splitting numerical features.
- Feature standardization.
- Categorical features with many possible values.
- Overfitting and Underfitting.
- Train and Run time complexity.
- Regression using Decision Trees.
- Cases
- Supervised Machine Learning – Classification and Regression – Ensemble Models
- What are ensembles?
- Bootstrapped Aggregation (Bagging)Intuition.
- Random Forest and their construction.
- Bias-Variance tradeoff.
- Train and Run-time Complexity.
- Bagging: code Sample.
- Extremely randomized trees.
- Random Forest: Cases.
- Boosting Intuition
- Residuals, Loss functions, and gradients.
- Gradient Boosting
- Regularization by Shrinkage.
- Train and Run time complexity.
- XG Boost: Boosting +Randomization
- Ada Boost: geometric intuition.
- Stacking models.
- Cascading classifiers.

### Unsupervised learning – Clustering – K means

- What is Clustering?
- Unsupervised learning
- Applications.
- Metrics for Clustering.
- K-Means: Geometric intuition, Centroids.
- K-Means: Mathematical formulation: Objective function
- K-Means Algorithm.
- How to initialize: K-Means++
- Failure cases/Limitations.
- K-Medoids
- Determining the right K.
- Code Samples.
- Time and Space complexity

### Unsupervised learning – Clustering – Hierarchical clustering Technique

- Agglomerative & Divisive, Dendrograms
- Agglomerative Clustering.
- Proximity methods: Advantages and Limitations.
- Time and Space Complexity.
- Limitations of Hierarchical Clustering.
- Code sample.

### Unsupervised learning – Clustering – DBSCAN (Density based clustering)

- Density based clustering
- MinPts and Eps:Density
- Core, Border and Noise points.
- Density edge and Density connected points.
- DBSCAN Algorithm.
- Hyper Parameters: MinPts and Eps.
- Advantages and Limitations of DBSCAN.
- Time and Space Complexity.

### Association Rule Mining (Apriori)

- Understanding of Support
- Understanding of Confidence
- Understanding of Lift
- Time and Space complexity

### Recommender Systems and Matrix Factorization

- Problem formulation: Movie reviews.
- Content based vs Collaborative Filtering.
- Similarity based Algorithms.
- Matrix Factorization: PCA,SVD.
- Matrix Factorization: NMF.
- Matrix Factorization for Collaborative filtering
- Matrix Factorization for feature engineering.
- Clustering as MF.
- Hyperparameter tuning.
- Matrix Factorization for recommender systems: Netflix Prize Solution.
- Cold Start problem.
- Word Vectors as MF.
- Eigen-Faces.

### Dimensionality reduction and Visualization

- What is dimensionality reduction?
- Row vector, and Column vector.
- How to represent a dataset?
- How to represent a dataset as a Matrix.
- Data preprocessing: Feature Normalization
- Mean of a datamatrix.
- Data preprocessing: Column Standardization
- Co-variance of a Data Matrix.
- MNIST dataset (784 dimensional)
- Code to load MNIST dataset.

### Principal Component Analysis.

- Geometric intuition.
- Mathematical objective function.
- Alternative formulation of PCA: distance minimization
- Eigenvalues and eigenvectors.
- PCA for dimensionality reduction and visualization.
- Visualize MNIST dataset.
- Limitations of PCA
- PCA for dimensionality reduction (not-visualization)

### T-distributed stochastic neighborhood embedding(t-SNE)

- What is -SNE?
- Neighborhood of a point, Embedding.
- Geometric intuition.
- Crowding problem.
- How to apply t-SNE and interpret its output(distill.pub)
- t-SNE on MNIST.

**Also Check:-** Executive Program in Information Technology-Advance at Mizoram University

## Module 6: Deep Learning and AI (Self Paced)

- Introduction to Python
- Introduction to Logistic Regression

### Introduction to Artificial Neural Network

- History of Neural networks and Deep Learning.
- How Biological Neurons work?
- Growth of biological neural networks.
- Diagrammatic representation: Logistic Regression and Perceptron.
- Multi-Layered Perceptron (MLP).
- Notation.
- Training a single-neuron model.
- Training an MLP: Chain Rule.
- Training an MLP:Memoization.
- Backpropagation.
- Activation functions.
- Vanishing Gradient problem.
- Bias-Variance tradeoff.

### Deep Multi-layer Perceptrons

- Deep Multi-layer perceptrons:1980s to 2010s
- Dropout layers & Regularization.
- Rectified Linear Units (ReLU).
- Weight initialization.
- Batch Normalization.
- Optimizers: Hill-descent analogy in 2D
- Optimizers: Hill descent in 3D and contours.
- SGD Recap
- Batch SGD with momentum.
- Nesterov Accelerated Gradient (NAG)
- Optimizers: AdaGrad
- Optimizers : Adadelta and RMSProp
- Adam
- Which algorithm to choose when?
- Gradient Checking and clipping
- Softmax and Cross-entropy for multi-class classification.
- How to train a Deep MLP?

### Convolutional Neural Network

- Biological inspiration: Visual Cortex
- Convolution: Edge Detection on images.
- Convolution: Padding and strides
- Convolution over RGB images.
- Convolutional layer.
- Max-pooling.
- CNN Training: Optimization
- Receptive Fields and Effective Receptive Fields
- ImageNet dataset.
- Data Augmentation.
- Convolution Layers in Keras
- AlexNet
- VGGNet

### Recurrent Neural Network

- Why RNNs?
- Recurrent Neural Network
- Training RNNs: Backprop
- Types of RNNs
- Need for LSTM/GRU
- LSTM
- GRUs

### Module 7: Power BI (Self Paced)

- Introduction to PowerBI and its architecture
- Import data from CSV files
- Import data from Excel files
- Import data from Web 1
- Import data from Web 2
- Import Real-time Streaming Data
- Import data from Oracle
- Import Data from Folder
- Dataflows – Introduction
- Dataflows – Create Gateway from Scratch
- Dataflows – Create Entities from CSV file
- Dataflows – Create Entities Using SQL Server
- Remove Rows
- Remove Columns
- Make first row as headers
- How to create calculate columns
- How to remove duplicates
- Unpivot columns and split columns
- Change Data type, Replace Values and Rearrange the columns
- Append Queries
- Merge Queries
- Visuals Intro
- Visuals-Bar Charts
- Visuals-Line Charts
- Visuals-Pie Chart
- Stacked bar Chart
- Clustered Column Chart
- Visuals-Area Chart and Analytics Tab Explained-0
- Visuals-Area Chart and Analytics Tab Explained-1
- Visuals-Combo Chart
- Visuals-Scatter Chart
- Visuals-Treemap Chart
- Visuals-funnel Chart
- Visuals-Card and Multi-Row Card
- Visuals-Gauge Card
- Visuals-KPIs
- Visuals-Matrix
- Visuals-Table
- Visuals-Text boxes – Shapes – Images
- Visuals-Slicers
- Visuals-Maps
- Custom Visuals – Word Cloud
- Visualization interactions
- Modeling and Relationships
- Other ways to create Relationship
- OLTP vs OLAP
- Star Schema vs Snowflake Schema
- DAX101 – Importing Data for Dax Learning
- DAX101 – Resources for Dax Learning
- DAX101 – What is Dax
- DAX101 – Dax Data Types
- DAX101 – Dax Operators and Syntax
- DAX101 – M vs Dax
- DAX101 – Create a Column
- DAX101 – Rules to Create Measures
- DAX101 – Calculated Columns vs Calculated Measures-0
- DAX101 – Calculated Columns vs Calculated Measures-1
- DAX101 – Sum()
- DAX101 – AVERAGE()-MIN()-MAX()
- DAX101 – SUMX()
- DAX101 – DIVIDE()
- DAX101 – COUNT()-COUNTROWS()
- DAX101 – CALCULATE()-0
- DAX101 – CALCULATE()-1
- DAX101 – FILTER()
- DAX101 – ALL()
- DAX101 – Time Intelligence – Create Date Table in M (important)
- DAX101 – Time Intelligence – Create Date Table in DAX
- DAX101 – Time Intelligence – SAMEPERIODLASTYEAR()
- DAX101 – Time Intelligence – TOTALYTD()
- DAX101 – Display Last Refresh Date
- DAX101 – Time Intelligence – PREVIOUSMONTH()
- DAX101 – Time Intelligence – DATEADD()
- DAX101 – Quick Measures
- PowerBI Reports
- PowerBI Workspaces
- PowerBI Datasets
- What are Dashboards
- How to create Workspace and Publish Report
- Favorite dashboards, reports, and apps in Power BI
- Subscribe to a Report or Dashboard
- Rename Workspace or Report or Dashboard
- Display Reports or Dashboards in Full screen mode
- Delete Reports or Dashboards
- Dashboard Menus
- File and View Options
- Printing Dashboard and Reports
- PATH Function
- PATHCONTAINS Function
- PATHITEM Function
- PATHITEMREVERSE Function
- PATHLENGTH Function
- RLS – Static Row Level Security
- RLS – Dynamic Row Level Security
- RLS – Organizational Hierarchy
- Sharing and Collaboration
- Sharing Dashboard
- Sharing Workspaces
- Sharing App
- Publish To Web

## Evaluation Process

Exam certification for this PG certification in Data Science online course consists of two steps: an aptitude test, a personal interview with an academic expert, an admission letter, and enrollment.

### How It Helps

This online PG Certification in Data Science benefits everyone who wants to become a trained data scientist. It is designed for individuals with a variety of backgrounds as well as freshmen, and it prepares students for a wide range of interesting and in-demand careers. Gain knowledge and employable skills in the most popular coding languages, as well as data science fundamentals such as Python standard programming, mathematics, statistics, data analysis, and machine learning.

This program was developed by industry professionals to give learners a thorough understanding of real-world scenarios, allowing them to learn the most and apply it on the job. Python, SQL, Statistics, Machine Learning, PowerBI, Tableau, Deep Learning, and AI (Self paced) are part of the syllabus. After completing this course they will get the course completion certificate.

## PG Certification in Data Science FAQ’S

### Who is eligible for PG certification in Data Science?

PG Diploma In Data Science eligibility requires candidates to have a bachelor’s degree with at least 50% overall or an equivalent grade, preferably in science or computer science from an accredited university. For some colleges candidates may need relevant work experience to be eligible for the particular course.

### Can I do PG in Data Science?

PG in data science can be done by an individual who wants to take a career path in Computer Science, Engineering, Management, or Business Analysis. The Eligibility differs depending on the type of course you want to do (Master’s degree, Certificate course, MBA, PG diploma).

### Is PG Diploma in Data Science worth it?

Higher salary: Data Science is one of the highest-paying fields in the world. Completing a PG Diploma in Data Science can help you command a higher salary than your peers in other fields.

**Related Post:- **

Professional Certificate Programme in Advanced Data Analytics for Managers at IIM Kozhikode

Post Graduate Level Advanced Certification Programme in Deep Learning at IISc Bangalore