Data Science

Data science is very much popular in today’s world scenario as there is a huge amount of data generated each day in different fields such as BFSI, Healthcare and Telecom. This training encompasses a conceptual understanding of Statistics, Machine Learning and Deep Learning using the Python and R programming languages.

Introduction to Data Science

  • What is Data Science?
  • Data science lifecycle
  • Use Cases/applications/examples
  • DS tools and technology

Python Programming

  • Installation
  • Python 2.7 Vs 3.4
  • Python programming fundamentals
  • Data types and structures, variables, Control flows, and functions
  • Python libraries
  • Numpy, Pandas, SciKitLearn, MatPlotLib

R Programming

  • Introduction to R
  • Vectors
  • Matrices
  • Factors
  • Data Frames
  • Lists

Data Extraction, Wrangling and Exploration

  • Data Analysis Pipeline
  • What is Data Extraction
  • Types of Data
  • Raw and Processed Data
  • Data Wrangling
  • Exploratory Data Analysis(EDA)
  • Data Structures in Pandas - Series and Data Frames

Probability

  • Basic Probability
  • Conditional Probability
  • Properties of Random Variables
  • Expectations
  • Variance
  • Entropy and cross-entropy
  • Covariance and correlation
  • Estimating probability of Random variable
  • Understanding standard random processes

Inferential Statistics

  • Estimating parameters of a population using sample statistics
  • Hypothesis testing and confidence intervals
  • T-tests and ANOVA
  • Correlation and regression
  • Chi-squared test

Descriptive Stats

  • Compute and interpret values like: Mean, Median, Mode, Sample, Population and Standard Deviation.
  • Compute simple probabilities.
  • Explore data through the use of bar graphs, histograms and other common visualizations.
  • Investigate distributions and understand a distributions properties.
  • Manipulate distributions to make probabilistic predictions on data.

Data visualization

  • Bar Graph, Histogram, Pi Chart, Line Chart, Box (Whisker) Plot, Scatter Plot, Heat map

Basic Machine Learning Algorithms

  • Linear Regression
  • Logistic Regression
  • Decision Trees
  • KNN (K- Nearest Neighbours)
  • K-Means Clustering
  • Naïve Bayes
  • Dimensionality Reduction

Advanced algorithms

  • Random Forests
  • Dimensionality Reduction Techniques
  • Support Vector Machines
  • Gradient boosting

Introduction to Deep Learning

  • Tensor flow
  • Neural Networks
  • Biological Neural Networks
  • Understand Artificial Neural Networks
  • Building an Artificial Neural Network
  • How ANN works
  • Image recognition
  • Image classification

Sentiment Analysis

Text Mining

Natural Language Processing(NLP)

Time Series

  • What is Time Series data?
  • Time Series variables
  • Different components of Time Series data
  • Visualize the data to identify Time Series Components
  • Implement ARIMA model for forecasting
  • Exponential smoothing models
  • Identifying different time series scenario based on which different Exponential Smoothing model can be applied
  • Implement respective ETS model for forecasting

Course Duration : 10 days (80 Hrs)