View on GitHub

Introduction-to-Data-Science-with-Python

This is a repo created for the session on Practical Introduction to Data Science with Python

Introduction to Data Science with Python

Photo by Franki Chamaki on Unsplash



Click here for the FDP Data Science Notebook


What you can expect from this notebook;

  1. Introduction to Data Science using python
    1. What is Data Science?
    2. Why do we need Data Science?
    3. Brief overview of Topics
      1. Big Data Analytics
      2. Machine Learning & Deep Learning
  2. Pythonic way of Data Science
    1. Brief Intro to Python Programming Language
    2. Python for Data Science
    3. Intro to Data Processing, Statistical analysis and Visualization libraries
      1. numpy, pandas, scipy
      2. matplotlib, seaborn, plotly
    4. Intro to Model Building and inference frameworks
      1. Scikit Learn, Tensorflow, Pytorch
  3. Approaching a Tabular(Structured) Problem (Hands On)
    1. Understanding the Problem
      1. Understanding the problem type
      2. Class imbalances and necessary fixes
      3. Understanding features and its types
    2. Exploratory Data Analysis
      1. Missing Data Imputation
      2. Identifying correlation, collinearity of features
      3. Data Distribution and statistical analysis
      4. Outlier Analysis
    3. Data Preprocessing
      1. Dimensionality reduction - Curse of dimensionality
      2. Data Preprocessing
        1. Normalization, MinMax Scalar, Standardization
        2. Categorical Encoding - OneHot Encoder
    4. Feature Engineering
      1. Combining Features
      2. Splitting Temporal features
    5. Feature Selection
      1. Removing features
      2. Choosing the right features to improve prediction power
    6. Model Building - A Machine Learning approach
      1. Hyper parameter tuning and Grid Search
      2. Logistic Regression
      3. Ensemble - Bagging and Boosting
        1. Gradient Boosting Classifier,Stochastic Gradient Boosting (SGB),XgBoostVoting Classifier
      4. Choosing Best classifier
        1. Choosing the right classifier based on evaluation criteria
        2. Classifier Inference on example data
  4. Approaching a Text(NLP) Problem(Hands On)
    1. Importance of solving NLP
    2. Applications of NLP
      1. chatbots, sentiment analysis, translation, autocomplete, document search ..etc
    3. Intro to Text
      1. Tokens, Corpus,Tokenization, Stemming, Lemmatization,N-grams ..etc
    4. Brief Intro to basic text processing libraries
      1. NLTK, spacy
    5. Solving a Real World Tweet Classification Problem
      1. Understanding the problem
      2. Basic EDA of tweets
        1. Class distribution, distribution of length of tweets
        2. Common Stopwords, words in tweets w/o stopwords,bigrams in tweets
        3. WordClouds of tweets
      3. Data Cleaning
        1. Handling stopwords, special characters, url, html,handler, emoji
      4. Text Vectorization
        1. CountVectorizer, Bag of Words, TF-IDF
  5. Approaching a Vision Problem (Hands On)
    1. An introduction to computer vision
      1. What is Computer Vision?
      2. How is computer vision used today?
    2. Image Processing
      1. Point Operators
        1. Pixel Transforms
        2. Color Transforms
        3. Compositing and matting
        4. Histogram Equalization
      2. Linear Filtering
        1. Separable Filtering
        2. Band Pass and Steerable Filters
      3. More neighborhood operators
        1. Non-linear filtering
        2. Bilateral filtering
        3. Binary Image processing
      4. Fourier Transforms
        1. Two-dimensional Fourier Transforms
      5. Pyramid and wavelets
        1. Interpolation
        2. Decimation
        3. Multi-resolution representations
        4. Wavelts
      6. Geometrics transformations
        1. Parametric transformations
        2. Mesh-based warping
    3. OpenCV Library [Hands On]
      1. Introduction
      2. Changing colorspaces
      3. Geometric transformations of Images
      4. Image thresholding
      5. Smoothing Images
      6. Morphological Transformations
      7. Image Gradients
      8. Canny Edge Detection
      9. Image Pyramids
      10. Contours
      11. Histograms
      12. Image Transforms