Data Science

70005000

Week 1: Core Python, Data Handling, and Statistics (Days 1–7)

📍 Day 1 – Intro to Data Science

  • What is Data Science?
  • Lifecycle: Data Collection → Cleaning → Analysis → Modeling → Deployment
  • Tools: Jupyter, Python, Git, Excel

📍 Day 2 – Python for Data Science

  • Data types, loops, functions
  • List, Dict, Tuple, Set
  • File handling

📍 Day 3 – Numpy & Pandas Basics

  • Arrays and matrix operations (Numpy)
  • Series & DataFrame (Pandas)
  • Importing CSV, Excel

📍 Day 4 – Data Cleaning & Preprocessing

  • Handling missing values
  • Duplicates, nulls
  • Renaming, replacing, mapping

📍 Day 5 – Exploratory Data Analysis (EDA)

  • Descriptive statistics
  • Groupby, sorting, filtering
  • Hands-on: Titanic dataset

📍 Day 6 – Data Visualization

  • Matplotlib, Seaborn
  • Plot types: histogram, bar, scatter, box, heatmap
  • Hands-on: Correlation analysis

📍 Day 7 – Statistics for Data Science

  • Mean, median, mode, std dev, variance
  • Probability basics
  • Distributions: normal, binomial

Week 2: Advanced Stats, ML Algorithms, and Model Building (Days 8–14)

📍 Day 8 – Inferential Stats & Hypothesis Testing

  • Confidence intervals
  • t-test, chi-square, ANOVA
  • p-value explained

📍 Day 9 – Linear Regression

  • Simple and multiple regression
  • R², adjusted R²
  • Hands-on: House price prediction

📍 Day 10 – Classification: Logistic Regression

  • Binary vs multi-class classification
  • Sigmoid function
  • Evaluation: Confusion matrix, ROC curve

📍 Day 11 – Decision Trees & Random Forest

  • Splitting criteria: Gini, Entropy
  • Overfitting, pruning
  • Hands-on: Loan approval prediction

📍 Day 12 – KNN & Naive Bayes

  • Distance metrics in KNN
  • Bayes theorem and Gaussian NB
  • Hands-on: Email spam detection

📍 Day 13 – Unsupervised Learning

  • K-means clustering
  • Elbow method
  • PCA for dimensionality reduction

📍 Day 14 – Model Evaluation & Tuning

  • Cross-validation
  • GridSearchCV, RandomSearchCV
  • Bias-variance tradeoff

 Week 3: Projects, Real-World Tools & Career Prep (Days 15–21)

📍 Day 15 – Time Series Analysis

  • Date/time handling
  • Rolling mean, autocorrelation
  • Forecasting with ARIMA (brief)

📍 Day 16 – Natural Language Processing (NLP)

  • Text cleaning (tokenize, stopwords, stemming)
  • TF-IDF
  • Sentiment analysis mini project

📍 Day 17 – SQL for Data Science

  • SELECT, WHERE, JOIN, GROUP BY
  • Subqueries
  • Practice with sample database (e.g., SQLite or MySQL)

📍 Day 18 – Working with Real Datasets

  • Kaggle datasets
  • End-to-end EDA + model
  • Hands-on: Diabetes prediction / Customer churn

📍 Day 19 – Mini Capstone Project

Choose 1:
  • Sales prediction
  • Fake news detection
  • Movie recommendation system
  • Smart city traffic analysis

📍 Day 20 – Model Deployment

  • Save model with Pickle/Joblib
  • Flask/Streamlit web app
  • Deploy to Heroku (or local server)

📍 Day 21 – Career in Data Science

  • Resume tips, GitHub portfolio
  • Data science roles: Analyst, ML engineer, DS
  • Certifications, interview prep (case studies, SQL/ML Qs)

🧰 Tools & Libraries:

  • Python (Jupyter Notebook)
  • Numpy, Pandas, Matplotlib, Seaborn
  • Scikit-learn
  • SQL (SQLite / MySQL)
  • Streamlit or Flask for deployment
  • Kaggle for datasets
Course Info