Skip to content

🧪 05 — Exercises: Machine Learning Basics

Exercise 1 — Identify Problem Type

Classify each as regression, classification, clustering, or dimensionality reduction:

  • predict house price
  • detect spam email
  • group customers by behavior
  • compress 50 features into 2

Exercise 2 — Train/Test Split

from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split

data = load_diabetes(as_frame=True)
X = data.data
y = data.target

Split into 80% train and 20% test.

Exercise 3 — First Model

Train a decision tree on the Iris dataset and report accuracy.

Exercise 4 — Leakage Check

Explain why scaling before train/test split can be leakage.


Self-Check

  • [ ] I can define features and target
  • [ ] I can choose supervised vs unsupervised
  • [ ] I can split train/test safely
  • [ ] I can train a first scikit-learn model

Next

➡️ Regression Algorithms