🧭 01 — Classification Overview¶
Classification predicts a category.
Examples:
- spam vs not spam
- churn vs not churn
- fraud vs legitimate
- disease vs no disease
- sentiment: positive, neutral, negative
Binary vs Multiclass¶
| Type | Example |
|---|---|
| Binary | churn yes/no |
| Multiclass | low/medium/high risk |
| Multilabel | movie can be comedy and drama |
Classification Workflow¶
- split with stratification
- preprocess features
- train classifier
- predict labels/probabilities
- evaluate with suitable metrics
- inspect confusion matrix and errors
Common Models¶
| Model | Good For |
|---|---|
| Logistic Regression | interpretable baseline |
| KNN | simple distance-based classification |
| Naive Bayes | text and simple probabilistic tasks |
| Decision Tree | interpretable nonlinear rules |
| Random Forest | strong general-purpose baseline |
| Gradient Boosting | high-performing tabular model |