🔢 02 — Numeric Features¶
Numeric features often need scaling, transformation, or binning.
Scaling¶
Use scaling for distance-based and linear models.
Models that often benefit:
- Logistic Regression
- Linear Regression with regularization
- KNN
- SVM
- Neural Networks
Tree models usually do not require scaling.
Log Transform¶
Useful for right-skewed values like income or spend.
log1p handles zero safely.
Binning¶
df["age_group"] = pd.cut(
df["age"],
bins=[0, 25, 40, 60, 100],
labels=["young", "adult", "middle", "senior"]
)
Binning can improve interpretability but may lose detail.
Interaction Features¶
df["revenue_per_order"] = df["revenue"] / df["order_count"]
df["price_per_unit"] = df["revenue"] / df["quantity"]
Create interactions when domain logic supports them.