🧪 06 — Exercises: NLP Basics¶
Tasks¶
- clean a small text column
- create Bag of Words features
- create TF-IDF features
- train Logistic Regression sentiment classifier
- inspect top words
- explain when transformers are useful
Interview Questions¶
Q1: What is tokenization?
Splitting text into smaller units like words or subwords.
Q2: What is TF-IDF?
A weighting method that values words important to a document but not common everywhere.