Skip to content

🧪 06 — Exercises: NLP Basics

Tasks

  • clean a small text column
  • create Bag of Words features
  • create TF-IDF features
  • train Logistic Regression sentiment classifier
  • inspect top words
  • explain when transformers are useful

Interview Questions

Q1: What is tokenization?

Splitting text into smaller units like words or subwords.

Q2: What is TF-IDF?

A weighting method that values words important to a document but not common everywhere.


Next

➡️ End-to-End Mini Project