⚡ 07 — NumPy Cheat Sheet¶
Quick Reference for NumPy Fundamentals¶
How to Use This
Keep this file open while solving NumPy exercises. It is designed for fast lookup, not deep explanation.
📦 Import Convention¶
This is the standard convention used in almost every Data Science project.
🧠 Core Ideas¶
| Concept | Meaning |
|---|---|
ndarray |
NumPy's main N-dimensional array object |
| Shape | Size of each dimension, e.g. (3, 4) |
| Dimension / Axis | Direction in the array: rows, columns, depth |
dtype |
Data type of array values, e.g. int64, float64 |
| Vectorization | Applying operations to whole arrays without Python loops |
| Broadcasting | Operating on arrays with different but compatible shapes |
Check Array Basics¶
arr = np.array([[1, 2, 3],
[4, 5, 6]])
print(arr.shape) # (2, 3)
print(arr.ndim) # 2
print(arr.size) # 6
print(arr.dtype) # int64 or int32
🏗️ Array Creation¶
From Python Lists¶
Common Constructors¶
np.zeros(5) # [0. 0. 0. 0. 0.]
np.ones((2, 3)) # 2x3 array of ones
np.full((2, 3), 7) # 2x3 array filled with 7
np.eye(3) # 3x3 identity matrix
np.empty((2, 2)) # uninitialized values
Ranges and Sequences¶
Random Arrays¶
np.random.seed(42)
np.random.rand(3) # uniform values in [0, 1)
np.random.randn(3) # normal distribution
np.random.randint(1, 10, 5) # random integers
np.random.choice([10, 20, 30], size=4)
Create Like Another Array¶
🔢 Data Types¶
| dtype | Use Case |
|---|---|
int32, int64 |
Whole numbers |
float32, float64 |
Decimal numbers |
bool |
Boolean masks |
str_ |
Text values, less common in NumPy |
Data Science Note
Most ML libraries expect numeric arrays. Convert strings or mixed data before modeling.
📐 Reshape and Flatten¶
arr = np.arange(12)
arr.reshape(3, 4) # shape (3, 4)
arr.reshape(2, 2, 3) # shape (2, 2, 3)
arr.reshape(-1, 3) # infer rows automatically
arr.flatten() # copy as 1D
arr.ravel() # view as 1D when possible
Transpose¶
🎯 Indexing and Slicing¶
1D Arrays¶
arr = np.array([10, 20, 30, 40, 50])
arr[0] # 10
arr[-1] # 50
arr[1:4] # [20 30 40]
arr[:3] # [10 20 30]
arr[::2] # [10 30 50]
2D Arrays¶
matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
matrix[0, 0] # 1
matrix[1, 2] # 6
matrix[0, :] # first row
matrix[:, 1] # second column
matrix[0:2, 1:3] # rows 0-1, columns 1-2
Modify Values¶
✅ Boolean Indexing¶
arr = np.array([10, 25, 30, 45, 50])
mask = arr > 30
print(mask) # [False False False True True]
print(arr[mask]) # [45 50]
Common Filters¶
Use &, |, and ~
With NumPy arrays, use & instead of and, | instead of or, and wrap each condition in parentheses.
Replace Values Conditionally¶
🎲 Fancy Indexing¶
matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
matrix[[0, 2]] # rows 0 and 2
matrix[[0, 1], [1, 2]] # elements (0,1) and (1,2): [2 6]
➕ Vectorized Operations¶
arr = np.array([1, 2, 3, 4])
arr + 10 # [11 12 13 14]
arr * 2 # [2 4 6 8]
arr ** 2 # [1 4 9 16]
arr / 2 # [0.5 1. 1.5 2. ]
Array-to-Array Operations¶
Avoid This¶
Prefer This¶
🧮 Universal Functions¶
arr = np.array([1, 4, 9, 16])
np.sqrt(arr)
np.log(arr)
np.exp(arr)
np.sin(arr)
np.abs(np.array([-3, -2, 1]))
np.round(np.array([1.234, 5.678]), 2)
📊 Aggregations¶
arr = np.array([10, 20, 30, 40])
arr.sum()
arr.mean()
arr.min()
arr.max()
arr.std()
arr.var()
arr.argmax() # index of max value
arr.argmin() # index of min value
Axis-Based Aggregations¶
matrix = np.array([[1, 2, 3],
[4, 5, 6]])
matrix.sum() # 21
matrix.sum(axis=0) # column sums: [5 7 9]
matrix.sum(axis=1) # row sums: [6 15]
matrix.mean(axis=0) # column means
matrix.mean(axis=1) # row means
Axis Memory Trick
axis=0 collapses rows and returns one result per column.
axis=1 collapses columns and returns one result per row.
📡 Broadcasting¶
Broadcasting lets NumPy operate on arrays with different but compatible shapes.
Simple Cases¶
Broadcasting Rules¶
Compare shapes from right to left:
- Dimensions are compatible if they are equal.
- Dimensions are compatible if one of them is
1. - If neither condition is true, broadcasting fails.
Shape Examples¶
| Shape A | Shape B | Works? | Result |
|---|---|---|---|
(3,) |
() |
Yes | (3,) |
(2, 3) |
(3,) |
Yes | (2, 3) |
(2, 3) |
(2, 1) |
Yes | (2, 3) |
(2, 3) |
(2,) |
No | Error |
(3, 1) |
(1, 4) |
Yes | (3, 4) |
Add New Axis¶
🧼 Common Data Cleaning Patterns¶
Normalize Values¶
Min-Max Scale¶
Clip Outliers¶
Handle Missing Values¶
values = np.array([1.0, np.nan, 3.0])
np.isnan(values)
values[~np.isnan(values)]
np.nanmean(values)
np.nanmedian(values)
np.nanstd(values)
Conditional Selection¶
🔗 Combining Arrays¶
x = np.array([[1, 2],
[3, 4]])
y = np.array([[5, 6],
[7, 8]])
np.vstack([x, y]) # stack rows
np.hstack([x, y]) # stack columns
np.concatenate([x, y], axis=0)
np.concatenate([x, y], axis=1)
✂️ Splitting Arrays¶
🔍 Sorting and Unique Values¶
Argsort¶
scores = np.array([70, 90, 80])
order = np.argsort(scores)
print(order) # [0 2 1]
print(scores[order]) # [70 80 90]
🧱 Linear Algebra Basics¶
a = np.array([[1, 2],
[3, 4]])
b = np.array([[10, 20],
[30, 40]])
a @ b # matrix multiplication
np.dot(a, b) # also matrix multiplication
np.linalg.inv(a) # inverse
np.linalg.det(a) # determinant
* vs @
* performs element-wise multiplication.
@ performs matrix multiplication.
💾 Save and Load Arrays¶
🧪 Mini Recipes¶
Create a 3x3 Matrix from 1 to 9¶
Extract Even Numbers¶
Replace Negative Values with Zero¶
Standardize Each Column¶
data = np.array([[10, 100],
[20, 200],
[30, 300]])
standardized = (data - data.mean(axis=0)) / data.std(axis=0)
Normalize Each Row¶
data = np.array([[1, 2, 3],
[4, 5, 6]])
row_sums = data.sum(axis=1, keepdims=True)
normalized = data / row_sums
Convert Celsius to Fahrenheit¶
🚨 Common Errors¶
Shape Mismatch¶
Fix by checking shapes:
Using and / or with Arrays¶
Forgetting Parentheses in Boolean Conditions¶
Accidentally Modifying a View¶
Use .copy() when you need independent data:
🧾 Quick Command Table¶
| Task | Code |
|---|---|
| Import NumPy | import numpy as np |
| Create array | np.array([1, 2, 3]) |
| Create zeros | np.zeros((2, 3)) |
| Create ones | np.ones((2, 3)) |
| Create range | np.arange(0, 10, 2) |
| Create evenly spaced values | np.linspace(0, 1, 5) |
| Random integers | np.random.randint(1, 10, size=5) |
| Shape | arr.shape |
| Dimensions | arr.ndim |
| Total elements | arr.size |
| Data type | arr.dtype |
| Reshape | arr.reshape(3, 4) |
| Flatten | arr.flatten() |
| Transpose | arr.T |
| Row slice | matrix[0, :] |
| Column slice | matrix[:, 0] |
| Boolean filter | arr[arr > 10] |
| Conditional values | np.where(condition, a, b) |
| Sum | arr.sum() |
| Mean | arr.mean() |
| Standard deviation | arr.std() |
| Min / max | arr.min(), arr.max() |
| Sort | np.sort(arr) |
| Unique values | np.unique(arr) |
| Concatenate | np.concatenate([a, b]) |
| Matrix multiply | a @ b |
| Save array | np.save("file.npy", arr) |
| Load array | np.load("file.npy") |
🎤 Interview Quick Answers¶
Why is NumPy faster than Python lists?
NumPy arrays store values in contiguous memory with a fixed data type, so operations can run in optimized compiled code instead of slow Python loops.
What is vectorization?
Vectorization means applying operations to entire arrays at once instead of looping element by element in Python.
What is broadcasting?
Broadcasting is NumPy's rule-based system for performing operations between arrays of different but compatible shapes.
What is the difference between arr.shape and arr.size?
shapereturns the dimensions of the array.sizereturns the total number of elements.
What is the difference between a view and a copy?
A view shares memory with the original array. A copy owns independent memory. Modifying a view can modify the original.
What does axis=0 mean?
It means operate down the rows, producing one result per column.
What does axis=1 mean?
It means operate across columns, producing one result per row.
✅ Final Checklist¶
- [ ] I can create arrays using
array,zeros,ones,arange, andlinspace - [ ] I can check
shape,ndim,size, anddtype - [ ] I can index and slice 1D and 2D arrays
- [ ] I can filter arrays using Boolean masks
- [ ] I can replace loops with vectorized operations
- [ ] I can use aggregations with
axis=0andaxis=1 - [ ] I can predict simple broadcasting behavior
- [ ] I can handle
np.nanvalues - [ ] I can reshape, flatten, stack, and split arrays
🔗 Navigation¶
| Previous | Back to Agenda |
|---|---|
| 06-numpy-exercises | 00-agenda |
Tags: #numpy #cheat-sheet #arrays #indexing #vectorization #broadcasting