Python, OpenCV, NumPy, Matplotlib & Image Processing
0. Introduction
This document summarizes the most important fundamentals of image processing using OpenCV, NumPy, and Matplotlib.
All examples are written in Python and run in both standalone scripts and Jupyter Notebooks (with Matplotlib adaptations).
1. Images & NumPy Arrays – Basic Concept
-
Images are represented as NumPy Arrays:
(height, width, channels) -
Pixel values: 0–255, data type
uint8 -
Colors:
-
RGB = Red / Green / Blue
-
HSV = Hue / Saturation / Value
-
OpenCV uses BGR by default
-
Loading an Image & Array Info
import cv2
import numpy as np
img = cv2.imread("image.jpg")
print(type(img)) # numpy.ndarray
print(img.shape) # e.g., (1080, 1920, 3)
print(img.dtype) # uint8
Wikipedia Links
2. Loading, Displaying, Saving Images (OpenCV)
2.1 Loading an Image
img = cv2.imread("image.jpg", cv2.IMREAD_COLOR)
gray = cv2.imread("image.jpg", cv2.IMREAD_GRAYSCALE)
2.2 Displaying an Image (for pure Python scripts)
cv2.imshow("Window", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
2.3 Saving an Image
cv2.imwrite("output.jpg", img)
3. Drawing on Images (Shapes, Text)
import cv2
import numpy as np
img = np.zeros((600, 800, 3), dtype=np.uint8)
# Line
cv2.line(img, (50,50), (300,50), (0,255,0), 3)
# Rectangle
cv2.rectangle(img, (50,100), (300,300), (255,0,0), 5)
# Square (filled)
cv2.rectangle(img, (350,100), (550,300), (0,0,255), 3)
# Circle
cv2.circle(img, (200,450), 80, (0,255,255), -1)
# Polygon
pts = np.array([[500,400], [700,500], [650,550], [480,500]], np.int32)
cv2.polylines(img, [pts], True, (255,255,0), 3)
# Text
cv2.putText(img, "Hello OpenCV!", (50,580),
cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), 2)
cv2.imshow("Drawings", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
4. Mouse Events (Interactive Drawing)
4.1 Example: Draw Circle on Click
import cv2
import numpy as np
img = np.zeros((600,800,3), dtype=np.uint8)
def click_event(event, x, y, flags, param):
if event == cv2.EVENT_LBUTTONDOWN:
cv2.circle(img, (x,y), 20, (0,255,0), -1)
cv2.imshow("Window", img)
cv2.namedWindow("Window")
cv2.setMouseCallback("Window", click_event)
cv2.imshow("Window", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
4.2 Overview of Mouse Events
| Event | Meaning |
|---|---|
EVENT_LBUTTONDOWN |
Left mouse button pressed |
EVENT_LBUTTONUP |
Left mouse button released |
EVENT_LBUTTONDBLCLK |
Left double click |
EVENT_RBUTTONDOWN |
Right mouse button pressed |
EVENT_RBUTTONUP |
Right mouse button released |
EVENT_MBUTTONDOWN |
Middle button (scroll wheel click) |
EVENT_MOUSEMOVE |
Mouse moved |
EVENT_MOUSEWHEEL |
Mouse wheel vertical |
EVENT_MOUSEHWHEEL |
Mouse wheel horizontal |
5. Matplotlib in Jupyter
OpenCV displays images as BGR, while Matplotlib expects RGB.
import matplotlib.pyplot as plt
import cv2
img = cv2.imread("image.jpg")
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.figure(figsize=(10, 6))
plt.imshow(img_rgb)
plt.axis("off")
6. Image Processing Basics
6.1 Blending – addWeighted
blend = cv2.addWeighted(img1, 0.7, img2, 0.3, 0)
Allows for image superposition and transparency effects.
6.2 Bitwise Operations
bit_and = cv2.bitwise_and(img, img, mask=mask)
bit_or = cv2.bitwise_or(img, img, mask=mask)
bit_not = cv2.bitwise_not(img)
bit_xor = cv2.bitwise_xor(img, img, mask=mask)
Ideal for masking & segmentation.
6.3 Color Conversions & Histograms (cvtColor)
OpenCV uses BGR, but many operations (Histograms, Thresholding, Preprocessing, Matplotlib) require GRAY or RGB.
Common Conversions:
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
Important reasons for color conversion:
-
Gray: Ideal for thresholding, edge detection, histograms, feature detection.
-
RGB: Required by Matplotlib.
-
HSV: Color segmentation (Hue for color, Value for brightness).
✔️ Colormapping (Detailed)
A Colormap takes a 1-channel image (e.g., Gray) and assigns a color to every value from 0–255:
heat = cv2.applyColorMap(gray, cv2.COLORMAP_JET)
Typical Colormaps:
-
COLORMAP_JET→ Blue → Red -
COLORMAP_HOT→ Black → Red → Yellow → White -
COLORMAP_OCEAN -
COLORMAP_TURBO(Modern alternative to “JET”)
6.3.1 Histograms – RGB, calcHist, and Histogram Equalization
Histograms show the distribution of pixel intensities.
We distinguish between:
-
RGB Histograms → Color distribution (3 channels separate)
-
Grayscale Histograms → Contrast analysis
-
Histogram Equalization → Contrast enhancement
We show all three channels (R, G, B) in one diagram:
img = cv2.imread("image.jpg")
b, g, r = cv2.split(img)
plt.figure(figsize=(10,5))
plt.title("RGB Histograms")
plt.xlabel("Pixel Value")
plt.ylabel("Count")
plt.plot(cv2.calcHist([b], [0], None, [256], [0,256]), color='blue')
plt.plot(cv2.calcHist([g], [0], None, [256], [0,256]), color='green')
plt.plot(cv2.calcHist([r], [0], None, [256], [0,256]), color='red')
plt.xlim([0,256])
plt.show()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
hist = cv2.calcHist([gray], [0], None, [256], [0,256])
| Parameter | Meaning |
|---|---|
[gray] |
Input image list |
[0] |
Channel 0 (the only channel in gray image) |
None |
No mask |
[256] |
256 bins in the histogram |
[0,256] |
Range of values |
Histogram Equalization (Contrast Improvement)
Histogram Equalization stretches dark areas → more details become visible.
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
equalized = cv2.equalizeHist(gray)
cv2.imshow("Original Gray", gray)
cv2.imshow("Equalized", equalized)
cv2.waitKey(0)
6.4 Thresholding – Types & Explanation
| Type | Description |
|---|---|
| THRESH_BINARY | Pixel ≥ T → max, else 0 |
| THRESH_BINARY_INV | Inverted: Pixel ≥ T → 0, else max |
| THRESH_TRUNC | Pixel ≥ T → reduced to T, else stays |
| THRESH_TOZERO | Pixel < T → 0, else stays |
| THRESH_TOZERO_INV | Pixel ≥ T → 0 |
| ADAPTIVE_MEAN_C | Local threshold = Mean of neighborhood |
| ADAPTIVE_GAUSSIAN_C | Local threshold = Gaussian-weighted mean |
| OTSU | Automatically calculates the best threshold |
Example:
ret, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
adaptive = cv2.adaptiveThreshold(gray, 255,
cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY, 11, 2)
6.5 Blurring (Averaging, Gaussian, Median, Bilateral)
Averaging Blur (cv2.blur)
Uniform kernel, smooths heavily.
blur = cv2.blur(img, (7,7))
Gaussian Blur (cv2.GaussianBlur)
Weights pixels according to Gaussian distribution → very soft blur.
gauss = cv2.GaussianBlur(img, (7,7), 1.5)
Median Blur (cv2.medianBlur)
Very good against Salt-and-Pepper Noise.
med = cv2.medianBlur(img, 5)
Bilateral Filter (cv2.bilateralFilter)
Smooths without losing edges, as color differences are taken into account.
bil = cv2.bilateralFilter(img, 9, 75, 75)
6.6 Kernel Size & Interactive Kernel Demo
👉 Interactive Site: https://setosa.io/ev/image-kernels/
Visually demonstrates how convolution kernels work.
6.7 2D Convolution – Briefly Explained
Principle:
$$New\ Value = \sum (Kernel \times Pixel\ Neighborhood)$$
Example:
kernel = np.ones((5,5), np.float32)/25
conv = cv2.filter2D(img, -1, kernel)
Used for:
-
Blur
-
Sharpen
-
Edge Detection
-
Custom Filters
6.8 Noise (Adding Noise)
noise = np.random.randint(0,2,(img.shape[0],img.shape[1]))
sp = img.copy()
sp[noise == 0] = 0
sp[noise == 1] = 255
Simulates Salt-and-Pepper noise.
6.9 Bilateral Filter – Brief Description
The bilateral filter considers:
-
Spatial Distance (like Gaussian)
-
Color Similarity (prevents blurring of edges)
Result:
-
Smooth image
-
Sharp edges are preserved
6.10 Morphological Operations – Erosion & Dilation (Briefly touched)
Morphological operations work on binary images (black/white) and change the shape of objects.
They are based on a small structuring element (“Kernel”).
Erosion (cv2.erode)
→ Objects shrink
-
Removes small white pixel areas
-
“Eats away” pixels at object boundaries
-
Very good for removing noise or thin lines
kernel = np.ones((5,5), np.uint8)
eroded = cv2.erode(thresh, kernel, iterations=1)
Dilation (cv2.dilate)
→ Objects grow
-
Makes white objects larger
-
Closes small holes
-
Good after erosion or to highlight contours
kernel = np.ones((5,5), np.uint8)
dilated = cv2.dilate(thresh, kernel, iterations=1)
7. Template Matching, Corner Detection, Edge Detection, Grid Detection & Contours (Compact + Complete)
7.1 Template Matching
Template Matching searches for a small pattern (Template) in a larger image.
6 Matching Methods (OpenCV)
| Method | Meaning |
|---|---|
TM_SQDIFF |
Lower = better match |
TM_SQDIFF_NORMED |
Normalized variant |
TM_CCORR |
Higher = better |
TM_CCORR_NORMED |
Normalized |
TM_CCOEFF |
Correlation coefficient (brightness robust) |
TM_CCOEFF_NORMED |
Normalized (often the best choice) |
import cv2
import numpy as np
img = cv2.imread("image.jpg", 0)
tpl = cv2.imread("tpl.jpg", 0)
h, w = tpl.shape
methods = [
cv2.TM_SQDIFF, cv2.TM_SQDIFF_NORMED,
cv2.TM_CCORR, cv2.TM_CCORR_NORMED,
cv2.TM_CCOEFF, cv2.TM_CCOEFF_NORMED
]
for m in methods:
res = cv2.matchTemplate(img, tpl, m)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
top_left = min_loc if m in [cv2.TM_SQDIFF, cv2.TM_SQDIFF_NORMED] else max_loc
bottom_right = (top_left[0]+w, top_left[1]+h)
disp = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
cv2.rectangle(disp, top_left, bottom_right, (0,0,255), 2)
cv2.imshow(str(m), disp)
cv2.waitKey(0)
cv2.destroyAllWindows()
7.2 Corner Detection
7.2.1 Harris Corner
img = cv2.imread("image.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = np.float32(gray)
dst = cv2.cornerHarris(gray, 2, 3, 0.04)
dst = cv2.dilate(dst, None)
img[dst > 0.01 * dst.max()] = [0, 0, 255]
cv2.imshow("Harris", img)
cv2.waitKey(0)
7.2.2 Shi-Tomasi (Good Features To Track)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
corners = cv2.goodFeaturesToTrack(gray, 80, 0.01, 10)
corners = np.int0(corners)
for c in corners:
x, y = c.ravel()
cv2.circle(img, (x,y), 4, (0,255,0), -1)
cv2.imshow("Shi-Tomasi", img)
cv2.waitKey(0)
7.3 Edge Detection – Canny + Contours
Sobel Operator (Directed Edges)
The Sobel Operator detects edges in x or y direction:
-
Sobel X → Vertical edges
-
Sobel Y → Horizontal edges
-
Combination → Diagonal / general edges
Sobel X
sobelx = cv2.Sobel(img, cv2.CV_64F, 1, 0, ksize=3)
Sobel Y
sobely = cv2.Sobel(img, cv2.CV_64F, 0, 1, ksize=3)
Magnitude (Strength of the edge)
sobel_mag = cv2.magnitude(sobelx, sobely)
Scharr Operator (Better than Sobel)
For very fine details, use the Scharr Operator:
scharrx = cv2.Scharr(img, cv2.CV_64F, 1, 0)
scharry = cv2.Scharr(img, cv2.CV_64F, 0, 1)
It is like Sobel but mathematically more precise.
Laplacian (Second Derivative → General Edges)
The Laplacian operator detects edges in all directions simultaneously because it calculates the second derivative.
laplace = cv2.Laplacian(img, cv2.CV_64F)
➡️ Result: very “thin”, sharp edges.
➡️ Sensitive to noise → blur beforehand!
7.3.1 Canny
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray, 100, 200)
cv2.imshow("Canny", edges)
cv2.waitKey(0)
7.3.2 Extracting Contours from Canny
contours, hierarchy = cv2.findContours(edges,
cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
7.3.3 Drawing Contours after Canny
img_c = img.copy()
cv2.drawContours(img_c, contours, -1, (0,255,0), 2)
cv2.imshow("Canny Contours", img_c)
cv2.waitKey(0)
7.3.4 Fine Contours (approxPolyDP)
img_a = img.copy()
for cnt in contours:
eps = 0.01 * cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, eps, True)
cv2.drawContours(img_a, [approx], -1, (0,0,255), 2)
cv2.imshow("Canny Approx", img_a)
cv2.waitKey(0)
7.4 Grid Detection (Chessboard & Circle Grid)
Important for calibration, tracking, pose estimation.
7.4.1 Chessboard
pattern = (7,6)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
found, corners = cv2.findChessboardCorners(gray, pattern)
if found:
cv2.drawChessboardCorners(img, pattern, corners, found)
cv2.imshow("Chessboard", img)
cv2.waitKey(0)
7.4.2 Circle Grid
pattern = (4,11)
found, centers = cv2.findCirclesGrid(gray, pattern,
flags=cv2.CALIB_CB_ASYMMETRIC_GRID)
if found:
cv2.drawChessboardCorners(img, pattern, centers, found)
cv2.imshow("Circle Grid", img)
cv2.waitKey(0)
7.4.3 Grid Detection in Videos (Briefly)
cap = cv2.VideoCapture("video.mp4")
pattern = (7,6)
while True:
ret, frame = cap.read()
if not ret:
break
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
found, corners = cv2.findChessboardCorners(gray, pattern)
if found:
cv2.drawChessboardCorners(frame, pattern, corners, found)
cv2.imshow("Grid Video", frame)
if cv2.waitKey(1) == 27:
break
cap.release()
cv2.destroyAllWindows()
7.5 Contour Detection (Object Contours)
7.5.1 Finding Contours
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
contours, hierarchy = cv2.findContours(thresh,
cv2.RETR_TREE,
cv2.CHAIN_APPROX_SIMPLE)
7.5.2 Drawing Contours
img_cnt = img.copy()
cv2.drawContours(img_cnt, contours, -1, (0,255,0), 2)
cv2.imshow("Contours", img_cnt)
cv2.waitKey(0)
7.5.3 Fine Contours (Polygon Approximation)
cnt = max(contours, key=cv2.contourArea)
eps = 0.01 * cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, eps, True)
cv2.drawContours(img, [approx], -1, (0,0,255), 2)
7.5.4 Filtering Contours & Understanding Hierarchy
Filtering by size (e.g., only large objects)
filtered = [c for c in contours if cv2.contourArea(c) > 1000]
cv2.drawContours(img, filtered, -1, (255,255,0), 2)
Only Outer Contours (Background/Top-Level)
cont_ext, hier_ext = cv2.findContours(thresh,
cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(img, cont_ext, -1, (0,0,255), 2)
Hierarchy Briefly Explained
hierarchy[0][i] = [next, prev, first_child, parent]
E.g., only contours without a parent → outermost level
top_level = []
for i, cnt in enumerate(contours):
if hierarchy[0][i][3] == -1: # parent == -1
top_level.append(cnt)
cv2.drawContours(img, top_level, -1, (0,255,255), 2)
Would you like me to create a specific example script combining some of these techniques, such as a real-time object detector using contours and color filtering?