A six-week, first-principles course that builds modern computer vision from scratch in pure NumPy — bridging the classical mathematics of image processing to the convolutional networks that power vision today. Every operator, from a Gaussian blur to a backprop'd convolutional layer, is derived and implemented by hand before any high-level library is allowed.
Each lecture is a self-contained Jupyter notebook: the mathematical derivation, a from-scratch
implementation, visualizations on real images, verification against scipy/skimage where
relevant, and exercises with worked solutions.
Most computer vision courses start with import torch and treat convolution as a black box. This
one starts with the integral that defines convolution and ends with a CNN whose every gradient you
have derived yourself. The through-line is always:
derive the math → implement it in NumPy → visualize it on a real image → verify against a reference → connect it to a modern model.
By the end you will have hand-written: 2D convolution, Gaussian and derivative filters, the Canny edge detector, Harris corners, a Hough transform, the Lucas–Kanade optical flow equations, k-means and graph-based segmentation, a full convolutional layer with backpropagation, and a small trained CNN — and you will understand exactly which classical idea each deep-learning component generalizes.
- Comfortable Python and NumPy (arrays, slicing, broadcasting)
- Linear algebra at the level of the companion
linear_algebra_for_mlcourse (vectors, matrices, convolution as a linear operator, eigenvalues for the Harris detector and PCA) - Basic calculus (gradients, the chain rule for backprop)
What a digital image is: a function sampled onto a grid of numbers. Pixels, channels, color spaces, histograms, and point operations (brightness, contrast, gamma, thresholding, histogram equalization). The foundation everything else builds on.
01_images_as_arrays.ipynb02_point_operations_histograms.ipynb
The single most important operation in vision. Derive convolution from first principles, implement it from scratch, and build the classic filters: box, Gaussian, sharpening. Separable kernels, borders, and why convolution is a linear shift-invariant system.
03_convolution_from_scratch.ipynb04_smoothing_gaussian_filters.ipynb
Where does information live in an image? At its edges. Image gradients, Sobel operators, the Laplacian, and the full multi-stage Canny edge detector built by hand — non-maximum suppression and hysteresis included.
05_image_gradients_edges.ipynb06_canny_edge_detector.ipynb
Finding and describing distinctive points. The Harris corner detector (an eigenvalue problem — straight from linear algebra), blob detection across scale, simple descriptors, and feature matching between two views — the basis of panoramas and structure-from-motion.
07_harris_corners.ipynb08_blobs_features_matching.ipynb
Grouping pixels and tracking them over time. Thresholding, k-means color segmentation, the Hough transform for lines, and the Lucas–Kanade optical flow equations derived and implemented from the brightness-constancy assumption.
09_segmentation_hough_kmeans.ipynb10_optical_flow_lucas_kanade.ipynb
The bridge to modern vision. Build a convolutional layer with full forward and backward passes in NumPy, add pooling and a classifier head, train a small CNN on real digits, and visualize the learned filters — connecting them all the way back to the hand-designed edge detectors of Week 3.
11_cnn_layer_from_scratch.ipynb12_training_a_cnn_capstone.ipynb
computer_vision/
├── README.md
├── requirements.txt
├── notebooks/ # the 12 lecture notebooks
├── utils/ # shared image & plotting helpers
│ └── cv_utils.py
├── data/ # sample images are generated in-notebook (no large binaries)
└── assets/ # exported figures
git clone https://github.com/HAYDARKILIC/computer_vision.git
cd computer_vision
pip install -r requirements.txt
jupyter labOpen notebooks/01_images_as_arrays.ipynb and work top to bottom. Every notebook runs standalone
and generates its own sample imagery, so there are no large image downloads.
- Read the derivation — each operator is motivated mathematically, not dumped as code.
- Type the implementation yourself before reading the provided one.
- Run the visualizations — change the kernel, the threshold, the image; watch what breaks.
- Do the exercises — solutions are collapsed at the bottom of each notebook.
- Connect it forward — each notebook ends with "Where this shows up in modern vision."
MIT — free to use for teaching and self-study.