Skip to content

HAYDARKILIC/computer_vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Computer Vision: From Pixels to Deep Networks

A six-week, first-principles course that builds modern computer vision from scratch in pure NumPy — bridging the classical mathematics of image processing to the convolutional networks that power vision today. Every operator, from a Gaussian blur to a backprop'd convolutional layer, is derived and implemented by hand before any high-level library is allowed.

Each lecture is a self-contained Jupyter notebook: the mathematical derivation, a from-scratch implementation, visualizations on real images, verification against scipy/skimage where relevant, and exercises with worked solutions.


Philosophy

Most computer vision courses start with import torch and treat convolution as a black box. This one starts with the integral that defines convolution and ends with a CNN whose every gradient you have derived yourself. The through-line is always:

derive the math → implement it in NumPy → visualize it on a real image → verify against a reference → connect it to a modern model.

By the end you will have hand-written: 2D convolution, Gaussian and derivative filters, the Canny edge detector, Harris corners, a Hough transform, the Lucas–Kanade optical flow equations, k-means and graph-based segmentation, a full convolutional layer with backpropagation, and a small trained CNN — and you will understand exactly which classical idea each deep-learning component generalizes.


Prerequisites

  • Comfortable Python and NumPy (arrays, slicing, broadcasting)
  • Linear algebra at the level of the companion linear_algebra_for_ml course (vectors, matrices, convolution as a linear operator, eigenvalues for the Harris detector and PCA)
  • Basic calculus (gradients, the chain rule for backprop)

Curriculum

Week 1 — Images as Arrays & Point Operations

What a digital image is: a function sampled onto a grid of numbers. Pixels, channels, color spaces, histograms, and point operations (brightness, contrast, gamma, thresholding, histogram equalization). The foundation everything else builds on.

  • 01_images_as_arrays.ipynb
  • 02_point_operations_histograms.ipynb

Week 2 — Convolution & Linear Filtering

The single most important operation in vision. Derive convolution from first principles, implement it from scratch, and build the classic filters: box, Gaussian, sharpening. Separable kernels, borders, and why convolution is a linear shift-invariant system.

  • 03_convolution_from_scratch.ipynb
  • 04_smoothing_gaussian_filters.ipynb

Week 3 — Gradients, Edges & the Canny Detector

Where does information live in an image? At its edges. Image gradients, Sobel operators, the Laplacian, and the full multi-stage Canny edge detector built by hand — non-maximum suppression and hysteresis included.

  • 05_image_gradients_edges.ipynb
  • 06_canny_edge_detector.ipynb

Week 4 — Features, Corners & Matching

Finding and describing distinctive points. The Harris corner detector (an eigenvalue problem — straight from linear algebra), blob detection across scale, simple descriptors, and feature matching between two views — the basis of panoramas and structure-from-motion.

  • 07_harris_corners.ipynb
  • 08_blobs_features_matching.ipynb

Week 5 — Segmentation & Motion

Grouping pixels and tracking them over time. Thresholding, k-means color segmentation, the Hough transform for lines, and the Lucas–Kanade optical flow equations derived and implemented from the brightness-constancy assumption.

  • 09_segmentation_hough_kmeans.ipynb
  • 10_optical_flow_lucas_kanade.ipynb

Week 6 — Convolutional Neural Networks from Scratch

The bridge to modern vision. Build a convolutional layer with full forward and backward passes in NumPy, add pooling and a classifier head, train a small CNN on real digits, and visualize the learned filters — connecting them all the way back to the hand-designed edge detectors of Week 3.

  • 11_cnn_layer_from_scratch.ipynb
  • 12_training_a_cnn_capstone.ipynb

Repository structure

computer_vision/
├── README.md
├── requirements.txt
├── notebooks/          # the 12 lecture notebooks
├── utils/              # shared image & plotting helpers
│   └── cv_utils.py
├── data/               # sample images are generated in-notebook (no large binaries)
└── assets/             # exported figures

Getting started

git clone https://github.com/HAYDARKILIC/computer_vision.git
cd computer_vision
pip install -r requirements.txt
jupyter lab

Open notebooks/01_images_as_arrays.ipynb and work top to bottom. Every notebook runs standalone and generates its own sample imagery, so there are no large image downloads.


How to use this course

  1. Read the derivation — each operator is motivated mathematically, not dumped as code.
  2. Type the implementation yourself before reading the provided one.
  3. Run the visualizations — change the kernel, the threshold, the image; watch what breaks.
  4. Do the exercises — solutions are collapsed at the bottom of each notebook.
  5. Connect it forward — each notebook ends with "Where this shows up in modern vision."

License

MIT — free to use for teaching and self-study.

About

A research-grade masterclass rebuilding the entire Computer Vision stack from first principles. No OpenCV, no PyTorch—just pure math and NumPy. Implements custom convolutions, Canny edge detection, eigenvalue-based Harris corners, Lucas–Kanade optical flow, and a CNN layer with full backward passes from scratch.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors