Back

scale invariant feature transform (SIFT)

SIFT is a robust algorithm for detecting and describing distinctive local features in images that are invariant to scale and rotation and resilient to moderate changes in illumination and viewpoint. It works by:

  1. Detecting scale-space extrema using difference-of-Gaussians to find stable keypoints across scales.
  2. Localizing keypoints and rejecting low-contrast or poorly localized points.
  3. Assigning an orientation to each keypoint from local gradient directions for rotation invariance.
  4. Building a descriptor from weighted gradient histograms around the keypoint (typically 128 dimensions) for distinctive, repeatable matching.

Common uses: object recognition, image stitching, visual SLAM, stereo matching, and 3D reconstruction. Advantages: high distinctiveness and repeatability; robust to scale, rotation, and moderate illumination/viewpoint changes. Drawbacks: relatively computationally expensive and patented (historically), which led to faster alternatives (ORB, SURF) and learned descriptors in modern systems.

Share: