Clustering similar images with pHash

back to all blogs

Clustering similar images with pHash

Published on Fri Oct 30 2020Kruttika Nadig

Tags:

Image hashing is a technique for generating distinct "fingerprints" of images which can be used to identify and group together similar images. "phash" is one of the most popular and effective hashing algorithms. We tried it on 10k images from our archive and had promising results.

This blog is a walkthrough of how we constructed the phashes with the Imagehash library, created easily navigable clusters (groups) of images whose fingerprints (hashes) are identical, and found images that are similar to a query image. An elegant feature of phashes is that similar images will have similar hashes. To know how the hashing algorithm works, check out this other blog

The code implementation of this can be found in a jupyter notebook here. The executed version of this notebook has been archived by the wayback back machine, which can be found here.

Contributing to Shell Server

Shell Server is the single point of contact for all the different services that tattle builds. This blog post describes the system architecture and lists resources helpful to anyone looking to contribute to the Shell Server.

Finding Similar Videos Efficiently

Data Science blog on finding similar videos in Tattle's archive by Feature Selection of anchor frames

Establishing Conventions for UI engineering with React

Scraping Fact-Checking Sites

Analysing the Katna library for video key frame extraction

Today we see a rampant proliferation of video content via various social media channels all over the world. We explore the Katna library to extract key frames from a video to feed into our duplicate image search engine

FAQ Contributors Privacy Policy Contact Us Site Map

Text and illustrations on the website is licensed under Creative Commons 4.0 License. The code is licensed under GPL. For data, please look at respective licenses.

back to all blogs

Clustering similar images with pHash

Tags:

Related Posts