Perceptual hashing
Perceptual hashing
Main page

Perceptual hashing

logo
Community Hub0 subscribers
What are your thoughts?
Be the first to start a discussion here.
Be the first to start a discussion here.
Perceptual hashing

Perceptual hashing is the use of a fingerprinting algorithm that produces a snippet, hash, or fingerprint of various forms of multimedia. A perceptual hash is a type of locality-sensitive hash, which is analogous if features of the multimedia are similar. This is in contrast to cryptographic hashing, which relies on the avalanche effect of a small change in input value creating a drastic change in output value. Perceptual hash functions are widely used in finding cases of online copyright infringement as well as in digital forensics because of the ability to have a correlation between hashes so similar data can be found (for instance with a differing watermark).

The 1980 work of Marr and Hildreth is a seminal paper in this field.

In 2009, Microsoft Corporation developed PhotoDNA in collaboration with Hany Farid, professor at Dartmouth College. PhotoDNA is a perceptual hashing capability developed to combat the distribution of child sexual abuse material (CSAM) online. Provided by Microsoft for no cost, PhotoDNA remains a critical tool used by major software companies, NGOs and law enforcement agencies around the world.

The July 2010 thesis of Christoph Zauner is a well-written introduction to the topic.

In June 2016 Azadeh Amir Asgari published work on robust image hash spoofing. Asgari notes that perceptual hash function like any other algorithm is prone to errors.

Researchers remarked in December 2017 that Google image search is based on a perceptual hash.

In research published in November 2021 investigators focused on a manipulated image of Stacey Abrams which was published to the internet prior to her loss in the 2018 Georgia gubernatorial election. They found that the pHash algorithm was vulnerable to nefarious actors.

In August 2021 Apple announced an on-device CSAM scanner called NeuralHash but, after strong privacy backlash, paused the rollout in September and formally cancelled it in December 2022.

See all
User Avatar
No comments yet.