Delhi's Duplicate Image Problem: The Numbers Exposing a City Archive in Crisis
Thousands of mislabelled and repeated photographs are clogging Delhi's official digital records — and the data trail shows just how deep the problem runs.
Thousands of mislabelled and repeated photographs are clogging Delhi's official digital records — and the data trail shows just how deep the problem runs.

At least 34 percent of images held across Delhi government's digitised heritage and civic documentation archives are either exact duplicates or near-identical copies filed under different reference numbers, according to an internal audit conducted by the Delhi Urban Heritage Foundation in March 2026. The finding has triggered a quiet scramble inside the departments responsible for managing records tied to everything from the Yamuna riverfront redevelopment scheme to Phase 4 Metro corridor approvals.
The problem matters now because 2026 is the year several of those archive-dependent projects are hitting active construction or legal review phases. The Delhi Development Authority's land-acquisition files for the Janakpuri West to RK Ashram Marg corridor rely on photographic documentation to validate pre-construction ground surveys. If the same image appears twice under different site coordinates — which the audit found happening in roughly 1,200 individual file instances — that is not merely a clerical nuisance. It can invalidate an entire survey submission in court or before the National Green Tribunal.
The audit flagged three institutional repositories as the worst offenders. The Shahjahanabad Redevelopment Corporation, which manages documentation for Old Delhi's walled city zone, showed a duplication rate of 41 percent across its photographic asset folders as of January 2026. The Delhi Jal Board's Yamuna Action Plan project files — running since 2022 under a Central government allocation — recorded 2,847 duplicate image entries out of roughly 8,100 total photographs submitted by contracted field teams. The third concentration is inside the South Delhi Municipal Corporation's property-mapping database, where automated upload tools introduced in late 2024 inadvertently allowed the same drone-captured frames to be ingested multiple times from different field officers' devices.
The geographic spread is telling. Duplicates cluster most heavily around Chandni Chowk, the Walled City's Lal Kuan area, and the Okhla industrial zone — precisely the locations that have been photographed repeatedly by multiple agencies running overlapping mandates. A single building facade on Nai Sarak in Old Delhi, photographed for heritage survey, pollution monitoring, and encroachment documentation simultaneously, can end up replicated across three separate government drives with three separate file codes.
Storage is the most straightforward metric. Delhi government's centralised e-District server infrastructure, managed through the National Informatics Centre facility in CGO Complex, Lodhi Road, was consuming an estimated 18 terabytes of redundant image data as of the March audit — at a storage and maintenance cost that the Foundation's report placed in the range of several lakh rupees per financial quarter, though it did not publish a single confirmed rupee figure.
The more consequential cost is legal. The NGT has, on at least two documented occasions since 2023, raised objections to photographic evidence submissions from Delhi state bodies where metadata inconsistencies — a classic symptom of duplicate-and-re-file workflows — undermined the credibility of compliance reports. Both instances involved Yamuna floodplain monitoring. Neither resulted in project cancellation, but both caused delays measured in months, not weeks.
A remediation programme called Project CleanFrame was approved by the Delhi government's Department of Information Technology in February 2026, with an implementation deadline of September 30, 2026. It involves deploying perceptual hash-matching software across the three worst-affected repositories, starting with the Shahjahanabad Redevelopment Corporation's files. The software compares image fingerprints and flags near-matches for human review rather than deleting automatically — a safeguard against removing legitimately similar but distinct photographs of changing sites over time.
For residents and contractors who submit photographic documentation to civic portals — the MCD's property portal at mcdonline.nic.in, or the Delhi Jal Board's grievance system — the practical guidance from IT department advisories issued in May 2026 is straightforward: each image upload should carry embedded GPS metadata and a timestamp before submission, which the new CleanFrame system uses as the primary deduplication signal. Without that metadata, even unique photographs risk being flagged as duplicates of earlier submissions from the same address. Contractors working on Phase 4 Metro stations between Aerocity and Tughlakabad have been specifically advised to recheck their survey photo packages before the October 2026 filing window opens.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Delhi
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News