More than 40 percent of images uploaded to Delhi government departmental websites contain duplicate or near-duplicate files, according to an internal audit circulated within the Delhi Secretariat in June 2026. The finding, drawn from a review of over 2.3 lakh image files hosted across portals managed by departments including the Public Works Department and the Delhi Jal Board, has pushed digital archivists and civic technologists to call for systematic image deduplication protocols before the city's next budget cycle in February 2027.
The timing matters. Delhi is mid-way through a sweeping digitisation push tied to the Phase 4 Delhi Metro expansion, which requires real-time photo documentation of construction progress at 65 new stations across corridors stretching from Janakpuri West to RK Ashram Marg. When duplicate images flood those records — the same construction-site photograph uploaded six or seven times under different file names — engineers and project monitors lose the ability to track actual progress. A botched visual record doesn't just create administrative noise; it delays sign-offs, clogs procurement audits, and can distort public dashboards that lakhs of commuters rely on.
The Scale of the Problem Across Delhi's Digital Infrastructure
The numbers compound when you look beyond government portals. Civic tech organisation Civis India, which operates a public data portal from its office near Hauz Khas Village in South Delhi, estimates that roughly one in five images submitted through citizen grievance platforms — including the Delhi government's own 311-style app — are re-uploaded duplicates, often the same pothole or broken streetlight photographed and filed multiple times by different users on the same block. That redundancy inflates complaint counts, making neighbourhoods like Laxmi Nagar in East Delhi or Karol Bagh in Central Delhi appear to have significantly more unresolved civic issues than field teams can verify.
The Yamuna Riverfront project documentation presents its own case study. The Delhi Development Authority has been building a photographic archive of the Yamuna floodplain restoration effort between the Nizamuddin Bridge and the Wazirabad Barrage — a 22-kilometre stretch — since early 2025. According to a project status note reviewed for this article, the archive accumulated over 18,000 images in its first 14 months. Automated hash-matching tools applied in a trial run this May flagged approximately 3,200 of those files — nearly 18 percent — as exact or near-exact duplicates. Removing them reduced the active archive size from 847 gigabytes to under 690 gigabytes, cutting cloud storage costs in that one project by an estimated ₹1.2 lakh per quarter.
Why Deduplication Technology Is Arriving Late
Perceptual hashing — the core algorithm behind most modern duplicate-image detection — has been commercially available since the mid-2010s, but its adoption inside Delhi's civic infrastructure lagged badly. The reason is structural. Departmental IT budgets in Delhi are allocated piecemeal, with individual directorates procuring their own content management systems. The result is at least 14 different CMS platforms running simultaneously across departments as of April 2026, none of them natively cross-linked. An image uploaded to the Heritage Conservation Committee's portal for Old Delhi sites like Mirza Ghalib Ki Haveli in Ballimaran has no automatic channel to check whether the same photograph already lives on the Archaeological Survey of India's parallel database.
The cost of inaction is measurable. Cloud hosting bills for Delhi government digital assets crossed ₹7.3 crore in the 2025-26 financial year, a figure the Comptroller and Auditor General's office flagged in its latest state audit report. Storage experts consulted by this newspaper — without attribution to specific names pending official confirmation — suggest that aggressive deduplication across all departmental repositories could trim that figure by 15 to 25 percent within 12 months of implementation.
The Delhi government's IT Department is expected to release a revised Digital Data Governance Policy before the monsoon session of the Delhi Assembly ends in late July. That policy, if it mandates standardised image ingestion pipelines with hash-checking built in from day one, would be the first such binding directive any Indian state capital has issued. For departments already wrestling with the Yamuna cleanup audit trail and Metro Phase 4 photo logs, that mandate cannot arrive too soon.