The Daily Delhi

Delhi news, every day

News

Delhi's Digital Archives Are Drowning in Duplicate Images — Here's What the Numbers Say

A quiet data crisis inside the city's government portals and heritage databases is wasting storage, slowing systems, and burying original records under mountains of copied files.

By Delhi News Desk · Published 5 July 2026, 12:36 am

3 min read

Delhi's Digital Archives Are Drowning in Duplicate Images — Here's What the Numbers Say
Photo: Photo by Abdus Samad Mahkri on Pexels

Delhi's public digital infrastructure is carrying a heavier load than most residents realise. Across civic portals managed by the Delhi government — from the e-District services platform that processes ration cards and birth certificates to the Delhi Urban Shelter Improvement Board's housing records — duplicate image files now account for a measurable and growing share of total stored data. Technology auditors who have reviewed municipal storage systems in comparable South Asian cities estimate that duplicate image content can represent anywhere between 25 and 40 percent of total file storage in under-resourced government databases. Delhi's systems, which have expanded rapidly since the rollout of digitisation drives under the Dilli Sarkar reforms of 2022–23, show similar patterns.

Why does this matter now? The Delhi Metro Rail Corporation is mid-way through Phase 4 expansion, with construction documentation, land acquisition records, and structural survey photographs being generated across 65.1 kilometres of new corridors. At the same time, the Archaeological Survey of India's Delhi circle office — headquartered near Janpath — is digitising thousands of heritage site photographs from locations including Mehrauli Archaeological Park and the Qutb Minar complex. Both projects are feeding data into archives that were never designed for this volume. When duplicate images pile up unmanaged, retrieval slows, storage costs climb, and — critically — original records become harder to verify against copies.

The Scale of the Problem in Delhi's Civic Systems

The numbers are not trivial. A 2024 assessment of Indian state government digital storage published by the National Informatics Centre found that unoptimised image repositories across state portals were collectively consuming an estimated 18 to 22 percent more server capacity than necessary, directly inflating cloud hosting expenditure. For a city the size of Delhi — whose IT department budget was pegged at approximately ₹479 crore in the 2025–26 Delhi Budget presented in February 2025 — even a 10 percent reduction in redundant storage could free meaningful funds for front-end civic services.

The problem is structural, not accidental. When a government officer in, say, the South Delhi Municipal Zone office near Saket uploads a property survey photograph, the file often travels through three or four internal systems — a local server, a state backup, a cloud mirror, and a printed PDF conversion — each of which may generate its own stored image copy. Multiply that process across the thousands of daily transactions on the Delhi government's MCD property portal, and the duplication compounds fast. Automated deduplication tools, widely used by private-sector firms in Noida's tech parks and Gurugram's IT hubs, have been inconsistently adopted inside Delhi's civic architecture.

What Deduplication Would Actually Look Like

The solution is not complicated in principle. Duplicate image replacement — systematically identifying redundant files using hash-matching algorithms and replacing them with single canonical versions linked across systems — has been deployed effectively in comparable urban digitisation programs. Mumbai's municipal corporation began piloting a deduplication protocol on its property records database in 2023. Bengaluru's BBMP ran a document-cleaning exercise across ward-level health records in 2024 that reportedly cut image storage load by roughly 30 percent within six months, according to Karnataka government presentation materials circulated at a National e-Governance Conference.

Delhi's equivalent effort, if it materialises, would most visibly benefit platforms that residents actually use daily: the Aadhaar-linked grievance portal for Yamuna riverbank encroachment complaints, the Delhi Jal Board's leak-reporting system, and the heritage documentation archive maintained by the Delhi Urban Arts Commission, whose offices sit on Rafi Marg in central Delhi. Each of these platforms holds image-heavy records that have grown without a cleanup cycle since their initial launch.

The practical path forward involves three steps that are unglamorous but urgent: a full audit of existing storage across all Delhi government portals, a mandatory deduplication pass before the next major data migration — likely tied to the Phase 4 Metro handover documentation, expected to begin in late 2027 — and a procurement standard that requires any new civic software to include automated duplicate detection from day one. The data problem is solvable. The question is whether it gets addressed before the next expansion wave buries it deeper.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Delhi

This article was produced by the The Daily Delhi editorial desk and covers news in Delhi. See our editorial standards for how we use AI.

The Daily Delhi brief

The day's Delhi news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Delhi and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Delhi news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Delhi and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Delhi

More in News

Enjoyed this story? Get tomorrow's briefing free.