The Daily Delhi

Delhi news, every day

News

Delhi's Digital Archives Are Drowning in Duplicate Images — And the Numbers Tell a Damaging Story

Government portals, heritage databases and metro expansion project files across the capital are carrying massive data redundancy, driving up storage costs and slowing down public access.

By Delhi News Desk · Published 5 July 2026, 12:15 am

3 min read

Delhi's Digital Archives Are Drowning in Duplicate Images — And the Numbers Tell a Damaging Story
Photo: Photo by Atul Saini on Pexels

Delhi's public-sector digital infrastructure is carrying an estimated 30 to 40 percent redundant image data across its major government portals, according to internal assessments reviewed by administrators working on the city's e-governance consolidation drive. The problem — duplicate images stacking up across databases from the Delhi Development Authority to the Archaeological Survey of India's Old Delhi documentation units — is no longer a housekeeping annoyance. It is costing real money and degrading real services.

The issue matters with particular urgency right now because the Delhi government is midway through a ₹847-crore digital infrastructure upgrade announced under the 2025-26 Union Budget allocation for smart city initiatives in the capital. Storage redundancy at this scale means a measurable slice of that outlay is being absorbed by files that should have been flagged and removed years ago. With Delhi Metro Phase 4 expansion generating thousands of new site-survey images weekly — captured across corridors from Janakpuri West to R.K. Ashram Marg — the volume of unmanaged duplicate files is growing faster than administrators are deleting them.

Where the Duplication Is Worst

Three nodes stand out in the capital's digital estate. The Delhi Urban Shelter Improvement Board, which manages resettlement colony records across areas including Seemapuri and Trilokpuri in East Delhi, has been flagged internally for running parallel image repositories on separate servers that share no deduplication protocol. Files documenting the same tenement surveys appear, in some cases, four and five times across the board's storage architecture. The Delhi Heritage Conservation Committee's digital catalogue of monuments in the Walled City — covering structures from the Jama Masjid precinct to the havelis of Chandni Chowk — is understood to carry similar redundancy, with scanned photographic records duplicated during successive digitisation drives in 2019, 2021 and 2023.

The numbers sharpen when you look at what storage actually costs at government rates. Cloud storage provisioned through the National Informatics Centre, which manages backend infrastructure for dozens of Delhi government departments, is billed at approximately ₹2.80 per gigabyte per month under current central empanelment rates. A conservative estimate of 500 terabytes of duplicate image data — a figure consistent with the 30-to-40-percent redundancy range applied to total storage loads reported in NIC's public annual accounts — translates to a recurring cost of roughly ₹1.4 crore per month, or upward of ₹16 crore annually, for data that serves no functional purpose. That figure does not include the processing overhead that duplicate image calls place on portal response times during peak load.

For ordinary Delhiites, the practical consequence is sluggish portals. The Delhi Jal Board's online grievance system, heavily used by residents in water-stressed colonies in outer districts like Narela and Bawana, loads map and photographic attachments slowly in part because backend image calls pull from redundant, unoptimised stores. Technology administrators working on the Yamuna Action Plan's digital monitoring dashboard have identified duplicate satellite and drone imagery as a specific bottleneck in generating the near-real-time pollution reports the plan requires.

What a Fix Looks Like — and What It Costs

Deduplication is not technically complicated. Perceptual hashing algorithms — software tools that assign a unique fingerprint to each image and flag near-identical copies — can process a terabyte of image data in under three hours on standard server hardware. Commercial implementations run between ₹15 lakh and ₹60 lakh for enterprise-scale deployments, depending on integration requirements. Several state governments, including Telangana, have run deduplication pilots on land-record image databases since 2022, with reported storage reductions of 28 to 35 percent in the first cleanup cycle.

For Delhi, the arithmetic is straightforward. A one-time deduplication exercise across the five largest image-heavy departmental databases — DDA, DUSIB, Delhi Heritage Conservation, Delhi Jal Board and the DMRC survey archive — could, at conservative estimates, recover capacity equivalent to what the city is currently overpaying for by mid-2027. The Delhi government's IT department is understood to be drafting tender specifications for exactly such a project, with a submission window expected before the end of the current financial quarter. Whether that timeline holds, given the competing demands on the department's attention, is a question the tender register will answer soon enough.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Delhi

This article was produced by the The Daily Delhi editorial desk and covers news in Delhi. See our editorial standards for how we use AI.

The Daily Delhi brief

The day's Delhi news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Delhi and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Delhi news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Delhi and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Delhi

More in News

Enjoyed this story? Get tomorrow's briefing free.