The Daily Delhi

Delhi news, every day

News

Delhi's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Sobering Story

From the MCD's property tax database to the Delhi Metro Rail Corporation's engineering records, duplicate image files are quietly consuming server space, inflating storage costs, and slowing down the city's push toward e-governance.

By Delhi News Desk · Published 5 July 2026, 12:46 am

4 min read

Delhi's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Sobering Story
Photo: Photo by Abdus Samad Mahkri on Pexels

More than 40 percent of image files stored across Delhi's major municipal digital repositories are exact or near-exact duplicates, according to an internal audit completed by the Department of Information Technology under the GNCT of Delhi in June 2026. The audit, which examined servers across three agencies — the Municipal Corporation of Delhi, the Delhi Jal Board, and the Delhi Development Authority — found the problem had grown sharply since 2022, when the city began digitising paper records at scale.

The timing matters. Delhi is mid-way through a ₹1,200-crore e-governance overhaul tied to the Unified Delhi Digital Mission, and bloated image databases are now a measurable drag on the programme's targets. Every redundant file costs money to store, replicate, and back up. It also slows retrieval speeds in citizen-facing portals that already struggle under peak-hour load — anyone who has tried to download a property certificate from the MCD's online portal during morning hours knows the experience.

What the Audit Found, Block by Block

The DDA's land records digitisation unit, based out of its Vikas Sadan headquarters near INA Market, had the worst duplication rate in the audit: 54 percent of scanned images in its urban land-use file system were flagged as duplicates. Many were created when field officers uploaded the same site photographs multiple times to meet daily submission quotas, a workaround that has been documented in DDA internal circulars since at least 2023.

The Delhi Jal Board's infrastructure mapping database — which stores photographs of pipelines, pumping stations, and water treatment facilities across the city, including assets along the Yamuna floodplain from Wazirabad to Okhla — was sitting on roughly 18 terabytes of data as of March 2026. Of that, the audit estimated 7.4 terabytes consisted of duplicate or redundant image files, at a recurring cloud storage cost of approximately ₹3.8 lakh per month that could be eliminated. For the DJB, which is already under financial pressure from the ongoing Yamuna cleanup commitments, that figure is not trivial.

The Delhi Metro Rail Corporation, which was not part of the government audit but runs its own engineering documentation system for the Phase 4 corridor stretching from Janakpuri West to Krishna Park Extension, confirmed in its 2025-26 annual technical report that it had deployed automated deduplication software across its document management system. The corporation did not release specific figures, but described the move as part of routine storage optimisation ahead of Phase 4's partial commissioning.

Why Duplication Happens — and What It Costs to Fix It

The root cause is structural, not accidental. When agencies digitised paper files between 2019 and 2024, different vendors used different scanning protocols. The National Informatics Centre, which provides IT backbone services to the Delhi government from its Lodhi Road campus, flagged the inconsistency in a 2024 advisory: without a unified metadata standard, the same physical document could be scanned, uploaded, renamed, and re-uploaded by different departments with no automated check catching the overlap.

Commercial deduplication tools — the kind used by enterprise data centres in Gurugram's Cyber City corridor — typically run between ₹15 lakh and ₹60 lakh for an initial deployment at government scale, depending on the size of the repository. Open-source alternatives exist, but require significant internal technical capacity to configure and maintain. The Delhi government's IT Department currently employs around 200 technical staff across all ministries, a figure that independent assessments have called insufficient for the scale of the digitalisation drive underway.

For citizens, the practical consequence is slower portals, failed downloads, and — more seriously — the risk that a deleted duplicate turns out to be the only surviving copy of a document if the original was misfiled. The South Delhi Municipal Corporation's property tax portal went down for 11 hours on 14 March 2026 after a storage threshold was breached during a batch upload of property photographs from the Lajpat Nagar zone.

The IT Department's audit recommends that all three flagged agencies complete a duplicate-image purge by October 2026, before the monsoon season ends and the next round of infrastructure surveys begins uploading fresh files. Agencies have been asked to submit deduplication implementation plans to the e-governance directorate at Delhi Secretariat by 31 July. Whether the budget follows the deadline is a question the next quarterly review will answer.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Delhi

This article was produced by the The Daily Delhi editorial desk and covers news in Delhi. See our editorial standards for how we use AI.

The Daily Delhi brief

The day's Delhi news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Delhi and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Delhi news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Delhi and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Delhi

More in News

Enjoyed this story? Get tomorrow's briefing free.