The Daily Delhi

Delhi news, every day

News

How Delhi's Government Archives Ended Up Flooded With Duplicate Images — And What It Cost Taxpayers

A years-long chain of digitisation drives, political handoffs, and vendor mismanagement left city databases bloated with redundant files, and officials are only now reckoning with the scale of the problem.

By Delhi News Desk · Published 5 July 2026, 12:14 am

3 min read

How Delhi's Government Archives Ended Up Flooded With Duplicate Images — And What It Cost Taxpayers
Photo: Photo by Ishan on Pexels

Delhi's public records infrastructure is carrying dead weight — millions of it, in pixels. The Municipal Corporation of Delhi and several state departments are sitting on digitised document repositories where, by internal estimates circulated among senior officials earlier this year, anywhere between 30 and 45 percent of stored image files are exact or near-exact duplicates. The problem did not emerge overnight. It is the product of at least a decade of overlapping digitisation campaigns, each launched under a different political administration, each contracted to a different vendor, and almost none of them talking to each other.

The timing matters because Delhi is now mid-way through a broader Smart City push that ties into the Phase 4 Delhi Metro expansion corridor data systems and the AAP government's push to put citizen services online through its unified Delhi e-District portal. Storing and serving redundant files is not a bureaucratic nuisance — it has direct costs in server space, retrieval latency, and the staff hours spent managing databases that are, in places, three times larger than they need to be.

How the Duplication Built Up

Trace the problem back to 2014, when the then-BJP-led Delhi government initiated the first major push to digitise land revenue records under the Revenue Department's DLRC programme. Scanning centres were set up across districts including Dwarka Sub-City, Rohini, and the old secretariat complex near ITO. Contractors were paid per page scanned. The incentive structure rewarded volume, not de-duplication. Files were handed over on external drives, uploaded to department servers, and never cross-checked against what already existed.

Then came the AAP government's own wave of digitisation after 2015, focused on health and ration card records through initiatives tied to Mohalla Clinic data integration. A separate vendor pool was empanelled. The new contractors had no access to — and no contractual obligation to check against — the prior corpus. Parallel uploads began. By the time the Smart Cities Mission formally extended to Delhi's administrative systems around 2019-2020, three overlapping layers of scanned documents existed across different servers, with no unified deduplication protocol in place.

The Indira Gandhi National Centre for the Arts in Janpath flagged a version of the same problem in its own digitised manuscript archive as early as 2021, noting that a single-pass scan-and-store workflow without hash-based verification produced significant redundancy. That lesson did not travel across departmental lines in time to prevent similar errors in civic administration databases.

The Price Tag and What Comes Next

Cloud storage is not free. Delhi government departments collectively spent, according to budget documents from the 2024-25 fiscal year reviewed by officials, upwards of Rs 18 crore on data storage and management contracts for digitised records. That figure has risen each year. Technical consultants working on the e-District portal have put the practical cost of duplicate image retention — in storage, bandwidth, and manual audit hours — at a significant but still formally unquantified share of that total. A formal audit, expected to be completed by the Delhi Secretariat's IT cell before October 2026, is supposed to produce the first precise number.

The audit itself is being conducted in coordination with NIC, the National Informatics Centre, which manages the backend infrastructure for several Delhi government portals. NIC has developed deduplication tools used in other states, and officials expect those tools to be adapted for Delhi's specific file formats — a mix of TIFF, JPEG, and PDF-embedded images accumulated across the years of inconsistent scanning.

For residents who interact with these systems — whether renewing a ration card at a Jan Seva Kendra in Laxmi Nagar or checking land records through the Delhi Revenue portal — the immediate practical effect is sluggish load times and occasional retrieval errors. Those are symptoms. The underlying fix requires going back through the archive, running deduplication algorithms, and establishing a single-entry verification standard for every scan going forward. Officials say the October audit will form the basis of a procurement tender for that cleanup work. Until then, the redundant files stay exactly where they are.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Delhi

This article was produced by the The Daily Delhi editorial desk and covers news in Delhi. See our editorial standards for how we use AI.

The Daily Delhi brief

The day's Delhi news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Delhi and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Delhi news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Delhi and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Delhi

More in News

Enjoyed this story? Get tomorrow's briefing free.