Delhi's Directorate of Archives, headquartered on Shamnath Marg in Civil Lines, is sitting on a problem years in the making: an estimated backlog of digitised government documents so riddled with duplicate image files that retrieval systems across at least three civic departments have slowed to a near halt. The issue surfaced formally during an internal audit conducted in the first quarter of 2026, but the conditions that produced it stretch back to at least 2014, when multiple agencies began scanning records independently, with no shared protocol and no central deduplication standard.
The timing matters for ordinary Delhiites. The AAP administration has repeatedly cited digital record access as a pillar of its governance transparency agenda, and the Delhi Metro Rail Corporation's Phase 4 expansion — which requires clearance of land acquisition documents stored in exactly these archives — is running against tight deadlines. Redundant image files do not merely waste storage space; they create version-control confusion that slows the issuance of official certificates, property records, and land titles that citizens need to navigate everything from mortgage approvals to court proceedings.
How the Duplication Happened
The problem is, at its core, a coordination failure that compounded over time. Between 2014 and 2019, the Delhi government ran at least four parallel digitisation drives — covering revenue records at the Divisional Commissioner's office in Indraprastha Estate, property files at the Municipal Corporation of Delhi's Civic Centre headquarters on Minto Road, heritage documentation under the Delhi Urban Heritage Foundation, and birth and death registers managed by district offices across all eleven districts. Each drive produced its own image archive, stored in department-specific servers, with no shared naming convention and no automated check for files already scanned elsewhere.
A single land parcel in, say, Mehrauli or Shahdara might appear in the MCD's property database, the revenue department's digitised khasra records, and a heritage survey simultaneously — each instance a separate TIFF or JPEG file, often scanned at different resolutions, sometimes labelled with different document numbers for the same physical page. By early 2026, the Delhi State Data Centre in Dwarka, which hosts consolidated government data, had flagged storage utilisation levels that administrators described internally as unsustainable, though the precise figure has not been made public by any named official.
The Yamuna riverfront redevelopment files present a particularly acute version of the problem. Documents relating to flood-plain demarcation along the 22-kilometre stretch from Wazirabad to Okhla have been scanned by at least three separate agencies since 2017, producing overlapping records that have complicated both the cleanup project championed by successive Delhi governments and the legal proceedings around encroachment removal that courts have monitored for years.
What Comes Next for the Archives
The audit, completed in March 2026, recommended a deduplication exercise using hash-matching software — a standard tool that compares image files at the binary level to identify identical or near-identical copies. The Delhi e-Governance Society, which operates under the IT Department, has been tasked with overseeing the process. A pilot covering approximately 40,000 files from the South Delhi district revenue office was expected to begin by June 30, though no public confirmation of its launch had been issued as of this week.
For citizens, the practical advice is straightforward: if you are waiting on archived documents — particularly land records, old municipal permits, or heritage property certificates — approach the relevant district office in person rather than relying on the online portal, which draws from the unresolved central repository. The South and East Delhi district offices on Sector 3, Sadiq Nagar and Patparganj respectively, have maintained parallel physical files that, in many cases, remain the more reliable reference point until the digital cleanup is complete.
The broader lesson the audit draws is not complicated. When agencies digitise without talking to each other, the archive does not get smarter — it just gets bigger, and slower, and harder to trust. Delhi's government has known this problem was building for years. The question now is whether the deduplication drive moves fast enough to stop it from becoming a full-scale administrative breakdown.