Delhi's public administration is sitting on a problem that nobody built, but everyone created: tens of thousands of duplicate images clogging the digital archives of multiple city agencies, slowing down record retrieval, inflating storage costs, and — in at least some departments — making it nearly impossible to verify which version of a scanned document is the authoritative one. The problem did not arrive overnight. It accumulated across roughly a decade of rushed digitisation drives, each launched with fanfare and each run largely independent of the ones before it.
The timing matters because Delhi is in the middle of its most aggressive push yet to move citizen services fully online. The AAP government's expanded e-District portal, which now handles everything from domicile certificates to ration card amendments, depends on clean, deduplicated image records. Garbage in, garbage out — and right now, according to internal planning documents circulated among senior officials at the Delhi Secretariat in Indraprastha Estate, the backlog of unresolved duplicate image files runs across at least six major departments.
A Decade of Disconnected Digitisation
The roots of the problem trace back to the early 2010s, when the Municipal Corporation of Delhi and the Delhi State Archives on Shyam Nath Marg each began independent scanning programmes without a shared file-naming convention or a central deduplication protocol. The Delhi State Archives holds records dating to the Mughal period, and the pressure to digitise fragile documents quickly led staff to scan the same folios multiple times — partly as a precaution, partly because different teams working on the same collection did not always communicate. By 2018, the problem had migrated into the newer systems. The Delhi Development Authority's land records wing, operating out of its Vikas Sadan headquarters in INA, adopted a document management system that was incompatible with the format used by the Revenue Department's DARPAN portal, which went live around the same time. When files were transferred between the two systems, images were frequently re-uploaded rather than linked, creating a second layer of duplication.
The 2020-21 period accelerated everything. With offices shut during the pandemic, agencies across Delhi scrambled to get records online so that clerks working from home could access them. The National Informatics Centre, which supports several Delhi government portals, flagged storage anomalies in two departments during that period, but bandwidth constraints and the immediate public health pressure meant remediation was deferred. That deferral has compounded since. Rough internal estimates — shared in planning committee notes but not yet published — suggest that in some departments, duplicate images account for between 20 and 35 percent of total stored files.
What Deduplication Actually Requires
Fixing the problem is neither simple nor cheap. Deduplication at scale requires either manual review — labour-intensive and slow — or automated hash-matching software that can identify identical or near-identical image files and flag them for deletion or archiving. The Delhi government's IT Department, headquartered in the Delhi Secretariat complex, began a pilot deduplication exercise in the Civil Supplies Department in early 2025. That pilot covered roughly 1.2 lakh scanned files and took four months to complete. Scaling that effort across all affected agencies, by the IT Department's own preliminary estimates shared in a 2025 budget working group, would require sustained funding and a dedicated technical team through at least 2027.
The political dimension is real. The BJP-controlled central government, through the Ministry of Electronics and Information Technology, has its own digitisation standards under the National Digital India framework, and Delhi — as a Union Territory with a complicated power-sharing arrangement — has repeatedly had to negotiate whether state-level systems will be integrated with or subordinate to central platforms. That negotiation has sometimes slowed decisions that could have caught the duplication problem earlier.
For ordinary Delhiites, the practical consequence shows up in wait times. Residents applying for land mutation at tehsil offices in areas like Mehrauli or Shahdara have reported — and officials have acknowledged informally — that document verification sometimes stalls because clerks cannot quickly identify the correct image version in the system. The fix, when it comes, will be invisible. But the groundwork being laid now — cleaner file taxonomies, shared standards between the DDA and Revenue Department, and the expanded deduplication pilot — is what stands between the current muddle and a city administration that can actually deliver on its paperless promises.