The Daily Delhi

Delhi news, every day

News

How Delhi's Government Records Ended Up Drowning in Duplicate Images — and Why It's Finally Being Fixed

Years of uncoordinated digital scanning drives across municipal offices left the capital's archives bloated, redundant, and increasingly unusable; here is how that happened.

By Delhi News Desk · Published 5 July 2026, 12:55 am

4 min read

How Delhi's Government Records Ended Up Drowning in Duplicate Images — and Why It's Finally Being Fixed
Photo: Photo by Riven Apwbihls on Pexels

Delhi's sprawling civic bureaucracy has a clutter problem that goes deeper than the filing cabinets stacked floor-to-ceiling in the corridors of the Municipal Corporation of Delhi's headquarters on S.P. Mukherjee Road. Digitisation projects launched over the past decade — some by the AAP government, some by central agencies under the Union Ministry of Housing and Urban Affairs — created tens of millions of scanned document images that were uploaded without deduplication checks. The result: the same land record, the same building plan, the same voter roll page exists in some databases four or five times over.

This matters now because Delhi is midway through an accelerated push to make public records genuinely searchable ahead of the Phase 4 Delhi Metro expansion, which requires rapid land-acquisition clearances across corridors stretching from Janakpuri West to RK Ashram Marg. When land records are buried under layers of duplicate scans, acquisition officers lose weeks tracing the correct, authoritative version of a document. Delays compound. So does litigation.

How the Duplication Built Up

The problem did not appear overnight. Between 2015 and 2022, at least three separate digitisation initiatives ran concurrently in Delhi with little coordination between them. The Delhi government's own e-District portal, administered through the Department of Revenue, began scanning tehsil records at offices including the Tis Hazari civil courts complex and the South Delhi district collectorate in Saket. Simultaneously, the National Informatics Centre ran its own archiving exercise under a central government mandate. A third sweep was conducted by private vendors contracted by the Delhi Development Authority to digitise its project files going back to the 1960s.

Each initiative used different scanning resolutions, different file-naming conventions, and — crucially — different metadata schemas. When records were later consolidated into shared servers, automated systems had no reliable way to identify that three differently named files were images of the same physical page. Disk storage at the DDA's data centre in Dwarka reportedly ballooned as a consequence, though the authority has not published precise figures on total storage consumption.

The duplication problem was compounded by well-meaning but poorly sequenced emergency scans during the Covid-19 lockdowns of 2020 and 2021, when staff at offices including the Registrar of Assurances in Kashmere Gate were urged to digitise backlogs rapidly so that property transactions could continue with minimal in-person contact. Speed took precedence over verification. Documents already scanned once were scanned again because no central registry confirmed what had already been processed.

The Deduplication Drive Now Underway

The fix being rolled out this year relies on perceptual hashing — a technique that generates a short numerical fingerprint from the visual content of an image, allowing software to flag near-identical files even when they have different names or were saved at slightly different resolutions. The National Informatics Centre began piloting the approach at the Shahdara revenue circle in early 2026, targeting roughly 1.4 lakh scanned pages as a test batch. Early results from that pilot, shared at a February 2026 inter-departmental review, indicated a duplication rate of around 34 percent across the sample — meaning more than one in three images was a redundant copy of something already in the system.

The Delhi government's Revenue Department has since directed all 11 district offices to participate in a phased deduplication exercise running through October 2026. Priority is being given to mutation records and sale deed registers, the two document categories most frequently accessed during the Metro Phase 4 land acquisition process. Officers at the Rohini district collectorate, which covers several Phase 4 corridor plots in the northwest, began their local audit in June.

For ordinary Delhiites, the practical payoff should be measurable. Property owners waiting for mutation certificates — a process that currently takes between 21 and 45 working days at most tehsil offices — may see turnaround times drop once clerks are no longer hunting through redundant file trees to locate the authoritative scan. Legal professionals working out of Tis Hazari and Patiala House courts have long complained that mismatched document versions slow down title dispute proceedings. A cleaner archive does not solve the underlying land-records disputes that fill those courtrooms, but it removes one layer of preventable delay from a system that has enough of its own.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Delhi

This article was produced by the The Daily Delhi editorial desk and covers news in Delhi. See our editorial standards for how we use AI.

The Daily Delhi brief

The day's Delhi news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Delhi and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Delhi news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Delhi and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Delhi

More in News

Enjoyed this story? Get tomorrow's briefing free.