Delhi's Department of Information Technology confirmed this past week that a systematic audit of the capital's centralised e-governance image repository has identified tens of thousands of duplicate and near-duplicate photograph files spread across at least seven separate departmental servers. The problem did not appear overnight. It is the product of nearly a decade of piecemeal digitisation, and officials are now racing to implement a deduplication protocol before Phase 4 of the Delhi Metro expansion — which is generating fresh volumes of documentation daily — makes the backlog unmanageable.
The stakes are higher than a cluttered hard drive. Public records stored by the Delhi government underpin everything from property title disputes in Shahjahanabad to land acquisition notices along the Janakpuri West–Krishna Park Extension Metro corridor. When the same image exists in multiple versions with conflicting metadata — different file names, different upload dates, different departmental tags — it creates legal ambiguity that can delay hearings at the Delhi High Court and stall infrastructure clearances. The problem has quietly been building since 2017, when the AAP administration launched its first major push to scan and upload legacy paper records from the Municipal Corporation of Delhi offices in Civil Lines.
How the Mess Was Made
The root cause is straightforward: there was never a single authority responsible for image standards. Between 2017 and 2023, at least four separate digitisation initiatives ran in parallel — the Delhi Archives modernisation program based in the Old Secretariat building near Rajghat, the Delhi Urban Shelter Improvement Board's slum-mapping project, the Public Works Department's road-survey photo uploads, and the Delhi Jal Board's Yamuna-cleanup documentation drive. Each unit chose its own file format, resolution standard, and naming convention. JPEG and TIFF files of identical government-issued maps were uploaded separately by separate teams who had no visibility into what the other was doing.
The Yamuna cleanup documentation alone — a politically charged body of work given the river's central role in disputes between the AAP government and the BJP-led central administration — produced an estimated 1.8 lakh photograph files between 2019 and 2024, according to an internal working-group presentation reviewed by The Daily Delhi. Spot checks found that roughly 30 percent of those files were exact or near-exact duplicates, many created when field teams re-uploaded images after network failures without checking whether the original transmission had succeeded. The Indraprastha Information Technology Institute, which provides technical support to Delhi's IT department under a services agreement, flagged the duplication rate in a November 2025 review but a formal remediation plan was not signed off until this June.
What the Fix Looks Like — and Who Pays
The deduplication project, now formally titled the Delhi Digital Records Consolidation Initiative, will run in two phases. Phase one, covering the Delhi Archives and the Public Works Department servers at the Nirman Bhawan annexe in ITO, is budgeted at ₹4.2 crore and is scheduled for completion by December 2026. Phase two, which brings in the more politically sensitive Yamuna and urban-shelter image sets, has no confirmed budget line yet and depends on whether the Delhi government and the central Ministry of Housing can agree on shared storage standards — a negotiation that has stalled before.
The practical method is not exotic. Software tools using perceptual hashing compare images pixel-by-pixel and flag files above a set similarity threshold for human review before deletion. The challenge in Delhi's case is that human review requires trained archivists who can distinguish a legitimately updated survey photograph from a true duplicate. The Delhi Archives currently employs fewer than a dozen qualified digital archivists for a collection that spans centuries of records.
For residents and legal professionals who rely on these records — particularly those with property matters in Old Delhi's Chandni Chowk ward or redevelopment disputes in Rohini — the practical advice is to request certified copies of any government image records through the Right to Information portal rather than relying on unofficial downloads, which may reflect the uncleaned, duplicate-laden version of the database. The consolidation process, once complete, is meant to produce a single canonical record for each file. Until it does, the archive remains a place where the same photograph can tell two different stories depending on which server you find it on.