Delhi's municipal and state government databases contain tens of thousands of duplicate images — redundant photographs clogging property records, voter ID files, ration card registries and public health portals — and the effort to identify and remove them has barely begun. Officials at the Delhi Urban Shelter Improvement Board, which manages housing records for resettlement colonies from Bhalswa in the north to Madanpur Khadar in the south-east, acknowledged the problem in internal notices circulated in late 2025, though no public remediation timeline has been formally announced.
The issue matters now because Delhi is midway through a push to digitise its civic infrastructure. The Delhi Metro Rail Corporation's Phase 4 expansion, which is extending lines to Janakpuri West and Tughlakabad, is generating new data sets that need to integrate with older archives. Where legacy records contain duplicate or mismatched images — faces attached to wrong names, properties photographed multiple times under different survey codes — the errors compound rather than disappear. The Aam Aadmi Party government's e-Governance push, centred on the Delhi Integrated Multi-Modal Transit System and linked citizen-services portals, has repeatedly flagged data quality as a bottleneck.
In practice, the burden falls on localities that were already digitised in a hurry. Chandni Chowk's property tax records, re-uploaded during a 2021 digitisation drive by the Municipal Corporation of Delhi's Central Zone office on Lothian Road, reportedly contain duplicate image entries for hundreds of heritage structures where photographs were submitted more than once by surveyors using different file-naming conventions. The MCD's own internal audit cell noted the discrepancy in a 2024 report, but no automated deduplication tool was budgeted for the current financial year.
How Other Cities Are Handling It
The contrast with comparable cities is sharp. Mumbai's Brihanmumbai Municipal Corporation began deploying perceptual-hash deduplication software across its property and identity databases in 2023, after a Right to Information request revealed more than 80,000 duplicate images in the city's electoral roll photo archive alone. Seoul's e-Government Integrated Centre has run automated image-matching protocols since 2019 across all 25 of its gu-office databases, reducing storage overhead by roughly 30 percent according to figures published by the Korean Ministry of the Interior. London's Government Digital Service mandated metadata-standardisation rules for all borough councils under the 2022 Data Quality Framework, which required councils to submit deduplication compliance reports by March 2024.
Delhi has no equivalent mandate yet. The National Informatics Centre, which hosts the back-end infrastructure for most Delhi government portals including the Delhi Jal Board's customer database, operates deduplication tools at the federal level for Aadhaar-linked records. But state-level repositories that sit outside the Aadhaar stack — including land records managed through the Delhi Revenue Department's Dharani software and older scanned archives at the Secretariat on I.P. Estate — have no standardised deduplication layer. Researchers at IIT Delhi's Bharti School of Telecommunication have been in dialogue with state officials about piloting a low-cost hash-matching tool, though no formal contract has been signed.
What Needs to Happen Next
The practical consequence for residents is already visible. Applicants for domicile certificates at sub-divisional magistrate offices in Dwarka and Rohini have reported delays traced to duplicate or conflicting image records in the civil registration system. When two images are flagged as potential matches for a single applicant, the file is held for manual review — a queue that, at some SDM offices, runs to several weeks.
Officials familiar with the process say the most straightforward fix is a phased audit beginning with the highest-traffic databases: voter rolls, property tax records, and ration card archives. Mumbai's BMC completed a comparable first-phase audit in roughly eight months at a reported cost of around ₹3.2 crore. Delhi's databases are significantly larger, given a population officially recorded at over 3.3 crore in the 2021 census, but so is the potential payoff in processing speed and storage savings. The window to act is narrow: Phase 4 Metro stations are expected to trigger a new wave of property surveys and resident registrations along corridors through Lajpat Nagar and RK Ashram Marg, adding fresh data to systems that are already carrying dead weight.