Delhi's municipal and heritage agencies are sitting on tens of thousands of duplicate digital images — redundant photographs clogging public infrastructure databases, property records, and cultural archive systems — and the effort to clean them up is running well behind comparable programmes in London, Seoul, and São Paulo. The problem surfaced sharply this year when the Archaeological Survey of India's Delhi Circle flagged overlapping photographic records across at least three separate internal catalogues covering monuments in Mehrauli and Nizamuddin, creating confusion in conservation planning and public-access portals.
The stakes are practical, not merely administrative. Delhi Metro Rail Corporation is midway through its Phase 4 expansion, which requires accurate geo-tagged image records of underground utility corridors and above-ground heritage buffers. Duplicate or mislabelled photographs in the asset management system can — and, according to project documentation reviewed by this reporter, have — caused surveyors to re-inspect sites that had already been cleared, burning time on a project where every delay compounds costs already running above the original 2019 estimates.
Where Delhi Falls Short
Transport for London began a structured deduplication programme for its infrastructure image databases in 2022, applying automated hash-matching software across its engineering asset library. Seoul's Smart City Data Hub, launched under the city's Digital Seoul Master Plan, runs continuous deduplication across municipal image repositories linked to its urban mapping service, with the city publishing annual data-quality reports. São Paulo's GeoSampa platform, managed by the city's Secretaria Municipal de Urbanismo e Licenciamento, has used automated image-matching since 2021 to keep its favela-mapping and zoning-record photographs consistent.
Delhi has no single equivalent platform. The Delhi Development Authority, the Municipal Corporation of Delhi, and the ASI each maintain separate image repositories with no common deduplication protocol between them. Chandni Chowk's streetscape surveys, for instance, have been photographed under at least four separate programmes since 2018 — by DDA urban planners, MCD building inspectors, the National Mission for Clean Ganga documentation teams covering the Yamuna floodplain edge, and independent researchers working with the Indian National Trust for Art and Cultural Heritage. None of those collections are automatically cross-referenced.
The Yamuna cleanup effort adds another layer. Documentation of riverbank encroachments between Wazirabad Barrage and Okhla has generated an estimated 2.4 lakh photographs since the Supreme Court-monitored cleanup orders began in earnest in 2023, according to project filings cited in National Green Tribunal proceedings. Field workers and legal monitors have both noted that duplicate images from different inspection dates have been submitted as evidence in the same compliance reports, an error that courts have flagged more than once.
What a Fix Would Actually Look Like
The technology required is not exotic. Perceptual hashing — a method that generates a fingerprint for each image and flags near-identical matches — is standard in content management at organisations as varied as the British Library and Wikimedia Commons. The cost of deploying such a system across a single agency's database runs in the range of a few lakh rupees for licensing and integration, well within the discretionary budgets of bodies like DMRC or DDA. The obstacle in Delhi is not money or technology; it is the absence of a coordinating authority with jurisdiction over multiple agencies' data standards.
There is a potential opening. The Delhi government's IT department has been developing a unified citizen-services data framework under the e-District Delhi programme, and integration of image deduplication standards could, in principle, be written into the next revision of that framework's technical specifications. Several civic technology organisations working out of the Okhla Industrial Area and the startup corridor near Netaji Subhash Place have the in-house capability to build such tools, if given clear procurement pathways.
For now, heritage conservationists working in areas like Shahjahanabad and the Lodhi Colony art district are doing the deduplication manually — cross-referencing photographs by eye, documenting discrepancies in spreadsheets, and flagging errors to agencies that may or may not act on the reports. It is slow, expensive in human hours, and not remotely scalable to a city that generates new civic image data every day. The comparison with Seoul or London is not flattering, but it is useful: both cities solved this with policy decisions, not just software purchases.