Delhi's Directorate of Information and Publicity quietly began a system-wide audit of its digital asset library in March 2026, and what officials found was, by any measure, an embarrassment: tens of thousands of duplicate image files spread across servers maintained by at least six separate municipal departments, with some photographs stored in as many as nine separate copies across different drives. The cleanup effort, which involves coordinating between the Delhi government's IT department at ITO and the Municipal Corporation of Delhi's own digital records division, has become one of the more unglamorous but consequential administrative projects the capital has undertaken in years.
The problem matters now because Delhi is mid-way through a digitisation push tied to Phase 4 of the Delhi Metro expansion, which requires integrated communications infrastructure across new corridors including the Janakpuri West–RK Ashram Marg line. Redundant files slow backup cycles, inflate cloud storage costs, and — critically — create version-control nightmares when press offices need to pull verified, current images of infrastructure projects or public health campaigns. With the Yamuna riverfront development project generating fresh documentation weekly, the volume of incoming files is only accelerating the underlying crisis.
How the Duplication Problem Grew
The roots go back to 2010, when Delhi's various departments began digitising records independently, without a shared taxonomy or central repository. The Delhi Secretariat at IP Estate, the PWD communications office in Nirman Bhawan, and the Delhi Jal Board all built their own image libraries in parallel. No interoperability standard was enforced. When the Aam Aadmi Party government launched its mohalla clinic documentation programme after 2015, field photographers uploaded images directly to WhatsApp groups that were later manually downloaded and re-uploaded by junior staff — generating duplicate after duplicate with no automated deduplication filter in place.
Compounding the chaos was the city's reliance on older content management software that lacked hash-based duplicate detection, a feature standard in enterprise digital asset management platforms since roughly 2012. The Delhi government's contract with its primary IT service vendor — renewed in 2019 — did not include a mandatory deduplication module, an omission that administrators now acknowledge created avoidable redundancy across shared drives at data centres in Okhla Phase II.
By 2024, storage costs had become measurable enough to flag internally. Cloud hosting expenditure across Delhi government departments rose substantially between 2021 and 2024, with image and video assets accounting for a disproportionate share of that growth, according to budget documents tabled in the Delhi Assembly session of February 2025. The March 2026 audit was ordered after an internal review found that the Directorate of Information and Publicity's primary archive contained more than 40 percent files that were exact or near-exact duplicates of images already held elsewhere in the system.
What the Cleanup Actually Involves
The deduplication exercise is being handled in two phases. The first, running through August 2026, focuses on the Directorate of Information and Publicity's own servers, using automated perceptual hashing tools to flag images that are visually identical even if stored under different filenames or resolutions. Phase two, expected to begin in October, will attempt to harmonise records held by the MCD, Delhi Jal Board, and the Delhi Tourism and Transportation Development Corporation — organisations that have historically guarded their own data infrastructure with limited enthusiasm for central coordination.
Staff at the Delhi government's e-Governance unit on Parliament Street have been tasked with building a unified metadata standard that will apply to all new uploads from July 2026 onward. New images tied to public projects — including documentation of the Chandni Chowk redevelopment and Yamuna floodplain restoration work — will be tagged with department codes, photographer IDs, and project reference numbers before entering the shared repository.
For ordinary Delhiites the practical stakes are indirect but real. Faster, cleaner digital archives mean press releases go out with verified, correctly captioned images, public health campaign materials reach RWA notice boards without version confusion, and RTI applicants asking for project photographs get responsive, organised replies rather than contradictory file dumps. The audit's findings are expected to be placed before a standing committee of the Delhi Assembly by September 2026, which is when the full scale of the storage cost savings — and the administrative failures that made them necessary — will become a matter of public record.