Delhi's push to digitise public records has generated a storage crisis hiding in plain sight. Across municipal departments, Delhi government portals, and the city's expanding network of surveillance and civic infrastructure, duplicate image files now account for a substantial share of total digital storage consumption — wasting public money and creating bottlenecks in record retrieval at a time when the city is racing to modernise.
The scale of the problem has come into sharper focus as the Delhi Metro Rail Corporation's Phase 4 expansion generates thousands of new engineering drawings, site photographs, and scanned permits monthly. Add to that the Municipal Corporation of Delhi's property tax digitisation programme, the ongoing Yamuna riverfront documentation initiative near the Rajghat ghats, and the Delhi Police's expanding CCTV network covering areas from Chandni Chowk to Dwarka Sector 21, and the volume of image data being produced — and carelessly duplicated — is enormous.
What the Data Actually Shows
Storage wastage from duplicate files is not a trivial technical nuisance. Industry benchmarks from enterprise data management studies consistently find that between 25 and 40 percent of unmanaged digital storage in large government deployments consists of redundant or duplicate files, with image files — JPEGs, PNGs, scanned PDFs — among the worst offenders. A single government scanning drive, where officers photograph the same document from multiple angles or rescan rejected uploads, can generate five or six copies of the same image without any formal deduplication step in the workflow.
In Delhi's context, the numbers carry real financial weight. The Delhi government's IT Department has been steadily expanding cloud and on-premise server capacity at the Delhi Secretariat complex in ITO. Server storage at enterprise government rates in India currently runs between ₹8 and ₹15 per gigabyte per month depending on redundancy tier — meaning that even a few hundred terabytes of genuinely redundant image data translates into annual expenditure of several crore rupees for no operational benefit. The problem compounds because duplicates also slow down search and retrieval systems, which affects frontline services at Jan Seva Kendras across districts including Shahdara, South Delhi, and Outer Delhi.
The Delhi e-District portal, which handles applications for domicile certificates, income certificates, and land records, processes tens of thousands of document image uploads weekly. Without automated duplicate image replacement protocols — systems that detect visually identical or near-identical image files and replace the redundant copies with pointers to a single stored master — each upload event risks adding another copy to an already bloated archive. Pilot programmes in comparable Indian metropolitan administrations have demonstrated storage savings of 20 to 35 percent following systematic deduplication drives, according to published assessments by the National Informatics Centre.
What Delhi's Civic Agencies Are — and Aren't — Doing
The Municipal Corporation of Delhi launched its unified property records portal in phases starting in 2023, but deduplication tooling was not built into the original architecture. Engineers at the department's data centre in Patparganj Industrial Area have reportedly been flagging the issue internally, though no formal public timeline for a deduplication overhaul has been announced. The Delhi Metro Rail Corporation, to its credit, moved to a structured Engineering Document Management System ahead of Phase 4 civil works, which includes version control features that reduce — though do not eliminate — file-level duplication.
For residents and small businesses interacting with government portals, the practical effect of unchecked duplicate image files is slower portal loading times, failed uploads when servers are under strain, and delays in certificate issuance. Citizens filing applications at the Pragati Maidan-area service centres have experienced system lag during peak submission windows, though attributing that lag solely to storage inefficiency would require a formal audit.
The fix is technically straightforward: perceptual hashing algorithms can identify visually duplicate images even when file names differ, and automated replacement pipelines can consolidate storage without deleting source documents. The harder problem is institutional — getting siloed departments to agree on shared data standards and invest in deduplication infrastructure before storage bills force the issue. Based on the trajectory of Delhi's digitisation calendar, that reckoning may arrive before the end of the current financial year ending March 2027.