Over the decades I have driven far too many defensible deletion, classification or similar initiatives aimed at removing the ROT from corporate local, network and cloud data stores. I have wondered many times whether these were Sisyphean tasks even as I generated ROI estimates based on data reduction to justify the technology and labor costs. Most of these projects achieved tangible short term corporate and user benefits that evaporated in the face of evolving systems, work practices and data types. I have heard fellow consultants joke about categorization or retention taxonomy deliverables that are outdated before they are even finalized. IT, Security, Legal and Compliance stakeholders can barely keep up with the barrage of new content and collaboration data streams demanded or adopted by employees. As Jeff Goldblum could have said, “Data finds a way.”
Hundreds of tech companies promise to use AI/ML on corporate systems to solve the classification challenge. After all, eDiscovery has leveraged these technologies to tackle multi-terabyte collections. I have even seen success in some narrowly focused scenarios with serious corporate investment in keeping the systems evergreen. Most classification or retention expiry initiatives never make it past the scoping phase as stakeholders grapple with the scale and complexity. I used to blame Google and Microsoft promising business users unlimited storage in 2014 and 2016. Before that, conscientious knowledge workers frantically filed email and documents or routed them to local/network drives before storage limits locked up their mailboxes or home drives. If storage is unlimited, why do many users still spend hours daily deleting, filing or tagging new data from all of their different systems?
Data and time are the coin of the Knowledge Worker economy. Most of us learned to purge and organize our data for efficient retrieval when network and mailbox searches were a joke. Oh wait, I still cannot search across all my M365, Gmail, client, OneBox and other systems. Until someone comes up with a better way to work without comingling private data or ‘crossing the streams’ I will continue to zero out my Inboxes and file client work product.
Microsoft seems to be acknowledging that users need to ‘clean house’ more than they need to retain labeled document. Microsoft is rolling a behavior change to deletion of online files that is similar to how they handle files under Legal Hold. Instead of getting the ‘file cannot be deleted’ error message, the default behavior now routes a copy to the “Preservation Hold Library” for eDiscovery and other compliance retrieval. This does mean that a company can create an effective ‘file journaling policy’ by applying a default retention label to all new OneDrive/SharePoint files. So a user deletion removes some of the noise from their view or search, but it will still count against their storage. Ah yes, all of that ‘unlimited lifetime storage’ marketing crumbled last year as Microsoft and Google imposed new limits. This change may give M365 admins a bit of breathing room, but my clients can tell you how fast users on Legal Hold run up against their 2 TB OneDrive or 100 GB mailbox limit. While this change will not affect your eDiscovery searches, it makes me wonder whether we have reached the tipping point on the filing vs. finding debate. My personal belief is that it is one of the early indicators that deletion is dead and we need to rethink data lifecycle management. Your mileage may vary and I am always open for counter arguments.