Senior Managing Director, FTI Consulting
After significant growth in 2020, cloud storage providers are continuing to experience increasing adoption within enterprises, and in turn spurring a rise in new legal and compliance challenges. The cloud storage market is forecasted to grow by more than 20% in the next five years across a range of established and emerging solution providers. With this growth, organizations are scrambling to make sense of how their reliance on providers like Microsoft, Google, Amazon and Dropbox affects their governance and discovery processes.
Documents residing in cloud storage accounts are increasingly coming into scope in digital forensic investigations such as IP theft, regulatory, corruption, merger clearance and civil matters. In any number of legal scenarios, organizations may be obligated to place preservation holds on data residing in cloud storage, collect documents from cloud accounts and produce this data to regulators or courts.
Accessing this type of data for an investigation can be a complicated undertaking no matter the provider, but in several recent matters, our teams have identified specific challenges in extracting forensic artifacts from cloud-based applications. With the most popular platforms in use by hundreds of millions of users around the globe, cloud storage and file sharing data is increasingly arising as a source of evidence in investigations.
This article provides a close examination of conducting digital forensics investigations within cloud-based file sharing applications, and the key challenges teams may face when this source comes into scope in an e-discovery matter.
Identifying the type of account being used is a critical first step in an investigation involving cloud data. Reporting, logging, collection and analysis capabilities vary greatly between the different applications and versions. One of the biggest disparities is the ability to recover deleted data—for example, soft deleted data (data that has been moved to the trash bin but not permanently disposed of) may be retrievable for anywhere between 30 and 180 days depending on an account’s settings. Below is an overview of the key differences across some of the tiers currently available for various platforms, such as Dropbox, OneDrive, Box, Microsoft 365 and Google Workspace.
One important file deletion issue that can arise in certain applications is the distinction between “hard deleted” and “soft deleted.” Files are considered hard deleted when they have been removed from the trash bin and are permanently erased. Soft delete means a file has been moved to the trash bin, but may be recoverable for a certain period of time, depending on the version in use.
Investigators have several options for collecting data from cloud-based storage and file share applications: via an account’s web interface, from the computer’s local application or with a third-party application. No matter which method is selected, quality control is paramount. Investigators must compare collected data back to the live interface to identify any gaps in the dataset that result from sync errors or a collection tool’s failure to collect data from the trash bin or events log. An experienced forensic examiner should be consulted to ensure the metadata matches and provide documentation that the collection methodology worked.
If using the locally installed application, traditional collection methodologies should be sufficient, as the files are maintained as if they are stored on the local computer’s hard drive and file system metadata can be preserved. Provided that the organization has access to the user’s computer, and the files have been downloaded/synced to the local folder on the custodian’s machine, this tends to be the ideal option—especially for matters where metadata is material to the case. Admin credentials aren’t needed and the files can be imaged directly into a forensic tool.
Collecting from a web interface is a good alternative when the “host” computer is not available, or the collection must take place remotely. It requires admin access, however, and file system metadata is not maintained when extracting files via the browser. This may or may not be an issue depending on the nature of the case.
Another option is collecting data using APIs provided by the vendors to connect directly into the environment. These tools require admin credentials. With this method, metadata can be preserved—native files are extracted, and the application collects corresponding metadata, which can be overlaid to the native files (thus restoring the metadata for the original documents).
Dropbox provides a prime example of how using an admin console can provide increased access for investigators. With an admin console, investigators are able to examine a user’s activity, which can be instrumental in identifying suspicious behavior and piecing together additional artifacts that may support the investigation. In Dropbox’s interface, this can inform investigators about user location, when files were downloaded or transferred, soft deletes, hard deletes, previews, uploads, etc. Investigators may create an Excel file of user activity and review that natively, or input it into a database and run queries against it to determine activities on specific dates.
Organizations that have admin-level access should work with their legal teams and digital forensics experts to assess the policies enabled within the target application. Retention periods and parameters should be set for admin console and other logs and items located in the trash bin. Organizations that are widely using a cloud-based file share but do not have a version that provides admin access should evaluate the risks and challenges that may arise in a future matter as a result.
Conducting investigations in cloud sources is not a straightforward process. Legal and IT teams must be aware of what’s going on within these platforms and the implications from an e-discovery perspective—particularly the limitations on recovering deleted data. Experienced digital forensics investigators can provide important guidance in determining which version of an application is in use and developing a defensible strategy for collecting and recovering critical data.