Skip to content
Partner With Us
    March 31, 2026

    Modern Data in eDiscovery: Why the Shift from Documents to Legal Data Intelligence Is Already Underway

    The nature of data in eDiscovery has changed. Traditional workflows built around email and file shares are now being applied to chat platforms, cloud applications, and mobile data sources. As a result, many legal and technical teams are finding that modern data in eDiscovery does not align cleanly with the Electronic Discovery Reference Model (EDRM).

    Recent matters have made this shift more visible. In investigations such as Nancy Guthrie, investigators and legal teams have relied on non-traditional data sources including Ring camera footage, mobile device location data, and other system-generated records to establish timelines and context. These data types are not documents in the traditional sense. They are dynamic, system-driven, and often require interpretation across multiple sources to form a complete picture.

    The EDRM was designed for a world of relatively static documents. Email messages, Word files, and spreadsheets could be treated as discrete units of data with consistent metadata and clear boundaries. That structure made it possible to move data through identification, collection, processing, review, and production in a predictable way.

    Modern data does not behave that way.

    Collaboration platforms like Slack and Microsoft Teams, cloud storage systems like OneDrive and Google Drive, and mobile messaging tools have introduced data that is conversational, distributed, and constantly changing. These modern data types challenge each phase of the eDiscovery process and are driving a broader shift toward what is increasingly being described as Legal Data Intelligence.

    Identification and Preservation of Modern Data in eDiscovery

    Identifying data sources in eDiscovery used to focus on email systems and file shares. Today, relevant data may exist across multiple cloud platforms, messaging applications, and SaaS tools. Each system has its own structure, retention policies, and access controls.

    Preservation presents additional challenges. Many modern platforms include ephemeral messaging, auto-deletion policies, and user-controlled edits. Legal hold processes do not always map cleanly to these systems. Even when preservation is applied, it may not capture all aspects of the data, including message edits, reactions, or linked documents.

    This creates a practical gap between what is relevant and what can actually be preserved in a defensible manner.

    Collection of Chat Data and Cloud Data

    Collection methods for modern data in eDiscovery often rely on APIs rather than traditional forensic techniques. While API-based collection is efficient and scalable, it may not capture a complete dataset.

    Chat data from platforms like Teams or Slack often includes links to files stored in cloud systems such as OneDrive or Google Drive. Collecting the chat conversation does not necessarily collect the linked document. Even when documents are retrieved, versioning becomes a concern. The version available at the time of the communication may differ from the version collected later.

    In practice, modern eDiscovery collections frequently consist of partial datasets and references to external content rather than fully self-contained documents.

    Processing Modern Data: From Ingestion to Reconstruction

    Processing modern data in eDiscovery has evolved beyond traditional ingestion workflows. Many platforms export data in semi-structured formats such as JSON, requiring transformation before the data can be reviewed.

    Chat data processing involves reconstructing conversations, including message threads, edits, deletions, reactions, and attachments. Time zone normalization, participant mapping, and conversation threading all play a role in how the data is ultimately understood.

    This stage increasingly requires technical expertise similar to data engineering. The goal is not just to process data, but to reconstruct it in a way that accurately reflects the original communication.

    Reviewing Chat Data and Fragmented Communications

    Modern data introduces challenges for document review workflows. Traditional review assumes that each item is a document with clear context. Chat data and short-form communications do not follow this model.

    Messages are often brief, fragmented, and dependent on surrounding context. Emojis, reactions, and edits can change meaning. Reviewing messages individually can lead to misinterpretation, while reviewing entire conversations can introduce duplication and reduce efficiency.

    As modern data volumes increase, review strategies must adapt to account for both scale and context.

    Production Challenges with Modern Data Types

    Producing modern data in eDiscovery raises fundamental questions about format and structure. In a chat-based environment, it is not always clear what constitutes a document. A single message may not be meaningful without the surrounding conversation.

    Decisions must be made between native productions and rendered formats. Native files may preserve structure and metadata, while rendered outputs improve readability. Linked documents and version history add further complexity, particularly when content has changed over time.

    Producing defensible and usable outputs requires planning early in the workflow, not just at the production stage.

    Modern Data Does Not Fit the Traditional EDRM

    The challenges associated with modern data in eDiscovery stem from a broader shift in how data is created and stored.

    Modern data is:

    • Relational, connecting messages, users, files, reactions, and system events
    • Dynamic, with content that can be edited, updated, or deleted
    • Distributed, spanning multiple platforms and systems
    • Connecting data across email, chat, cloud platforms, mobile devices, and structured systems
    • Preserving relationships between communications, users, content, and events
    • Applying analytics to identify patterns, timelines, and context
    • Supporting investigations, compliance, and litigation from a unified data perspective

    The EDRM assumes data that is static, file-based, and self-contained. While the framework still provides value, it does not fully account for the complexity of modern data types such as chat data, cloud data, mobile data, and system-generated records like video and location data.

    The Shift from eDiscovery to Legal Data Intelligence

    As modern data continues to evolve, eDiscovery is expanding beyond its traditional boundaries. The focus is no longer limited to collecting and reviewing documents. Instead, organizations are working to understand data across systems, platforms, and formats.

    This shift is often described as Legal Data Intelligence.

    In practical terms, Legal Data Intelligence involves:

    The EDRM remains an important foundation, but it is no longer sufficient on its own. Modern data requires a more flexible and technically grounded approach.

    Organizations that adapt to this shift will be better equipped to manage the realities of modern data in eDiscovery. Those that continue to rely on document-centric assumptions will face increasing challenges as data becomes more complex and distributed.

    The transition is already underway. The question is how quickly organizations will adjust their approach.