Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Core Concepts

Understanding the fundamental ideas behind Eidetica will help you use it effectively and appreciate its unique capabilities.

Architectural Foundation

Eidetica builds on several powerful concepts from distributed systems and database design:

  1. Content-addressable storage: Data is identified by the hash of its content, similar to Git and IPFS
  2. Directed acyclic graphs (DAGs): Changes form a graph structure rather than a linear history
  3. Conflict-free replicated data types (CRDTs): Data structures that can merge concurrent changes automatically
  4. Immutable data structures: Once created, data is never modified, only new versions are added

These foundations enable Eidetica’s key features: robust history tracking, efficient synchronization, and eventual consistency in distributed environments.

Merkle-CRDTs

Eidetica is inspired by the Merkle-CRDT concept from OrbitDB, which combines:

  • Merkle DAGs: A data structure where each node contains a cryptographic hash of its children, creating a tamper-evident history
  • CRDTs: Data types designed to resolve conflicts automatically when concurrent changes occur

In a Merkle-CRDT, each update creates a new node in the graph, containing:

  1. References to parent nodes (previous versions)
  2. The updated data
  3. Metadata for conflict resolution

This approach allows for:

  • Strong auditability of all changes
  • Automatic conflict resolution
  • Efficient synchronization between replicas

Data Model Layers

Eidetica organizes data in a layered architecture:

+-----------------------+
| User Application      |
+-----------------------+
| Instance              |
+-----------------------+
| Databases             |
+----------+------------+
| Stores | Operations   |
+----------+------------+
| Entries (DAG)         |
+-----------------------+
| Database Storage      |
| (local or daemon RPC) |
+-----------------------+

Each layer builds on the ones below, providing progressively higher-level abstractions:

  1. Database Storage: Physical storage of data (SQLite, PostgreSQL, InMemory, or remote via daemon mode)
  2. Entries: Immutable, content-addressed objects forming the database’s history
  3. Databases & Stores: Logical organization and typed access to data
  4. Operations: Atomic transactions across multiple stores
  5. Instance: The top-level database container and API entry point

Entries and the DAG

At the core of Eidetica is a directed acyclic graph (DAG) of immutable Entry objects:

  • Each Entry represents a point-in-time snapshot of data and has:

    • A unique CID derived from its content (making it content-addressable)
    • Links to parent entries via CID references (forming the graph structure)
    • Data payloads organized by store
    • Metadata for database and store relationships
  • The DAG enables:

    • Full history tracking (nothing is ever deleted)
    • Efficient verification of data integrity
    • Conflict resolution when merging concurrent changes

IPLD Data Model

Eidetica uses the IPLD (InterPlanetary Linked Data) data model for content addressing and serialization:

  • CIDs (Content Identifiers): Every entry is identified by a CID — a self-describing hash that encodes the hash algorithm, the content codec (DAG-CBOR), and the hash digest. CID strings look like bafyr4i... (base32lower encoding).
  • DAG-CBOR: Entries are serialized using DAG-CBOR, a deterministic subset of CBOR that ensures identical content always produces identical bytes (and therefore identical CIDs).
  • Multihash: The CID format carries the hash algorithm in a self-describing prefix via the multihash specification. BLAKE3 is the hash algorithm used throughout eidetica — a database must use a single hash algorithm end-to-end, because parent pointers inside entries reference CIDs that must match the algorithm used to key entries in storage. Cross-algorithm trees (e.g., bootstrapping a pre-BLAKE3 database from a peer) are not supported without explicit multi-CID-per-entry support in the backend.

This aligns Eidetica with the broader content-addressed ecosystem (IPFS, Filecoin, libp2p) and makes entries interoperable with IPLD-aware tools. BLAKE3 is also used as the content-addressing hash by iroh, the P2P transport Eidetica uses for sync.

Stores: A Core Innovation

Eidetica extends the Merkle-CRDT concept with Stores, which partition data within each Entry:

  • Each store is a named, typed data structure within a Database
  • Stores can use different data models and conflict resolution strategies
  • Stores maintain their own history tracking within the larger Database

This enables:

  • Type-safe, structure-specific APIs for data access
  • Efficient partial synchronization (only needed stores)
  • Modular features through pluggable stores
  • Atomic operations across different data structures

Planned future stores include:

  • Object Storage: Efficiently handling large objects with content-addressable hashing
  • Backup: Archiving database history for space efficiency

Atomic Operations and Transactions

All changes in Eidetica happen through atomic Transactions:

  1. A Transaction is created from a Database
  2. Stores are accessed and modified through the Transaction
  3. When committed, all changes across all stores become a single new Entry
  4. If the Transaction fails, no changes are applied

This model ensures data consistency while allowing complex operations across multiple stores.

Settings as Stores

In Eidetica, even configuration is stored as a store:

  • A Database’s settings are stored in a special “settings” Store internally that is hidden from regular usage
  • This approach unifies the data model and allows settings to participate in history tracking

CRDT Properties and Eventual Consistency

Eidetica is designed with distributed systems in mind:

  • All data structures have CRDT properties for automatic conflict resolution
  • Different store types implement appropriate CRDT strategies:
    • DocStore uses structural merge by default: concurrent writes to the same key use last-writer-wins (LWW), while writes to different keys are combined. Docs can be marked as atomic to use full LWW replacement, where the entire document replaces its predecessor instead of merging field-by-field. This is used for data that must be treated as a complete unit.
    • Table preserves all items, with LWW for updates to the same item

These properties ensure that when Eidetica instances synchronize, they eventually reach a consistent state regardless of the order in which updates are received.

History Tracking and Time Travel

One of Eidetica’s most powerful features is comprehensive history tracking:

  • All changes are preserved in the Entry DAG
  • “Tips” represent the latest state of a Database or Store
  • Historical states can be reconstructed by traversing the DAG

This design allows for future capabilities like:

  • Point-in-time recovery
  • Auditing and change tracking
  • Historical queries and analysis
  • Branching and versioning

Entry Verification and the Verified Frontier

Every entry carries a verification status that records whether this node has cryptographically validated it against the authentication settings it was signed under:

  • Verified — the entry’s signature and permissions were checked and accepted by a local validation pass.
  • Unverified — the entry is stored but not yet validated by this node. This is the state every entry arrives in over sync, and it is the only state the storage layer will store on put: a peer can never assert “this entry is verified” — verification is always a local decision.
  • Failed — the entry was checked and definitively rejected (bad signature, unauthorized key). Failed entries are dropped from reads everywhere.

Entries you create locally are validated as part of the commit and become Verified immediately. Entries received from a peer start Unverified; Database::verify() re-validates them against the _settings they pin, and a normal read also triggers an opportunistic verification pass when a tip is still Unverified.

Verification is prefix-closed: an entry can only be Verified if its entire ancestor history is Verified. It is therefore impossible for a tip to be Verified while one of its ancestors is not, and a Failed ancestor taints its descendants (they become Failed too).

Because of this, a Database handle exposes only the Verified frontier by default: the largest ancestor-closed prefix of the DAG in which every entry is Verified. Anything reachable only through a still-Unverified entry is hidden, so a default read never reflects state this node could not authenticate. Call .allow_unverified() on the handle to opt into the looser view that also includes Unverified entries (Failed is always dropped).

Current Status and Roadmap

Eidetica is under active development, and some features mentioned in this documentation are still in planning or development stages. Here’s a summary of the current status:

Implemented Features

  • Core Entry and Database structure
  • SQLite and PostgreSQL persistent storage backends (InMemory available for testing)
  • DocStore, Table, and YDoc (Y-CRDT) store implementations
  • PasswordStore for transparent password-based encryption wrapping any store
  • CRDT functionality:
    • Doc (hierarchical nested document structure with recursive merging and tombstone support)
  • Atomic operations across stores
  • Tombstone support for proper deletion handling in distributed environments
  • Signature-based authentication with crypto-agile key types and granular permissions
  • Per-entry verification status with prefix-closed verification and Verified-frontier reads
  • User identity and session management with encrypted key storage
  • Peer-to-peer synchronization via Iroh and HTTP transports
  • Local service (daemon) mode for multi-process shared access over Unix sockets

Planned Features

  • Object Storage store for efficient handling of large objects
  • Backup store for archiving database history
  • IPFS-compatible addressing for distributed storage
  • Point-in-time recovery
  • Sparse and shallow checkouts

This roadmap is subject to change as development progresses. Check the project repository for the most up-to-date information on feature availability.