Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Eidetica Documentation

Welcome to the official documentation for Eidetica - a decentralized database built on Merkle-CRDT principles with built-in peer-to-peer synchronization.

Key Features

  • Decentralized Architecture: No central server required - peers connect directly
  • Conflict-Free Replication: Automatic merge resolution using CRDT principles
  • Content-Addressable Storage: Immutable, hash-identified data entries
  • Real-time Synchronization: Background sync with configurable batching and timing
  • Multiple Transport Protocols: HTTP and Iroh P2P with NAT traversal
  • Authentication & Security: Ed25519 signatures for all operations
  • Flexible Data Models: Support for documents, key-value, and structured data

Project Structure

Eidetica is organized as a Cargo workspace:

  • Library (crates/lib/): The core Eidetica library crate
  • CLI Binary (crates/bin/): Command-line interface using the library
  • Examples (examples/): Standalone applications demonstrating usage

Documentation Sections

Examples

User Guide

Welcome to the Eidetica User Guide. This guide will help you understand and use Eidetica effectively in your applications.

What is Eidetica?

Eidetica is a Rust library for managing structured data with built-in history tracking. It combines concepts from distributed systems, Merkle-CRDTs, and traditional databases to provide a unique approach to data management:

  • Efficient data storage with customizable Databases
  • History tracking for all changes via immutable Entries forming a DAG
  • Structured data types via named, typed Stores within logical Databases
  • Atomic changes across multiple data structures using Transactions
  • Designed for distribution (future capability)

How to Use This Guide

This user guide is structured to guide you from basic setup to advanced concepts:

  1. Getting Started: Installation, basic setup, and your first steps.
  2. Basic Usage Pattern: A quick look at the typical workflow.
  3. Core Concepts: Understand the fundamental building blocks:
  4. Tutorial: Todo App: A step-by-step walkthrough using a simple application.
  5. Code Examples: Focused code snippets for common tasks.

Quick Overview: The Core Flow

Eidetica revolves around a few key components working together:

  1. Database: You start by choosing or creating a storage Database (e.g., InMemoryDatabase).
  2. Instance: You create a Instance instance, providing it the Database. This is your main database handle.
  3. Database: Using the Instance, you create or load a Database, which acts as a logical container for related data and tracks its history.
  4. Transaction: To read or write data, you start a Transaction from the Database. This ensures atomicity and consistent views.
  5. Store: Within a Transaction, you get handles to named Stores (like DocStore or Table<YourData>). These provide methods (set, get, insert, remove, etc.) to interact with your structured data.
  6. Commit: Changes made via Store handles within the Transaction are staged. Calling commit() on the Transaction finalizes these changes atomically, creating a new historical Entry in the Database.

Basic Usage Pattern

Here's a quick example showing creating a user, database, and writing new data.

extern crate eidetica;
extern crate serde;
use eidetica::{backend::database::InMemory, Instance, crdt::Doc, store::{DocStore, Table}};
use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize, Clone, Debug)]
struct MyData {
    name: String,
}

fn main() -> eidetica::Result<()> {
let backend = InMemory::new();
let instance = Instance::open(Box::new(backend))?;

// Create and login a passwordless user
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;

// Create a database
let mut settings = Doc::new();
settings.set_string("name", "my_database");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;

// --- Writing Data ---
// Start a Transaction
let txn = database.new_transaction()?;
let inserted_id = { // Scope for store handles
    // Get Store handles
    let config = txn.get_store::<DocStore>("config")?;
    let items = txn.get_store::<Table<MyData>>("items")?;

    // Use Store methods
    config.set("version", "1.0")?;
    items.insert(MyData { name: "example".to_string() })?
}; // Handles drop, changes are staged in txn
// Commit changes
let new_entry_id = txn.commit()?;
println!("Committed changes, new entry ID: {}", new_entry_id);

// --- Reading Data ---
// Use Database::get_store_viewer for a read-only view
let items_viewer = database.get_store_viewer::<Table<MyData>>("items")?;
if let Ok(item) = items_viewer.get(&inserted_id) {
   println!("Read item: {:?}", item);
}
Ok(())
}

See Transactions and Code Examples for more details.

Project Status

Eidetica is currently under active development. The core functionality is working, but APIs are considered experimental and may change in future releases. It is suitable for evaluation and prototyping, but not yet recommended for production systems requiring long-term API stability.

Getting Started

This guide will walk you through the basics of using Eidetica in your Rust applications. We'll cover the essential steps to set up and interact with the database.

Installation

Add Eidetica to your project dependencies:

[dependencies]
eidetica = "0.1.0"  # Update version as appropriate
# Or if using from a local workspace:
# eidetica = { path = "path/to/eidetica/crates/lib" }

Setting up the Database

To start using Eidetica, you need to:

  1. Choose and initialize a Backend (storage mechanism)
  2. Create an Instance (the infrastructure manager)
  3. Create and login a User (authentication and session)
  4. Create or access a Database through the User (logical container for data)

Here's a simple example:

extern crate eidetica;
use eidetica::{backend::database::InMemory, Instance, crdt::Doc};

fn main() -> eidetica::Result<()> {
    // Create a new in-memory backend
    let backend = InMemory::new();

    // Create the Instance
    let instance = Instance::open(Box::new(backend))?;

    // Create a passwordless user (perfect for embedded/single-user apps)
    instance.create_user("alice", None)?;

    // Login to get a User session
    let mut user = instance.login_user("alice", None)?;

    // Create a database in the user's context
    let mut settings = Doc::new();
    settings.set_string("name", "my_database");

    // Get the default key (earliest created key)
    let default_key = user.get_default_key()?;
    let _database = user.create_database(settings, &default_key)?;

    Ok(())
}

Note: This example uses a passwordless user (password is None) for simplicity, which is perfect for embedded applications and CLI tools. For multi-user scenarios, you can create password-protected users by passing Some("password") instead.

The backend determines how your data is stored. The example above uses InMemory, which keeps everything in memory but can save to a file:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc};
use std::path::PathBuf;

fn main() -> eidetica::Result<()> {
// Create instance and user
let backend = InMemory::new();
let instance = Instance::open(Box::new(backend))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "test_db");
let default_key = user.get_default_key()?;
let _database = user.create_database(settings, &default_key)?;

// Use a temporary file path for testing
let temp_dir = std::env::temp_dir();
let path = temp_dir.join("eidetica_test_save.json");

// Save the backend to a file
let backend_guard = instance.backend();
if let Some(in_memory) = backend_guard.as_any().downcast_ref::<InMemory>() {
    in_memory.save_to_file(&path)?;
}

// Clean up the temporary file
if path.exists() {
    std::fs::remove_file(&path).ok();
}
Ok(())
}

You can load a previously saved backend:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc};
use std::path::PathBuf;

fn main() -> eidetica::Result<()> {
// First create and save a test backend
let backend = InMemory::new();
let instance = Instance::open(Box::new(backend))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "test_db");
let default_key = user.get_default_key()?;
let _database = user.create_database(settings, &default_key)?;

// Use a temporary file path for testing
let temp_dir = std::env::temp_dir();
let path = temp_dir.join("eidetica_test_load.json");

// Save the backend first
let backend_guard = instance.backend();
if let Some(in_memory) = backend_guard.as_any().downcast_ref::<InMemory>() {
    in_memory.save_to_file(&path)?;
}

// Load a previously saved backend
let backend = InMemory::load_from_file(&path)?;

// Load instance (automatically detects existing system state)
let instance = Instance::open(Box::new(backend))?;

// Login to existing user
let user = instance.login_user("alice", None)?;

// Clean up the temporary file
if path.exists() {
    std::fs::remove_file(&path).ok();
}
Ok(())
}

User-Centric Architecture

Eidetica uses a user-centric architecture:

  • Instance: Manages infrastructure (user accounts, backend, system databases)
  • User: Handles all contextual operations (database creation, key management)

All database and key operations happen through a User session after login. This provides:

  • Clear separation: Infrastructure management vs. contextual operations
  • Strong isolation: Each user has separate keys and preferences
  • Flexible authentication: Users can have passwords or not (passwordless mode)

Passwordless Users (embedded/single-user apps):

instance.create_user("alice", None)?;
let user = instance.login_user("alice", None)?;

Password-Protected Users (multi-user apps):

instance.create_user("bob", Some("password123"))?;
let user = instance.login_user("bob", Some("password123"))?;

The downside of password protection is a slow login. instance.login_user needs to verify the password and decrypt keys, which by design is a relatively slow operation.

Working with Data

Eidetica uses Stores to organize data within a database. One common store type is Table, which maintains a collection of items with unique IDs.

Defining Your Data

Any data you store must be serializable with serde:

Basic Operations

All operations in Eidetica happen within an atomic Transaction:

Inserting Data:

extern crate eidetica;
extern crate serde;
use eidetica::{backend::database::InMemory, Instance, crdt::Doc, store::Table, Database};
use serde::{Serialize, Deserialize};

#[derive(Clone, Debug, Serialize, Deserialize)]
struct Person {
    name: String,
    age: u32,
}

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;

// Start an authenticated transaction
let op = database.new_transaction()?;

// Get or create a Table store
let people = op.get_store::<Table<Person>>("people")?;

// Insert a person and get their ID
let person = Person { name: "Alice".to_string(), age: 30 };
let _id = people.insert(person)?;

// Commit the changes (automatically signed with the user's key)
op.commit()?;
Ok(())
}

Reading Data:

extern crate eidetica;
extern crate serde;
use eidetica::{backend::database::InMemory, Instance, crdt::Doc, store::Table, Database};
use serde::{Serialize, Deserialize};

#[derive(Clone, Debug, Serialize, Deserialize)]
struct Person {
    name: String,
    age: u32,
}

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
// Insert some test data
let op = database.new_transaction()?;
let people = op.get_store::<Table<Person>>("people")?;
let test_id = people.insert(Person { name: "Alice".to_string(), age: 30 })?;
op.commit()?;
let id = &test_id;

let op = database.new_transaction()?;
let people = op.get_store::<Table<Person>>("people")?;

// Get a single person by ID
if let Ok(person) = people.get(id) {
    println!("Found: {} ({})", person.name, person.age);
}

// Search for all people (using a predicate that always returns true)
let all_people = people.search(|_| true)?;
for (id, person) in all_people {
    println!("ID: {}, Name: {}, Age: {}", id, person.name, person.age);
}
Ok(())
}

Updating Data:

extern crate eidetica;
extern crate serde;
use eidetica::{backend::database::InMemory, Instance, crdt::Doc, store::Table, Database};
use serde::{Serialize, Deserialize};

#[derive(Clone, Debug, Serialize, Deserialize)]
struct Person {
    name: String,
    age: u32,
}

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
// Insert some test data
let op_setup = database.new_transaction()?;
let people_setup = op_setup.get_store::<Table<Person>>("people")?;
let test_id = people_setup.insert(Person { name: "Alice".to_string(), age: 30 })?;
op_setup.commit()?;
let id = &test_id;

let op = database.new_transaction()?;
let people = op.get_store::<Table<Person>>("people")?;

// Get, modify, and update
if let Ok(mut person) = people.get(id) {
    person.age += 1;
    people.set(id, person)?;
}

op.commit()?;
Ok(())
}

Deleting Data:

extern crate eidetica;
extern crate serde;
use eidetica::{backend::database::InMemory, Instance, crdt::Doc, store::Table, Database};
use serde::{Serialize, Deserialize};

#[derive(Clone, Debug, Serialize, Deserialize)]
struct Person {
    name: String,
    age: u32,
}

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
let _id = "test_id";

let op = database.new_transaction()?;
let people = op.get_store::<Table<Person>>("people")?;

// FIXME: Table doesn't currently support deletion
// You can overwrite with a "deleted" marker or use other approaches

op.commit()?;
Ok(())
}

Complete Examples

For complete working examples, see:

  • Chat Example - Multi-user chat application demonstrating:

    • User accounts and authentication
    • Real-time synchronization with HTTP and Iroh transports
    • Bootstrap protocol for joining rooms
    • TUI interface with Ratatui
  • Todo Example - Task management application

Next Steps

After getting familiar with the basics, you might want to explore:

Core Concepts

Understanding the fundamental ideas behind Eidetica will help you use it effectively and appreciate its unique capabilities.

Architectural Foundation

Eidetica builds on several powerful concepts from distributed systems and database design:

  1. Content-addressable storage: Data is identified by the hash of its content, similar to Git and IPFS
  2. Directed acyclic graphs (DAGs): Changes form a graph structure rather than a linear history
  3. Conflict-free replicated data types (CRDTs): Data structures that can merge concurrent changes automatically
  4. Immutable data structures: Once created, data is never modified, only new versions are added

These foundations enable Eidetica's key features: robust history tracking, efficient synchronization, and eventual consistency in distributed environments.

Merkle-CRDTs

Eidetica is inspired by the Merkle-CRDT concept from OrbitDB, which combines:

  • Merkle DAGs: A data structure where each node contains a cryptographic hash of its children, creating a tamper-evident history
  • CRDTs: Data types designed to resolve conflicts automatically when concurrent changes occur

In a Merkle-CRDT, each update creates a new node in the graph, containing:

  1. References to parent nodes (previous versions)
  2. The updated data
  3. Metadata for conflict resolution

This approach allows for:

  • Strong auditability of all changes
  • Automatic conflict resolution
  • Efficient synchronization between replicas

Data Model Layers

Eidetica organizes data in a layered architecture:

+-----------------------+
| User Application      |
+-----------------------+
| Instance                |
+-----------------------+
| Databases                 |
+----------+------------+
| Stores | Operations |
+----------+------------+
| Entries (DAG)         |
+-----------------------+
| Database Storage      |
+-----------------------+

Each layer builds on the ones below, providing progressively higher-level abstractions:

  1. Database Storage: Physical storage of data (currently InMemory with file persistence)
  2. Entries: Immutable, content-addressed objects forming the database's history
  3. Databases & Stores: Logical organization and typed access to data
  4. Operations: Atomic transactions across multiple stores
  5. Instance: The top-level database container and API entry point

Entries and the DAG

At the core of Eidetica is a directed acyclic graph (DAG) of immutable Entry objects:

  • Each Entry represents a point-in-time snapshot of data and has:

    • A unique ID derived from its content (making it content-addressable)
    • Links to parent entries (forming the graph structure)
    • Data payloads organized by store
    • Metadata for database and store relationships
  • The DAG enables:

    • Full history tracking (nothing is ever deleted)
    • Efficient verification of data integrity
    • Conflict resolution when merging concurrent changes

IPFS Inspiration and Future Direction

While Eidetica draws inspiration from IPFS (InterPlanetary File System), it currently uses its own implementation patterns:

  • IPFS is a content-addressed, distributed storage system where data is identified by cryptographic hashes
  • OrbitDB (which inspired Eidetica) uses IPFS for backend storage and distribution

Eidetica's future plans include:

  • Developing efficient internal APIs for transferring objects between Eidetica instances
  • Potential IPFS-compatible addressing for distributed storage
  • More efficient synchronization mechanisms than traditional IPFS

Stores: A Core Innovation

Eidetica extends the Merkle-CRDT concept with Stores, which partition data within each Entry:

  • Each store is a named, typed data structure within a Database
  • Stores can use different data models and conflict resolution strategies
  • Stores maintain their own history tracking within the larger Database

This enables:

  • Type-safe, structure-specific APIs for data access
  • Efficient partial synchronization (only needed stores)
  • Modular features through pluggable stores
  • Atomic operations across different data structures

Planned future stores include:

  • Object Storage: Efficiently handling large objects with content-addressable hashing
  • Backup: Archiving database history for space efficiency
  • Encrypted Store: Transparent encrypted data storage

Atomic Operations and Transactions

All changes in Eidetica happen through atomic Transactions:

  1. A Transaction is created from a Database
  2. Stores are accessed and modified through the Transaction
  3. When committed, all changes across all stores become a single new Entry
  4. If the Transaction fails, no changes are applied

This model ensures data consistency while allowing complex operations across multiple stores.

Settings as Stores

In Eidetica, even configuration is stored as a store:

  • A Database's settings are stored in a special "settings" Store internally that is hidden from regular usage
  • This approach unifies the data model and allows settings to participate in history tracking

CRDT Properties and Eventual Consistency

Eidetica is designed with distributed systems in mind:

  • All data structures have CRDT properties for automatic conflict resolution
  • Different store types implement appropriate CRDT strategies:
    • DocStore uses last-writer-wins (LWW) with implicit timestamps
    • Table preserves all items, with LWW for updates to the same item

These properties ensure that when Eidetica instances synchronize, they eventually reach a consistent state regardless of the order in which updates are received.

History Tracking and Time Travel

One of Eidetica's most powerful features is comprehensive history tracking:

  • All changes are preserved in the Entry DAG
  • "Tips" represent the latest state of a Database or Store
  • Historical states can be reconstructed by traversing the DAG

This design allows for future capabilities like:

  • Point-in-time recovery
  • Auditing and change tracking
  • Historical queries and analysis
  • Branching and versioning

Current Status and Roadmap

Eidetica is under active development, and some features mentioned in this documentation are still in planning or development stages. Here's a summary of the current status:

Implemented Features

  • Core Entry and Database structure
  • In-memory database with file persistence
  • DocStore and Table store implementations
  • CRDT functionality:
    • Doc (hierarchical nested document structure with recursive merging and tombstone support for deletions)
  • Atomic operations across stores
  • Tombstone support for proper deletion handling in distributed environments

Planned Features

  • Object Storage store for efficient handling of large objects
  • Backup store for archiving database history
  • Encrypted store for transparent encrypted data storage
  • IPFS-compatible addressing for distributed storage
  • Enhanced synchronization mechanisms
  • Point-in-time recovery

This roadmap is subject to change as development progresses. Check the project repository for the most up-to-date information on feature availability.

Entries & Databases

The basic units of data and organization in Eidetica.

Entries

Entries are the fundamental building blocks in Eidetica. An Entry represents an atomic unit of data with the following characteristics:

  • Content-addressable: Each entry has a unique ID derived from its content, similar to Git commits.
  • Immutable: Once created, entries cannot be modified.
  • Parent references: Entries maintain references to their parent entries, forming a directed acyclic graph (DAG).
  • Database association: Each entry belongs to a database and can reference parent entries within both the main database and stores.
  • Store data: Entries can contain data for one or more stores, representing different aspects or types of data.

Entries function similar to commits in Git - they represent a point-in-time snapshot of data with links to previous states, enabling history tracking.

Databases

A Database in Eidetica is a logical container for related entries, conceptually similar to:

  • A traditional database containing multiple tables
  • A branch in a version control system
  • A collection in a document database

Key characteristics of Databases:

  • Root Entry: Each database has a root entry that serves as its starting point.
  • Named Identity: Databases typically have a name stored in their settings store.
  • History Tracking: Databases maintain the complete history of all changes as a linked graph of entries.
  • Store Organization: Data within a database is organized into named stores, each potentially using different data structures.
  • Atomic Operations: All changes to a database happen through transactions, which create new entries.

Database Transactions

You interact with Databases through Transactions:

extern crate eidetica;
use eidetica::{backend::database::InMemory, Instance, crdt::Doc, store::DocStore, Database};

use eidetica::Result;

fn example(database: Database) -> Result<()> {
    // Create a new transaction
    let op = database.new_transaction()?;

    // Access stores and perform actions
    let settings = op.get_store::<DocStore>("settings")?;
    settings.set("version", "1.2.0")?;

    // Commit the changes, creating a new Entry
    let new_entry_id = op.commit()?;

    Ok(())
}

fn main() -> Result<()> {
    let backend = InMemory::new();
    let instance = Instance::open(Box::new(backend))?;
    instance.create_user("alice", None)?;
    let mut user = instance.login_user("alice", None)?;
    let mut settings = Doc::new();
    settings.set_string("name", "test");
    let default_key = user.get_default_key()?;
    let database = user.create_database(settings, &default_key)?;
    example(database)?;
    Ok(())
}

When you commit a transaction, Eidetica:

  1. Creates a new Entry containing all changes
  2. Links it to the appropriate parent entries
  3. Adds it to the database's history
  4. Returns the ID of the new entry

Database Settings

Each Database maintains its settings as a key-value store in a special "settings" store:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::SettingsStore, Database};

fn main() -> eidetica::Result<()> {
// Setup database for testing
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings_doc = Doc::new();
settings_doc.set("name", "example_database");
settings_doc.set("version", "1.0.0");
let default_key = user.get_default_key()?;
let database = user.create_database(settings_doc, &default_key)?;
// Access database settings through a transaction
let transaction = database.new_transaction()?;
let settings_store = transaction.get_settings()?;

// Access common settings
let name = settings_store.get_name()?;
println!("Database name: {}", name);

// Access custom settings via the underlying DocStore
let doc_store = settings_store.as_doc_store();
if let Ok(version_value) = doc_store.get("version") {
    println!("Database version available");
}

transaction.commit()?;
Ok(())
}

Common settings include:

  • name: The identifier for the database (used by Instance::find_database). This is the primary standard setting currently used.
  • Other application-specific settings can be stored here.

Tips and History

Databases in Eidetica maintain a concept of "tips" - the latest entries in the database's history. Tips represent the current state of the database and are managed automatically by the system.

When you create transactions and commit changes, Eidetica automatically:

  • Updates the database tips to point to your new entries
  • Maintains the complete history of all previous states
  • Ensures efficient access to the current state through tip tracking

This historical information remains accessible, allowing you to:

  • Track all changes to data over time
  • Reconstruct the state at any point in history (requires manual traversal or specific backend support - see Backends)

Database vs. Store

While a Database is the logical container, the actual data is organized into Stores. This separation allows:

  • Different types of data structures within a single Database
  • Type-safe access to different parts of your data
  • Fine-grained history tracking by store
  • Efficient partial replication and synchronization

See Stores for more details on how data is structured within a Database.

Database Storage

Database storage implementations in Eidetica define how and where data is physically stored.

The Database Abstraction

The Database trait abstracts the underlying storage mechanism for Eidetica entries. This separation of concerns allows the core database logic to remain independent of the specific storage details.

Key responsibilities of a Database:

  • Storing and retrieving entries by their unique IDs
  • Tracking relationships between entries
  • Calculating tips (latest entries) for databases and stores
  • Managing the graph-like structure of entry history

Available Database Implementations

InMemory

The InMemory database is currently the primary storage implementation:

  • Stores all entries in memory
  • Can load from and save to a JSON file
  • Well-suited for development, testing, and applications with moderate data volumes
  • Simple to use and requires no external dependencies

Example usage:

// Create a new in-memory database
use eidetica::backend::database::InMemory;
let database = InMemory::new();
let db = Instance::open(Box::new(database))?;

// ... use the database ...

// Save to a file (optional)
let path = PathBuf::from("my_database.json");
let database_guard = db.backend().lock().unwrap();
if let Some(in_memory) = database_guard.as_any().downcast_ref::<InMemory>() {
    in_memory.save_to_file(&path)?;
}

// Load from a file
let database = InMemory::load_from_file(&path)?;
let db = Instance::open(Box::new(database))?;

Note: The InMemory database is the only storage implementation currently provided with Eidetica.

Database Trait Responsibilities

The Database trait (eidetica::backend::Database) defines the core interface required for storage. Beyond simple get and put for entries, it includes methods crucial for navigating the database's history and structure:

  • get_tips(tree_id): Finds the latest entries in a specific Database.
  • get_subtree_tips(tree_id, subtree_name): Finds the latest entries for a specific Store within a Database.
  • all_roots(): Finds all top-level Database roots stored in the database.
  • get_tree(tree_id) / get_subtree(...): Retrieve all entries for a database/store, typically sorted topologically (required for some history operations, potentially expensive).

Implementing these methods efficiently often requires the database to understand the DAG structure, making the database more than just a simple key-value store.

Database Performance Considerations

The Database implementation significantly impacts database performance:

  • Entry Retrieval: How quickly entries can be accessed by ID
  • Graph Traversal: Efficiency of history traversal and tip calculation
  • Memory Usage: How entries are stored and whether they're kept in memory
  • Concurrency: How concurrent operations are handled

Stores

Stores provide structured, type-safe access to different kinds of data within a Database.

The Store Concept

In Eidetica, Stores extend the Merkle-CRDT concept by explicitly partitioning data within each Entry. A Store:

  • Represents a specific type of data structure (like a key-value store or a collection of records)
  • Has a unique name within its parent Database
  • Maintains its own history tracking
  • Is strongly typed (via Rust generics)

Stores are what make Eidetica practical for real applications, as they provide high-level, data-structure-aware interfaces on top of the core Entry and Database concepts.

Why Stores?

Stores offer several advantages:

  • Type Safety: Each store implementation provides appropriate methods for its data type
  • Isolation: Changes to different stores can be tracked separately
  • Composition: Multiple data structures can exist within a single Database
  • Efficiency: Only relevant stores need to be loaded or synchronized
  • Atomic Operations: Changes across multiple stores can be committed atomically

Available Store Types

Eidetica provides four main store types, each optimized for different data patterns:

TypePurposeKey FeaturesBest For
DocStoreDocument storagePath-based operations, nested structuresConfiguration, metadata, structured docs
Table<T>Record collectionsAuto-generated UUIDs, type safety, searchUser lists, products, any structured records
SettingsStoreDatabase settingsType-safe settings API, auth managementDatabase configuration, authentication
YDocCollaborative editingY-CRDT integration, real-time syncShared documents, collaborative text editing

DocStore (Document-Oriented Storage)

The DocStore store provides a document-oriented interface for storing and retrieving structured data. It wraps the crdt::Doc type to provide ergonomic access patterns with both simple key-value operations and path-based operations for nested data structures.

Basic Usage

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::DocStore, path};

fn main() -> eidetica::Result<()> {
let backend = Box::new(InMemory::new());
let instance = Instance::open(backend)?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
// Get a DocStore store
let op = database.new_transaction()?;
let store = op.get_store::<DocStore>("app_data")?;

// Set simple values
store.set("version", "1.0.0")?;
store.set("author", "Alice")?;

// Path-based operations for nested structures
// This creates nested maps: {"database": {"host": "localhost", "port": "5432"}}
store.set_path(path!("database.host"), "localhost")?;
store.set_path(path!("database.port"), "5432")?;

// Retrieve values
let version = store.get("version")?; // Returns a Value
let host = store.get_path(path!("database.host"))?; // Returns Value

op.commit()?;
Ok(())
}

Important: Path Operations Create Nested Structures

When using set_path("a.b.c", value), DocStore creates nested maps, not flat keys with dots:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::DocStore, path};

fn main() -> eidetica::Result<()> {
let backend = Box::new(InMemory::new());
let instance = Instance::open(backend)?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
let op = database.new_transaction()?;
let store = op.get_store::<DocStore>("app_data")?;
// This code:
store.set_path(path!("user.profile.name"), "Bob")?;

// Creates this structure:
// {
//   "user": {
//     "profile": {
//       "name": "Bob"
//     }
//   }
// }

// NOT: { "user.profile.name": "Bob" } ❌
op.commit()?;
Ok(())
}

Use cases for DocStore:

  • Application configuration
  • Metadata storage
  • Structured documents
  • Settings management
  • Any data requiring path-based access

Table

The Table<T> store manages collections of serializable items, similar to a table in a database:

extern crate eidetica;
extern crate serde;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::Table};
use serde::{Serialize, Deserialize};

fn main() -> eidetica::Result<()> {
let backend = Box::new(InMemory::new());
let instance = Instance::open(backend)?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
// Define a struct for your data
#[derive(Serialize, Deserialize, Clone)]
struct User {
    name: String,
    email: String,
    active: bool,
}

// Get a Table store
let op = database.new_transaction()?;
let users = op.get_store::<Table<User>>("users")?;

// Insert items (returns a generated UUID)
let user = User {
    name: "Alice".to_string(),
    email: "alice@example.com".to_string(),
    active: true,
};
let id = users.insert(user)?;

// Get an item by ID
if let Ok(user) = users.get(&id) {
    println!("Found user: {}", user.name);
}

// Update an item
if let Ok(mut user) = users.get(&id) {
    user.active = false;
    users.set(&id, user)?;
}

// Delete an item
let was_deleted = users.delete(&id)?;
if was_deleted {
    println!("User deleted successfully");
}

// Search for items matching a condition
let active_users = users.search(|user| user.active)?;
for (id, user) in active_users {
    println!("Active user: {} (ID: {})", user.name, id);
}
op.commit()?;
Ok(())
}

Use cases for Table:

  • Collections of structured objects
  • Record storage (users, products, todos, etc.)
  • Any data where individual items need unique IDs
  • When you need to search across records with custom predicates

SettingsStore (Database Settings Management)

The SettingsStore provides a specialized, type-safe interface for managing database settings and authentication configuration. It wraps the internal _settings subtree to provide convenient methods for common settings operations.

Basic Usage

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::SettingsStore};

fn main() -> eidetica::Result<()> {
let backend = Box::new(InMemory::new());
let instance = Instance::open(backend)?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
// Get a SettingsStore for the current transaction
let transaction = database.new_transaction()?;
let settings_store = transaction.get_settings()?;

// Set database name
settings_store.set_name("My Application Database")?;

// Get database name
let name = settings_store.get_name()?;
println!("Database name: {}", name);

transaction.commit()?;
Ok(())
}

Authentication Management

SettingsStore provides convenient methods for managing authentication keys:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::SettingsStore};
use eidetica::auth::{AuthKey, Permission};
use eidetica::auth::crypto::{generate_keypair, format_public_key};

fn main() -> eidetica::Result<()> {
// Setup database for testing
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "stores_auth_example");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
// Generate a keypair for the new user
let (_alice_signing_key, alice_verifying_key) = generate_keypair();
let alice_public_key = format_public_key(&alice_verifying_key);
let transaction = database.new_transaction()?;
let settings_store = transaction.get_settings()?;

// Add a new authentication key
let auth_key = AuthKey::active(
    &alice_public_key,
    Permission::Write(10),
)?;
settings_store.set_auth_key("alice", auth_key)?;

// Get an authentication key
let key = settings_store.get_auth_key("alice")?;
println!("Alice's key: {}", key.pubkey());

// Revoke a key
settings_store.revoke_auth_key("alice")?;

transaction.commit()?;
Ok(())
}

Complex Updates with Closures

For complex operations that need to be atomic, use the update_auth_settings method:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::SettingsStore};
use eidetica::auth::{AuthKey, Permission};
use eidetica::auth::crypto::{generate_keypair, format_public_key};

fn main() -> eidetica::Result<()> {
// Setup database for testing
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "complex_auth_example");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
// Generate keypairs for multiple users
let (_bob_signing_key, bob_verifying_key) = generate_keypair();
let bob_public_key = format_public_key(&bob_verifying_key);
let bob_key = AuthKey::active(&bob_public_key, Permission::Write(20))?;
let (_charlie_signing_key, charlie_verifying_key) = generate_keypair();
let charlie_public_key = format_public_key(&charlie_verifying_key);
let charlie_key = AuthKey::active(&charlie_public_key, Permission::Admin(15))?;
let (_old_user_signing_key, old_user_verifying_key) = generate_keypair();
let old_user_public_key = format_public_key(&old_user_verifying_key);
let old_user_key = AuthKey::active(&old_user_public_key, Permission::Write(30))?;
// Add old_user first so we can revoke it
let setup_txn = database.new_transaction()?;
let setup_store = setup_txn.get_settings()?;
setup_store.set_auth_key("old_user", old_user_key)?;
setup_txn.commit()?;
let transaction = database.new_transaction()?;
let settings_store = transaction.get_settings()?;

// Perform multiple auth operations atomically
settings_store.update_auth_settings(|auth| {
    // Add multiple keys
    auth.overwrite_key("bob", bob_key)?;
    auth.overwrite_key("charlie", charlie_key)?;

    // Revoke an old key
    auth.revoke_key("old_user")?;

    Ok(())
})?;

transaction.commit()?;
Ok(())
}

Advanced Usage

For operations not covered by the convenience methods, access the underlying DocStore:

let transaction = database.new_transaction()?;
let settings_store = transaction.get_settings()?;

// Access underlying DocStore for advanced operations
let doc_store = settings_store.as_doc_store();
doc_store.set_path(path!("custom.config.option"), "value")?;

transaction.commit()?;

Use cases for SettingsStore:

  • Database configuration and metadata
  • Authentication key management
  • User permission management
  • Bootstrap and sync policies
  • Any settings that need type-safe, validated access

YDoc (Y-CRDT Integration)

The YDoc store provides integration with Y-CRDT (Yjs) for real-time collaborative editing. This requires the "y-crdt" feature:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::YDoc};
use eidetica::y_crdt::{Map, Text, Transact};

fn main() -> eidetica::Result<()> {
// Setup database for testing
let backend = InMemory::new();
let instance = Instance::open(Box::new(backend))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "y_crdt_stores");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;

// Get a YDoc store
let op = database.new_transaction()?;
let doc_store = op.get_store::<YDoc>("document")?;

// Work with Y-CRDT structures
doc_store.with_doc_mut(|doc| {
    let text = doc.get_or_insert_text("content");
    let metadata = doc.get_or_insert_map("meta");

    let mut txn = doc.transact_mut();

    // Collaborative text editing
    text.insert(&mut txn, 0, "Hello, collaborative world!");

    // Set metadata
    metadata.insert(&mut txn, "title", "My Document");
    metadata.insert(&mut txn, "author", "Alice");

    Ok(())
})?;

op.commit()?;
Ok(())
}

Use cases for YDoc:

  • Real-time collaborative text editing
  • Shared documents with multiple editors
  • Conflict-free data synchronization
  • Applications requiring sophisticated merge algorithms

Store Implementation Details

Each Store implementation in Eidetica:

  1. Implements the Store trait
  2. Provides methods appropriate for its data structure
  3. Handles serialization/deserialization of data
  4. Manages the store's history within the Database

The Store trait defines the minimal interface:

pub trait Store: Sized {
    fn new(op: &Transaction, store_name: &str) -> Result<Self>;
    fn name(&self) -> &str;
}

Store implementations add their own methods on top of this minimal interface.

Store History and Merging (CRDT Aspects)

While Eidetica uses Merkle-DAGs for overall history, the way data within a Store is combined when branches merge relies on Conflict-free Replicated Data Type (CRDT) principles. This ensures that even if different replicas of the database have diverged and made concurrent changes, they can be merged back together automatically without conflicts (though the merge result depends on the CRDT strategy).

Each Store type implements its own merge logic, typically triggered implicitly when an Transaction reads the current state of the store (which involves finding and merging the tips of that store's history):

  • DocStore: Implements a Last-Writer-Wins (LWW) strategy using the internal Doc type. When merging concurrent writes to the same key or path, the write associated with the later Entry "wins", and its value is kept. Writes to different keys are simply combined. Deleted keys (via delete()) are tracked with tombstones to ensure deletions propagate properly.

  • Table<T>: Also uses LWW for updates to the same row ID. If two concurrent operations modify the same row, the later write wins. Inserts of different rows are combined (all inserted rows are kept). Deletions generally take precedence over concurrent updates (though precise semantics might evolve).

Note: The CRDT merge logic happens internally when an Transaction loads the initial state of a Store or when a store viewer is created. You typically don't invoke merge logic directly.

Future Store Types

Eidetica's architecture allows for adding new Store implementations. Potential future types include:

  • ObjectStore: For storing large binary blobs.

These are not yet implemented. Development is currently focused on the core API and the existing DocStore and Table types.

Transactions: Atomic Changes

In Eidetica, all modifications to the data stored within a Database's Stores happen through an Transaction. This is a fundamental concept ensuring atomicity and providing a consistent mechanism for interacting with your data.

Authentication Note: All transactions in Eidetica are authenticated by default. Every transaction uses the database's default signing key to ensure that all changes are cryptographically verified and can be traced to their source.

A Transaction bundles multiple Store operations (which affect individual subtrees) into a single atomic Entry that gets committed to the database.

Why Transactions?

Transactions provide several key benefits:

  • Atomicity: Changes made to multiple Stores within a single Transaction are committed together as one atomic unit. If the commit() fails, no changes are persisted. This is similar to transactions in traditional databases.
  • Consistency: A Transaction captures a snapshot of the Database's state (specifically, the tips of the relevant Stores) when it's created or when a Store is first accessed within it. All reads and writes within that Transaction occur relative to this consistent state.
  • Change Staging: Modifications made via Store handles are staged within the Transaction object itself, not written directly to the database until commit() is called.
  • Authentication: All transactions are automatically authenticated using the database's default signing key, ensuring data integrity and access control.
  • History Creation: A successful commit() results in the creation of a new Entry in the Database, containing the staged changes and linked to the previous state (the tips the Transaction was based on). This is how history is built.

The Transaction Lifecycle

Using a Transaction follows a distinct lifecycle:

  1. Creation: Start an authenticated transaction from a Database instance.

    extern crate eidetica;
    use eidetica::{backend::database::InMemory, Instance, crdt::Doc};
    
    fn main() -> eidetica::Result<()> {
    // Setup database
    let backend = InMemory::new();
    let instance = Instance::open(Box::new(backend))?;
    instance.create_user("alice", None)?;
    let mut user = instance.login_user("alice", None)?;
    let mut settings = Doc::new();
    settings.set_string("name", "test");
    let default_key = user.get_default_key()?;
    let database = user.create_database(settings, &default_key)?;
    
    let _txn = database.new_transaction()?; // Automatically uses the database's default signing key
    Ok(())
    }
  2. Store Access: Get handles to the specific Stores you want to interact with. This implicitly loads the current state (tips) of that store into the transaction if accessed for the first time.

    extern crate eidetica;
    extern crate serde;
    use eidetica::{backend::database::InMemory, Instance, crdt::Doc, store::{Table, DocStore, SettingsStore}, Database};
    use serde::{Serialize, Deserialize};
    
    #[derive(Clone, Debug, Serialize, Deserialize)]
    struct User {
        name: String,
    }
    
    fn main() -> eidetica::Result<()> {
    // Setup database and transaction
    let backend = InMemory::new();
    let instance = Instance::open(Box::new(backend))?;
    instance.create_user("alice", None)?;
    let mut user = instance.login_user("alice", None)?;
    let mut settings = Doc::new();
    settings.set_string("name", "test");
    let default_key = user.get_default_key()?;
    let database = user.create_database(settings, &default_key)?;
    let txn = database.new_transaction()?;
    
    // Get handles within a scope or manage their lifetime
    let _users_store = txn.get_store::<Table<User>>("users")?;
    let _config_store = txn.get_store::<DocStore>("config")?;
    let _settings_store = txn.get_settings()?;  // For database settings
    
    txn.commit()?;
    Ok(())
    }
  3. Staging Changes: Use the methods provided by the Store handles (set, insert, get, remove, etc.). These methods interact with the data staged within the Transaction.

    extern crate eidetica;
    extern crate serde;
    use eidetica::{backend::database::InMemory, Instance, crdt::Doc, store::{Table, DocStore, SettingsStore}};
    use serde::{Serialize, Deserialize};
    
    #[derive(Clone, Debug, Serialize, Deserialize)]
    struct User {
        name: String,
    }
    
    fn main() -> eidetica::Result<()> {
    // Setup database and transaction
    let backend = InMemory::new();
    let instance = Instance::open(Box::new(backend))?;
    instance.create_user("alice", None)?;
    let mut user = instance.login_user("alice", None)?;
    let mut settings = Doc::new();
    settings.set_string("name", "test");
    let default_key = user.get_default_key()?;
    let database = user.create_database(settings, &default_key)?;
    let txn = database.new_transaction()?;
    let users_store = txn.get_store::<Table<User>>("users")?;
    let config_store = txn.get_store::<DocStore>("config")?;
    let settings_store = txn.get_settings()?;
    
    // Insert a new user and get their ID
    let user_id = users_store.insert(User { name: "Alice".to_string() })?;
    let _current_user = users_store.get(&user_id)?;
    config_store.set("last_updated", "2024-01-15T10:30:00Z")?;
    settings_store.set_name("Updated Database Name")?;  // Manage database settings
    
    txn.commit()?;
    Ok(())
    }

    Note: get methods within a transaction read from the staged state, reflecting any changes already made within the same transaction.

  4. Commit: Finalize the changes. This consumes the Transaction object, calculates the final Entry content based on staged changes, cryptographically signs the entry, writes the new Entry to the Database, and returns the ID of the newly created Entry.

    extern crate eidetica;
    use eidetica::{backend::database::InMemory, Instance, crdt::Doc};
    
    fn main() -> eidetica::Result<()> {
    // Setup database
    let backend = InMemory::new();
    let instance = Instance::open(Box::new(backend))?;
    instance.create_user("alice", None)?;
    let mut user = instance.login_user("alice", None)?;
    let mut settings = Doc::new();
    settings.set_string("name", "test");
    let default_key = user.get_default_key()?;
    let database = user.create_database(settings, &default_key)?;
    
    // Create transaction and commit
    let txn = database.new_transaction()?;
    let new_entry_id = txn.commit()?;
    println!("Changes committed. New state represented by Entry: {}", new_entry_id);
    Ok(())
    }

    After commit(), the txn variable is no longer valid.

Managing Database Settings

Within transactions, you can manage database settings using SettingsStore. This provides type-safe access to database configuration and authentication settings:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::SettingsStore};
use eidetica::auth::{AuthKey, Permission};
use eidetica::auth::crypto::{generate_keypair, format_public_key};

fn main() -> eidetica::Result<()> {
// Setup database for testing
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "settings_example");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
// Generate keypairs for old user and add it first so we can revoke it
let (_old_user_signing_key, old_user_verifying_key) = generate_keypair();
let old_user_public_key = format_public_key(&old_user_verifying_key);
let old_user_key = AuthKey::active(&old_user_public_key, Permission::Write(15))?;
let setup_txn = database.new_transaction()?;
let setup_store = setup_txn.get_settings()?;
setup_store.set_auth_key("old_user", old_user_key)?;
setup_txn.commit()?;
let transaction = database.new_transaction()?;
let settings_store = transaction.get_settings()?;

// Update database name
settings_store.set_name("Production Database")?;

// Generate keypairs for new users (hidden in production code)
let (_new_user_signing_key, new_user_verifying_key) = generate_keypair();
let new_user_public_key = format_public_key(&new_user_verifying_key);
let (_alice_signing_key, alice_verifying_key) = generate_keypair();
let alice_public_key = format_public_key(&alice_verifying_key);

// Add authentication keys
let new_user_key = AuthKey::active(
    &new_user_public_key,
    Permission::Write(10),
)?;
settings_store.set_auth_key("new_user", new_user_key)?;

// Complex auth operations atomically
let alice_key = AuthKey::active(&alice_public_key, Permission::Write(5))?;
settings_store.update_auth_settings(|auth| {
    auth.overwrite_key("alice", alice_key)?;
    auth.revoke_key("old_user")?;
    Ok(())
})?;

transaction.commit()?;
Ok(())
}

This ensures that settings changes are atomic and properly authenticated alongside other database modifications.

Read-Only Access

While Transactions are essential for writes, you can perform reads without an explicit Transaction using Database::get_store_viewer:

extern crate eidetica;
extern crate serde;
use eidetica::{backend::database::InMemory, Instance, crdt::Doc, store::Table, Database};
use serde::{Serialize, Deserialize};

#[derive(Clone, Debug, Serialize, Deserialize)]
struct User {
    name: String,
}

fn main() -> eidetica::Result<()> {
// Setup database with some data
let backend = InMemory::new();
let instance = Instance::open(Box::new(backend))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "test");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
// Insert test data
let txn = database.new_transaction()?;
let users_store = txn.get_store::<Table<User>>("users")?;
let user_id = users_store.insert(User { name: "Alice".to_string() })?;
txn.commit()?;

let users_viewer = database.get_store_viewer::<Table<User>>("users")?;
if let Ok(_user) = users_viewer.get(&user_id) {
    // Read data based on the current tips of the 'users' store
}
Ok(())
}

A SubtreeViewer provides read-only access based on the latest committed state (tips) of that specific store at the time the viewer is created. It does not allow modifications and does not require a commit().

Choose Transaction when you need to make changes or require a transaction-like boundary for multiple reads/writes. Choose SubtreeViewer for simple, read-only access to the latest state.

Authentication Guide

How to use Eidetica's authentication system for securing your data.

Quick Start

Every Eidetica database requires authentication. Here's the minimal setup:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory};
use eidetica::crdt::Doc;

fn main() -> eidetica::Result<()> {
let backend = InMemory::new();
let instance = Instance::open(Box::new(backend))?;

// Create and login a passwordless user (generates Ed25519 keypair automatically)
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;

// Create a database using the user's default key
let mut settings = Doc::new();
settings.set("name", "my_database");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;

// All operations are now authenticated
let op = database.new_transaction()?;
// ... make changes ...
op.commit()?;  // Automatically signed
Ok(())
}

Key Concepts

Mandatory Authentication: Every entry must be signed - no exceptions.

Permission Levels:

  • Admin: Can modify settings and manage keys
  • Write: Can read and write data
  • Read: Can only read data

Key Storage: Private keys are stored in Instance, public keys in database settings.

Common Tasks

Adding Users

Give other users access to your database:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::SettingsStore};
use eidetica::auth::{AuthKey, Permission};
use eidetica::auth::crypto::{generate_keypair, format_public_key};

fn main() -> eidetica::Result<()> {
// Setup database for testing
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "auth_example");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
let transaction = database.new_transaction()?;
// Generate a keypair for the new user
let (_alice_signing_key, alice_verifying_key) = generate_keypair();
let alice_public_key = format_public_key(&alice_verifying_key);
let settings_store = transaction.get_settings()?;

// Add a user with write access
let user_key = AuthKey::active(
    &alice_public_key,
    Permission::Write(10),
)?;
settings_store.set_auth_key("alice", user_key)?;

transaction.commit()?;
Ok(())
}

Making Data Public (Read-Only)

Allow anyone to read your database:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory};
use eidetica::store::SettingsStore;
use eidetica::auth::{AuthKey, Permission};
use eidetica::crdt::Doc;

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
let transaction = database.new_transaction()?;
let settings_store = transaction.get_settings()?;

// Wildcard key for public read access
let public_key = AuthKey::active(
    "*",
    Permission::Read,
)?;
settings_store.set_auth_key("*", public_key)?;

transaction.commit()?;
Ok(())
}

Collaborative Databases (Read-Write)

Create a collaborative database where anyone can read and write without individual key management:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory};
use eidetica::store::SettingsStore;
use eidetica::auth::{AuthKey, Permission};
use eidetica::crdt::Doc;

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set("name", "collaborative_notes");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;

// Set up global write permissions
let transaction = database.new_transaction()?;
let settings_store = transaction.get_settings()?;

// Global permission allows any device to read and write
let collaborative_key = AuthKey::active(
    "*",
    Permission::Write(10),
)?;
settings_store.set_auth_key("*", collaborative_key)?;

transaction.commit()?;
Ok(())
}

How it works:

  1. Any device can bootstrap without approval (global permission grants access)
  2. Devices discover available SigKeys using Database::find_sigkeys()
  3. Select a SigKey from the available options (will include "*" for global permissions)
  4. Open the database with the selected SigKey
  5. All transactions automatically use the configured permissions
  6. No individual keys are added to the database's auth settings

Example of opening a collaborative database:

extern crate eidetica;
use eidetica::{Instance, Database, backend::database::InMemory};
use eidetica::auth::crypto::{generate_keypair, format_public_key};
use eidetica::auth::types::SigKey;

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
let (signing_key, verifying_key) = generate_keypair();
let database_root_id = "collaborative_db_root".into();
// Get your public key
let pubkey = format_public_key(&verifying_key);

// Discover all SigKeys this public key can use
let sigkeys = Database::find_sigkeys(&instance, &database_root_id, &pubkey)?;

// Use the first available SigKey (will be "*" for global permissions)
if let Some((sigkey, _permission)) = sigkeys.first() {
    let sigkey_str = match sigkey {
        SigKey::Direct(name) => name.clone(),
        _ => panic!("Delegation paths not yet supported"),
    };

    // Open the database with the discovered SigKey
    let database = Database::open(instance, &database_root_id, signing_key, sigkey_str)?;

    // Create transactions as usual
    let txn = database.new_transaction()?;
    // ... make changes ...
    txn.commit()?;
}
Ok(())
}

This is ideal for:

  • Team collaboration spaces
  • Shared notes and documents
  • Public wikis
  • Development/testing environments

Security note: Use appropriate permission levels. Write(10) allows Write and Read operations but not Admin operations (managing keys and settings).

Revoking Access

Remove a user's access:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory};
use eidetica::store::SettingsStore;
use eidetica::auth::{AuthKey, Permission};
use eidetica::crdt::Doc;

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
// First add alice key so we can revoke it
let transaction_setup = database.new_transaction()?;
let settings_setup = transaction_setup.get_settings()?;
settings_setup.set_auth_key("alice", AuthKey::active("*", Permission::Write(10))?)?;
transaction_setup.commit()?;
let transaction = database.new_transaction()?;

let settings_store = transaction.get_settings()?;

// Revoke the key
settings_store.revoke_auth_key("alice")?;

transaction.commit()?;
Ok(())
}

Note: Historical entries created by revoked keys remain valid.

Multi-User Setup Example

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::SettingsStore};
use eidetica::auth::{AuthKey, Permission};
use eidetica::auth::crypto::{generate_keypair, format_public_key};

fn main() -> eidetica::Result<()> {
// Setup database for testing
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "multi_user_example");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
let transaction = database.new_transaction()?;
let settings_store = transaction.get_settings()?;

// Generate keypairs for different users
let (_super_admin_signing_key, super_admin_verifying_key) = generate_keypair();
let super_admin_public_key = format_public_key(&super_admin_verifying_key);

let (_dept_admin_signing_key, dept_admin_verifying_key) = generate_keypair();
let dept_admin_public_key = format_public_key(&dept_admin_verifying_key);

let (_user1_signing_key, user1_verifying_key) = generate_keypair();
let user1_public_key = format_public_key(&user1_verifying_key);

// Use update_auth_settings for complex multi-key setup
settings_store.update_auth_settings(|auth| {
    // Super admin (priority 0 - highest)
    auth.overwrite_key("super_admin", AuthKey::active(
        &super_admin_public_key,
        Permission::Admin(0),
    )?)?;

    // Department admin (priority 10)
    auth.overwrite_key("dept_admin", AuthKey::active(
        &dept_admin_public_key,
        Permission::Admin(10),
    )?)?;

    // Regular users (priority 100)
    auth.overwrite_key("user1", AuthKey::active(
        &user1_public_key,
        Permission::Write(100),
    )?)?;

    Ok(())
})?;

transaction.commit()?;
Ok(())
}

Key Management Tips

  1. Use descriptive key names: "alice_laptop", "build_server", etc.
  2. Set up admin hierarchy: Lower priority numbers = higher authority
  3. Use SettingsStore methods:
    • set_auth_key() for setting keys (upsert behavior)
    • revoke_auth_key() for removing access
    • update_auth_settings() for complex multi-step operations
  4. Regular key rotation: Periodically update keys for security
  5. Backup admin keys: Keep secure copies of critical admin keys

Advanced: Cross-Database Authentication (Delegation)

Delegation allows databases to reference other databases as sources of authentication keys. This enables powerful patterns like:

  • Users manage their own keys in personal databases
  • Multiple projects share authentication across databases
  • Hierarchical access control without granting admin privileges

How Delegation Works

When you delegate to another database:

  1. The delegating database references another database in its _settings.auth
  2. The delegated database maintains its own keys in its _settings.auth
  3. Permission clamping ensures delegated keys can't exceed specified bounds
  4. Delegation paths reference keys by their name in the delegated database's auth settings

Basic Delegation Setup

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc};
use eidetica::auth::{DelegatedTreeRef, Permission, PermissionBounds, TreeReference};
use eidetica::store::SettingsStore;

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let default_key = user.get_default_key()?;

// Create user's personal database
let alice_database = user.create_database(Doc::new(), &default_key)?;

// Create main project database
let project_database = user.create_database(Doc::new(), &default_key)?;
// Get the user's database root and current tips
let user_root = alice_database.root_id().clone();
let user_tips = alice_database.get_tips()?;

// Add delegation reference to project database
let transaction = project_database.new_transaction()?;
let settings = transaction.get_settings()?;

settings.update_auth_settings(|auth| {
    auth.add_delegated_tree("alice@example.com", DelegatedTreeRef {
        permission_bounds: PermissionBounds {
            max: Permission::Write(15),
            min: Some(Permission::Read),
        },
        tree: TreeReference {
            root: user_root,
            tips: user_tips,
        },
    })?;
    Ok(())
})?;

transaction.commit()?;
Ok(())
}

Now any key in Alice's personal database can access the project database, with permissions clamped to the specified bounds.

Understanding Delegation Paths

Critical concept: A delegation path traverses through databases using two different types of key names:

  1. Delegation reference names - Point to other databases (DelegatedTreeRef)
  2. Signing key names - Point to public keys (AuthKey) for signature verification

Delegation Reference Names

These are names in the delegating database's auth settings that point to other databases:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc};
use eidetica::auth::{DelegatedTreeRef, Permission, PermissionBounds, TreeReference};
use eidetica::store::SettingsStore;

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let default_key = user.get_default_key()?;
let alice_db = user.create_database(Doc::new(), &default_key)?;
let alice_root = alice_db.root_id().clone();
let alice_tips = alice_db.get_tips()?;
let project_db = user.create_database(Doc::new(), &default_key)?;
let transaction = project_db.new_transaction()?;
let settings = transaction.get_settings()?;
// In project database: "alice@example.com" points to Alice's database
settings.update_auth_settings(|auth| {
    auth.add_delegated_tree(
        "alice@example.com",  // ← Delegation reference name
        DelegatedTreeRef {
            tree: TreeReference {
                root: alice_root,
                tips: alice_tips,
            },
            permission_bounds: PermissionBounds {
                max: Permission::Write(15),
                min: Some(Permission::Read),
            },
        }
    )?;
    Ok(())
})?;
transaction.commit()?;
Ok(())
}

This creates an entry in the project database's auth settings:

  • Name: "alice@example.com"
  • Points to: Alice's database (via TreeReference)

Signing Key Names

These are names in the delegated database's auth settings that point to public keys:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc};
use eidetica::auth::{AuthKey, Permission};
use eidetica::store::SettingsStore;

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let default_key_id = user.get_default_key()?;
let alice_db = user.create_database(Doc::new(), &default_key_id)?;
let signing_key = user.get_signing_key(&default_key_id)?;
let alice_pubkey_str = eidetica::auth::crypto::format_public_key(&signing_key.verifying_key());
let transaction = alice_db.new_transaction()?;
let settings = transaction.get_settings()?;
// In Alice's database: "alice_laptop" points to a public key
// (This was added automatically during bootstrap, but we can add aliases)
settings.update_auth_settings(|auth| {
    auth.add_key(
        "alice_work",  // ← Signing key name (alias)
        AuthKey::active(
            &alice_pubkey_str,  // The actual Ed25519 public key
            Permission::Write(10),
        )?
    )?;
    Ok(())
})?;
transaction.commit()?;
Ok(())
}

This creates an entry in Alice's database auth settings:

  • Name: "alice_work" (an alias for the same key as "alice_laptop")
  • Points to: An Ed25519 public key

Using Delegated Keys

A delegation path is a sequence of steps that traverses from the delegating database to the signing key:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc};
use eidetica::auth::{SigKey, DelegationStep};
use eidetica::store::DocStore;

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let default_key = user.get_default_key()?;
let project_db = user.create_database(Doc::new(), &default_key)?;
let user_db = user.create_database(Doc::new(), &default_key)?;
let user_tips = user_db.get_tips()?;
// Create a delegation path with TWO steps:
let delegation_path = SigKey::DelegationPath(vec![
    // Step 1: Look up "alice@example.com" in PROJECT database's auth settings
    //         This is a delegation reference name pointing to Alice's database
    DelegationStep {
        key: "alice@example.com".to_string(),
        tips: Some(user_tips),  // Tips for Alice's database
    },
    // Step 2: Look up "alice_laptop" in ALICE'S database's auth settings
    //         This is a signing key name pointing to an Ed25519 public key
    DelegationStep {
        key: "alice_laptop".to_string(),
        tips: None,  // Final step has no tips (it's a pubkey, not a tree)
    },
]);

// Use the delegation path to create an authenticated operation
// Note: This requires the actual signing key to be available
// project_database.new_operation_with_sig_key(delegation_path)?;
Ok(())
}

Path traversal:

  1. Start in project database auth settings
  2. Look up "alice@example.com" → finds DelegatedTreeRef → jumps to Alice's database
  3. Look up "alice_laptop" in Alice's database → finds AuthKey → gets Ed25519 public key
  4. Use that public key to verify the entry signature

Permission Clamping

Permissions from delegated databases are automatically clamped:

User DB key: Admin(5)     →  Project DB clamps to: Write(15)  (max bound)
User DB key: Write(10)    →  Project DB keeps:      Write(10) (within bounds)
User DB key: Read         →  Project DB keeps:      Read      (above min bound)

Rules:

  • If delegated permission > max bound: lowered to max
  • If delegated permission < min bound: raised to min (if specified)
  • Permissions within bounds are preserved
  • Admin permissions only apply within the delegated database

This makes it convenient to reuse the same validation rules across both databases. Only an Admin can grant permissions to a database by modifying the Auth Settings, but we can grant lower access to a User, and allow them to use any key they want, by granting access to a User controlled database and giving that the desired permissions. The User can then manage their own keys using their own Admin keys, under exactly the same rules.

Multi-Level Delegation

Delegated databases can themselves delegate to other databases, creating chains:

// Entry signed through a delegation chain:
{
  "auth": {
    "sig": "...",
    "key": [
      {
        "key": "team@example.com",      // Step 1: Delegation ref in Main DB → Team DB
        "tips": ["team_db_tip"]
      },
      {
        "key": "alice@example.com",     // Step 2: Delegation ref in Team DB → Alice's DB
        "tips": ["alice_db_tip"]
      },
      {
        "key": "alice_laptop",          // Step 3: Signing key in Alice's DB → pubkey
        // No tips - this is a pubkey, not a tree
      }
    ]
  }
}

Path traversal:

  1. Look up "team@example.com" in Main DB → finds DelegatedTreeRef → jump to Team DB
  2. Look up "alice@example.com" in Team DB → finds DelegatedTreeRef → jump to Alice's DB
  3. Look up "alice_laptop" in Alice's DB → finds AuthKey → get Ed25519 public key
  4. Use that public key to verify the signature

Each level applies its own permission clamping, with the final effective permission being the minimum across all levels.

Common Delegation Patterns

User-Managed Access:

Project DB → delegates to → Alice's Personal DB
                              ↓
                         Alice manages her own keys

Team Hierarchy:

Main DB → delegates to → Team DB → delegates to → User DB
          (max: Admin)            (max: Write)

Cross-Project Authentication:

Project A ───┐
             ├→ delegates to → Shared Auth DB
Project B ───┘

Key Aliasing

Auth settings can contain multiple names for the same public key with different permissions:

{
  "_settings": {
    "auth": {
      "Ed25519:abc123...": {
        "pubkey": "Ed25519:abc123...",
        "permissions": "admin:0",
        "status": "active"
      },
      "alice_work": {
        "pubkey": "Ed25519:abc123...",
        "permissions": "write:10",
        "status": "active"
      },
      "alice_readonly": {
        "pubkey": "Ed25519:abc123...",
        "permissions": "read",
        "status": "active"
      }
    }
  }
}

This allows:

  • The same key to have different permission contexts
  • Readable delegation path names instead of public key strings
  • Fine-grained access control based on how the key is referenced

Best Practices

  1. Use descriptive delegation names: "alice@example.com", "team-engineering"
  2. Set appropriate permission bounds: Don't grant more access than needed
  3. Update delegation tips: Keep tips current to ensure revocations are respected
  4. Use friendly key names: Add aliases for keys that will be used in delegation paths
  5. Document delegation chains: Complex hierarchies can be hard to debug

See Also

Troubleshooting

"Authentication failed": Check that:

  • The key exists in database settings
  • The key status is Active (not Revoked)
  • The key has sufficient permissions for the operation

"Key name conflict": When using set_auth_key() with different public key:

  • set_auth_key() provides upsert behavior for same public key
  • Returns KeyNameConflict error if key name exists with different public key
  • Use get_auth_key() to check existing key before deciding action

"Cannot modify key": Admin operations require:

  • Admin-level permissions
  • Equal or higher priority than the target key

Multi-device conflicts: During bootstrap sync between devices:

  • If same key name with same public key: Operation succeeds (safe)
  • If same key name with different public key: Operation fails (prevents conflicts)
  • Consider using device-specific key names like "alice_laptop", "alice_phone"

Network partitions: Authentication changes merge automatically using Last-Write-Wins. The most recent change takes precedence.

Bootstrap Security Policy

When new devices join existing databases through bootstrap synchronization, Eidetica provides two approval methods to balance security and convenience.

Bootstrap Approval Methods

Eidetica supports two bootstrap approval approaches, checked in this order:

  1. Global Wildcard Permissions - Databases with global '*' permissions automatically approve bootstrap requests if the requested permission is satisfied
  2. Manual Approval - Default secure behavior requiring admin approval for each device

Default Security Behavior

By default, bootstrap requests are rejected for security:

// Bootstrap will fail without explicit policy configuration
client_sync.sync_with_peer_for_bootstrap(
    "127.0.0.1:8080",
    &database_tree_id,
    Some("device_key_name"),
    Some(Permission::Write(100)),
).await; // Returns PermissionDenied error

The simplest approach for collaborative databases is to use global wildcard permissions:

let mut settings = Doc::new();
let mut auth_doc = Doc::new();

// Add admin key
auth_doc.set_json("admin", serde_json::json!({
    "pubkey": admin_public_key,
    "permissions": {"Admin": 1},
    "status": "Active"
}))?;

// Add global wildcard permission for automatic bootstrap
auth_doc.set_json("*", serde_json::json!({
    "pubkey": "*",
    "permissions": {"Write": 10},  // Allows Read and Write(11+) requests
    "status": "Active"
}))?;

settings.set_doc("auth", auth_doc);

Benefits:

  • No per-device key management required
  • Immediate bootstrap approval
  • Simple configuration - one permission setting controls all devices
  • See Bootstrap Guide for details

Manual Approval Process

For controlled access scenarios, use manual approval to review each bootstrap request:

Security Recommendations:

  • Development/Testing: Use global wildcard permissions for convenience
  • Production: Use manual approval for controlled access
  • Team Collaboration: Use global wildcard permissions with appropriate permission levels
  • Public Databases: Use global wildcard permissions for open access, or manual approval for controlled access

Bootstrap Flow

  1. Client Request: Device requests access with public key and permission level
  2. Global Permission Check: Server checks if global '*' permission satisfies request
  3. Global Permission Approval: If global permission exists and satisfies request, access is granted immediately
  4. Manual Approval Queue: If no global permission, request is queued for admin review
  5. Admin Decision: Admin explicitly approves or rejects the request
  6. Database Access: Approved devices can read/write according to granted permissions

See Also

Synchronization Guide

Eidetica's synchronization system enables real-time data synchronization between distributed peers in a decentralized network. This guide covers how to set up, configure, and use the sync features.

Overview

The sync system uses a BackgroundSync architecture with command-pattern communication:

  • Single background thread handles all sync operations
  • Command-channel communication between frontend and backend
  • Automatic change detection via hook system
  • Multiple transport protocols (HTTP, Iroh P2P)
  • Database-level sync relationships for granular control
  • Authentication and security using Ed25519 signatures
  • Persistent state tracking via DocStore

Quick Start

1. Enable Sync on Your Database

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory};

fn main() -> eidetica::Result<()> {
let backend = Box::new(InMemory::new());
// Create a database with sync enabled
let instance = Instance::open(backend)?;
instance.enable_sync()?;

// Create and login a user (generates authentication key automatically)
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
Ok(())
}

2. Enable a Transport Protocol

// Get access to sync module
let sync = db.sync().unwrap();

// Enable HTTP transport
sync.enable_http_transport()?;

// Start a server to accept connections
sync.start_server_async("127.0.0.1:8080").await?;

For new devices joining existing databases, use authenticated bootstrap to request access:

// On another device - connect and bootstrap with authentication
let client_sync = client_db.sync().unwrap();
client_sync.enable_http_transport()?;

// Bootstrap with authentication - automatically requests write permission
client_sync.sync_with_peer_for_bootstrap(
    "127.0.0.1:8080",
    &tree_id,
    "device_key",                          // Your local key name
    eidetica::auth::Permission::Write      // Requested permission level
).await?;

4. Connecting to Peers

// Connect and sync with a peer - automatically detects bootstrap vs incremental sync
client_sync.sync_with_peer("127.0.0.1:8080", Some(&tree_id)).await?;

That's it! The sync system handles everything automatically:

  • Handshake and peer registration
  • Bootstrap (full sync) if you don't have the database
  • Incremental sync if you already have it
  • Bidirectional data transfer

Transport Protocols

HTTP Transport

The HTTP transport uses REST APIs for synchronization. Good for simple deployments with fixed IP addresses:

// Enable HTTP transport
sync.enable_http_transport()?;

// Start server
sync.start_server_async("127.0.0.1:8080").await?;

// Connect to peer
sync.sync_with_peer("peer.example.com:8080", Some(&tree_id)).await?;

Iroh provides direct peer-to-peer connectivity with NAT traversal. Best for production deployments:

// Enable Iroh transport (uses production relay servers by default)
sync.enable_iroh_transport()?;

// Start server
sync.start_server_async("ignored").await?; // Iroh manages its own addressing

// Get address to share with peers
let my_address = sync.get_server_address_async().await?;

// Connect to peer
sync.sync_with_peer(&peer_address_json, Some(&tree_id)).await?;

How it works:

  • Attempts direct connection via NAT hole-punching (~90% success)
  • Falls back to relay servers if needed
  • Automatically upgrades to direct when possible

Advanced configuration: Iroh supports custom relay servers, staging mode, and relay-disabled mode for local testing. See the Iroh documentation for details.

Sync Configuration

BackgroundSync Architecture

The sync system automatically starts a background thread when transport is enabled. Once configured, all operations are handled automatically:

  • When you commit changes, they're sent immediately via sync callbacks
  • Failed sends are retried with exponential backoff (2^attempts seconds, max 64 seconds)
  • Periodic sync runs based on user-configured intervals (default: every 5 minutes)
  • Connection checks every 60 seconds
  • No manual queue management needed

Periodic Sync Intervals

Each user can configure how frequently their databases sync automatically using the interval_seconds setting:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc};
use eidetica::user::types::{DatabasePreferences, SyncSettings};

fn main() -> eidetica::Result<()> {
let backend = Box::new(InMemory::new());
let instance = Instance::open(backend)?;
instance.enable_sync()?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "my_db");
let key = user.get_default_key()?;
let db = user.create_database(settings, &key)?;
let db_id = db.root_id().clone();
// Configure periodic sync every 60 seconds
let prefs = DatabasePreferences {
    database_id: db_id,
    key_id: user.get_default_key()?,
    sync_settings: SyncSettings {
        sync_enabled: true,
        sync_on_commit: false,
        interval_seconds: Some(60),  // Sync every 60 seconds
        properties: Default::default(),
    },
};

user.add_database(prefs)?;
Ok(())
}

Interval Options:

  • Some(seconds): Sync automatically every N seconds
  • None: No periodic sync (only sync on commit or manual trigger)

Multi-User Behavior:

When multiple users track the same database with different intervals, the system uses the minimum interval (most aggressive sync):

// Alice syncs every 300 seconds
alice.add_database(DatabasePreferences {
    database_id: shared_db_id.clone(),
    sync_settings: SyncSettings {
        interval_seconds: Some(300),
        ..Default::default()
    },
    ..prefs
})?;

// Bob syncs every 60 seconds
bob.add_database(DatabasePreferences {
    database_id: shared_db_id.clone(),
    sync_settings: SyncSettings {
        interval_seconds: Some(60),
        ..Default::default()
    },
    ..prefs
})?;

// Database will sync every 60 seconds (minimum of 300 and 60)

This ensures the database stays as up-to-date as the most active user wants.

Peer Management

The new sync_with_peer() method handles peer management automatically:

// Automatic peer connection, handshake, and sync in one call
sync.sync_with_peer("peer.example.com:8080", Some(&tree_id)).await?;

// This automatically:
// 1. Performs handshake with the peer
// 2. Registers the peer if not already known
// 3. Bootstraps the database if it doesn't exist locally
// 4. Performs incremental sync if database exists locally
// 5. Stores peer information for future sync operations

// For subsequent sync operations with the same peer
sync.sync_with_peer("peer.example.com:8080", Some(&tree_id)).await?;
// Reuses existing peer registration and performs incremental sync

Manual Peer Management (Advanced)

For advanced use cases, you can manage peers manually:

// Register a peer manually
sync.register_peer("ed25519:abc123...", Some("Alice's Device"))?;

// Add multiple addresses for the same peer
sync.add_peer_address(&peer_key, Address::http("192.168.1.100:8080")?)?;
sync.add_peer_address(&peer_key, Address::iroh("iroh://peer_id@relay")?)?;

// Use low-level sync method with registered peers
sync.sync_tree_with_peer(&peer_key, &tree_id).await?;

Peer Information and Status

// Get peer information (after connecting)
if let Some(peer_info) = sync.get_peer_info(&peer_key)? {
    println!("Peer: {} ({})",
             peer_info.display_name.unwrap_or("Unknown".to_string()),
             peer_info.status);
}

// List all registered peers
let peers = sync.list_peers()?;
for peer in peers {
    println!("Peer: {} - {}", peer.pubkey, peer.display_name.unwrap_or("Unknown".to_string()));
}

Database Tracking and Preferences

When using Eidetica with user accounts, you can track which databases you want to sync and configure individual sync preferences for each database.

Adding a Database to Track

To add a database to your user's tracked databases:

// Configure preferences for a database
let prefs = DatabasePreferences {
    database_id: db_id.clone(),
    key_id: user.get_default_key()?,
    sync_settings: SyncSettings {
        sync_enabled: true,
        sync_on_commit: false,
        interval_seconds: Some(60),  // Sync every 60 seconds
        properties: Default::default(),
    },
};

// Add to user's tracked databases
user.add_database(prefs)?;

When you add a database, the system automatically discovers which signing key (SigKey) your user key can use to authenticate with that database. This uses the database's permission system to find the best available access level.

Managing Tracked Databases

// List all tracked databases
let databases = user.list_database_prefs()?;
for db_prefs in databases {
    println!("Database: {}", db_prefs.database_id);
    println!("  Syncing: {}", db_prefs.sync_settings.sync_enabled);
}

// Get preferences for a specific database
let prefs = user.database_prefs(&db_id)?;

// Update sync preferences
let mut updated_prefs = prefs.clone();
updated_prefs.sync_settings.sync_enabled = false;
user.set_prefs(updated_prefs.database_prefs)?;

// Remove a database from tracking
user.remove_database(&db_id)?;

Loading Tracked Databases

Once a database is tracked, you can easily load it:

// Load a tracked database
let database = user.open_database(&db_id)?;

// The user's configured key and SigKey are automatically used
// You can now work with the database normally

Sync Preferences vs Sync Status

It's important to understand the distinction:

  • Preferences (managed by User): What you want to happen (sync enabled, interval, etc.)
  • Status (managed by Sync module): What is actually happening (last sync time, success/failure, etc.)

The user tracking system manages your preferences. The sync module reads these preferences to determine which databases to sync and when.

Multi-User Support

Different users can track the same database with different preferences:

// Alice wants to sync this database every minute
alice_user.add_database(DatabasePreferences {
    database_id: shared_db_id.clone(),
    key_id: alice_key.clone(),
    sync_settings: SyncSettings {
        sync_enabled: true,
        interval_seconds: Some(60),
        ..Default::default()
    },
})?;

// Bob wants to sync the same database, but only on commit
bob_user.add_database(DatabasePreferences {
    database_id: shared_db_id.clone(),
    key_id: bob_key.clone(),
    sync_settings: SyncSettings {
        sync_enabled: true,
        sync_on_commit: true,
        interval_seconds: None,
        ..Default::default()
    },
})?;

Each user maintains their own tracking list and preferences independently.

Security

Authentication

All sync operations use Ed25519 digital signatures:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc};

fn main() -> eidetica::Result<()> {
// Setup database instance with sync capability
let backend = Box::new(InMemory::new());
let instance = Instance::open(backend)?;
instance.enable_sync()?;

// The sync system automatically uses your user's key for authentication
// Create and login a user (generates the primary key)
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;

// User can add additional keys if needed for backup or multiple devices
user.add_private_key(Some("backup_key"))?;

// Create a database with default authentication
let mut settings = Doc::new();
settings.set_string("name", "my_sync_database");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;

// Set a specific key as default for a database (configuration pattern)
// In production: database.set_default_auth_key("backup_key");
println!("Authentication keys configured for sync operations");
Ok(())
}

Bootstrap Authentication Flow

When joining a new database, the authenticated bootstrap protocol handles permission requests:

  1. Client Request: Device requests access with its public key and desired permission level
  2. Policy Check: Server evaluates bootstrap auto-approval policy (secure by default)
  3. Conditional Approval: Key approved only if policy explicitly allows
  4. Key Addition: Server adds the requesting key to the database's authentication settings
  5. Database Transfer: Complete database state transferred to the client
  6. Access Granted: Client can immediately make authenticated operations
// Configure database with global wildcard permission (server side)
let mut settings = Doc::new();
settings.set_string("name", "Team Database");

let mut auth_doc = Doc::new();

// Add global wildcard permission for team collaboration
auth_doc.set_json("*", serde_json::json!({
    "pubkey": "*",
    "permissions": {"Write": 10},
    "status": "Active"
}))?;

settings.set_doc("auth", auth_doc);

// Create the database owned by the admin, with anyone able to write
let admin = instance.login_user("admin", Some("admin_password"))
let admin_key = admin.get_default_key()?;
let database = admin.create_database(settings, &admin_key)?;

// Bootstrap authentication example (client side)
sync.sync_with_peer_for_bootstrap(
    "peer_address",
    &tree_id,
    "my_device_key",
    Permission::Write(15)  // Request write access
).await?;  // Will succeed via global permission

// After successful bootstrap, the device can write to the database
let op = database.new_authenticated_operation("my_device_key")?;
// ... make changes ...
op.commit()?;

Security Considerations:

  • Bootstrap requests are rejected by default for security
  • Global wildcard permissions enable automatic approval without per-device key management
  • Manual approval queues bootstrap requests for administrator review
  • All key additions (in manual approval) are recorded in the immutable database history
  • Permission levels are enforced (Read/Write/Admin)

Peer Verification

During handshake, peers exchange and verify public keys:

// The connect_to_peer method automatically:
// 1. Exchanges public keys
// 2. Verifies signatures
// 3. Registers the verified peer
let verified_peer_key = sync.connect_to_peer(&addr).await?;

Monitoring and Diagnostics

Sync Operations

The BackgroundSync engine handles all operations automatically:

// Entries are synced immediately when committed
// No manual queue management needed

// The background thread handles:
// - Immediate sending of new entries
// - Retry queue with exponential backoff
// - Periodic sync every 5 minutes
// - Connection health checks every minute

// Server status
let is_running = sync.is_server_running();
let server_addr = sync.get_server_address()?;

Sync State Tracking

use eidetica::sync::state::SyncStateManager;

// Get sync state for a database-peer relationship
let op = sync.sync_tree().new_operation()?;
let state_manager = SyncStateManager::new(&op);

let cursor = state_manager.get_sync_cursor(&peer_key, &tree_id)?;
println!("Last synced: {:?}", cursor.last_synced_entry);

let metadata = state_manager.get_sync_metadata(&peer_key)?;
println!("Success rate: {:.2}%", metadata.sync_success_rate() * 100.0);

Bootstrap-First Sync Protocol

Eidetica now supports a bootstrap-first sync protocol that enables devices to join existing databases (rooms/channels) without requiring pre-existing local state.

Key Features

Unified API: Single sync_with_peer() method handles both bootstrap and incremental sync

// Works whether you have the database locally or not
sync.sync_with_peer("peer.example.com:8080", Some(&tree_id)).await?;

Automatic Detection: The protocol automatically detects whether bootstrap or incremental sync is needed:

  • Bootstrap sync: If you don't have the database locally, the peer sends the complete database
  • Incremental sync: If you already have the database, only new changes are transferred

True Bidirectional Sync: Both peers exchange data in a single sync operation - no separate client/server roles needed

Protocol Flow

  1. Handshake: Peers exchange device identities and establish trust
  2. Tree Discovery: Client requests information about available trees
  3. Tip Comparison: Compare local vs remote database tips to detect missing data
  4. Bidirectional Transfer: Both peers send missing entries to each other in a single sync operation
    • Client receives missing entries from server
    • Client automatically detects what server is missing and sends those entries back
    • Uses existing IncrementalResponse.their_tips field to enable true bidirectional sync
  5. Verification: Validate received entries and update local database

Use Cases

Joining Chat Rooms:

// Join a chat room by ID
let room_id = "abc123...";
sync.sync_with_peer("chat.example.com", Some(&room_id)).await?;
// Now you have the full chat history and can participate

Document Collaboration:

// Join a collaborative document
let doc_id = "def456...";
sync.sync_with_peer("docs.example.com", Some(&doc_id)).await?;
// You now have the full document and can make edits

Data Synchronization:

// Sync application data to a new device
sync.sync_with_peer("my-server.com", Some(&app_data_id)).await?;
// All your application data is now available locally

Best Practices

1. Use the New Simplified API

Prefer sync_with_peer() over manual peer management:

// ✅ Recommended: Automatic connection and sync
sync.sync_with_peer("peer.example.com", Some(&tree_id)).await?;

// ❌ Avoid: Manual peer setup (unless you need fine control)
sync.register_peer(&pubkey, Some("Alice"))?;
sync.add_peer_address(&pubkey, addr)?;
sync.sync_tree_with_peer(&pubkey, &tree_id).await?;

2. Use Iroh Transport for Production

Iroh provides better NAT traversal and P2P capabilities than HTTP.

3. Leverage Database Discovery

Use discover_peer_trees() to find available databases before syncing:

let available = sync.discover_peer_trees("peer.example.com").await?;
for tree in available {
    if tree.name == "My Project" {
        sync.sync_with_peer("peer.example.com", Some(&tree.tree_id)).await?;
        break;
    }
}

4. Handle Network Failures Gracefully

The sync system automatically retries failed operations, but your application should handle temporary disconnections.

5. Understand Bootstrap vs Incremental Behavior

  • First sync with a database = Bootstrap (full data transfer)
  • Subsequent syncs = Incremental (only changes)
  • No manual state management needed

6. Secure Your Private Keys

Store device keys securely and use different keys for different purposes when appropriate.

Advanced Topics

Custom Write Callbacks for Sync

You can use write callbacks to trigger sync operations when entries are committed:

use std::sync::Arc;

// Get the sync instance
let sync = instance.sync().expect("Sync not enabled");

// Set up a write callback to trigger sync
let sync_clone = sync.clone();
let peer_pubkey = "peer_public_key".to_string();
database.on_local_write(move |entry, db, _instance| {
    // Queue the entry for sync when it's committed
    sync_clone.queue_entry_for_sync(&peer_pubkey, entry.id(), db.root_id())
})?;

This approach allows you to automatically sync entries when they're created, enabling real-time synchronization between peers.

Multiple Database Instances

You can run multiple sync-enabled databases in the same process:

// Database 1
let db1 = Instance::open(Box::new(InMemory::new())?.enable_sync()?;
db1.sync()?.enable_http_transport()?;
db1.sync()?.start_server("127.0.0.1:8080")?;

// Database 2
let db2 = Instance::open(Box::new(InMemory::new())?.enable_sync()?;
db2.sync()?.enable_http_transport()?;
db2.sync()?.start_server("127.0.0.1:8081")?;

// Connect them
let addr = Address::http("127.0.0.1:8080")?;
let peer_key = db2.sync()?.connect_to_peer(&addr).await?;

Troubleshooting

Common Issues

"No transport enabled" error:

  • Ensure you've called enable_http_transport() or enable_iroh_transport()

Sync not happening:

  • Check peer status is Active
  • Verify database sync relationships are configured
  • Check network connectivity between peers

Performance issues:

  • Consider using Iroh transport for better performance
  • Check retry queue for persistent failures
  • Verify network connectivity is stable

Authentication failures:

  • Ensure private keys are properly configured
  • Verify peer public keys are correct
  • Check that peers are using compatible protocol versions

Complete Synchronization Example

For a full working example that demonstrates real-time synchronization between peers, see the Chat Example in the repository.

The chat application demonstrates:

  • Multi-Transport Sync: Both HTTP (simple client-server) and Iroh (P2P with NAT traversal)
  • Bootstrap Protocol: Automatic access requests when joining existing rooms
  • User API Integration: User-based authentication with automatic key management
  • Sync Hooks: Real-time message updates via periodic refresh
  • Peer Discovery: Server address sharing for easy peer connection
  • Multiple Databases: Each chat room is a separate synchronized database

Quick Start with the Chat Example

# Terminal 1 - Create a room with HTTP transport
cd examples/chat
cargo run -- --username alice --transport http --create-only --room-name "Demo"

# Terminal 2 - Connect to the room
cargo run -- --username bob --transport http --connect "room_id@127.0.0.1:PORT"

See the full chat documentation for detailed usage, transport options, and troubleshooting.

Bootstrapping

Overview

The Bootstrap system provides secure key management for Eidetica databases by controlling how new devices gain access to synchronized databases. It supports two approval methods:

  1. Global Wildcard Permissions - Databases with global '*' permissions automatically approve bootstrap requests without adding new keys
  2. Manual Approval - Bootstrap requests are queued for administrator review and explicit approval

Global Permission Bootstrap

Global '*' permissions provide the simplest and most flexible approach for collaborative or public databases:

How It Works

When a database has global permissions configured (e.g., {"*": {"pubkey": "*", "permissions": "Write: 10"}}), bootstrap requests are automatically approved if the requested permission level is satisfied by the global permission. No new keys are added to the database.

Devices use the global permission for both bootstrap approval and subsequent operations (transactions, reads, writes). The system automatically resolves to the global "*" key when a device's specific key is not present in the database's auth settings.

Advantages

  • No key management: Devices don't need individual keys added to database
  • Immediate access: Bootstrap approval happens instantly
  • Simple configuration: One permission setting controls all devices
  • Flexible permissions: Set exactly the permission level you want to allow

Configuration Example

Configure a database with global write permission:

use eidetica::crdt::Doc;

// Create database with global permission
let mut settings = Doc::new();
let mut auth_doc = Doc::new();

// Add admin key for database management
auth_doc.set_json("admin_key", serde_json::json!({
    "pubkey": "ed25519:admin_public_key_here",
    "permissions": {"Admin": 1},
    "status": "Active"
}))?;

// Add global permission for automatic bootstrap approval
auth_doc.set_json("*", serde_json::json!({
    "pubkey": "*",
    "permissions": {"Write": 10},  // Allows Read and Write(11+) requests
    "status": "Active"
}))?;

settings.set_doc("auth", auth_doc);
let database = instance.new_database(settings, "admin_key")?;

Permission Levels

Eidetica uses lower numbers = higher permissions:

  • Global Write(10) allows: Read, Write(11), Write(15), etc.
  • Global Write(10) denies: Write(5), Admin(*)

Choose your global permission level carefully based on your security requirements.

Client Workflow

From the client's perspective, the bootstrap process follows these steps:

1. Initial Bootstrap Attempt

The client initiates a bootstrap request when it needs access to a synchronized database:

client_sync.sync_with_peer_for_bootstrap(
    &server_address,
    &tree_id,
    "client_device_key",     // Client's key name
    Permission::Write(5)     // Requested permission level
).await

2. Response Handling

The client must handle different response scenarios:

  • Global Wildcard Permission Approved:

    • Request succeeds immediately
    • Client gains access via global permission
    • No individual key added to database
    • Can proceed with normal operations
  • Manual Approval Required (default):

    • Request fails with an error
    • Error indicates request is "pending"
    • Bootstrap request is queued for admin review

3. Waiting for Approval

While the request is pending, the client has several options:

  • Polling Strategy: Periodically retry sync operations
  • Event-Based: Wait for notification from server (future enhancement)
  • User-Triggered: Let user manually retry when they expect approval

4. After Admin Decision

If Approved:

  • The initial sync_with_peer_for_bootstrap() will still return an error
  • Client must use normal sync_with_peer() to access the database
  • Once synced, client can load and use the database normally

If Rejected:

  • All sync attempts continue to fail
  • Client remains unable to access the database
  • May submit a new request with different parameters if appropriate

5. Retry Logic Example

async fn bootstrap_with_retry(
    client_sync: &mut Sync,
    server_addr: &str,
    tree_id: &ID,
    key_name: &str,
) -> Result<()> {
    // Initial bootstrap request
    if let Err(_) = client_sync.sync_with_peer_for_bootstrap(
        server_addr, tree_id, key_name, Permission::Write(5)
    ).await {
        println!("Bootstrap request pending approval...");

        // Poll for approval (with backoff)
        for attempt in 0..10 {
            tokio::time::sleep(Duration::from_secs(30 * (attempt + 1))).await;

            // Try normal sync after potential approval
            if client_sync.sync_with_peer(server_addr, Some(tree_id)).await.is_ok() {
                println!("Access granted!");
                return Ok(());
            }
        }

        return Err("Bootstrap request timeout or rejected".into());
    }

    Ok(()) // Auto-approved
}

Usage Examples

Manual Approval Workflow

For administrators managing bootstrap requests:

// 1. List pending requests
let pending = sync.pending_bootstrap_requests()?;
for (request_id, request) in pending {
    println!("Request {}: {} wants {} access to tree {}",
        request_id,
        request.requesting_key_name,
        request.requested_permission,
        request.tree_id
    );
}

// 2. Approve a request
sync.approve_bootstrap_request(
    "bootstrap_a1b2c3d4...",
    "admin_key"  // Your admin key name
)?;

// 3. Or reject a request
sync.reject_bootstrap_request(
    "bootstrap_e5f6g7h8...",
    "admin_key"
)?;

Complete Client Bootstrap Example

// Step 1: Initial bootstrap attempt with authentication
let bootstrap_result = client_sync.sync_with_peer_for_bootstrap(
    &server_address,
    &tree_id,
    "my_device_key",
    Permission::Write(5)
).await;

// Step 2: Handle the response based on approval method
match bootstrap_result {
    Ok(_) => {
        // Global wildcard permission granted immediate access
        println!("Bootstrap approved via global permission! Access granted immediately.");
    },
    Err(e) => {
        // Manual approval required
        // The error indicates the request is pending
        println!("Bootstrap request submitted, awaiting admin approval...");

        // Step 3: Wait for admin to review and approve
        // Options:
        // a) Poll periodically
        // b) Wait for out-of-band notification
        // c) User-triggered retry

        // Step 4: After admin approval, retry with normal sync
        // (bootstrap sync will still fail, use regular sync instead)
        tokio::time::sleep(Duration::from_secs(30)).await;

        // After approval, normal sync will succeed
        match client_sync.sync_with_peer(&server_address, Some(&tree_id)).await {
            Ok(_) => {
                println!("Access granted! Database synchronized.");
                // Client can now load and use the database
                let db = client_instance.load_database(&tree_id)?;
            },
            Err(_) => {
                println!("Still pending or rejected. Check with admin.");
            }
        }
    }
}

// Handling rejection scenario
// If the request was rejected, all sync attempts will continue to fail
// The client will need to submit a new bootstrap request if appropriate

Security Considerations

Trust Model

  • Global Wildcard Permissions: Trusts any device that can reach the sync endpoint

    • Suitable for: Development, collaborative projects, public databases
    • Risk: Any device can gain the configured global permissions
    • Benefit: Simple, immediate access for authorized scenarios
  • Manual Approval: Requires explicit admin action for each device

    • Suitable for: Production, sensitive data, controlled access scenarios
    • Benefit: Complete control over which devices gain access
    • Risk: Administrative overhead for each new device

Troubleshooting

Common Issues

  1. "Authentication required but not configured"

    • Cause: Sync handler cannot authenticate with target database
    • Solution: Ensure proper key configuration for database operations
  2. "Invalid request state"

    • Cause: Attempting to approve/reject non-pending request
    • Solution: Check request status before operation

Performance Considerations

  • Sync database grows linearly with request count
  • Request queries are indexed by ID

Sync Quick Reference

A concise reference for Eidetica's synchronization API with common usage patterns and code snippets.

Setup and Initialization

Basic Sync Setup

use eidetica::{Instance, backend::InMemory};

// Create database with sync enabled
let backend = Box::new(InMemory::new());
let instance = Instance::open(backend)?.enable_sync()?;

// Create and login user (generates authentication key)
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;

// Enable transport
let sync = db.sync().unwrap();
sync.enable_http_transport()?;
sync.start_server_async("127.0.0.1:8080").await?;

Understanding BackgroundSync

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory};

fn main() -> eidetica::Result<()> {
// Setup database instance with sync capability
let backend = Box::new(InMemory::new());
let db = Instance::open(backend)?;
db.enable_sync()?;

// The BackgroundSync engine starts automatically with transport
let sync = db.sync().unwrap();
sync.enable_http_transport()?; // Starts background thread

// Background thread configuration and behavior:
// - Command processing (immediate response to commits)
// - Periodic sync operations (5 minute intervals)
// - Retry queue processing (30 second intervals)
// - Connection health checks (60 second intervals)

// All sync operations are automatic - no manual queue management needed
println!("BackgroundSync configured with automatic operation timers");
Ok(())
}

Peer Management

// For new devices joining existing databases with authentication
sync.sync_with_peer_for_bootstrap(
    "peer.example.com:8080",
    &tree_id,
    "device_key",                    // Local authentication key
    eidetica::auth::Permission::Write // Requested permission level
).await?;

// This automatically:
// 1. Connects to peer and performs handshake
// 2. Requests database access with specified permission level
// 3. Receives auto-approved access (or manual approval in production)
// 4. Downloads complete database state
// 5. Grants authenticated write access

Simplified Sync (Legacy/Existing Databases)

// Single call handles connection, handshake, and sync detection
sync.sync_with_peer("peer.example.com:8080", Some(&tree_id)).await?;

// This automatically:
// 1. Connects to peer and performs handshake
// 2. Bootstraps database if you don't have it locally
// 3. Syncs incrementally if you already have the database
// 4. Handles peer registration internally

Database Discovery

// Discover available databases on a peer
let available_trees = sync.discover_peer_trees("peer.example.com:8080").await?;
for tree in available_trees {
    println!("Available: {} ({} entries)", tree.tree_id, tree.entry_count);
}

// Bootstrap from discovered database
if let Some(tree) = available_trees.first() {
    sync.sync_with_peer("peer.example.com:8080", Some(&tree.tree_id)).await?;
}

Manual Peer Registration (Advanced)

// Register peer manually (for advanced use cases)
let peer_key = "ed25519:abc123...";
sync.register_peer(peer_key, Some("Alice's Device"))?;

// Add addresses
sync.add_peer_address(peer_key, Address::http("192.168.1.100:8080")?)?;
sync.add_peer_address(peer_key, Address::iroh("iroh://peer_id")?)?;

// Use low-level sync with registered peer
sync.sync_tree_with_peer(&peer_key, &tree_id).await?;

// Note: Manual registration is usually unnecessary
// The sync_with_peer() method handles registration automatically

Peer Status Management

// List all peers
let peers = db.sync()?.list_peers()?;
for peer in peers {
    println!("{}: {} ({})",
        peer.pubkey,
        peer.display_name.unwrap_or("Unknown".to_string()),
        peer.status
    );
}

// Get specific peer info
if let Some(peer) = db.sync()?.get_peer_info(&peer_key)? {
    println!("Status: {:?}", peer.status);
    println!("Addresses: {:?}", peer.addresses);
}

// Update peer status
db.sync()?.update_peer_status(&peer_key, PeerStatus::Inactive)?;

Database Synchronization

Create and Share Database

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::DocStore};

fn main() -> eidetica::Result<()> {
let backend = Box::new(InMemory::new());
let instance = Instance::open(backend)?;
instance.enable_sync()?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
// Create a database to share
let mut settings = Doc::new();
settings.set_string("name", "My Chat Room");
settings.set_string("description", "A room for team discussions");

let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
let tree_id = database.root_id();

// Add some initial data
let op = database.new_transaction()?;
let store = op.get_store::<DocStore>("messages")?;
store.set_string("welcome", "Welcome to the room!")?;
op.commit()?;

// Share the tree_id with others
println!("Room ID: {}", tree_id);
Ok(())
}

Bootstrap from Shared Database

// Join someone else's database using the tree_id
let room_id = "abc123..."; // Received from another user
sync.sync_with_peer("peer.example.com:8080", Some(&room_id)).await?;

// You now have the full database locally
let database = db.load_database(&room_id)?;
let op = database.new_transaction()?;
let store = op.get_store::<DocStore>("messages")?;
println!("Welcome message: {}", store.get_string("welcome")?);

Ongoing Synchronization

// All changes automatically sync after bootstrap
let op = database.new_transaction()?;
let store = op.get_store::<DocStore>("messages")?;
store.set_string("my_message", "Hello everyone!")?;
op.commit()?; // Automatically syncs to all connected peers

// Manually sync to get latest changes
sync.sync_with_peer("peer.example.com:8080", Some(&tree_id)).await?;

Advanced: Manual Sync Relationships

// For fine-grained control (usually not needed)
sync.add_tree_sync(&peer_key, &tree_id)?;

// List synced databases for peer
let databases = sync.get_peer_trees(&peer_key)?;

// List peers syncing a database
let peers = sync.get_tree_peers(&tree_id)?;

// Remove sync relationship
sync.remove_tree_sync(&peer_key, &tree_id)?;

Data Operations (Auto-Sync)

Basic Data Changes

use eidetica::store::DocStore;

// Any database operation automatically triggers sync
let op = database.new_transaction()?;
let store = op.get_store::<DocStore>("data")?;

store.set_string("message", "Hello World")?;
store.set_path("user.name", "Alice")?;
store.set_path("user.age", 30)?;

// Commit triggers sync callbacks automatically
op.commit()?; // Entries queued for sync to all configured peers

Bulk Operations

// Multiple operations in single commit
let op = database.new_transaction()?;
let store = op.get_store::<DocStore>("data")?;

for i in 0..100 {
    store.set_string(&format!("item_{}", i), &format!("value_{}", i))?;
}

// Single commit, single sync entry
op.commit()?;

Monitoring and Diagnostics

Server Control

// Start/stop sync server
let sync = db.sync()?;
sync.start_server("127.0.0.1:8080")?;

// Check server status
if sync.is_server_running() {
    let addr = sync.get_server_address()?;
    println!("Server running at: {}", addr);
}

// Stop server
sync.stop_server()?;

Sync State Tracking

// Get sync state manager
let op = db.sync()?.sync_tree().new_operation()?;
let state_manager = SyncStateManager::new(&op);

// Get sync cursor for peer-database relationship
let cursor = state_manager.get_sync_cursor(&peer_key, &tree_id)?;
if let Some(cursor) = cursor {
    println!("Last synced: {:?}", cursor.last_synced_entry);
    println!("Total synced: {}", cursor.total_synced_count);
}

// Get peer metadata
let metadata = state_manager.get_sync_metadata(&peer_key)?;
if let Some(meta) = metadata {
    println!("Successful syncs: {}", meta.successful_sync_count);
    println!("Failed syncs: {}", meta.failed_sync_count);
}

Sync State Tracking

use eidetica::sync::state::SyncStateManager;

// Get sync database operation
let op = sync.sync_tree().new_operation()?;
let state_manager = SyncStateManager::new(&op);

// Check sync cursor
let cursor = state_manager.get_sync_cursor(&peer_key, &tree_id)?;
println!("Last synced: {:?}", cursor.last_synced_entry);
println!("Total synced: {}", cursor.total_synced_count);

// Check sync metadata
let metadata = state_manager.get_sync_metadata(&peer_key)?;
println!("Success rate: {:.2}%", metadata.sync_success_rate() * 100.0);
println!("Avg duration: {:.1}ms", metadata.average_sync_duration_ms);

// Get recent sync history
let history = state_manager.get_sync_history(&peer_key, Some(10))?;
for entry in history {
    println!("Sync {}: {} entries in {:.1}ms",
        entry.sync_id, entry.entries_count, entry.duration_ms);
}

Error Handling

Common Error Patterns

use eidetica::sync::SyncError;

// Connection errors
match sync.connect_to_peer(&addr).await {
    Ok(peer_key) => println!("Connected: {}", peer_key),
    Err(e) if e.is_sync_error() => {
        match e.sync_error().unwrap() {
            SyncError::HandshakeFailed(msg) => {
                eprintln!("Handshake failed: {}", msg);
                // Retry with different address or check credentials
            },
            SyncError::NoTransportEnabled => {
                eprintln!("Enable transport first");
                sync.enable_http_transport()?;
            },
            SyncError::PeerNotFound(key) => {
                eprintln!("Peer {} not registered", key);
                // Register peer first
            },
            _ => eprintln!("Other sync error: {}", e),
        }
    },
    Err(e) => eprintln!("Non-sync error: {}", e),
}

Monitoring Sync Health

// Check server status
if !sync.is_server_running() {
    eprintln!("Warning: Sync server not running");
}

// Monitor peer connectivity
let peers = sync.list_peers()?;
for peer in peers {
    if peer.status != PeerStatus::Active {
        eprintln!("Warning: Peer {} is {}", peer.pubkey, peer.status);
    }
}

// Sync happens automatically, but you can monitor state
// via the SyncStateManager for diagnostics

Configuration Examples

Development Setup

// Fast, responsive sync for development
// Enable HTTP transport for easy debugging
db.sync()?.enable_http_transport()?;
db.sync()?.start_server("127.0.0.1:8080")?;

// Connect to local test peer
let addr = Address::http("127.0.0.1:8081")?;
let peer = db.sync()?.connect_to_peer(&addr).await?;

Production Setup

// Use Iroh for production deployments (defaults to n0's relay servers)
db.sync()?.enable_iroh_transport()?;

// Or configure for specific environments:
use iroh::RelayMode;
use eidetica::sync::transports::iroh::IrohTransport;

// Custom relay server (e.g., enterprise deployment)
let relay_url: iroh::RelayUrl = "https://relay.example.com".parse()?;
let relay_node = iroh::RelayNode {
    url: relay_url,
    quic: Some(Default::default()),
};
let transport = IrohTransport::builder()
    .relay_mode(RelayMode::Custom(iroh::RelayMap::from_iter([relay_node])))
    .build()?;
db.sync()?.enable_iroh_transport_with_config(transport)?;

// Connect to peers
let addr = Address::iroh(peer_node_id)?;
let peer = db.sync()?.connect_to_peer(&addr).await?;

// Sync happens automatically:
// - Immediate on commit
// - Retry with exponential backoff
// - Periodic sync every 5 minutes

Multi-Database Setup

// Run multiple sync-enabled databases
let db1 = Instance::open(Box::new(InMemory::new())?.enable_sync()?;
db1.sync()?.enable_http_transport()?;
db1.sync()?.start_server("127.0.0.1:8080")?;

let db2 = Instance::open(Box::new(InMemory::new())?.enable_sync()?;
db2.sync()?.enable_http_transport()?;
db2.sync()?.start_server("127.0.0.1:8081")?;

// Connect them together
let addr = Address::http("127.0.0.1:8080")?;
let peer = db2.sync()?.connect_to_peer(&addr).await?;

Testing Patterns

Testing with Iroh (No Relays)

#[tokio::test]
async fn test_iroh_sync_local() -> Result<()> {
    use iroh::RelayMode;
    use eidetica::sync::transports::iroh::IrohTransport;

    // Configure Iroh for local testing (no relay servers)
    let transport1 = IrohTransport::builder()
        .relay_mode(RelayMode::Disabled)
        .build()?;
    let transport2 = IrohTransport::builder()
        .relay_mode(RelayMode::Disabled)
        .build()?;

    // Setup databases with local Iroh transport
    let db1 = Instance::open(Box::new(InMemory::new())?.enable_sync()?;
    db1.sync()?.enable_iroh_transport_with_config(transport1)?;
    db1.sync()?.start_server("ignored")?; // Iroh manages its own addresses

    let db2 = Instance::open(Box::new(InMemory::new())?.enable_sync()?;
    db2.sync()?.enable_iroh_transport_with_config(transport2)?;
    db2.sync()?.start_server("ignored")?;

    // Get the serialized NodeAddr (includes direct addresses)
    let addr1 = db1.sync()?.get_server_address()?;
    let addr2 = db2.sync()?.get_server_address()?;

    // Connect peers using full NodeAddr info
    let peer1 = db2.sync()?.connect_to_peer(&Address::iroh(&addr1)).await?;
    let peer2 = db1.sync()?.connect_to_peer(&Address::iroh(&addr2)).await?;

    // Now they can sync directly via P2P
    Ok(())
}

Mock Peer Setup (HTTP)

#[tokio::test]
async fn test_sync_between_peers() -> Result<()> {
    // Setup first peer
    let instance1 = Instance::open(Box::new(InMemory::new())?.enable_sync()?;
    instance1.create_user("peer1", None)?;
    let mut user1 = instance1.login_user("peer1", None)?;
    instance1.sync()?.enable_http_transport()?;
    instance1.sync()?.start_server("127.0.0.1:0")?; // Random port

    let addr1 = instance1.sync()?.get_server_address()?;

    // Setup second peer
    let instance2 = Instance::open(Box::new(InMemory::new())?.enable_sync()?;
    instance2.create_user("peer2", None)?;
    let mut user2 = instance2.login_user("peer2", None)?;
    instance2.sync()?.enable_http_transport()?;

    // Connect peers
    let addr = Address::http(&addr1)?;
    let peer1_key = instance2.sync()?.connect_to_peer(&addr).await?;
    instance2.sync()?.update_peer_status(&peer1_key, PeerStatus::Active)?;

    // Setup sync relationship
    let key1 = user1.get_default_key()?;
    let tree1 = user1.create_database(Doc::new(), &key1)?;
    let key2 = user2.get_default_key()?;
    let tree2 = user2.create_database(Doc::new(), &key2)?;

    db2.sync()?.add_tree_sync(&peer1_key, &tree1.root_id().to_string())?;

    // Test sync
    let op1 = tree1.new_transaction()?;
    let store1 = op1.get_store::<DocStore>("data")?;
    store1.set_string("test", "value")?;
    op1.commit()?;

    // Wait for sync
    tokio::time::sleep(Duration::from_secs(2)).await;

    // Verify sync occurred
    // ... verification logic

    Ok(())
}

Best Practices Summary

✅ Do

  • Use sync_with_peer() for most synchronization needs
  • Enable sync before creating databases you want to synchronize
  • Use Iroh transport for production deployments (better NAT traversal)
  • Use discover_peer_trees() to find available databases before syncing
  • Share tree IDs to allow others to bootstrap from your databases
  • Handle network failures gracefully (sync system auto-retries)
  • Let BackgroundSync handle retry logic automatically

❌ Don't

  • Manually manage peers unless you need fine control (use sync_with_peer() instead)
  • Remove peer relationships for databases you want to synchronize
  • Manually manage sync queues (BackgroundSync handles this)
  • Ignore sync errors in production code
  • Use HTTP transport for high-volume production (prefer Iroh)
  • Assume sync is instantaneous (it's eventually consistent)

🚀 New Bootstrap-First Features

  • Zero-state joining: Join rooms/databases without any local setup
  • Automatic protocol detection: Bootstrap vs incremental sync handled automatically
  • Simplified API: Single sync_with_peer() call handles everything
  • Database discovery: Find available databases on peers
  • Bidirectional sync: Both devices can share and receive databases

🔧 Troubleshooting Checklist

  1. Sync not working?

    • Check transport is enabled and server started
    • Verify peer status is Active
    • Confirm database sync relationships configured
    • Check network connectivity
  2. Performance issues?

    • Consider using Iroh transport
    • Check for network bottlenecks
    • Verify retry queue isn't growing unbounded
    • Monitor peer connectivity status
  3. Memory usage high?

    • Check for dead/unresponsive peers
    • Verify retry queue is processing correctly
    • Consider restarting sync to clear state
  4. Sync delays?

    • Remember sync is immediate on commit
    • Check if entries are in retry queue
    • Verify network is stable
    • Check peer responsiveness

Logging

Eidetica uses the tracing crate for structured logging throughout the library. This provides flexible, performant logging that library users can configure to their needs.

Quick Start

Eidetica uses the tracing crate, which means you can attach any subscriber to capture and format logs. The simplest approach is using tracing-subscriber:

[dependencies]
eidetica = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
use tracing_subscriber::EnvFilter;

fn main() -> eidetica::Result<()> {
    // Initialize tracing subscriber to see Eidetica logs
    tracing_subscriber::fmt()
        .with_env_filter(EnvFilter::from_default_env())
        .init();

    // Now all Eidetica operations will log according to RUST_LOG
    // ...
}

You can customize formatting, filtering, and output destinations. See the tracing-subscriber documentation for advanced configuration options.

Configuring Log Levels

Control logging verbosity using the RUST_LOG environment variable:

# Show only errors
RUST_LOG=eidetica=error cargo run

# Show info messages and above
RUST_LOG=eidetica=info cargo run

# Show debug messages for sync module
RUST_LOG=eidetica::sync=debug cargo run

# Show all trace messages (very verbose)
RUST_LOG=eidetica=trace cargo run

Log Levels in Eidetica

Eidetica uses the following log levels:

  • ERROR: Critical errors that prevent operations from completing
    • Failed database operations
    • Network errors during sync
    • Authentication failures
  • WARN: Important warnings that don't prevent operation
    • Retry operations after failures
    • Invalid configuration detected
    • Deprecated feature usage
  • INFO: High-level operational messages
    • Sync server started/stopped
    • Successful synchronization with peers
    • Database loaded/saved
  • DEBUG: Detailed operational information
    • Sync protocol details
    • Database synchronization progress
    • Hook execution
  • TRACE: Very detailed trace information
    • Individual entry processing
    • Detailed CRDT operations
    • Network protocol messages

Using Eidetica with Logging

Once you've initialized a tracing subscriber, all Eidetica operations will automatically emit structured logs that you can capture and filter:

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc};

fn main() -> eidetica::Result<()> {
let backend = Box::new(InMemory::new());
let instance = Instance::open(backend)?;

// Create and login user - this will log at INFO level
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;

// Create a database - this will log at INFO level
let mut settings = Doc::new();
settings.set("name", "my_database");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;

// Operations will emit logs at appropriate levels
// Use RUST_LOG to control what you see
Ok(())
}

Performance Considerations

The tracing crate is designed to have minimal overhead when logging is disabled. Log statements that aren't enabled at the current level are optimized away at compile time.

For performance-critical code paths, Eidetica uses appropriate log levels:

  • Hot paths use trace! level to avoid overhead in production
  • Synchronization operations use debug! for detailed tracking
  • Only important events use info! and above

Integration with Observability Tools

The tracing ecosystem supports various backends for production observability:

  • Console output: Default, human-readable format
  • JSON output: For structured logging systems
  • OpenTelemetry: For distributed tracing
  • Jaeger/Zipkin: For trace visualization

See the tracing documentation for more advanced integration options.

Developer Walkthrough: Building with Eidetica

This guide walks through the Todo Example (examples/todo/src/main.rs) to explain Eidetica's core concepts. The example is a simple command-line todo app that demonstrates databases, transactions, stores, and Y-CRDT integration.

Core Concepts

The Todo example demonstrates Eidetica's key components working together in a real application.

1. The Database Backend (Instance)

The Instance is your main entry point. It wraps a storage backend and provides access to your databases.

The Todo example implements load_or_create_db() to handle loading existing databases or creating new ones:

fn load_or_create_db(path: &PathBuf) -> Result<Instance> {
    let db = if path.exists() {
        let backend = InMemory::load_from_file(path)?;
        Instance::open(Box::new(backend))?
    } else {
        let backend = InMemory::new();
        Instance::open(Box::new(backend))?
    };

    // Ensure the todo app authentication key exists
    let existing_keys = db.list_private_keys()?;

    if !existing_keys.contains(&TODO_APP_KEY_NAME.to_string()) {
        db.add_private_key(TODO_APP_KEY_NAME)?;
        println!("✓ New authentication key created");
    }

    Ok(db)
}

This shows how the InMemory backend can persist to disk and how authentication keys are managed.

2. Databases (Database)

A Database is a primary organizational unit within a Instance. Think of it somewhat like a schema or a logical database within a larger instance. It acts as a container for related data, managed through Stores. Databases provide versioning and history tracking for the data they contain.

The Todo example uses a single Database named "todo":

fn load_or_create_todo_database(db: &Instance) -> Result<Database> {
    let database_name = "todo";

    // Try to find the database by name
    let mut database = match db.find_database(database_name) {
        Ok(mut databases) => {
            databases.pop().unwrap() // unwrap is safe because find_database errors if empty
        }
        Err(e) if e.is_not_found() => {
            // If not found, create a new one
            println!("No existing todo database found, creating a new one...");
            let mut settings = Doc::new();
            settings.set_string("name", database_name);

            db.new_database(settings, TODO_APP_KEY_NAME)?
        }
        Err(e) => return Err(e),
    };

    // Set the default authentication key for this database
    database.set_default_auth_key(TODO_APP_KEY_NAME);

    Ok(database)
}

This shows how find_database() searches for existing databases by name, and set_default_auth_key() configures automatic authentication for all transactions.

3. Transactions and Stores

All data modifications happen within a Transaction. Transactions ensure atomicity and are automatically authenticated using the database's default signing key.

Within a transaction, you access Stores - flexible containers for different types of data. The Todo example uses Table<Todo> to store todo items with unique IDs.

4. The Todo Data Structure

The example defines a Todo struct that must implement Serialize and Deserialize to work with Eidetica:

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Todo {
    pub title: String,
    pub completed: bool,
    pub created_at: DateTime<Utc>,
    pub completed_at: Option<DateTime<Utc>>,
}

impl Todo {
    pub fn new(title: String) -> Self {
        Self {
            title,
            completed: false,
            created_at: Utc::now(),
            completed_at: None,
        }
    }

    pub fn complete(&mut self) {
        self.completed = true;
        self.completed_at = Some(Utc::now());
    }
}

5. Adding a Todo

The add_todo() function shows how to insert data into a Table store:

fn add_todo(database: &Database, title: String) -> Result<()> {
    // Start an atomic transaction (uses default auth key)
    let op = database.new_transaction()?;

    // Get a handle to the 'todos' Table store
    let todos_store = op.get_store::<Table<Todo>>("todos")?;

    // Create a new todo
    let todo = Todo::new(title);

    // Insert the todo into the Table
    // The Table will generate a unique ID for it
    let todo_id = todos_store.insert(todo)?;

    // Commit the transaction
    op.commit()?;

    println!("Added todo with ID: {todo_id}");

    Ok(())
}

6. Updating a Todo

The complete_todo() function demonstrates reading and updating data:

fn complete_todo(database: &Database, id: &str) -> Result<()> {
    // Start an atomic transaction (uses default auth key)
    let op = database.new_transaction()?;

    // Get a handle to the 'todos' Table store
    let todos_store = op.get_store::<Table<Todo>>("todos")?;

    // Get the todo from the Table
    let mut todo = todos_store.get(id)?;

    // Mark the todo as complete
    todo.complete();

    // Update the todo in the Table
    todos_store.set(id, todo)?;

    // Commit the transaction
    op.commit()?;

    Ok(())
}

These examples show the typical pattern: start a transaction, get a store handle, perform operations, and commit.

7. Y-CRDT Integration (YDoc)

The example also uses YDoc stores for user information and preferences. Y-CRDTs are designed for collaborative editing:

fn set_user_info(
    database: &Database,
    name: Option<&String>,
    email: Option<&String>,
    bio: Option<&String>,
) -> Result<()> {
    // Start an atomic transaction (uses default auth key)
    let op = database.new_transaction()?;

    // Get a handle to the 'user_info' YDoc store
    let user_info_store = op.get_store::<YDoc>("user_info")?;

    // Update user information using the Y-CRDT document
    user_info_store.with_doc_mut(|doc| {
        let user_info_map = doc.get_or_insert_map("user_info");
        let mut txn = doc.transact_mut();

        if let Some(name) = name {
            user_info_map.insert(&mut txn, "name", name.clone());
        }
        if let Some(email) = email {
            user_info_map.insert(&mut txn, "email", email.clone());
        }
        if let Some(bio) = bio {
            user_info_map.insert(&mut txn, "bio", bio.clone());
        }

        Ok(())
    })?;

    // Commit the transaction
    op.commit()?;
    Ok(())
}

The example demonstrates using different store types in one database:

  • "todos" (Table<Todo>): Stores todo items with automatic ID generation
  • "user_info" (YDoc): Stores user profile using Y-CRDT Maps
  • "user_prefs" (YDoc): Stores preferences using Y-CRDT Maps

This shows how you can choose the most appropriate data structure for each type of data.

Running the Todo Example

To see these concepts in action, you can run the Todo example:

# Navigate to the example directory
cd examples/todo

# Build the example
cargo build

# Run commands (this will create todo_db.json)
cargo run -- add "Learn Eidetica"
cargo run -- list
# Note the ID printed
cargo run -- complete <id_from_list>
cargo run -- list

Refer to the example's README.md and test.sh for more usage details.

This walkthrough provides a starting point. Explore the Eidetica documentation and other examples to learn about more advanced features like different store types, history traversal, and distributed capabilities.

Code Examples

This page provides focused code snippets for common tasks in Eidetica.

Assumes basic setup like use eidetica::{Instance, Database, Error, ...}; and error handling (?) for brevity.

1. Initializing the Database (Instance)

extern crate eidetica;
use eidetica::{backend::database::InMemory, Instance, crdt::Doc};
use std::path::PathBuf;

fn main() -> eidetica::Result<()> {
// Use a temporary file for testing
let temp_dir = std::env::temp_dir();
let db_path = temp_dir.join("eidetica_example_init.json");

// First create and save a test database to demonstrate loading
let backend = InMemory::new();
let test_instance = Instance::open(Box::new(backend))?;
test_instance.create_user("alice", None)?;
let mut test_user = test_instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "example_db");
let test_key = test_user.get_default_key()?;
let _database = test_user.create_database(settings, &test_key)?;
let database_guard = test_instance.backend();
if let Some(in_memory) = database_guard.as_any().downcast_ref::<InMemory>() {
    in_memory.save_to_file(&db_path)?;
}

// Option A: Create a new, empty in-memory database
let database_new = InMemory::new();
let _db_new = Instance::open(Box::new(database_new))?;

// Option B: Load from a previously saved file
if db_path.exists() {
    match InMemory::load_from_file(&db_path) {
        Ok(database_loaded) => {
            let _db_loaded = Instance::open(Box::new(database_loaded))?;
            println!("Database loaded successfully.");
            // Use db_loaded
        }
        Err(e) => {
            eprintln!("Error loading database: {}", e);
            // Handle error, maybe create new
        }
    }
} else {
    println!("Database file not found, creating new.");
    // Use db_new from Option A
}

// Clean up the temporary file
if db_path.exists() {
    std::fs::remove_file(&db_path).ok();
}
Ok(())
}

2. Creating or Loading a Database

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc};

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let tree_name = "my_app_data";

let database = match instance.find_database(tree_name) {
    Ok(mut databases) => {
        println!("Found existing database: {}", tree_name);
        databases.pop().unwrap() // Assume first one is correct
    }
    Err(e) if e.is_not_found() => {
        println!("Creating new database: {}", tree_name);
        let mut doc = Doc::new();
        doc.set("name", tree_name);
        let default_key = user.get_default_key()?;
        user.create_database(doc, &default_key)?
    }
    Err(e) => return Err(e.into()), // Propagate other errors
};

println!("Using Database with root ID: {}", database.root_id());
Ok(())
}

3. Writing Data (DocStore Example)

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::DocStore};

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;

// Start an authenticated transaction (automatically uses the database's default key)
let op = database.new_transaction()?;

{
    // Get the DocStore store handle (scoped)
    let config_store = op.get_store::<DocStore>("configuration")?;

    // Set some values
    config_store.set("api_key", "secret-key-123")?;
    config_store.set("retry_count", "3")?;

    // Overwrite a value
    config_store.set("api_key", "new-secret-456")?;

    // Remove a value
    config_store.delete("old_setting")?; // Ok if it doesn't exist
}

// Commit the changes atomically
let entry_id = op.commit()?;
println!("DocStore changes committed in entry: {}", entry_id);
Ok(())
}

4. Writing Data (Table Example)

extern crate eidetica;
extern crate serde;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::Table};
use serde::{Serialize, Deserialize};

#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
struct Task {
    description: String,
    completed: bool,
}

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;

// Start an authenticated transaction (automatically uses the database's default key)
let op = database.new_transaction()?;
let inserted_id;

{
    // Get the Table handle
    let tasks_store = op.get_store::<Table<Task>>("tasks")?;

    // Insert a new task
    let task1 = Task { description: "Buy milk".to_string(), completed: false };
    inserted_id = tasks_store.insert(task1)?;
    println!("Inserted task with ID: {}", inserted_id);

    // Insert another task
    let task2 = Task { description: "Write docs".to_string(), completed: false };
    tasks_store.insert(task2)?;

    // Update the first task (requires getting it first if you only have the ID)
    if let Ok(mut task_to_update) = tasks_store.get(&inserted_id) {
        task_to_update.completed = true;
        tasks_store.set(&inserted_id, task_to_update)?;
        println!("Updated task {}", inserted_id);
    } else {
        eprintln!("Task {} not found for update?", inserted_id);
    }

    // Delete a task (if you knew its ID)
    // tasks_store.delete(&some_other_id)?;
}

// Commit all inserts/updates/deletes
let entry_id = op.commit()?;
println!("Table changes committed in entry: {}", entry_id);
Ok(())
}

5. Reading Data (DocStore Viewer)

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::DocStore};

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;

// Get a read-only viewer for the latest state
let config_viewer = database.get_store_viewer::<DocStore>("configuration")?;

match config_viewer.get("api_key") {
    Ok(api_key) => println!("Current API Key: {}", api_key),
    Err(e) if e.is_not_found() => println!("API Key not set."),
    Err(e) => return Err(e.into()),
}

match config_viewer.get("retry_count") {
    Ok(count_str) => {
        // Note: DocStore values can be various types
        if let Some(text) = count_str.as_text() {
            if let Ok(count) = text.parse::<u32>() {
                println!("Retry Count: {}", count);
            }
        }
    }
    Err(_) => println!("Retry count not set or invalid."),
}
Ok(())
}

6. Reading Data (Table Viewer)

extern crate eidetica;
extern crate serde;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::Table};
use serde::{Serialize, Deserialize};

#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
struct Task {
    description: String,
    completed: bool,
}

fn main() -> eidetica::Result<()> {
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
let op = database.new_transaction()?;
let tasks_store = op.get_store::<Table<Task>>("tasks")?;
let id_to_find = tasks_store.insert(Task { description: "Test task".to_string(), completed: false })?;
op.commit()?;

// Get a read-only viewer
let tasks_viewer = database.get_store_viewer::<Table<Task>>("tasks")?;

// Get a specific task by ID
match tasks_viewer.get(&id_to_find) {
    Ok(task) => println!("Found task {}: {:?}", id_to_find, task),
    Err(e) if e.is_not_found() => println!("Task {} not found.", id_to_find),
    Err(e) => return Err(e.into()),
}

// Search for all tasks
println!("\nAll Tasks:");
match tasks_viewer.search(|_| true) {
    Ok(tasks) => {
        for (id, task) in tasks {
            println!("  ID: {}, Task: {:?}", id, task);
        }
    }
    Err(e) => eprintln!("Error searching tasks: {}", e),
}
Ok(())
}

7. Working with Nested Data (Path-Based Operations)

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::DocStore, path, Database};

fn main() -> eidetica::Result<()> {
// Setup database for testing
let instance = Instance::open(Box::new(InMemory::new()))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set("name", "test_db");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;
// Start an authenticated transaction (automatically uses the database's default key)
let op = database.new_transaction()?;

// Get the DocStore store handle
let user_store = op.get_store::<DocStore>("users")?;

// Using path-based operations to create and modify nested structures
// Set profile information using paths - creates nested structure automatically
user_store.set_path(path!("user123.profile.name"), "Jane Doe")?;
user_store.set_path(path!("user123.profile.email"), "jane@example.com")?;

// Set preferences using paths
user_store.set_path(path!("user123.preferences.theme"), "dark")?;
user_store.set_path(path!("user123.preferences.notifications"), "enabled")?;
user_store.set_path(path!("user123.preferences.language"), "en")?;

// Set additional nested configuration
user_store.set_path(path!("config.database.host"), "localhost")?;
user_store.set_path(path!("config.database.port"), "5432")?;

// Commit the changes
let entry_id = op.commit()?;
println!("Nested data changes committed in entry: {}", entry_id);

// Read back the nested data using path operations
let viewer_op = database.new_transaction()?;
let viewer_store = viewer_op.get_store::<DocStore>("users")?;

// Get individual values using path operations
let _name_value = viewer_store.get_path(path!("user123.profile.name"))?;
let _email_value = viewer_store.get_path(path!("user123.profile.email"))?;
let _theme_value = viewer_store.get_path(path!("user123.preferences.theme"))?;
let _host_value = viewer_store.get_path(path!("config.database.host"))?;

// Get the entire user object to verify nested structure was created
if let Ok(_user_data) = viewer_store.get("user123") {
    println!("User profile and preferences created successfully");
}

// Get the entire config object to verify nested structure
if let Ok(_config_data) = viewer_store.get("config") {
    println!("Configuration data created successfully");
}

println!("Path-based operations completed successfully");
Ok(())
}

8. Working with Y-CRDT Documents (YDoc)

The YDoc store provides access to Y-CRDT (Yrs) documents for collaborative data structures. This requires the "y-crdt" feature flag.

extern crate eidetica;
use eidetica::{Instance, backend::database::InMemory, crdt::Doc, store::YDoc, Database};
use eidetica::y_crdt::{Map as YMap, Transact};

fn main() -> eidetica::Result<()> {
// Setup database for testing
let backend = InMemory::new();
let instance = Instance::open(Box::new(backend))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "y_crdt_example");
let default_key = user.get_default_key()?;
let database = user.create_database(settings, &default_key)?;

// Start an authenticated transaction (automatically uses the database's default key)
let op = database.new_transaction()?;

// Get the YDoc store handle
let user_info_store = op.get_store::<YDoc>("user_info")?;

// Writing to Y-CRDT document
user_info_store.with_doc_mut(|doc| {
    let user_info_map = doc.get_or_insert_map("user_info");
    let mut txn = doc.transact_mut();

    user_info_map.insert(&mut txn, "name", "Alice Johnson");
    user_info_map.insert(&mut txn, "email", "alice@example.com");
    user_info_map.insert(&mut txn, "bio", "Software developer");

    Ok(())
})?;

// Commit the transaction
let entry_id = op.commit()?;
println!("YDoc changes committed in entry: {}", entry_id);

// Reading from Y-CRDT document
let read_op = database.new_transaction()?;
let reader_store = read_op.get_store::<YDoc>("user_info")?;

reader_store.with_doc(|doc| {
    let user_info_map = doc.get_or_insert_map("user_info");
    let txn = doc.transact();

    println!("User Information:");

    if let Some(name) = user_info_map.get(&txn, "name") {
        let name_str = name.to_string(&txn);
        println!("Name: {name_str}");
    }

    if let Some(email) = user_info_map.get(&txn, "email") {
        let email_str = email.to_string(&txn);
        println!("Email: {email_str}");
    }

    if let Some(bio) = user_info_map.get(&txn, "bio") {
        let bio_str = bio.to_string(&txn);
        println!("Bio: {bio_str}");
    }

    Ok(())
})?;

// Working with nested Y-CRDT maps
let prefs_op = database.new_transaction()?;
let prefs_store = prefs_op.get_store::<YDoc>("user_prefs")?;

prefs_store.with_doc_mut(|doc| {
    let prefs_map = doc.get_or_insert_map("preferences");
    let mut txn = doc.transact_mut();

    prefs_map.insert(&mut txn, "theme", "dark");
    prefs_map.insert(&mut txn, "notifications", "enabled");
    prefs_map.insert(&mut txn, "language", "en");

    Ok(())
})?;

prefs_op.commit()?;

// Reading preferences
let prefs_read_op = database.new_transaction()?;
let prefs_read_store = prefs_read_op.get_store::<YDoc>("user_prefs")?;

prefs_read_store.with_doc(|doc| {
    let prefs_map = doc.get_or_insert_map("preferences");
    let txn = doc.transact();

    println!("User Preferences:");

    // Iterate over all preferences
    for (key, value) in prefs_map.iter(&txn) {
        let value_str = value.to_string(&txn);
        println!("{key}: {value_str}");
    }

    Ok(())
})?;
Ok(())
}

YDoc Features:

  • Collaborative Editing: Y-CRDT documents provide conflict-free merging for concurrent modifications
  • Rich Data Types: Support for Maps, Arrays, Text, and other Y-CRDT types
  • Functional Interface: Access via with_doc() for reads and with_doc_mut() for writes
  • Atomic Integration: Changes are staged within the Transaction and committed atomically

Use Cases for YDoc:

  • User profiles and preferences (as shown in the todo example)
  • Collaborative documents and shared state
  • Real-time data synchronization
  • Any scenario requiring conflict-free concurrent updates

9. Saving the Database (InMemory)

extern crate eidetica;
use eidetica::{backend::database::InMemory, Instance, crdt::Doc};
use std::path::PathBuf;

fn main() -> eidetica::Result<()> {
// Create a test database
let backend = InMemory::new();
let instance = Instance::open(Box::new(backend))?;
instance.create_user("alice", None)?;
let mut user = instance.login_user("alice", None)?;
let mut settings = Doc::new();
settings.set_string("name", "save_example");
let default_key = user.get_default_key()?;
let _database = user.create_database(settings, &default_key)?;

// Use a temporary file for testing
let temp_dir = std::env::temp_dir();
let db_path = temp_dir.join("eidetica_save_example.json");

// Save the database to a file
let database_guard = instance.backend();

// Downcast to the concrete InMemory type
if let Some(in_memory_database) = database_guard.as_any().downcast_ref::<InMemory>() {
    match in_memory_database.save_to_file(&db_path) {
        Ok(_) => println!("Database saved successfully to {:?}", db_path),
        Err(e) => eprintln!("Error saving database: {}", e),
    }
} else {
    eprintln!("Database is not InMemory, cannot save to file this way.");
}

// Clean up the temporary file
if db_path.exists() {
    std::fs::remove_file(&db_path).ok();
}
Ok(())
}

Complete Example: Chat Application

For a full working example that demonstrates Eidetica in a real application, see the Chat Example in the repository.

The chat application showcases:

  • User Management: Automatic passwordless user creation with key management
  • Multiple Databases: Each chat room is a separate database
  • Table Store: Messages stored with auto-generated IDs
  • Multi-Transport Sync: HTTP for local testing, Iroh for P2P with NAT traversal
  • Bootstrap Protocol: Automatic access requests when joining rooms
  • Real-time Updates: Periodic message refresh with automatic sync
  • TUI Interface: Interactive terminal UI using Ratatui

Key Architectural Concepts

The chat example demonstrates several advanced patterns:

1. User API with Automatic Key Management

// Initialize instance with sync enabled
let backend = InMemory::new();
let instance = Instance::create(Box::new(backend))?;
instance.enable_sync()?;

// Create passwordless user (or use existing)
let username = "alice";
let _ = instance.create_user(username, None);

// Login to get User session (handles key management automatically)
let user = instance.login_user(username, None)?;

// User API automatically manages cryptographic keys for databases
let default_key = user.get_default_key()?;
println!("User {} has key: {}", username, default_key);

2. Room Creation with Global Access

// Create a chat room (database) with settings
let mut settings = Doc::new();
settings.set_string("name", "Team Chat");

let key_id = user.get_default_key()?;
let database = user.create_database(settings, &key_id)?;

// Add global wildcard permission so anyone can join and write
let tx = database.new_transaction()?;
let settings_store = tx.get_settings()?;
let global_key = auth::AuthKey::active("*", auth::Permission::Write(10))?;
settings_store.set_auth_key("*", global_key)?;
tx.commit()?;

println!("Chat room created with ID: {}", database.root_id());

3. Message Storage with Table

use chrono::{DateTime, Utc};
use uuid::Uuid;

#[derive(Debug, Clone, Serialize, Deserialize)]
struct ChatMessage {
    id: String,
    author: String,
    content: String,
    timestamp: DateTime<Utc>,
}

impl ChatMessage {
    fn new(author: String, content: String) -> Self {
        Self {
            id: Uuid::new_v4().to_string(),
            author,
            content,
            timestamp: Utc::now(),
        }
    }
}

// Send a message to the chat room
let message = ChatMessage::new("alice".to_string(), "Hello, world!".to_string());

let op = database.new_transaction()?;
let messages_store = op.get_store::<Table<ChatMessage>>("messages")?;
messages_store.insert(message)?;
op.commit()?;

// Read all messages
let viewer_op = database.new_transaction()?;
let viewer_store = viewer_op.get_store::<Table<ChatMessage>>("messages")?;
let all_messages = viewer_store.search(|_| true)?;

for (_, msg) in all_messages {
    println!("[{}] {}: {}", msg.timestamp.format("%H:%M:%S"), msg.author, msg.content);
}

4. Bootstrap Connection to Remote Room

// Join an existing room using bootstrap protocol
let room_address = "abc123def456@127.0.0.1:8080"; // From room creator

// Parse room address (format: room_id@server_address)
let parts: Vec<&str> = room_address.split('@').collect();
let room_id = eidetica::entry::ID::from(parts[0]);
let server_addr = parts[1];

// Enable sync transport
if let Some(sync) = instance.sync() {
    sync.enable_http_transport()?;

    // Request access to the room (bootstrap protocol)
    let key_id = user.get_default_key()?;
    user.request_database_access(
        &sync,
        server_addr,
        &room_id,
        &key_id,
        eidetica::auth::Permission::Write(10),
    ).await?;

    // Register the database with User's key manager
    user.add_database(eidetica::user::types::DatabasePreferences {
        database_id: room_id.clone(),
        key_id: key_id.clone(),
        sync_settings: eidetica::user::types::SyncSettings {
            sync_enabled: true,
            sync_on_commit: true,
            interval_seconds: None,
            properties: std::collections::HashMap::new(),
        },
    })?;

    // Open the synced database
    let database = user.open_database(&room_id)?;
    println!("Joined room successfully!");
}

5. Real-time Sync with Callbacks

// Automatic sync is configured via peer relationships
// When you add a peer for a database, commits automatically trigger sync
if let Some(sync) = instance.sync() {
    if let Ok(peers) = sync.list_peers() {
        if let Some(peer) = peers.first() {
            // Add tree sync relationship - this enables automatic sync on commit
            sync.add_tree_sync(&peer.pubkey, &database.root_id()).await?;

            println!("Automatic sync enabled for database");
        }
    }
}

// Manually trigger immediate sync for a specific database
sync.sync_with_peer(server_addr, Some(&database.root_id())).await?;

Running the Chat Example

# From the repository root
cd examples/chat

# Create a new room (default uses Iroh P2P transport)
cargo run -- --username alice

# Or use HTTP transport for local testing
cargo run -- --username alice --transport http

# Connect to an existing room
cargo run -- <room_address> --username bob

Creating a new room: When you run without a room address, the app will:

  1. Create a new room
  2. Display the room address that others can use to join
  3. Wait for you to press Enter before starting the chat interface

Example output:

🚀 Eidetica Chat Room Created!
📍 Room Address: abc123@127.0.0.1:54321
👤 Username: alice

Share this address with others to invite them to the chat.
Press Enter to start chatting...

Joining an existing room: When you provide a room address as the first argument, the app connects and starts the chat interface immediately.

Transport Options

HTTP Transport (--transport http):

  • Simple client-server model for local networks
  • Server binds to 127.0.0.1 with random port
  • Address format: room_id@127.0.0.1:PORT
  • Best for testing and same-machine demos

Iroh Transport (--transport iroh, default):

  • Peer-to-peer with built-in NAT traversal
  • Uses QUIC protocol with relay servers
  • Address format: room_id@{node-info-json}
  • Best for internet connections across networks

Architecture Highlights

The chat example demonstrates production-ready patterns:

  • Multi-database architecture: Each room is isolated with independent sync state
  • User session management: Automatic key discovery and database registration
  • Bootstrap protocol: Seamless joining of rooms with access requests
  • Dual transport support: Flexible networking for different environments
  • CRDT-based messages: Eventual consistency with deterministic ordering
  • Automatic sync: Background synchronization triggered by commits via callbacks

See the full chat example documentation for detailed usage instructions, complete workflow examples, troubleshooting tips, and implementation details.

Eidetica Architecture Overview

Eidetica is a decentralized database designed to "Remember Everything." This document outlines the architecture and how different components interact with each other.

Eidetica is built on a foundation of content-addressable entries organized in databases, with a pluggable backend system for storage. Entry objects are immutable and contain Tree/SubTree structures that form the Merkle-DAG, with integrated authentication using Ed25519 digital signatures. The system provides Database and Store abstractions over these internal structures to enable efficient merging and synchronization of distributed data.

See the Core Components section for details on the key building blocks.

graph TD
    A[User Application] --> B[Instance]
    B --> C[Database]
    C --> T[Transaction]
    T --> S[Stores: DocStore, Table, etc.]

    subgraph Backend Layer
        C --> BE[Backend: InMemory, etc.]
        BE --> D[Entry Storage]
    end

    subgraph Entry Internal Structure
        H[EntryBuilder] -- builds --> E[Entry]
        E -- contains --> I[TreeNode]
        E -- contains --> J[SubTreeNode Vector]
        E -- contains --> K[SigInfo]
        I --> IR[Root ID, Parents, Metadata]
        J --> JR[Name, Parents, Data]
    end

    subgraph Authentication System
        K --> N[SigKey]
        K --> O[Signature]
        L[AuthValidator] -- validates --> E
        L -- uses --> M[_settings subtree]
        Q[CryptoModule] -- signs/verifies --> E
    end

    subgraph User Abstractions
        C -.-> |"provides view over"| I
        S -.-> |"provides view over"| J
    end

    T -- uses --> H
    H -- stores --> BE
    C -- uses --> L
    S -- modifies --> J

Architectural Terminology

This document clarifies the important distinction between internal data structure names and user-facing API abstractions in Eidetica's architecture.

Overview

Eidetica uses two parallel naming schemes that serve different purposes:

  1. Internal Data Structures: TreeNode/SubTreeNode - the actual Merkle-DAG data structures
  2. User-Facing Abstractions: Database/Store - high-level views over these structures

Understanding this distinction is crucial for maintaining consistency in code, documentation, and APIs.

Internal Data Structures

TreeNode and SubTreeNode

These are the fundamental building blocks of the Merkle-DAG, defined within the Entry module:

  • TreeNode: The internal representation of the main tree node within an Entry

    • Contains the root ID, parent references, and metadata
    • Represents the core structural data of the Merkle-DAG
    • Always singular per Entry
  • SubTreeNode: The internal representation of named subtree nodes within an Entry

    • Contains subtree name, parent references, and data payload
    • Multiple SubTreeNodes can exist per Entry
    • Each represents a named partition of data (analogous to tables)

When to Use Tree/SubTree Terminology

  • When discussing the actual data structures within Entry
  • In Entry module documentation and implementation
  • When describing the Merkle-DAG at the lowest level
  • In comments that deal with the serialized data format
  • When explaining parent-child relationships in the DAG

User-Facing Abstractions

Database and Store

These represent the current high-level abstraction layer that users interact with:

  • Database: A collection of related entries with shared authentication and history

    • Provides a view over a tree of entries
    • Manages operations, authentication, and synchronization
    • What users think of as a "database" or "collection"
  • Store: Typed data access patterns within a database

    • DocStore, Table, YDoc are concrete Store implementations
    • Provide familiar APIs (key-value, document, collaborative editing)
    • Each Store operates on a named subtree within entries

When to Use Database/Store Terminology

  • In all public APIs and user-facing documentation
  • In user guides, tutorials, and examples
  • When describing current application-level concepts
  • In error messages shown to users
  • In logging that users might see

Future Abstraction Layers

Database/Store represents the current abstraction over TreeNode/SubTreeNode structures, but it is not the only possible abstraction. Future versions of Eidetica may introduce alternative abstraction layers that provide different views or APIs over the same underlying layered Merkle-DAG structures.

The key principle is that TreeNode/SubTreeNode remain the stable internal representation, while various abstractions can be built on top to serve different use cases or API paradigms.

The Relationship

User Application
       ↓
    Database  ←─ User-facing abstraction
       ↓
   Transaction ←─ Operations layer
       ↓
     Entry    ←─ Contains TreeNode + SubTreeNodes
       ↓         (internal data structures)
   Backend    ←─ Storage layer
  • A Database provides operations over a tree of Entry objects
  • Each Entry contains one TreeNode and multiple SubTreeNode structures
  • Store implementations provide typed access to specific SubTreeNode data
  • Users never directly interact with TreeNode/SubTreeNode

Code Guidelines

Internal Implementation

// Correct - dealing with Entry internals
entry.tree.root  // TreeNode field
entry.subtrees.iter()  // SubTreeNode collection
builder.set_subtree_data_mut()  // Working with subtree data structures

Public APIs

// Correct - user-facing abstractions
database.new_transaction()  // Database operations
transaction.get_store::<DocStore>("users")  // Store access
instance.create_database("mydata")  // Database management

Documentation

  • Internal docs: Can reference both levels, explaining their relationship
  • User guides: Only use Database/Store terminology
  • API docs: Use Database/Store exclusively
  • Code comments: Use appropriate terminology for the level being discussed

Rationale

This dual naming scheme serves several important purposes:

  1. Separation of Concerns: Internal structures focus on correctness and efficiency, while abstractions focus on usability

  2. API Stability: Users interact with stable Database/Store concepts, while internal TreeNode/SubTreeNode structures can evolve

  3. Conceptual Clarity: Users think in terms of databases and data stores, not Merkle-DAG nodes

  4. Implementation Flexibility: Internal refactoring doesn't affect user-facing terminology

  5. Domain Appropriateness: Tree/Subtree accurately describes the Merkle-DAG structure, while Database/Store matches user mental models

Core Components

The architectural foundation of Eidetica, implementing the Merkle-CRDT design principles through a carefully orchestrated set of interconnected components.

Component Overview

These components work together to provide Eidetica's unique combination of features: content-addressable storage, cryptographic authentication, conflict-free synchronization, and flexible data access patterns.

Architecture Layers

Entry: The fundamental data unit containing TreeNode and SubTreeNode structures - immutable, content-addressable, and cryptographically signed

Database: User-facing abstraction providing operations over trees of entries with independent history and authentication policies

Instance: The main database orchestration layer managing databases, authentication, and storage

User System: Multi-user account management with per-user key storage, database tracking, and sync preferences

Transaction: Transaction mechanism providing atomic operations across multiple stores

Data Access and Storage

Stores: User-facing typed data access patterns (DocStore, Table, YDoc) that provide application-friendly interfaces over subtree data

Backend: Pluggable storage abstraction supporting different persistence strategies

CRDT: Conflict-free data types enabling distributed merging and synchronization

Security and Synchronization

Authentication: Ed25519-based cryptographic system for signing and verification

Synchronization: Distributed sync protocols built on the Merkle-DAG foundation

Terminology Note

Eidetica uses a dual terminology system:

  • Internal structures: TreeNode/SubTreeNode refer to the actual Merkle-DAG data structures within entries
  • User abstractions: Database/Store refer to the high-level APIs and concepts users interact with

See Terminology for detailed guidelines on when to use each naming scheme.

Entry

The fundamental building block of Eidetica's data model, representing an immutable, cryptographically-signed unit of data within the Merkle-DAG structure.

Conceptual Role

Entries serve as the atomic units of both data storage and version history in Eidetica. They combine the functions of:

  • Data Container: Holding actual application data and metadata
  • Version Node: Linking to parent entries to form a history DAG
  • Authentication Unit: Cryptographically signed to ensure integrity and authorization
  • Content-Addressable Object: Uniquely identified by their content hash for deduplication and verification

Internal Data Structure

Entry contains two fundamental internal data structures that form the Merkle-DAG:

TreeNode: The main tree node containing:

  • Root ID of the tree this entry belongs to
  • Parent entry references for the main tree history
  • Optional metadata (not merged with other entries)

SubTreeNodes: Named subtree nodes, each containing:

  • Subtree name (analogous to store/table names)
  • Parent entry references specific to this subtree's history
  • Serialized CRDT data payload for this subtree

Authentication Envelope: Every entry includes signature information that proves authorization and ensures tamper-detection.

Relationship to User Abstractions

While entries internally use TreeNode and SubTreeNode structures, users interact with higher-level abstractions:

  • Database: Provides operations over the tree of entries (uses TreeNode data)
  • Stores: Typed access patterns (DocStore, Tables, etc.) over subtree data (uses SubTreeNode data)

This separation allows the internal Merkle-DAG structures to remain efficient and correct while providing user-friendly APIs.

Identity and Integrity

Content-Addressable Identity: Each entry's ID is a SHA-256 hash of its canonical content, making entries globally unique and enabling efficient deduplication.

Deterministic Hashing: IDs are computed from a canonical JSON representation, ensuring identical entries produce identical IDs across different systems.

Immutability Guarantee: Once created, entries cannot be modified, ensuring the integrity of the historical record and cryptographic signatures.

Design Benefits

Distributed Synchronization: Content-addressable IDs enable efficient sync protocols where systems can identify missing or conflicting entries.

Cryptographic Verification: Signed entries provide strong guarantees about data authenticity and integrity.

Granular History: The DAG structure enables sophisticated queries like "show me all changes since timestamp X" or "merge these two concurrent branches".

Efficient Storage: Identical entries are automatically deduplicated, and metadata can be stored separately from bulk data.

ID Format Requirements

All IDs in Eidetica must be valid SHA-256 hashes represented as 64-character lowercase hexadecimal strings. This includes:

  • Tree root IDs: The ID of the root entry of a tree
  • Main tree parent IDs: Parent entries in the main tree
  • Subtree parent IDs: Parent entries within specific subtrees
  • Entry IDs: Content-addressable identifiers for entries themselves

Valid ID Format

  • Length: Exactly 64 characters
  • Characters: Only lowercase hexadecimal (0-9, a-f)
  • Example: a1b2c3d4e5f6789012345678901234567890abcdef1234567890abcdef123456

Invalid ID Examples

❌ "parent_id"           # Too short, not hex
❌ "ABCD1234..."         # Uppercase letters
❌ "abcd-1234-..."       # Contains hyphens
❌ "12345678901234567890123456789012345678901234567890123456789012345"  # 63 chars (too short)

Internal Data Structure Detail

An Entry contains the following internal data structures:

struct Entry {
    // Main tree node - the core Merkle-DAG structure
    tree: TreeNode {
        root: ID,                    // Root entry ID of the tree
        parents: Vec<ID>,           // Parent entries in main tree history
        metadata: Option<RawData>,  // Optional metadata (not merged)
    },

    // Named subtree nodes - independent data partitions
    subtrees: Vec<SubTreeNode> {
        name: String,              // Subtree name (e.g., "users", "posts")
        parents: Vec<ID>,          // Parent entries specific to this subtree
        data: RawData,            // Serialized CRDT data for this subtree
    },

    // Authentication and signature information
    sig: SigInfo {
        sig: Option<String>,       // Base64-encoded Ed25519 signature
        key: SigKey,              // Reference to signing key
    },
}

// Where:
type RawData = String;            // JSON-serialized CRDT structures
type ID = String;                // SHA-256 content hash (hex-encoded)

Key Design Points

  • TreeNode: Represents the entry's position in the main Merkle-DAG tree structure
  • SubTreeNodes: Enable independent histories for different data partitions within the same entry
  • Separation: The tree structure (TreeNode) is separate from the data partitions (SubTreeNodes)
  • Multiple Histories: Each entry can participate in one main tree history plus multiple independent subtree histories

Backend

Pluggable storage abstraction layer supporting different storage implementations.

Architecture

The backend system has two layers:

  • BackendImpl trait: The storage trait that backends implement
  • Backend wrapper: Instance-level wrapper providing future local/remote dispatch

BackendImpl Trait

Abstracts underlying storage to allow different backends without changing core logic.

Core Operations:

  • Entry storage and retrieval by content-addressable ID
  • Verification status tracking for authentication
  • Database and store tip calculation
  • Topological sorting for consistent entry ordering

Current Implementation

InMemory: HashMap-based storage with JSON file persistence

  • Stores entries and verification status
  • Includes save/load functionality for state preservation
  • Supports all BackendImpl trait operations

Verification Status

Verified: Entry cryptographically verified and authorized

Unverified: Entry lacks authentication or failed verification

Status determined during commit based on signature validation and permission checking.

Key Features

Entry Storage: Immutable entries with content-addressable IDs

Tip Calculation: Identifies entries with no children in databases/stores

Height Calculation: Computes topological heights for proper ordering

Graph Traversal: Efficient DAG navigation for database operations

Custom Backend Implementation

Implement BackendImpl trait with:

  1. Storage-specific logic for all trait methods
  2. Verification status tracking support
  3. Thread safety (Send + Sync + Any)
  4. Performance considerations for graph operations

The Backend wrapper will automatically delegate operations to your BackendImpl implementation.

Instance

Purpose and Architecture

Instance manages the multi-user infrastructure and system resources. It separates infrastructure management from contextual operations, providing user account management and coordinating with pluggable storage backends. All contextual operations (database creation, key management) run through User sessions after login.

Each Instance maintains a unique device identity (_device_key) through an automatically-generated Ed25519 keypair, enabling system database authentication and secure multi-device synchronization.

Key Responsibilities

User Management: Creates and authenticates user accounts with optional password protection.

System Database Management: Maintains system databases (_instance, _users, _databases) for infrastructure operations.

Backend Coordination: Interfaces with pluggable storage backends (currently just InMemory) while abstracting storage details from higher-level code.

Device Identity: Automatically maintains device-specific cryptographic identity (_device_key) for system operations and sync.

Design Principles

  • Infrastructure Focus: Instance manages infrastructure, User handles operations
  • User-Centric: All database and key operations run in User context after login
  • Pluggable Storage: Storage backends can be swapped without affecting application logic
  • Multi-User: Always multi-user underneath, supporting both passwordless and password-protected users
  • Sync-Ready: Built-in device identity and callbacks for distributed synchronization

Architecture Layers

Instance provides infrastructure management:

  • User Account Management: Create users with optional passwords, login to obtain User sessions
  • System Databases: Maintain _instance, _users, _databases for infrastructure
  • Backend Access: Coordinate storage operations through pluggable backends

User provides contextual operations (returned from login):

  • Database Operations: Create, load, and find databases in user context
  • Key Management: Add private keys, list keys, get signing keys
  • Session Management: Logout to clear decrypted keys from memory

Sync Integration

Instance can be extended with synchronization capabilities via enable_sync():

// Enable sync on an instance
let instance = Instance::open(backend)?.enable_sync()?;

// Access sync module via Arc (cheap to clone, thread-safe)
let sync = instance.sync().expect("Sync enabled");

Design:

  • Optional feature: Sync is opt-in via enable_sync() method
  • Arc-based sharing: sync() returns Option<Arc<Sync>>
  • Thread-safe: Arc<Sync> can be shared across threads without additional locking
  • Interior mutability: Sync uses AtomicBool and OnceLock internally, eliminating need for Mutex wrapper
  • Single accessor: Only sync() method (no separate mutable accessor needed)

This design eliminates deadlock risks and simplifies the API by avoiding MutexGuard lifetime management.

Database

Represents an independent, versioned collection of data entries within Eidetica, analogous to a database in traditional databases.

Conceptual Model

Databases organize related data entries into a coherent unit with its own history and authentication policies. Each Database is identified by its root entry's content-addressable ID, making it globally unique and verifiable.

Unlike traditional databases, Databases maintain full historical data through a Merkle DAG structure, enabling features like:

  • Conflict-free merging of concurrent changes
  • Cryptographic verification of data integrity
  • Decentralized synchronization across devices
  • Point-in-time queries (unimplemented)

Architecture and Lifecycle

Database Creation: Initialized with settings (stored as a Doc CRDT) and associated with an authentication key for signing operations. Database holds a weak reference to its parent Instance for storage access.

Data Access: Applications interact with Databases through Transaction instances, which provide transactional semantics and store access.

Storage Coordination: Database accesses storage through Instance using weak references, preventing circular dependencies while maintaining clear ownership hierarchy.

Entry History: Each operation creates new entries that reference their parents, building an immutable history DAG.

Settings Management: Database-level configuration (permissions, sync settings, etc.) is stored as CRDT data, allowing distributed updates.

Authentication

Each Database maintains its own authentication configuration in the special _settings store. All entries must be cryptographically signed with Ed25519 signatures - there are no unsigned entries in Eidetica.

Databases support direct keys, delegation to other databases for flexible cross-project authentication, and a three-tier permission hierarchy (Admin, Write, Read) with priority-based key management. Authentication changes merge deterministically using Last-Write-Wins semantics.

For complete details, see Authentication.

Integration Points

Store Access: Databases provide typed access to different data structures (DocStore, Table, YDoc) through the store system.

Synchronization: Databases serve as the primary unit of synchronization, with independent merge and conflict resolution.

User System

Purpose and Architecture

The User system provides multi-user account management with per-user key management, database tracking, and sync preferences. Each user maintains their own encrypted private database for storing keys, database preferences, and personal settings.

Key Responsibilities

Account Management: User creation, login/logout with optional password protection, and session management.

Key Management: Per-user encryption keys with secure storage, key-to-SigKey mapping for database access, and automatic SigKey discovery.

Database Tracking: Per-user list of tracked databases with individual sync preferences, automatic permission discovery, and preference management.

Secure Storage: User data stored in a private database, with password-based encryption for the private keys of password-protected users.

Design Principles

  • Session-Based: All user operations happen through User session objects obtained via login
  • Secure by Default: User keys never stored in plaintext, passwords hashed with Argon2id
  • Separation of Concerns: User manages preferences, other modules read preferences and adjust Instance behavior
  • Auto-Discovery: Automatic SigKey discovery using database permissions
  • Multi-User Support: Different users can have different preferences for the same database

Data Model

UserKey

Each user has one or more cryptographic keys for database authentication:

pub struct UserKey {
    /// Unique identifier for this key
    pub key_id: String,

    /// Encrypted private key bytes
    pub encrypted_key: Vec<u8>,

    /// Encryption nonce
    pub nonce: Vec<u8>,

    /// Per-database SigKey mappings
    pub database_sigkeys: HashMap<ID, String>,

    /// When this key was created
    pub created_at: u64,

    /// Optional display name
    pub display_name: Option<String>,
}

The database_sigkeys HashMap maps database IDs to SigKey identifiers, allowing each user key to authenticate with multiple databases using different SigKeys.

UserDatabasePreferences

Tracks which databases a user wants to sync and their sync configuration:

pub struct UserDatabasePreferences {
    /// Database ID being tracked
    pub database_id: ID,

    /// Which user key to use for this database
    pub key_id: String,

    /// User's sync preferences for this database
    pub sync_settings: SyncSettings,

    /// When user added this database
    pub added_at: u64,
}

SyncSettings

Per-database sync configuration:

pub struct SyncSettings {
    /// Whether user wants to sync this database
    pub sync_enabled: bool,

    /// Sync on commit
    pub sync_on_commit: bool,

    /// Sync interval in seconds (if periodic)
    pub interval_seconds: Option<u64>,

    /// Additional sync configuration
    pub properties: HashMap<String, String>,
}

Field Behavior:

  • sync_enabled: Master switch for syncing this database
  • sync_on_commit: Trigger sync immediately when committing changes
  • interval_seconds: Periodic sync interval in seconds
    • Some(n): Sync automatically every n seconds
    • None: No periodic sync
  • properties: Extensible key-value pairs for future features

Multi-User Merging:

When multiple users track the same database, their settings are merged using the "most aggressive" strategy:

  • sync_enabled: OR (true if any user enables sync)
  • sync_on_commit: OR (true if any user wants commit-sync)
  • interval_seconds: MIN (most frequent sync wins)
  • properties: UNION (combine all properties, later values override)

This ensures the database syncs as frequently and aggressively as any user prefers.

Storage Architecture

Each user has a private database: user:<username>

  • keys Table: Stores UserKey entries with encrypted private keys
  • databases Table: Stores UserDatabasePreferences for tracked databases
  • settings DocStore: User preferences and configuration

Database Tracking Flow

When a user adds a database to track:

  1. Validate Input: Check database isn't already tracked, verify key_id exists
  2. Derive Public Key: Get public key from the user's private key
  3. Auto-Discovery: Call Database::find_sigkeys() with user's public key
  4. Permission Sorting: Results sorted by permission level (Admin > Write > Read)
  5. Select Best: Choose highest-permission SigKey from results
  6. Store Mapping: Save SigKey mapping in UserKey.database_sigkeys
  7. Save Preferences: Store UserDatabasePreferences in databases Table
  8. Commit: Changes persisted to backend

This automatic discovery eliminates the need for users to manually specify which SigKey to use - the system finds the best available access level.

Key Management

Adding Keys

user.add_private_key(Some("backup_key"))?;

Keys are:

  1. Generated as Ed25519 keypairs
  2. Encrypted using user's encryption key (derived from password or master key)
  3. Stored in the user's private database
  4. Never persisted in plaintext

Key-to-SigKey Mapping

user.map_key("my_key", &db_id, "sigkey_id")?;

Manual mapping is supported for advanced use cases, but most applications use auto-discovery via add_database().

Default Keys

Each user has a default key (usually created during account creation) accessible via:

let default_key_id = user.get_default_key()?;

API Surface

User Creation and Login

// Create user (on Instance)
instance.create_user("alice", Some("password"))?;

// Login to get User session
let user = instance.login_user("alice", Some("password"))?;

Database Tracking

// Add database to tracking
let prefs = DatabasePreferences {
    database_id: db_id.clone(),
    key_id: user.get_default_key()?,
    sync_settings: SyncSettings {
        sync_enabled: true,
        sync_on_commit: false,
        interval_seconds: Some(60),
        properties: Default::default(),
    },
};
user.add_database(prefs)?;

// List tracked databases
let databases = user.list_database_prefs()?;

// Get specific preferences
let prefs = user.database_prefs(&db_id)?;

// Update preferences (upsert behavior)
user.set_database(new_prefs)?;

// Remove from tracking
user.remove_database(&db_id)?;

// Load a tracked database
let database = user.open_database(&db_id)?;

Key Management

// Add a key
user.add_private_key(Some("device_key"))?;

// List all keys
let keys = user.list_keys()?;

// Get default key
let default = user.get_default_key()?;

// Set database-specific SigKey mapping
user.map_key("my_key", &db_id, "sigkey_id")?;

Security Considerations

Password Protection

Password-protected users use Argon2id for key derivation:

let config = argon2::Config {
    variant: argon2::Variant::Argon2id,
    // ... secure parameters
};

This provides resistance against:

  • Brute force attacks
  • Rainbow table attacks
  • Side-channel timing attacks

Key Storage

  • Private keys encrypted at rest
  • Decrypted keys only held in memory during User session
  • Keys cleared from memory on logout
  • No plaintext key material ever persisted

Permission System Integration

The User system integrates with the permission system via SigKey discovery:

  1. User's public key derived from private key
  2. Database queried for SigKeys matching that public key
  3. Results include permission level (Direct or DelegationPath)
  4. Highest permission selected automatically
  5. Currently only Direct SigKeys supported (DelegationPath planned)

Multi-User Support

Different users can track the same database with different preferences:

  • Each user has independent tracking lists
  • Each user can use different keys for the same database
  • Each user can configure different sync settings
  • No coordination needed between users

Transaction

Atomic transaction mechanism for database modifications.

Lifecycle

  1. Creation: Initialize with current database tips as parents
  2. Store Access: Get typed handles for data manipulation
  3. Staging: Accumulate changes in internal entry
  4. Commit: Sign, validate, and store finalized entry

Features

  • Multiple store changes in single commit
  • Automatic authentication using database's default key
  • Type-safe store access
  • Cryptographic signing and validation

Integration

Entry Management: Creates and manages entries via EntryBuilder

Authentication: Signs operations and validates permissions

CRDT Support: Enables store conflict resolution

Backend Storage: Stores entries with verification status

Authentication Validation

Transaction commit includes comprehensive authentication validation that distinguishes between valid auth states and corrupted configurations.

Validation Process

During commit() (transaction/mod.rs ~line 938-960), the system validates authentication configuration:

  1. Extract effective settings: Get _settings state at commit time
  2. Check for tombstone: Use is_tombstone("auth") to detect deleted auth
  3. Retrieve auth value: Use get("auth") to get configuration
  4. Validate type: Ensure auth is Doc type (if present)
  5. Parse auth settings: Convert Doc to AuthSettings
  6. Validate operation: Check signature and permissions

Error Types

Defined in transaction/errors.rs:

  • AuthenticationRequired: Unsigned op attempted in signed mode
  • NoAuthConfiguration: Auth lookup failed in signed mode
  • CorruptedAuthConfiguration: Auth has wrong type or is deleted
  • SigningKeyNotFound: Requested signing key doesn't exist
  • InsufficientPermissions: Key lacks required permissions

All are classified as authentication errors via is_authentication_error().

Authentication

Comprehensive Ed25519-based cryptographic authentication system that ensures data integrity and access control across Eidetica's distributed architecture.

Overview

Eidetica provides flexible authentication supporting both unsigned and signed modes, although signed databases are the default. Databases that lack authentication are used for specialized purposes, such as local-only databases or 'overlays'.

Once authentication is configured, all operations require valid Ed25519 signatures, providing strong guarantees about data authenticity and enabling access control in decentralized environments.

The authentication system is deeply integrated with the core database, not merely a consumer of the API. This tight integration enables efficient validation, deterministic conflict resolution during network partitions, and preservation of historical validity.

Authentication States

Databases operate in one of four authentication configuration states:

State_settings.auth ValueUnsigned OpsAuthenticated OpsTransitionError Type
Unsigned ModeMissing or {} (empty Doc)✓ Allowed✓ Bootstrap→ SignedN/A
Signed ModeValid keys configured✗ Rejected✓ ValidatedPermanentAuthenticationRequired
CorruptedWrong type (String, etc.)✗ Rejected✗ Rejected→ Fail-safeCorruptedAuthConfiguration
DeletedTombstone (was deleted)✗ Rejected✗ Rejected→ Fail-safeCorruptedAuthConfiguration

State Semantics:

  • Unsigned Mode: Database has no authentication configured (missing or empty _settings.auth). Both missing and empty {} are equivalent. Unsigned operations succeed, authenticated operations trigger automatic bootstrap.

  • Signed Mode: Database has at least one key configured in _settings.auth. All operations require valid signatures. This is a permanent state - cannot return to unsigned mode.

  • Corrupted: Authentication configuration exists but has wrong type (not a Doc). Fail-safe behavior: ALL operations rejected to prevent security bypass through corruption.

  • Deleted: Authentication configuration was explicitly deleted (CRDT tombstone). Fail-safe behavior: ALL operations rejected since this indicates invalid security state.

Fail-Safe Principle: When auth configuration is corrupted or deleted, the system rejects ALL operations rather than guessing or bypassing security. This prevents exploits through auth configuration manipulation. When the state is detected, those Entries are invalid and will be rejected by Instances that try to validate them.

For complete behavioral details, see Authentication Behavior Reference.

Architecture

Storage Location: Authentication configuration resides in the special _settings.auth store of each Database, using Doc CRDT for deterministic conflict resolution.

Validation Component: The AuthValidator provides centralized entry validation with performance-optimized caching.

Signature Format: All entries include authentication information in their structure:

{
  "auth": {
    "sig": "ed25519_signature_base64_encoded",
    "key": "KEY_NAME_OR_DELEGATION_PATH"
  }
}

Permission Hierarchy

Three-tier permission model with integrated priority system:

PermissionSettings AccessKey ManagementData WriteData ReadPriority
Admin0-2^32
Write0-2^32
ReadNone

Priority Semantics:

  • Lower numbers = higher priority (0 is highest)
  • Admin/Write permissions include u32 priority value
  • Keys can only modify other keys with equal or lower priority
  • Priority affects administrative operations, NOT CRDT merge resolution

Key Management

Key Management API

The authentication system provides three methods for managing keys with different safety guarantees:

add_key(key_name, auth_key): Adds a new key, fails if key already exists

  • Prevents accidental overwrites during operations like bootstrap sync
  • Recommended for new key creation to avoid conflicts between devices
  • Returns KeyAlreadyExists error if key name is already in use

overwrite_key(key_name, auth_key): Explicitly replaces an existing key

  • Use when intentionally updating or replacing a key
  • Provides clear intent for key replacement operations
  • Always succeeds regardless of whether key exists

can_access(pubkey, requested_permission): Check if a public key has sufficient access

  • Checks both specific key permissions and global '*' permissions
  • Returns true if the key has sufficient permission (either specific or global)
  • Used by bootstrap approval system to avoid unnecessary key additions
  • Supports the flexible access control patterns enabled by wildcard permissions

resolve_sig_key_for_operation(device_pubkey): Resolve which SigKey to use for operations

  • Searches auth settings for a key matching the device's public key
  • Falls back to global "*" permission when specific pubkey is not found in auth settings
  • Returns the SigKey name, granted permission level, and whether pubkey must be included in SigInfo
  • Used by transaction commit to build proper SigInfo structures
  • Enables devices to automatically discover their appropriate authentication method

Key Conflict Prevention

During multi-device synchronization, the system prevents key conflicts:

  • If adding a key that exists with the same public key: Operation succeeds silently (idempotent)
  • If adding a key that exists with a different public key: Operation fails with detailed error
  • This prevents devices from accidentally overwriting each other's authentication keys

Direct Keys

Ed25519 public keys stored directly in the database's _settings.auth:

{
  "_settings": {
    "auth": {
      "KEY_LAPTOP": {
        "pubkey": "ed25519:BASE64_PUBLIC_KEY",
        "permissions": "write:10",
        "status": "active"
      }
    }
  }
}

Key Lifecycle

Keys transition between two states:

  • Active: Can create new entries, all operations permitted
  • Revoked: Cannot create new entries, historical entries remain valid

This design preserves the integrity of historical data while preventing future use of compromised keys.

Wildcard Keys

Special * key enables public access:

  • Can grant any permission level (read, write, or admin)
  • Commonly used for world-readable databases
  • Subject to same revocation mechanisms as regular keys

Delegation System

Databases can delegate authentication to other databases, enabling powerful authentication patterns without granting administrative privileges on the delegating database.

Core Concepts

Delegated Database References: Any database can reference another database as an authentication source:

{
  "_settings": {
    "auth": {
      "user@example.com": {
        "permission-bounds": {
          "max": "write:15",
          "min": "read" // optional
        },
        "database": {
          "root": "TREE_ROOT_ID",
          "tips": ["TIP_ID_1", "TIP_ID_2"]
        }
      }
    }
  }
}

Permission Clamping

Delegated permissions are constrained by bounds:

  • max: Maximum permission level (required)
  • min: Minimum permission level (optional)
  • Effective permission = clamp(delegated_permission, min, max)
  • Priority derives from the effective permission after clamping

Delegation Chains

Multi-level delegation supported with permission clamping at each level:

{
  "auth": {
    "key": [
      { "key": "org_tree", "tips": ["tip1"] },
      { "key": "team_tree", "tips": ["tip2"] },
      { "key": "ACTUAL_KEY" }
    ]
  }
}

Tip Tracking

"Latest known tips" mechanism ensures key revocations are respected:

  1. Entries include delegated database tips at signing time
  2. Database tracks these as "latest known tips"
  3. Future entries must use equal or newer tips
  4. Prevents using old database states where revoked keys were valid

Authentication Flow

  1. Entry Creation: Application creates entry with auth field
  2. Signing: Entry signed with Ed25519 private key
  3. Resolution: AuthValidator resolves key (direct or delegated)
  4. Status Check: Verify key is Active (not Revoked)
  5. Tip Validation: For delegated keys, validate against latest known tips
  6. Permission Clamping: Apply bounds for delegated permissions
  7. Signature Verification: Cryptographically verify Ed25519 signature
  8. Permission Check: Ensure key has sufficient permissions
  9. Storage: Entry stored if all validations pass

Bootstrap Authentication Flow

For new devices joining existing databases without prior state:

  1. Bootstrap Request: Device sends SyncTreeRequest with empty tips + auth info
  2. Key Validation: Server validates requesting device's public key
  3. Permission Evaluation: Server checks requested permission level
  4. Key Conflict Check: System checks if key name already exists:
    • If key exists with same public key: Bootstrap continues (idempotent)
    • If key exists with different public key: Bootstrap fails with error
    • If key doesn't exist: Key is added to database
  5. Auto-Approval: Server automatically approves key (configurable)
  6. Database Update: Server safely adds key using conflict-safe add_key() method
  7. Bootstrap Response: Complete database sent with key approval confirmation
  8. Local Setup: Device stores database and gains authenticated access

Key Components:

  • sync_with_peer_for_bootstrap(): API for authenticated bootstrap
  • add_key_to_database(): Server-side key approval with conflict handling
  • Protocol extensions in SyncTreeRequest/BootstrapResponse
  • Key conflict resolution during multi-device bootstrap scenarios

Conflict Resolution

Authentication changes use Last-Write-Wins (LWW) semantics based on the DAG structure:

  • Settings conflicts resolved deterministically by Doc CRDT
  • Priority determines who CAN make changes
  • LWW determines WHICH change wins in a conflict
  • Historical entries remain valid even after permission changes
  • Revoked status prevents new entries but preserves existing content

Network Partition Handling

During network splits:

  1. Both sides may modify authentication settings
  2. Upon reconnection, LWW resolves conflicts
  3. Most recent change (by DAG timestamp) takes precedence
  4. All historical entries remain valid
  5. Future operations follow merged authentication state

Security Considerations

Protected Against

  • Unauthorized entry creation (mandatory signatures)
  • Permission escalation (permission clamping)
  • Historical tampering (immutable DAG)
  • Replay attacks (content-addressable IDs)
  • Administrative hierarchy violations (priority system)

Requires Manual Recovery

  • Admin key compromise when no higher-priority key exists
  • Conflicting administrative changes during partitions

Implementation Components

AuthValidator (auth/validation.rs): Core validation logic with caching

Crypto Module (auth/crypto.rs): Ed25519 operations and signature verification

AuthSettings (auth/settings.rs): Settings management and conflict-safe key operations

Permission Module (auth/permission.rs): Permission checking and clamping logic

See Also

Authentication Behavior Reference

This document provides a comprehensive behavioral reference for authentication configuration states in Eidetica. It complements the Authentication Design by documenting the exact behavior of each authentication state with implementation details.

Table of Contents

Overview

Eidetica's authentication system operates in two valid modes with proactive corruption prevention:

  1. Unsigned Mode: No authentication configured (missing or empty _settings.auth)
  2. Signed Mode: Valid authentication configuration with at least one key

Corruption Prevention: The system uses two-layer validation to prevent invalid auth states:

  • Proactive Prevention (Layer 1): Transactions that would corrupt or delete auth configuration fail immediately during commit(), before the entry enters the Merkle DAG
  • Reactive Fail-Safe (Layer 2): If auth is already corrupted (from older code or external manipulation), all operations fail with CorruptedAuthConfiguration

Theoretical States (prevented by validation): 3. Corrupted State: Auth configuration has wrong type (PREVENTED - cannot be created) 4. Deleted State: Auth configuration was deleted (PREVENTED - cannot be created)

The system enforces aggressive fail-safe behavior: any attempt to corrupt or delete authentication fails immediately, preventing security bypass exploits.

State Definitions

Unsigned Mode

CRDT State: _settings.auth is either:

  • Missing entirely (key doesn't exist in Doc)
  • Contains empty Doc: {"auth": {}}

Both states are equivalent - the system treats missing and empty identically.

Behavior:

  • Unsigned operations succeed: Transactions without signatures commit normally
  • No validation overhead: Authentication validation is skipped entirely
  • 🔒 Not a security weakness: Intended for only specialized databases

Use Cases:

  • Development and testing environments
  • Local-only computation that never syncs
  • Temporary scratch databases
  • Future "overlay" databases for local work

Signed Mode

CRDT State: _settings.auth contains a Doc with at least one key configuration:

{
  "_settings": {
    "auth": {
      "KEY_NAME": {
        "pubkey": "ed25519:...",
        "permissions": "admin:0",
        "status": "active"
      }
    }
  }
}

Behavior:

  • Unsigned operations rejected: All operations must have valid signatures
  • Authenticated operations validated: Signature verification and permission checks
  • 🔒 Mandatory authentication: Security enforced for all future operations
  • ⚠️ Permanent state: Cannot return to unsigned mode without creating new database

Corrupted/Deleted State

Status: This state is prevented by proactive validation and can no longer be created through normal operations.

Theoretical CRDT State: _settings.auth exists but contains wrong type (not a Doc):

  • String value: {"auth": "corrupted_string"}
  • Number value: {"auth": 42}
  • Array value: {"auth": [1, 2, 3]}
  • Tombstone value: {"auth": null}
  • Any non-Doc type

How It's Prevented:

  • Layer 1 (Proactive): Commits that would create wrong-type auth fail before entry creation
  • Layer 2 (Reactive): If somehow corrupted, all subsequent operations fail

If It Existed, Behavior Would Be:

  • ALL operations rejected: Both unsigned and authenticated operations fail
  • 💥 Fail-safe enforcement: Prevents security bypass through corruption
  • 🚨 Error: TransactionError::CorruptedAuthConfiguration

Rationale for Fail-Safe: If auth configuration is corrupted, the system cannot determine:

  • Whether authentication should be required
  • What keys are valid
  • What permissions exist

Rather than guess or bypass security, ignore the corrupted Entry.

Operation Behavior by State

Complete behavior matrix for all combinations:

Auth StateUnsigned TransactionAuthenticated TransactionBehavior
Unsigned Mode (missing)✓ Succeeds✓ Triggers bootstrapNormal operation
Unsigned Mode (empty {})✓ Succeeds✓ Triggers bootstrapEquivalent to missing
Signed Mode✗ Rejected (auth required)✓ Validated normallySecurity enforced
Corrupted (wrong type)✗ Rejected✗ RejectedFail-safe: All ops fail
Deleted (tombstone)✗ Rejected✗ RejectedFail-safe: All ops fail

Error Messages:

  • Unsigned op in signed mode: AuthenticationRequired or NoAuthConfiguration
  • Corrupted state: CorruptedAuthConfiguration
  • Deleted state: CorruptedAuthConfiguration

Synchronization Architecture

This document describes the internal architecture of Eidetica's synchronization system, including design decisions, data structures, and implementation details.

Architecture Overview

The synchronization system uses a BackgroundSync architecture with command-pattern communication:

  1. Single background thread handling all sync operations
  2. Command channel communication between frontend and background
  3. Merkle-CRDT synchronization for conflict-free replication
  4. Modular transport layer supporting HTTP and Iroh P2P protocols
  5. Hook-based change detection for automatic sync triggering
  6. Persistent state tracking in sync database using DocStore
graph TB
    subgraph "Application Layer"
        APP[Application Code] --> TREE[Database Operations]
    end

    subgraph "Core Database Layer"
        TREE --> ATOMICOP[Transaction]
        BASEDB[Instance] --> TREE
        BASEDB --> SYNC[Sync Module]
        ATOMICOP --> COMMIT[Commit Operation]
        COMMIT --> CALLBACKS[Execute Write Callbacks]
    end

    subgraph "Sync Frontend"
        SYNC[Sync Module] --> CMDTX[Command Channel]
        SYNC --> PEERMGR[PeerManager]
        SYNC --> SYNCTREE[Sync Database]
        CALLBACKS --> QUEUEENTRY[Sync::queue_entry_for_sync]
        QUEUEENTRY --> CMDTX
    end

    subgraph "BackgroundSync Engine"
        CMDTX --> BGSYNC[BackgroundSync Thread]
        BGSYNC --> TRANSPORT[Transport Layer]
        BGSYNC --> RETRY[Retry Queue]
        BGSYNC --> TIMERS[Periodic Timers]
        BGSYNC -.->|reads| SYNCTREE[Sync Database]
        BGSYNC -.->|reads| PEERMGR[PeerManager]
    end

    subgraph "Sync State Management"
        SYNCSTATE[SyncStateManager]
        SYNCCURSOR[SyncCursor]
        SYNCMETA[SyncMetadata]
        SYNCHISTORY[SyncHistoryEntry]
        SYNCSTATE --> SYNCCURSOR
        SYNCSTATE --> SYNCMETA
        SYNCSTATE --> SYNCHISTORY
        BGSYNC --> SYNCSTATE
    end

    subgraph "Storage Layer"
        BACKEND[(Backend Storage)]
        SYNCTREE --> BACKEND
        SYNCSTATE --> SYNCTREE
    end

    subgraph "Transport Layer"
        TRANSPORT --> HTTP[HTTP Transport]
        TRANSPORT --> IROH[Iroh P2P Transport]
        HTTP --> NETWORK1[Network/HTTP]
        IROH --> NETWORK2[Network/QUIC]
    end

Core Components

1. Sync Module (sync/mod.rs)

The main Sync struct is a thread-safe frontend that communicates with a background sync engine using interior mutability:

pub struct Sync {
    /// Communication channel to background sync (initialized once when transport is enabled)
    command_tx: OnceLock<mpsc::Sender<SyncCommand>>,
    /// The instance for read operations and tree management
    instance: Instance,
    /// The tree containing synchronization settings
    sync_tree: Database,
    /// Track if transport has been enabled (atomic for lock-free access)
    transport_enabled: AtomicBool,
}

Design:

  • Thread-safe by design: Uses Arc<Sync> without needing Mutex wrapper
  • Interior mutability: AtomicBool and OnceLock enable &self methods
  • Lock-free operation: No mutex contention, safe to share across threads
  • One-time initialization: OnceLock ensures command channel is initialized exactly once

Key responsibilities:

  • Provides public API methods (all using &self)
  • Sends commands to background thread via channel
  • Manages sync database for peer/relationship storage
  • Registers callbacks that queue entries when commits occur

2. BackgroundSync Engine (sync/background.rs)

The BackgroundSync struct handles all sync operations in a single background thread and accesses peer state directly from the sync database:

pub struct BackgroundSync {
    // Core components
    transport: Box<dyn SyncTransport>,
    instance: WeakInstance,  // Weak reference to Instance for storage access

    // Reference to sync database for peer/relationship management
    sync_tree_id: ID,

    // Server state
    server_address: Option<String>,

    // Retry queue for failed sends
    retry_queue: Vec<RetryEntry>,

    // Communication
    command_rx: mpsc::Receiver<SyncCommand>,
}

BackgroundSync accesses peer and relationship data directly from the sync database:

  • All peer data is stored persistently in the sync database via PeerManager
  • Peer information is read on-demand when needed for sync operations
  • Peer data automatically survives application restarts
  • Single source of truth eliminates state synchronization issues

Command types:

pub enum SyncCommand {
    // Entry operations
    SendEntries { peer: String, entries: Vec<Entry> },
    QueueEntry { peer: String, entry_id: ID, tree_id: ID },

    // Sync control
    SyncWithPeer { peer: String },
    Shutdown,

    // Server operations (with response channels)
    StartServer { addr: String, response: oneshot::Sender<Result<()>> },
    StopServer { response: oneshot::Sender<Result<()>> },
    GetServerAddress { response: oneshot::Sender<Result<String>> },

    // Peer connection operations
    ConnectToPeer { address: Address, response: oneshot::Sender<Result<String>> },
    SendRequest { address: Address, request: SyncRequest, response: oneshot::Sender<Result<SyncResponse>> },
}

Event loop architecture:

The BackgroundSync engine runs a tokio select loop that handles:

  1. Command processing: Immediate handling of frontend commands
  2. Periodic sync: Every 5 minutes, sync with all registered peers
  3. Retry processing: Every 30 seconds, attempt to resend failed entries
  4. Connection checks: Every 60 seconds, verify peer connectivity

All operations are non-blocking and handled concurrently within the single background thread.

Server initialization:

When starting a server, BackgroundSync creates a SyncHandlerImpl with database access:

// Inside handle_start_server()
let handler = Arc::new(SyncHandlerImpl::new(
    self.backend.clone(),
    DEVICE_KEY_NAME,
));
self.transport.start_server(addr, handler).await?;

This enables the transport layer to process incoming sync requests and store received entries.

3. Command Pattern Architecture

The command pattern provides clean separation between the frontend and background sync engine:

Command categories:

  • Entry operations: SendEntries, QueueEntry - Handle network I/O for entry transmission
  • Server management: StartServer, StopServer, GetServerAddress - Manage transport server state
  • Network operations: ConnectToPeer, SendRequest - Perform async network operations
  • Control: SyncWithPeer, Shutdown - Coordinate background sync operations

Data access pattern:

  • Peer and relationship data: Written directly to sync database by frontend, read on-demand by background
  • Network operations: Handled via commands to maintain async boundaries
  • Transport state: Owned and managed by background sync engine

This architecture:

  • Eliminates circular dependencies: Clear ownership boundaries
  • Maintains async separation: Network operations stay in background thread
  • Enables direct data access: Both components access sync database directly for peer data
  • Provides clean shutdown: Graceful handling in both async and sync contexts

4. Change Detection via Write Callbacks

Write callbacks automatically detect when entries need synchronization:

// Callback function type defined in instance/mod.rs (stored internally as Arc by Instance)
pub type WriteCallback = dyn Fn(&Entry, &Database, &Instance) -> Result<()> + Send + Sync;

// Usage for sync integration
let sync = instance.sync().expect("Sync enabled");
let sync_clone = sync.clone();
let peer_pubkey = "peer_key".to_string();
database.on_local_write(move |entry, db, _instance| {
    sync_clone.queue_entry_for_sync(&peer_pubkey, entry.id(), db.root_id())
})?;

Integration flow:

  1. Transaction commits entry and stores in backend
  2. Instance triggers registered write callbacks with Entry, Database, and Instance
  3. Callback invokes Sync::queue_entry_for_sync()
  4. Sync creates QueueEntry command and sends to BackgroundSync via channel
  5. Background thread fetches entry from backend and sends to peer immediately

Callbacks are per-database and per-peer, allowing targeted synchronization. The queue_entry_for_sync method uses try_send to avoid blocking the commit operation.

5. Peer Management (sync/peer_manager.rs)

The PeerManager handles peer registration and relationship management:

impl PeerManager {
    /// Register a new peer
    pub fn register_peer(&self, pubkey: &str, display_name: Option<&str>) -> Result<()>;

    /// Add database sync relationship
    pub fn add_tree_sync(&self, peer_pubkey: &str, tree_root_id: &str) -> Result<()>;

    /// Get peers that sync a specific database
    pub fn get_tree_peers(&self, tree_root_id: &str) -> Result<Vec<String>>;
}

Data storage:

  • Peers stored in peers.{pubkey} paths in sync database
  • Database relationships in peers.{pubkey}.sync_trees arrays
  • Addresses in peers.{pubkey}.addresses arrays

6. Sync State Tracking (sync/state.rs)

Persistent state tracking for synchronization progress:

pub struct SyncCursor {
    pub peer_pubkey: String,
    pub tree_id: ID,
    pub last_synced_entry: Option<ID>,
    pub last_sync_time: String,
    pub total_synced_count: u64,
}

pub struct SyncMetadata {
    pub peer_pubkey: String,
    pub successful_sync_count: u64,
    pub failed_sync_count: u64,
    pub total_entries_synced: u64,
    pub average_sync_duration_ms: f64,
}

Storage organization:

sync_state/
├── cursors/{peer_pubkey}/{tree_id}     -> SyncCursor
├── metadata/{peer_pubkey}              -> SyncMetadata
└── history/{sync_id}                   -> SyncHistoryEntry

7. Transport Layer (sync/transports/)

Modular transport system supporting multiple protocols with SyncHandler architecture:

pub trait SyncTransport: Send + Sync {
    /// Start server with handler for processing requests
    async fn start_server(&mut self, addr: &str, handler: Arc<dyn SyncHandler>) -> Result<()>;

    /// Send entries to peer
    async fn send_entries(&self, address: &Address, entries: &[Entry]) -> Result<()>;

    /// Send sync request and get response
    async fn send_request(&self, address: &Address, request: &SyncRequest) -> Result<SyncResponse>;
}

SyncHandler Architecture:

The transport layer uses a callback-based handler pattern to enable database access:

pub trait SyncHandler: Send + Sync {
    /// Handle incoming sync requests with database access
    async fn handle_request(&self, request: &SyncRequest) -> SyncResponse;
}

This architecture solves the fundamental problem of received data storage by:

  • Providing database backend access to transport servers
  • Enabling stateful request processing (GetTips, GetEntries, SendEntries)
  • Maintaining clean separation between networking and sync logic
  • Supporting both HTTP and Iroh transports with identical handler interface

HTTP Transport:

  • REST API endpoint at /api/v0 for sync operations
  • JSON serialization for wire format
  • Axum-based server with handler state injection
  • Standard HTTP error codes

Iroh P2P Transport:

  • QUIC-based direct peer connections with handler integration
  • Built-in NAT traversal
  • Efficient binary protocol with JsonHandler serialization
  • Bidirectional streams for request/response pattern

Bootstrap-First Sync Protocol

Eidetica implements a bootstrap-first sync protocol that enables devices to join existing databases without prior local state.

Protocol Architecture

Unified SyncTree Protocol: Replaced multiple request/response types with single SyncTreeRequest:

pub struct SyncTreeRequest {
    pub tree_id: ID,
    pub our_tips: Vec<ID>, // Empty = bootstrap needed
}

pub enum SyncResponse {
    Bootstrap(BootstrapResponse),
    Incremental(IncrementalResponse),
    Error(String),
}

pub struct IncrementalResponse {
    pub tree_id: ID,
    pub missing_entries: Vec<Entry>,
    pub their_tips: Vec<ID>, // Enable bidirectional sync
}

Auto-Detection Logic: Server automatically determines sync type:

async fn handle_sync_tree(&self, request: &SyncTreeRequest) -> SyncResponse {
    if request.our_tips.is_empty() {
        // Client has no local state - send full bootstrap
        return self.handle_bootstrap_request(&request.tree_id).await;
    }
    // Client has tips - send incremental updates
    self.handle_incremental_sync(&request.tree_id, &request.our_tips).await
}

Bootstrap Flow

sequenceDiagram
    participant Client as New Client
    participant Server as Existing Peer

    Client->>Server: Handshake (establish identity)
    Server->>Client: HandshakeResponse (tree_count=N)

    Client->>Server: SyncTree(tree_id, our_tips=[])
    Note over Server: Empty tips = bootstrap needed
    Server->>Server: collect_all_tree_entries(tree_id)
    Server->>Client: Bootstrap(root_entry + all entries)

    Client->>Client: store_entries_in_backend()
    Client->>Server: Ack

Incremental Flow (Bidirectional)

sequenceDiagram
    participant Client as Existing Client
    participant Server as Peer

    Client->>Server: Handshake
    Server->>Client: HandshakeResponse

    Client->>Server: SyncTree(tree_id, our_tips=[tip1, tip2])
    Note over Server: Compare tips to find missing entries
    Server->>Client: Incremental(missing_entries, their_tips)

    Client->>Client: store_new_entries()
    Note over Client: Compare server tips to find what they're missing
    Client->>Server: SendEntries(entries_server_missing)
    Server->>Server: store_entries_from_client()
    Server->>Client: Ack

API Integration

New Simplified API:

// Single method handles both bootstrap and incremental
pub async fn sync_with_peer(&mut self, peer_address: &str, tree_id: Option<&ID>) -> Result<()> {
    let peer_pubkey = self.connect_to_peer(&address).await?;
    if let Some(tree_id) = tree_id {
        self.sync_tree_with_peer(&peer_pubkey, tree_id).await?;
    }
}

// Tree discovery for bootstrap scenarios
pub async fn discover_peer_trees(&mut self, peer_address: &str) -> Result<Vec<TreeInfo>> {
    // Returns list of available databases on peer
}

Legacy API Still Supported:

The old manual peer management API (register_peer, add_tree_sync, etc.) still works for advanced use cases.

Data Flow

1. Entry Commit Flow

sequenceDiagram
    participant App as Application
    participant Database as Database
    participant Transaction as Transaction
    participant Callbacks as Write Callbacks
    participant Sync as Sync Module
    participant Cmd as Command Channel
    participant BG as BackgroundSync

    App->>Database: new_transaction()
    App->>Transaction: modify data
    App->>Transaction: commit()
    Transaction->>Backend: store entry
    Transaction->>Callbacks: invoke(entry, db, instance)
    Callbacks->>Sync: queue_entry_for_sync(peer, entry_id, tree_id)
    Sync->>Cmd: try_send(QueueEntry)
    Cmd->>BG: deliver command

    Note over BG: Background thread
    BG->>BG: handle_command()
    BG->>BG: fetch entry from backend
    BG->>Transport: send_entries(peer, entries)

2. BackgroundSync Processing

The background thread processes commands immediately upon receipt:

  • SendEntries: Transmit entries to peer, retry on failure
  • QueueEntry: Fetch entry from backend and send immediately
  • SyncWithPeer: Initiate bidirectional synchronization
  • AddPeer/RemovePeer: Update peer registry
  • CreateRelationship: Establish database-peer sync mapping
  • Server operations: Start/stop transport server

Failed operations are automatically added to the retry queue with exponential backoff timing.

3. Smart Duplicate Prevention

Eidetica implements semantic duplicate prevention through Merkle-CRDT tip comparison, eliminating the need for simple "sent entry" tracking.

How It Works

Database Synchronization Process:

  1. Tip Exchange: Both peers share their current database tips (frontier entries)
  2. Gap Analysis: Compare local and remote tips to identify missing entries
  3. Smart Filtering: Only send entries the peer doesn't have (based on DAG analysis)
  4. Ancestor Inclusion: Automatically include necessary parent entries
// Background sync's smart duplicate prevention
async fn sync_tree_with_peer(&self, peer_pubkey: &str, tree_id: &ID, address: &Address) -> Result<()> {
    // Step 1: Get our tips for this database
    let our_tips = self.backend.get_tips(tree_id)?;

    // Step 2: Get peer's tips via network request
    let their_tips = self.get_peer_tips(tree_id, address).await?;

    // Step 3: Smart filtering - only send what they're missing
    let entries_to_send = self.find_entries_to_send(&our_tips, &their_tips)?;
    if !entries_to_send.is_empty() {
        self.transport.send_entries(address, &entries_to_send).await?;
    }

    // Step 4: Fetch what we're missing from them
    let missing_entries = self.find_missing_entries(&our_tips, &their_tips)?;
    if !missing_entries.is_empty() {
        let entries = self.fetch_entries_from_peer(address, &missing_entries).await?;
        self.store_received_entries(entries).await?;
    }
}

Benefits over Simple Tracking:

ApproachDuplicate PreventionCorrectnessNetwork Efficiency
Tip-Based (Current)✅ Semantic understanding✅ Always correct✅ Optimal - only sends needed
Simple Tracking❌ Can get out of sync❌ May miss updates❌ May send unnecessary data

Merkle-CRDT Synchronization Algorithm

Phase 1: Tip Discovery

sequenceDiagram
    participant A as Peer A
    participant B as Peer B

    A->>B: GetTips(tree_id)
    B->>A: TipsResponse([tip1, tip2, ...])

    Note over A: Compare tips to identify gaps
    A->>A: find_entries_to_send(our_tips, their_tips)
    A->>A: find_missing_entries(our_tips, their_tips)

Phase 2: Gap Analysis

The find_entries_to_send method performs sophisticated DAG analysis:

fn find_entries_to_send(&self, our_tips: &[ID], their_tips: &[ID]) -> Result<Vec<Entry>> {
    // Find tips that peer doesn't have
    let tips_to_send: Vec<ID> = our_tips
        .iter()
        .filter(|tip_id| !their_tips.contains(tip_id))
        .cloned()
        .collect();

    if tips_to_send.is_empty() {
        return Ok(Vec::new()); // Peer already has everything
    }

    // Use DAG traversal to collect all necessary ancestors
    self.collect_ancestors_to_send(&tips_to_send, their_tips)
}

Phase 3: Efficient Transfer

Only entries that are genuinely missing are transferred:

  • No duplicates: Tips comparison guarantees no redundant sends
  • Complete data: DAG traversal ensures all dependencies included
  • Bidirectional: Both peers send and receive simultaneously
  • Incremental: Only new changes since last sync

Integration with Command Pattern

The smart duplicate prevention integrates seamlessly with the command architecture:

Direct Entry Sends:

// Via SendEntries command - caller determines what to send
self.command_tx.send(SyncCommand::SendEntries {
    peer: peer_pubkey.to_string(),
    entries // No filtering - trust caller
}).await?;

Database Synchronization:

// Via SyncWithPeer command - background sync determines what to send
self.command_tx.send(SyncCommand::SyncWithPeer {
    peer: peer_pubkey.to_string()
}).await?;
// Background sync performs tip comparison and smart filtering

Performance Characteristics

Network Efficiency:

  • O(tip_count) network requests for tip discovery
  • O(missing_entries) data transfer (minimal)
  • Zero redundancy in steady state

Computational Complexity:

  • O(n log n) tip comparison where n = tip count
  • O(m) DAG traversal where m = missing entries
  • Constant memory per sync operation

State Requirements:

  • No persistent tracking of individual sends needed
  • Stateless operation - each sync is independent
  • Self-correcting - any missed entries caught in next sync

4. Handshake Protocol

Peer connection establishment:

sequenceDiagram
    participant A as Peer A
    participant B as Peer B

    A->>B: HandshakeRequest { device_id, public_key, challenge }
    B->>B: verify signature
    B->>B: register peer
    B->>A: HandshakeResponse { device_id, public_key, challenge_response }
    A->>A: verify signature
    A->>A: register peer

    Note over A,B: Both peers now registered and authenticated

Performance Characteristics

Memory Usage

BackgroundSync state: Minimal memory footprint

  • Single background thread with owned state
  • Retry queue: O(n) where n = failed entries pending retry
  • Peer state: ~1KB per registered peer
  • Relationships: ~100 bytes per peer-database relationship

Persistent state: Stored in sync database

  • Sync cursors: ~200 bytes per peer-database relationship
  • Metadata: ~500 bytes per peer
  • History: ~300 bytes per sync operation (with cleanup)
  • Sent entries tracking: ~50 bytes per entry-peer pair

Network Efficiency

Immediate processing:

  • Commands processed as received (no batching delay)
  • Failed sends added to retry queue with exponential backoff
  • Automatic compression in transport layer

Background timers:

  • Periodic sync: User-configurable per database via interval_seconds (default: 5 minutes)
  • Retry processing: Every 30 seconds
  • Connection checks: Every 60 seconds

Periodic sync interval merging:

When multiple users track the same database, their interval_seconds preferences are merged using the minimum interval strategy. This ensures databases stay as up-to-date as the most active user wants. The merging happens in UserSyncManager::get_combined_settings() which uses instance::settings_merge::merge_sync_settings():

  • interval_seconds: Some(a), Some(b)Some(min(a, b))
  • interval_seconds: Some(a), NoneSome(a)
  • interval_seconds: None, NoneNone

Concurrency

Single-threaded design:

  • One background thread handles all sync operations
  • No lock contention or race conditions
  • Commands queued via channel (non-blocking)

Async integration:

  • Tokio-based event loop
  • Non-blocking transport operations
  • Works in both async and sync contexts

Connection Management

Lazy Connection Establishment

Eidetica uses a lazy connection strategy where connections are established on-demand rather than immediately when peers are registered:

Key Design Principles:

  1. No Persistent Connections: Connections are not maintained between sync operations
  2. Transport-Layer Handling: Connection establishment is delegated to the transport layer
  3. Automatic Discovery: Background sync periodically discovers and syncs with all registered peers
  4. On-Demand Establishment: Connections are created when sync operations occur

Connection Lifecycle:

graph LR
    subgraph "Peer Registration"
        REG[register_peer] --> STORE[Store in Sync Database]
    end

    subgraph "Discovery & Connection"
        TIMER[Periodic Timer<br/>Every 5 min] --> SCAN[Scan Active Peers<br/>from Sync Database]
        SCAN --> SYNC[sync_with_peer]
        SYNC --> CONN[Transport Establishes<br/>Connection On-Demand]
        CONN --> XFER[Transfer Data]
        XFER --> CLOSE[Connection Closed]
    end

    subgraph "Manual Connection"
        API[connect_to_peer API] --> HANDSHAKE[Perform Handshake]
        HANDSHAKE --> STORE2[Store Peer Info]
    end

Benefits of Lazy Connection:

  • Resource Efficient: No idle connections consuming resources
  • Resilient: Network issues don't affect registered peer state
  • Scalable: Can handle many peers without connection overhead
  • Self-Healing: Failed connections automatically retried on next sync cycle

Connection Triggers:

  1. Periodic Sync (every 5 minutes):

    • BackgroundSync scans all active peers from sync database
    • Attempts to sync with each peer's registered databases
    • Connections established as needed during sync
  2. Manual Sync Commands:

    • SyncWithPeer command triggers immediate connection
    • SendEntries command establishes connection for data transfer
  3. Explicit Connection:

    • connect_to_peer() API for manual connection establishment
    • Performs handshake and stores peer information

No Alert on Registration:

When register_peer() or add_peer_address() is called:

  • Peer information is stored in the sync database
  • No command is sent to BackgroundSync
  • No immediate connection attempt is made
  • Peer will be discovered in next periodic sync cycle (within 5 minutes)

This design ensures that peer registration is a lightweight operation that doesn't block or trigger network activity.

Transport Implementations

Iroh Transport

The Iroh transport provides peer-to-peer connectivity using QUIC with automatic NAT traversal.

Key Components:

  • Relay Servers: Intermediary servers that help establish P2P connections
  • Hole Punching: Direct connection establishment through NATs (~90% success rate)
  • NodeAddr: Contains node ID and direct socket addresses for connectivity
  • QUIC Protocol: Provides reliable, encrypted communication

Configuration via Builder Pattern:

The IrohTransportBuilder allows configuring:

  • RelayMode: Controls relay server usage
    • Default: Uses n0's production relay servers
    • Staging: Uses n0's staging infrastructure
    • Disabled: Direct P2P only (for local testing)
    • Custom(RelayMap): User-provided relay servers
  • enable_local_discovery: mDNS for local network discovery (future feature)

Address Serialization:

When get_server_address() is called, Iroh returns a JSON-serialized NodeAddrInfo containing:

  • node_id: The peer's cryptographic identity
  • direct_addresses: Socket addresses where the peer can be reached

This allows peers to connect using either relay servers or direct connections, whichever succeeds first.

Connection Flow:

  1. Endpoint initialization with configured relay mode
  2. Relay servers help peers discover each other
  3. Attempt direct connection via hole punching
  4. Fall back to relay if direct connection fails
  5. Upgrade to direct connection when possible

HTTP Transport

The HTTP transport provides traditional client-server connectivity using REST endpoints.

Features:

  • Simple JSON API at /api/v0
  • Axum server with Tokio runtime
  • Request/response pattern
  • No special NAT traversal needed

Architecture Benefits

Command Pattern Advantages

Clean separation of concerns:

  • Frontend handles API and database management
  • Background owns transport and sync state
  • No circular dependencies

Flexible communication:

  • Fire-and-forget for most operations
  • Request-response with oneshot channels when needed
  • Graceful degradation if channel full

Reliability Features

Retry mechanism:

  • Automatic retry queue for failed operations
  • Exponential backoff prevents network flooding
  • Configurable maximum retry attempts
  • Per-entry failure tracking

State persistence:

  • Sync state stored in database via DocStore store
  • Tracks sent entries to prevent duplicates
  • Survives restarts and crashes
  • Provides complete audit trail of sync operations

Handshake security:

  • Ed25519 signature verification
  • Challenge-response protocol prevents replay attacks
  • Device key management integrated with backend
  • Mutual authentication between peers

Error Handling

Retry Queue Management

The BackgroundSync engine maintains a retry queue for failed send operations:

  • Exponential backoff: 2^attempts seconds delay (max 64 seconds)
  • Attempt tracking: Failed sends increment attempt counter
  • Maximum retries: Entries dropped after configurable max attempts
  • Periodic processing: Retry timer checks queue every 30 seconds

Each retry entry tracks the peer, entries to send, attempt count, and last attempt timestamp.

Transport Error Handling

  • Network failures: Added to retry queue with exponential backoff
  • Protocol errors: Logged and skipped
  • Peer unavailable: Entries remain in retry queue

State Consistency

  • Command channel full: Commands dropped (fire-and-forget)
  • Hook failures: Don't prevent commit, logged as warnings
  • Transport errors: Don't affect local data integrity

Testing Architecture

Current Test Coverage

The sync module maintains comprehensive test coverage across multiple test suites:

Unit Tests (6 passing):

  • Hook collection execution and error handling
  • Sync cursor and metadata operations
  • State manager functionality

Integration Tests (78 passing):

  • Basic sync operations and persistence
  • HTTP and Iroh transport lifecycles
  • Peer management and relationships
  • DAG synchronization algorithms
  • Protocol handshake and authentication
  • Bidirectional sync flows
  • Transport polymorphism and isolation

Test Categories

Transport Tests:

  • Server lifecycle management for both HTTP and Iroh
  • Client-server communication patterns
  • Error handling and recovery
  • Address management and peer discovery

Protocol Tests:

  • Handshake with signature verification
  • Version compatibility checking
  • Request/response message handling
  • Entry synchronization protocols

DAG Sync Tests:

  • Linear chain synchronization
  • Branching structure handling
  • Partial overlap resolution
  • Bidirectional sync flows

Implementation Status

Completed Features ✅

Architecture:

  • BackgroundSync engine with command pattern
  • Single background thread ownership model
  • Channel-based frontend/backend communication
  • Automatic runtime detection (async/sync contexts)

Bootstrap-First Sync Protocol:

  • Unified SyncTreeRequest/SyncResponse protocol
  • Automatic bootstrap vs incremental detection
  • Complete database transfer for zero-state clients
  • Simplified sync_with_peer() API
  • Peer discovery via discover_peer_trees()
  • Graceful peer registration (handles PeerAlreadyExists)

Core Functionality:

  • HTTP and Iroh transport implementations with SyncHandler architecture
  • SyncHandler trait enabling database access in transport layer
  • Full protocol support (bootstrap and incremental sync)
  • Ed25519 handshake protocol with signatures
  • Persistent sync state via DocStore
  • Per-peer sync hook creation
  • Retry queue with exponential backoff
  • Periodic sync timers (5 min intervals)

State Management:

  • Sync relationships tracking
  • Peer registration and management
  • Transport address handling
  • Server lifecycle control

Testing:

  • Comprehensive integration tests for bootstrap protocol
  • Zero-state bootstrap verification
  • Incremental sync after bootstrap
  • Complex DAG synchronization scenarios
  • All 490 integration tests passing

Completed Recent Work 🎉

Bootstrap-First Protocol Implementation:

  • Full bootstrap sync from zero local state ✅
  • Automatic protocol detection (empty tips = bootstrap needed) ✅
  • Unified sync handler with SyncTreeRequest processing ✅
  • Background sync integration with bootstrap response handling ✅
  • Peer registration robustness (PeerAlreadyExists handling) ✅
  • Integration test suite validation ✅

Future Enhancements 📋

Performance:

  • Entry batching for large sync operations
  • Compression for network transfers
  • Bandwidth throttling controls
  • Connection pooling

Reliability:

  • Circuit breaker for problematic peers
  • Advanced retry strategies
  • Connection state tracking
  • Automatic reconnection logic

Bootstrap Protocol Extensions:

  • Selective bootstrap (partial tree sync)
  • Progress tracking for large bootstraps
  • Resume interrupted bootstrap operations
  • Bandwidth-aware bootstrap scheduling

Monitoring:

  • Sync metrics collection
  • Health check endpoints
  • Performance dashboards
  • Sync status visualization
  • Bootstrap completion tracking

Bootstrap System

Secure key management and access control for distributed Eidetica databases through a request-approval workflow integrated with the sync module.

Architecture

Storage Location

Bootstrap Request Storage: Requests are stored in the sync database (_sync), not target databases:

  • Subtree: bootstrap_requests
  • Structure: Table<BootstrapRequest> with UUID keys
  • Persistence: Indefinite for audit trail purposes

Global Wildcard Permissions: Databases can enable automatic approval via global * permissions in _settings.auth.*

Core Components

1. Bootstrap Request Manager (bootstrap_request_manager.rs)

The BootstrapRequestManager handles storage and lifecycle of bootstrap requests within the sync database. Key responsibilities:

  • Request Storage: Persists bootstrap requests as structured documents in the bootstrap_requests subtree
  • Status Tracking: Manages request states (Pending, Approved, Rejected)
  • Request Retrieval: Provides query APIs to list and filter requests

2. Sync Handler Extensions

The SyncHandlerImpl processes bootstrap requests during sync operations:

  • Global Permission Check: Checks if global * wildcard permission satisfies the request
  • Automatic Approval: Grants access immediately via global permission (no key addition)
  • Manual Queue: Stores requests for manual review when no global permission exists
  • Response Generation: Returns appropriate sync responses (BootstrapPending, BootstrapResponse)

3. Sync Module Public API (sync/mod.rs)

Request management methods on the Sync struct:

MethodDescriptionReturns
pending_bootstrap_requests()Query pending requestsVec<(String, BootstrapRequest)>
approved_bootstrap_requests()Query approved requestsVec<(String, BootstrapRequest)>
rejected_bootstrap_requests()Query rejected requestsVec<(String, BootstrapRequest)>
get_bootstrap_request(id)Retrieve specific requestOption<(String, BootstrapRequest)>
approve_bootstrap_request(id, key)Approve and add key to databaseResult<()>
reject_bootstrap_request(id, key)Reject without adding keyResult<()>

Data Flow

sequenceDiagram
    participant Client
    participant SyncHandler
    participant GlobalPermCheck
    participant PolicyCheck
    participant BootstrapManager
    participant Database
    participant Admin

    Client->>SyncHandler: Bootstrap Request<br/>(key, permission)
    SyncHandler->>GlobalPermCheck: Check global '*' permission

    alt Global Permission Grants Access
        GlobalPermCheck-->>SyncHandler: sufficient
        SyncHandler-->>Client: BootstrapResponse<br/>(approved=true, no key added)
    else Global Permission Insufficient
        GlobalPermCheck-->>SyncHandler: insufficient/missing
        SyncHandler->>BootstrapManager: Store request
        BootstrapManager-->>SyncHandler: Request ID
        SyncHandler-->>Client: BootstrapPending<br/>(request_id)

            Note over Client: Waits for approval

            Admin->>BootstrapManager: approve_request(id)
            BootstrapManager->>Database: Add key
            Database-->>BootstrapManager: Success
            BootstrapManager-->>Admin: Approved

            Note over Client: Next sync gets access
        end
    end

Global Permission Auto-Approval

The bootstrap system supports automatic approval through global '*' permissions, which provides immediate access without adding new keys to the database.

How It Works

When a bootstrap request is received, the sync handler first checks if the requesting key already has sufficient permissions through existing auth settings:

  1. Permission Check: AuthSettings::can_access() checks if the requesting public key has sufficient permissions
  2. Global Permission Check: Includes checking for active global '*' permission that satisfies the request
  3. Auto-Approval: If sufficient permission exists (specific or global), approve without adding a new key
  4. Fallback: If no existing permission, proceed to auto-approval policy or manual approval flow

Implementation Details

Key Components (handler.rs:check_existing_auth_permission):

  1. Create database instance for target tree
  2. Get AuthSettings via SettingsStore
  3. Call AuthSettings::can_access(requesting_pubkey, requested_permission)
  4. Return approval decision without modifying database if permission exists

Permission Hierarchy: Eidetica uses an inverted priority system where lower numbers = higher permissions:

  • Write(5) has higher permission than Write(10)
  • Global Write(10) allows bootstrap requests for Read, Write(11), Write(15), etc.
  • Global Write(10) rejects bootstrap requests for Write(5), Write(1), Admin(*)

Precedence Rules

  1. Global permissions checked first - Before manual approval queue
  2. Global permissions provide immediate access - No admin approval required
  3. No key storage - Global permission grants don't add keys to auth settings
  4. Insufficient global permission - Falls back to manual approval queue

Global Permissions for Ongoing Operations

Once bootstrapped with global permissions, devices use the global "*" key for all subsequent operations:

  • Transaction commits: AuthSettings::resolve_sig_key_for_operation() resolves to global "*" when device's specific key is not in auth settings
  • Entry validation: KeyResolver::resolve_direct_key_with_pubkey() falls back to global "*" permission during signature verification
  • Permission checks: All operations use the same permission hierarchy and validation rules

This unified approach ensures consistent behavior whether a device has a specific key or relies on global permissions.

Use Cases

  • Public databases: Set global Read permission for open access
  • Collaborative workspaces: Set global Write(*) for team environments
  • Development environments: Reduce friction while maintaining some permission control

Data Structures

BootstrapRequest

Stored in sync database's bootstrap_requests subtree using Table<BootstrapRequest>.

Key Structure: Request ID (UUID string) is the table key, not a struct field.

pub struct BootstrapRequest {
    /// Target database/tree ID
    pub tree_id: ID,

    /// Public key of requesting device (ed25519:...)
    pub requesting_pubkey: String,

    /// Key name for the requesting device
    pub requesting_key_name: String,

    /// Permission level requested (Admin, Write, Read)
    pub requested_permission: Permission,

    /// ISO 8601 timestamp of request
    pub timestamp: String,

    /// Current processing status
    pub status: RequestStatus,

    /// Network address for future notifications
    pub peer_address: Address,
}

RequestStatus Enum

pub enum RequestStatus {
    Pending,
    Approved {
        approved_by: String,
        approval_time: String,
    },
    Rejected {
        rejected_by: String,
        rejection_time: String,
    },
}

Implementation Details

Request Lifecycle

1. Request Creation

When a client attempts bootstrap with authentication:

  • Sync handler checks if tree exists
  • Evaluates bootstrap policy in database settings
  • If auto-approval disabled, creates bootstrap request
  • Stores request in sync database's bootstrap_requests subtree

2. Manual Review

Admin query operations:

  • pending_bootstrap_requests() - Filter by status enum discriminant
  • get_bootstrap_request(id) - Direct table lookup
  • Decision criteria: pubkey, permission level, timestamp, out-of-band verification

3. Approval Process

When approving a request:

  1. Load request from sync database
  2. Validate request is still pending
  3. Create transaction on target database
  4. Add requesting key with specified permissions
  5. Update request status to "Approved"
  6. Record approver and timestamp

4. Rejection Process

When rejecting a request:

  1. Load request from sync database
  2. Validate request is still pending
  3. Update status to "Rejected"
  4. Record rejector and timestamp
  5. No keys added to target database

Authentication Integration

Key Addition Flow (handler.rs:add_key_to_database):

  1. Load target database via Database::open_readonly()
  2. Create transaction with device key auth
  3. Get SettingsStore and AuthSettings
  4. Create AuthKey::active() with requested permission
  5. Call settings_store.set_auth_key()
  6. Commit transaction

Global Permission Check (handler.rs:check_existing_auth_permission):

  1. Load database settings via SettingsStore
  2. Check if global * key exists with sufficient permissions
  3. Approve immediately if global permission satisfies request

Audit Trail

Request immutability provides forensic capability:

  • Original request parameters preserved
  • Approval/rejection metadata includes actor and timestamp
  • Complete history of all bootstrap attempts maintained

Concurrency and Persistence

Persistence: No automatic cleanup - requests remain indefinitely for audit trail

Concurrency:

  • Multiple pending requests per database supported
  • UUID keys prevent ID conflicts
  • Status transitions use standard CRDT merge semantics

Duplicate Detection: Not currently implemented - identical requests from same client create separate entries. Future enhancement may consolidate by (tree_id, pubkey) tuple.

Error Handling

Key error scenarios:

  • RequestNotFound: Invalid request ID
  • RequestAlreadyExists: Duplicate request ID
  • InvalidRequestState: Request not in expected state
  • InsufficientPermissions: Approver lacks required permissions

Stores

Typed data access patterns within databases providing structured interaction with Entry RawData.

Core Concepts

SubTree Trait: Interface for typed store implementations accessed through Operation handles.

Reserved Names: Store names with underscore prefix (e.g., _settings) reserved for internal use.

Typed APIs: Handle serialization/deserialization and provide structured access to raw entry data.

Current Implementations

Table

Record-oriented store for managing collections with unique identifiers.

Features:

  • Stores user-defined types (T: Serialize + Deserialize)
  • Automatic UUID generation for records
  • CRUD operations: insert, get, set, delete, search
  • Type-safe access via Operation::get_subtree

Use Cases: User lists, task management, any collection requiring persistent IDs.

DocStore

Document-oriented store wrapping crdt::Doc for nested structures and path-based access.

Features:

  • Path-based operations for nested data (set_path, get_path, etc.)
  • Simple key-value operations (get, set, delete)
  • Support for nested map structures via Value enum
  • Tombstone support for distributed deletion propagation
  • Last-write-wins merge strategy

Use Cases: Configuration data, metadata, structured documents, sync state.

SettingsStore

Specialized wrapper around DocStore for managing the _settings subtree with type-safe authentication operations.

Features:

  • Type-safe settings management API
  • Convenience methods for authentication key operations
  • Atomic updates via closure pattern (update_auth_settings)
  • Direct access to underlying DocStore for advanced operations
  • Built-in validation for authentication configurations

Architecture:

  • Wraps DocStore instance configured for _settings subtree
  • Delegates to AuthSettings for authentication-specific operations
  • Provides abstraction layer hiding CRDT implementation details
  • Maintains proper transaction boundaries for settings modifications

Operations:

  • Database name management (get_name, set_name)
  • Authentication key lifecycle (set_auth_key, get_auth_key, revoke_auth_key)
  • Bulk auth operations via update_auth_settings closure
  • Auth validation via validate_entry_auth method

Use Cases: Database configuration, authentication key management, settings validation, bootstrap policies.

YDoc (Y-CRDT Integration)

Real-time collaborative editing with sophisticated conflict resolution.

Features (requires "y-crdt" feature):

  • Y-CRDT algorithms for collaboration
  • Differential saving for storage efficiency
  • Full Y-CRDT API access
  • Caching for performance optimization

Architecture:

  • YrsBinary wrapper implements CRDT traits
  • Differential updates vs full snapshots
  • Binary update merging preserves Y-CRDT algorithms

Operations:

  • Document access with safe closures
  • External update application
  • Incremental change tracking

Use Cases: Collaborative documents, real-time editing, complex conflict resolution.

Custom SubTree Implementation

Requirements:

  1. Struct implementing SubTree trait
  2. Handle creation linked to Transaction
  3. Custom API methods using Transaction interaction:
    • get_local_data for staged state
    • get_full_state for merged historical state
    • update_subtree for staging changes

Integration

Operation Context: All stores accessed through atomic operations

CRDT Support: Stores can implement CRDT trait for conflict resolution

Serialization: Data stored as RawData strings in Entry structure

DocStore

Public store implementation providing document-oriented storage with path-based nested data access.

Overview

DocStore is a publicly available store type that provides a document-oriented interface for storing and retrieving data. It wraps the crdt::Doc type to provide ergonomic access patterns for nested data structures, making it ideal for configuration, metadata, and structured document storage.

Key Characteristics

Public API: DocStore is exposed as part of the public store API and can be used in applications.

Doc CRDT Based: Wraps the crdt::Doc type which provides deterministic merging of concurrent changes.

Path-Based Operations: Supports both flat key-value storage and path-based access to nested structures.

Important Behavior: Nested Structure Creation

Path-Based Operations Create Nested Maps

When using set_path() with dot-separated paths, DocStore creates nested map structures, not flat keys with dots:

// This code:
docstore.set_path("user.profile.name", "Alice")?;

// Creates this structure:
{
  "user": {
    "profile": {
      "name": "Alice"
    }
  }
}

// NOT this:
{ "user.profile.name": "Alice" }  // ❌ This is NOT what happens

Accessing Nested Data

When using get_all() to retrieve all data, you get the nested structure and must navigate it accordingly:

let all_data = docstore.get_all()?;

// Wrong way - looking for a flat key with dots
let value = all_data.get("user.profile.name");  // ❌ Returns None

// Correct way - navigate the nested structure
if let Some(Value::Doc(user_doc)) = all_data.get("user") {
    if let Some(Value::Doc(profile_doc)) = user_doc.get("profile") {
        if let Some(Value::Text(name)) = profile_doc.get("name") {
            println!("Name: {}", name);  // ✅ "Alice"
        }
    }
}

API Methods

Basic Operations

  • set(key, value) - Set a simple key-value pair
  • get(key) - Get a value by key
  • get_as<T>(key) - Get and deserialize a value
  • delete(key) - Delete a key (creates tombstone)
  • get_all() - Get all data as a Map

Path Operations

  • set_path(path, value) - Set a value at a nested path (creates intermediate maps)
  • get_path(path) - Get a value from a nested path
  • get_path_as<T>(path) - Get and deserialize from a path
  • delete_path(path) - Delete a value at a path

Path Mutation Operations

  • modify_path<F>(path, f) - Modify existing value at path
  • get_or_insert_path<F>(path, default) - Get or insert with default
  • modify_or_insert_path<F, G>(path, modify, default) - Modify or insert

Utility Operations

  • contains_key(key) - Check if a key exists
  • contains_path(path) - Check if a path exists

Usage Examples

Application Configuration

let op = database.new_transaction()?;
let config = op.get_subtree::<DocStore>("app_config")?;

// Set configuration values
config.set("app_name", "MyApp")?;
config.set_path("database.host", "localhost")?;
config.set_path("database.port", "5432")?;
config.set_path("features.auth.enabled", "true")?;

op.commit()?;

Sync State Management

DocStore is used internally for sync state tracking in the sync module:

// Creating nested sync state structure
let sync_state = op.get_subtree::<DocStore>("sync_state")?;

// Store cursor information in nested structure
let cursor_path = format!("cursors.{}.{}", peer_pubkey, tree_id);
sync_state.set_path(cursor_path, cursor_json)?;

// Store metadata in nested structure
let metadata_path = format!("metadata.{}", peer_pubkey);
sync_state.set_path(metadata_path, metadata_json)?;

// Store history in nested structure
let history_path = format!("history.{}", sync_id);
sync_state.set_path(history_path, history_json)?;

// Later, retrieve all data and navigate the structure
let all_data = sync_state.get_all()?;

// Navigate to history entries
if let Some(Value::Doc(history_doc)) = all_data.get("history") {
    for (sync_id, entry_value) in history_doc.iter() {
        // Process each history entry
        if let Value::Text(json_str) = entry_value {
            let entry: SyncHistoryEntry = serde_json::from_str(json_str)?;
            // Use the entry...
        }
    }
}

Common Pitfalls

Expecting Flat Keys

The most common mistake is expecting set_path("a.b.c", value) to create a flat key "a.b.c" when it actually creates nested maps.

Incorrect get_all() Usage

When using get_all(), remember that the returned Map contains the nested structure, not flat keys:

// After: docstore.set_path("config.server.port", "8080")

let all = docstore.get_all()?;

// Wrong:
all.get("config.server.port")  // Returns None

// Right:
all.get("config")
   .and_then(|v| v.as_node())
   .and_then(|n| n.get("server"))
   .and_then(|v| v.as_node())
   .and_then(|n| n.get("port"))  // Returns Some(Value::Text("8080"))

Design Rationale

The nested structure approach was chosen because:

  1. Natural Hierarchy: Represents hierarchical data more naturally
  2. Partial Updates: Allows updating parts of a structure without rewriting everything
  3. CRDT Compatibility: Works well with Doc CRDT merge semantics
  4. Query Flexibility: Enables querying at any level of the hierarchy

See Also

CRDT Implementation

Trait-based system for Conflict-free Replicated Data Types enabling deterministic conflict resolution.

Core Concepts

CRDT Trait: Defines merge operation for resolving conflicts between divergent states. Requires Serialize, Deserialize, and Default implementations.

Merkle-CRDT Principles: CRDT state stored in Entry's RawData for deterministic merging across distributed systems.

Multiple CRDT Support: Different CRDT types can be used for different stores within the same database.

Doc Type

Doc: The main CRDT document type

  • Hierarchical document structure supporting nested data
  • Provides document-level operations (get, set, merge, etc.)
  • Handles path-based operations for nested data access (dot notation)
  • Supports the Value enum for different data types

Value Types:

  • Text (string)
  • Int (i64 integer)
  • Bool (boolean)
  • Doc (nested document)
  • List (ordered collection with CRDT positioning)
  • Deleted (tombstone marker)

CRDT Behavior:

  • Recursive merging for nested structures
  • Last-write-wins strategy for conflicting leaf values
  • Tombstones for deletion tracking
  • Type-aware conflict resolution

Tombstones

Critical for distributed deletion propagation:

  • Mark data as deleted instead of physical removal
  • Retained and synchronized between replicas
  • Ensure deletions propagate to all nodes
  • Prevent resurrection of deleted data

Merge Algorithm

LCA-Based Computation: Uses Lowest Common Ancestor for efficient state calculation

Process:

  1. Identify parent entries (tips) for store
  2. Find LCA if multiple parents exist
  3. Merge all paths from LCA to parent tips
  4. Cache results for performance

Caching: Automatic caching of computed states with (Entry_ID, Store) keys for dramatic performance improvements.

Custom CRDT Implementation

Requirements:

  1. Struct implementing Default, Serialize, Deserialize
  2. Data marker trait implementation
  3. CRDT trait with deterministic merge logic
  4. Optional SubTree handle for user-friendly API

Data Flow

The data flow in Eidetica follows a structured sequence of interactions between core components.

Basic Flow

  1. User creates an Instance with a storage backend
  2. User creates Databases within the Instance
  3. Database holds a weak reference to Instance for storage access
  4. Operations construct immutable Entry objects through EntryBuilder
  5. Entries reference parent entries, forming a directed acyclic graph
  6. Database accesses storage through Instance.backend() via weak reference upgrade
  7. Entries are stored and retrieved through the Instance's backend interface
  8. Authentication validates and signs entries when configured

Authentication Flow

When authentication is enabled, additional steps occur during commit:

  • Entry signing with cryptographic signatures
  • Permission validation for the operation type
  • Bootstrap handling for initial admin configuration
  • Verification status assignment based on validation results

This ensures data integrity and access control while maintaining compatibility with unsigned entries.

CRDT Caching Flow

The system uses an efficient caching layer for CRDT state computation:

  • Cache lookup using Entry ID and Store as the key
  • On cache miss, recursive LCA algorithm computes state and caches the result
  • Cache hits return instantly for subsequent queries
  • Performance scales well due to immutable entries and high cache hit rates

CRDT Principles

Eidetica implements a Merkle-CRDT using content-addressable entries organized in a Merkle DAG structure. Entries store data and maintain parent references to form a distributed version history that supports deterministic merging.

Core Concepts

  • Content-Addressable Entries: Immutable data units forming a directed acyclic graph
  • CRDT Trait: Enables deterministic merging of concurrent changes
  • Parent References: Maintain history and define DAG structure
  • Tips Tracking: Identifies current heads for efficient synchronization

Fork and Merge Support

The system supports branching and merging through parent-child relationships:

  • Forking: Multiple entries can share parents, creating divergent branches
  • Merging: Entries with multiple parents merge separate branches
  • Deterministic Ordering: Entries sorted by height then ID for consistent results

Merge Algorithm

Uses a recursive LCA-based approach for computing CRDT states:

  • Cache Check: Avoids redundant computation through automatic caching
  • LCA Computation: Finds lowest common ancestor for multi-parent entries
  • Recursive Building: Computes ancestor states recursively
  • Path Merging: Merges all entries from LCA to parents with proper ordering
  • Local Integration: Applies current entry's data to final state

Key Properties

  • Correctness: Consistent state computation regardless of access patterns
  • Performance: Caching eliminates redundant work
  • Deterministic: Maintains ordering through proper LCA computation
  • Immutable Caching: Entry immutability ensures cache validity

Subtree Parent Relationships in Eidetica

Overview

Subtree parent relationships are a critical aspect of Eidetica's Merkle-CRDT architecture. Each entry in the database can contain multiple subtrees (like "messages", "_settings", etc.), and these subtrees maintain their own parent-child relationships within the larger DAG structure.

How Subtree Parents Work

Subtree Root Entries

Subtree root entries are entries that establish the beginning of a named subtree. They have these characteristics:

  • Contains the subtree: The entry has a SubTreeNode for the named subtree
  • Empty subtree parents: The subtree's parents field is empty ([])
  • Normal main tree parents: The entry still has normal parent relationships in the main tree

Example structure:

Entry {
    tree: TreeNode {
        root: "tree_id",
        parents: ["main_parent_1", "main_parent_2"], // Normal main tree parents
    },
    subtrees: [
        SubTreeNode {
            name: "messages",
            parents: [], // EMPTY - this makes it a subtree root
            data: "first_message_data",
        }
    ],
}

Non-Root Subtree Entries

Subsequent entries in the subtree have the previous subtree entries as parents:

Entry {
    tree: TreeNode {
        root: "tree_id",
        parents: ["main_parent_3"],
    },
    subtrees: [
        SubTreeNode {
            name: "messages",
            parents: ["previous_messages_entry_id"], // Points to previous subtree entry
            data: "second_message_data",
        }
    ],
}

Multi-Layer Validation System

The system uses multi-layer validation to ensure DAG integrity and ID format correctness (see Entry documentation for ID format details):

1. Entry Layer: Structural and Format Validation

The Entry::validate() method enforces critical invariants:

/// CRITICAL VALIDATION RULES:
/// 1. Root entries (with "_root" subtree): May have empty parents
/// 2. Non-root entries: MUST have at least one parent in main tree
/// 3. Empty parent IDs: Always rejected
/// 4. All IDs must be valid 64-character lowercase hex SHA-256 hashes

pub fn validate(&self) -> Result<()> {
    // Non-root entries MUST have main tree parents
    if !self.is_root() && self.parents()?.is_empty() {
        return Err(ValidationError::NonRootEntryWithoutParents);
    }

    // Validate all parent IDs are properly formatted (see Entry docs for format details)
    for parent in self.parents()? {
        if parent.is_empty() {
            return Err(ValidationError::EmptyParentId);
        }
        validate_id_format(parent, "main tree parent ID")?;
    }

    // Validate tree root ID format (when not empty)
    if !self.tree.root.is_empty() {
        validate_id_format(&self.tree.root, "tree root ID")?;
    }

    // Validate subtree parent IDs
    for subtree in &self.subtrees {
        for parent_id in &subtree.parents {
            validate_id_format(parent_id, "subtree parent ID")?;
        }
    }
    // ... additional validation
}

This prevents the creation of orphaned nodes and ensures all IDs are properly formatted.

2. Entry Builder: Build-time Validation

The EntryBuilder::build() method automatically validates entries before returning them:

pub fn build(mut self) -> Result<Entry> {
    // 1. Sort and deduplicate parent lists
    // 2. Sort subtrees by name
    // 3. Create the entry
    let entry = Entry { ... };

    // 4. VALIDATE before returning - catches errors at build time
    entry.validate()?;

    Ok(entry)
}

This means validation errors are caught immediately when building entries, providing clear error messages about ID format violations (see Entry documentation for format details).

3. Transaction Layer: Automatic Parent Discovery

When a transaction accesses a subtree for the first time, only then does it determine the correct subtree parents:

// Get subtree tips based on transaction context
let tips = if main_parents == current_database_tips {
    // Using current database tips - get all current subtree tips
    self.db.backend().get_store_tips(self.db.root_id(), &subtree_name)?
} else {
    // Using custom parent tips - get subtree tips reachable from those parents
    self.db.backend().get_store_tips_up_to_entries(
        self.db.root_id(),
        &subtree_name,
        &main_parents,
    )?
};

// Use the tips directly as subtree parents
builder.set_subtree_parents_mut(&subtree_name, tips);

The transaction system handles:

  • Normal operations: Uses current subtree tips from the database
  • Custom parent scenarios: Finds subtree tips reachable from specific main parents
  • First subtree entry: Returns empty tips, creating a subtree root

4. Backend Storage: Final Validation Gate

The backend put() method serves as the final validation gate before persistence:

/// CRITICAL VALIDATION GATE: Final check before persistence
pub(crate) fn put(
    backend: &InMemory,
    verification_status: VerificationStatus,
    entry: Entry,
) -> Result<()> {
    // Validate entry structure before storing
    entry.validate()?;  // HARD FAILURE on invalid entries

    // ... storage operations
}

5. LCA Traversal: Subtree Root Detection

During LCA (Lowest Common Ancestor) calculations, the system correctly identifies subtree roots:

match entry.subtree_parents(subtree) {
    Ok(parents) => {
        if parents.is_empty() {
            // This entry is a subtree root - don't traverse further up this subtree
        } else {
            // Entry has parents in the subtree, add them to traversal queue
            for parent in parents {
                queue.push_back(parent);
            }
        }
    }
    Err(_) => {
        // Entry doesn't contain this subtree - ERROR, should not happen in LCA
        return Err(BackendError::EntryNotInSubtree { ... });
    }
}

Common Scenarios

Scenario 1: Normal Sequential Operations

Entry 1 (root)
  └─ Entry 2 (messages subtree, parents: [])  // First message (subtree root)
      └─ Entry 3 (messages subtree, parents: [2])  // Second message

Scenario 2: Bidirectional Sync

Device 1: Entry 1 (root) → Entry 2 (message A, subtree parents: [])
Device 2: Syncs, gets Entry 1 & 2
Device 2: Entry 3 (message B, subtree parents: [2])
Device 1: Syncs back, creates Entry 4 (message C, subtree parents: [3])

Scenario 3: Diamond Pattern

        Entry 1 (root)
       /              \
   Entry 2A         Entry 2B
       \              /
        Entry 3 (merge)

The transaction system correctly handles finding subtree parents in diamond patterns using get_store_tips_up_to_entries.

API Usage

// The transaction automatically handles subtree parent discovery
let op = database.new_transaction()?;
let store = op.get_store::<DocStore>("messages")?;
store.set("content", "Hello world")?;
let entry_id = op.commit()?; // Parents automatically determined

Manual Entry Creation (Internal Only)

// ✅ CORRECT: Root entry (doesn't need parents)
let entry = Entry::root_builder()
    .set_subtree_data("data", "content")
    .build()
    .expect("Root entry should build successfully");

// ✅ CORRECT: Non-root entry with valid SHA-256 hex IDs
let entry = Entry::builder("a1b2c3d4e5f6789012345678901234567890abcdef1234567890abcdef123456")
    .set_parents(vec!["b2c3d4e5f6789012345678901234567890abcdef1234567890abcdef1234567a"])
    .set_subtree_data("messages", "data")
    .set_subtree_parents("messages", vec!["c3d4e5f6789012345678901234567890abcdef1234567890abcdef1234567ab2"])
    .build()
    .expect("Entry with valid IDs should build successfully");

// ❌ WRONG: Non-root entry without parents (WILL FAIL AT BUILD TIME)
let result = Entry::builder("tree_id").build();
assert!(result.is_err()); // Fails validation

// ❌ WRONG: Invalid ID format (WILL FAIL AT BUILD TIME)
let result = Entry::builder("invalid_id")
    .set_parents(vec!["also_invalid"])
    .build();
assert!(result.is_err()); // Fails ID format validation

Debugging Tips

Identifying Subtree Root Entries

Look for entries where:

  • entry.subtree_parents(subtree_name) returns Ok(vec![]) (empty parents)
  • The entry contains the subtree in question
  • This indicates the entry is the starting point for that subtree

Common Error Messages

  • "Entry is subtree root (empty parents)" - Normal operation, entry starts a new subtree
  • "Entry encountered in subtree LCA that doesn't contain the subtree" - Invalid state, entry should not be in subtree operations
  • "Non-root entry has empty main tree parents" - Validation failure, entry missing required parents
  • "Invalid ID format in main tree parent ID: 'xyz'. IDs must be exactly 64 characters" - ID format validation failure
  • "Invalid ID format in subtree 'messages' parent ID: 'ABC123'. IDs must contain only lowercase hexadecimal characters" - Uppercase or invalid characters in ID

Validation Points

  1. Entry building: ID format and structural validation at build time via EntryBuilder::build()
  2. Entry validation: Check that entries have proper main tree parents and valid ID formats
  3. Transaction commit: Subtree parents are automatically discovered and set
  4. Backend storage: Final validation before persistence
  5. LCA operations: Proper subtree traversal based on subtree parent relationships

Best Practices

  1. Use transactions for all entry creation - they handle parent discovery automatically and generate proper IDs
  2. Use Entry::root_builder() for standalone entries that start new DAGs
  3. Generate proper SHA-256 hex IDs when creating entries manually (for testing or advanced use cases)
  4. Handle build errors - EntryBuilder::build() can fail with validation errors
  5. Test with valid IDs - use proper 64-character hex strings in tests
  6. Monitor debug logs for subtree parent discovery during development

Implementation Details

The subtree parent system is implemented across:

  • crates/lib/src/entry/mod.rs: Entry structure and validation
  • crates/lib/src/transaction/mod.rs: Automatic parent discovery
  • crates/lib/src/backend/database/in_memory/storage.rs: Final validation gate
  • crates/lib/src/backend/database/in_memory/traversal.rs: LCA operations with subtree awareness

Each layer ensures proper subtree parent relationships and DAG integrity.

Testing Architecture

Eidetica employs a comprehensive testing strategy to ensure reliability and correctness. This document outlines our testing approach, organization, and best practices for developers working with or contributing to the codebase.

Test Organization

Eidetica centralizes all its tests into a unified integration test binary located in the tests/it/ directory. All testing is done through public interfaces, without separate unit tests, promoting interface stability.

The main categories of testing activities are:

Comprehensive Integration Tests

All tests for the Eidetica crate are located in the crates/lib/tests/it/ directory. These tests verify both:

  • Component behavior: Validating individual components through their public interfaces
  • System behavior: Ensuring different components interact correctly when used together

This unified suite is organized as a single integration test binary, following the pattern described by matklad.

The module structure within crates/lib/tests/it/ mirrors the main library structure from crates/lib/src/. Each major component has its own test module directory.

Example Applications as Tests

The examples/ directory contains standalone applications that demonstrate library features. While not traditional tests, these examples serve as pragmatic validation of the API's usability and functionality in real-world scenarios.

For instance, the examples/todo/ directory contains a complete Todo application that demonstrates practical usage of Eidetica, effectively acting as both documentation and functional validation.

Test Coverage Goals

Eidetica maintains ambitious test coverage targets:

  • Core Data Types: 95%+ coverage for all core data types (Entry, Database, SubTree)
  • CRDT Implementations: 100% coverage for all CRDT implementations
  • Database Implementations: 90%+ coverage, including error cases
  • Public API Methods: 100% coverage

Testing Patterns and Practices

Test-Driven Development

For new features, we follow a test-driven approach:

  1. Write tests defining expected behavior
  2. Implement features to satisfy those tests
  3. Refactor while maintaining test integrity

Interface-First Testing

We exclusively test through public interfaces. This approach ensures API stability.

Test Helpers

Eidetica provides test helpers organized into main helpers (crates/lib/tests/it/helpers.rs) for common database and database setup, and module-specific helpers for specialized testing scenarios. Each test module has its own helpers.rs file with utilities specific to that component's testing needs.

Standard Test Structure

Tests follow a consistent setup-action-assertion pattern, utilizing test helpers for environment preparation and result verification.

Error Case Testing

Tests cover both successful operations and error conditions to ensure robust error handling throughout the system.

CRDT-Specific Testing

Given Eidetica's CRDT foundation, special attention is paid to testing CRDT properties:

  1. Merge Semantics: Validating that merge operations produce expected results
  2. Conflict Resolution: Ensuring conflicts resolve according to CRDT rules
  3. Determinism: Verifying that operations are commutative when required

Running Tests

Basic Test Execution

Run all tests with:

cargo test
# Or using the task runner
task test

Eidetica uses nextest for test execution, which provides improved test output and performance:

cargo nextest run --workspace --all-features

Targeted Testing

Run specific test categories:

# Run all integration tests
cargo test --test it

# Run specific integration tests
cargo nextest run tests::it::store

Run tests using cargo test --test it for all integration tests, or target specific modules with patterns like cargo test --test it auth::. The project also supports cargo nextest for improved test output and performance.

Coverage Analysis

Eidetica uses tarpaulin for code coverage analysis:

# Run with coverage analysis
task coverage
# or
cargo tarpaulin --workspace --skip-clean --include-tests --all-features --output-dir coverage --out lcov

Module Test Organization

Each test module follows a consistent structure with mod.rs for declarations, helpers.rs for module-specific utilities, and separate files for different features or aspects being tested.

Contributing New Tests

When adding features or fixing bugs:

  1. Add focused tests to the appropriate module within the crates/lib/tests/it/ directory. These tests should cover:
    • Specific functionality of the component or module being changed through its public interface.
    • Interactions between the component and other parts of the system.
  2. Consider adding example code in the examples/ directory for significant new features to demonstrate usage and provide further validation.
  3. Test both normal operation ("happy path") and error cases.
  4. Use the test helpers in crates/lib/tests/it/helpers.rs for general setup, and module-specific helpers for specialized scenarios.
  5. If you need common test utilities for a new pattern, add them to the appropriate helpers.rs file.

Best Practices

  • Descriptive Test Names: Use test_<component>_<functionality> or test_<functionality>_<scenario> naming pattern
  • Self-Documenting Tests: Write clear test code with useful comments
  • Isolation: Ensure tests don't interfere with each other
  • Speed: Keep tests fast to encourage frequent test runs
  • Determinism: Avoid flaky tests that intermittently fail

Performance Considerations

The architecture provides several performance characteristics:

  • Content-addressable storage: Enables efficient deduplication through SHA-256 content hashing.
  • Database structure (DAG): Supports partial replication and sparse checkouts. Tip calculation complexity depends on parent relationships.
  • InMemoryDatabase: Provides high-speed operations but is limited by available RAM.
  • Lock-based concurrency: May create bottlenecks in high-concurrency write scenarios.
  • Height calculation: Uses BFS-based topological sorting with O(V + E) complexity.
  • CRDT merge algorithm: Employs recursive LCA-based merging with intelligent caching.

CRDT Merge Performance

The recursive LCA-based merge algorithm uses caching for performance optimization:

Algorithm Complexity

  • Cached states: O(1) amortized performance
  • Uncached states: O(D × M) where D is DAG depth and M is merge cost
  • Overall performance benefits from high cache hit rates

Key Performance Benefits

  • Efficient handling of complex DAG structures
  • Optimized path finding reduces database calls
  • Cache eliminates redundant computations
  • Scales well with DAG complexity through memoization
  • Memory-computation trade-off favors cached access patterns

Error Handling

The database uses a custom Result (crate::Result) and Error (crate::Error) type hierarchy defined in src/lib.rs. Errors are typically propagated up the call stack using Result.

The Error enum uses a modular approach with structured error types from each component:

  • Io(#[from] std::io::Error): Wraps underlying I/O errors from backend operations or file system access.
  • Serialize(#[from] serde_json::Error): Wraps errors occurring during JSON serialization or deserialization.
  • Auth(auth::AuthError): Structured authentication errors with detailed context.
  • Backend(backend::DatabaseError): Database storage and retrieval errors.
  • Instance(instance::InstanceError): Instance management errors.
  • CRDT(crdt::CRDTError): CRDT operation and merge errors.
  • Store(store::StoreError): Store data access and validation errors.
  • Transaction(transaction::TransactionError): Transaction coordination errors.

The use of #[error(transparent)] allows for zero-cost conversion from module-specific errors into crate::Error using the ? operator. Helper methods like is_not_found(), is_permission_denied(), and is_authentication_error() enable categorized error handling without pattern matching on specific variants.

Best Practices

This section documents established patterns and guidelines for developing within the Eidetica codebase. Following these practices ensures consistency, performance, and maintainability across the project.

Overview

The best practices documentation covers:

  • API Design Patterns - Guidelines for string parameters, conversion patterns, and performance considerations
  • Module Organization - Code structure, dependency management, and module design patterns
  • Error Handling - Structured error types, error propagation, and error handling strategies
  • Testing - Integration testing, test organization, and comprehensive validation strategies
  • Performance - Hot path optimization, memory efficiency, and scalable algorithms
  • Security - Authentication, authorization, cryptographic operations, and secure data handling
  • Documentation - Documentation standards, API documentation, and writing guidelines

Core Principles

All best practices in Eidetica are built around these fundamental principles:

1. Performance with Ergonomics

  • Optimize for common use cases without sacrificing API usability
  • Minimize conversion overhead while maintaining flexible parameter types
  • Use appropriate generic bounds to avoid double conversions

2. Consistency Across Components

  • Similar operations should have similar APIs across different modules
  • Follow established patterns for parameter types and method naming
  • Maintain consistent error handling and documentation patterns

3. Clear Intent and Documentation

  • Function signatures should clearly communicate their intended usage
  • Parameter types should indicate whether data is stored or accessed
  • Performance characteristics should be documented for critical paths

4. Future-Ready Design

  • Backward compatibility is NOT required during development
  • Breaking changes are acceptable for both API and storage format
  • Focus on correctness and performance over compatibility at this stage

Quick Reference

For New Contributors

Start with these essential guides:

  1. Module Organization - Understanding code structure and dependencies
  2. Error Handling - How errors work throughout the system
  3. Testing - Writing and running tests effectively
  4. Documentation - Writing good documentation and examples

For API Development

Focus on these areas for public API work:

  1. API Design Patterns - String parameters and method design
  2. Performance - Hot path optimization and memory efficiency
  3. Security - Authentication and secure coding practices

For Internal Development

These guides cover internal implementation patterns:

  1. Module Organization - Internal module structure and abstractions
  2. Performance - CRDT algorithms and backend optimization
  3. Testing - Integration testing and test helper patterns

Implementation Guidelines

When implementing new features or modifying existing code:

  1. Review existing patterns in similar components
  2. Follow the established conventions documented in this section
  3. Add comprehensive tests that validate the patterns
  4. Document the rationale for any deviations from established patterns
  5. Update documentation to reflect new patterns or changes

Contributing to Best Practices

These best practices evolve based on:

  • Lessons learned from real-world usage
  • Performance analysis and optimization needs
  • Developer feedback and common patterns
  • Code review discussions and decisions

When proposing changes to established patterns, include:

  • Rationale for the change
  • Performance impact analysis
  • Updated documentation and examples

API Design Patterns

This document outlines established patterns for API design within the Eidetica codebase, with particular emphasis on string parameter handling, conversion patterns, and performance considerations.

String Parameter Guidelines

One of the most important API design decisions in Rust is choosing the right parameter types for string data. Eidetica follows specific patterns to optimize performance while maintaining ergonomic APIs.

Core Principle: Storage vs Lookup Pattern

The fundamental rule for string parameters in Eidetica:

  • Use Into<String> for parameters that will be stored (converted to owned String)
  • Use AsRef<str> for parameters that are only accessed temporarily (lookup, comparison)

When to Use Into<String>

Use impl Into<String> when the function will store the parameter as an owned String. This avoids double conversion and is more efficient for storage operations while still accepting &str, String, and &String transparently.

When to Use AsRef<str>

Use impl AsRef<str> when the function only needs to read the string temporarily for lookups, comparisons, or validation. This provides maximum flexibility with no unnecessary allocations and clearly indicates the parameter is not stored.

Anti-Patterns to Avoid

Never use AsRef<str> followed by immediate .to_string() - this causes double conversion. Instead, use Into<String> for direct conversion when storing the value.

Common Conversion Patterns

ID Types

For ID parameters, prefer Into<ID> when working with ID-typed fields for clear intent and type safety.

Path Segments

For path operations, use Into<String> with Clone bounds when segments will be stored as keys.

Performance Guidelines

Hot Path Optimizations

For performance-critical operations:

  1. Bulk Operations: Convert all parameters upfront to avoid per-iteration conversions
  2. Iterator Chains: Prefer direct loops over complex iterator chains in hot paths

API Documentation Standards

Always document the expected usage pattern for string parameters, indicating whether the parameter will be stored or just accessed, and which string types are accepted.

Testing Patterns

Ensure APIs work with all string types (&str, String, &String) by testing conversion compatibility.

API Evolution Guidelines

During development, APIs can be freely changed to follow best practices. Update methods directly with improved parameter types, add comprehensive tests, update documentation, and consider performance impact. Breaking changes are acceptable when they improve performance, ergonomics, or consistency.

Summary

Following these patterns ensures:

  • Optimal performance through minimal conversions
  • Consistent APIs across the codebase
  • Clear intent about parameter usage
  • Maximum flexibility for API consumers
  • Maintainable code for future development

When in doubt, ask: "Is this parameter stored or just accessed?" The answer determines whether to use Into<String> or AsRef<str>.

Module Organization

This document outlines best practices for organizing code modules within the Eidetica codebase, focusing on clear separation of concerns, consistent structure, and maintainable hierarchies.

Module Hierarchy Principles

1. Domain-Driven Organization

Organize modules around business domains and functionality rather than technical layers. Each module should have a clear responsibility and evolve independently while maintaining clean boundaries.

2. Consistent Module Structure

Every module should follow a standard internal structure with mod.rs for public API and re-exports, errors.rs for module-specific error types, and separate files for implementation logic. Keep related functionality together within the same module.

3. Error Module Standards

Each module must define its own error type with #[non_exhaustive] for future compatibility, semantic helper methods for error classification, transparent delegation for dependency errors, and contextual information in error variants.

Public API Design

1. Clean Re-exports

Module mod.rs files should provide clean public APIs with clear documentation, selective re-exports of public types, and convenient access to commonly used shared types.

2. Module Documentation Standards

Every module should have comprehensive documentation including purpose, core functionality, usage examples, integration points, and performance considerations.

Dependency Management

1. Dependency Direction

Maintain clear dependency hierarchies where higher-level modules depend on lower-level modules, modules at the same level avoid direct dependencies, and trait abstractions break circular dependencies when needed.

2. Feature Gating

Use feature flags for optional functionality, gating modules and exports appropriately with #[cfg(feature = "...")] attributes.

Module Communication Patterns

1. Trait-Based Abstractions

Use traits to define interfaces between modules, allowing implementation modules to depend on abstractions rather than concrete types.

2. Event-Driven Communication

Consider event patterns for decoupled communication, particularly useful for logging, metrics, or cross-cutting concerns without introducing tight coupling.

Testing Integration

Integration tests should mirror the module structure with module-specific helpers for each domain. Test organization should follow the same hierarchy as the source modules.

Common Anti-Patterns to Avoid

  • Circular Dependencies - Modules depending on each other in cycles
  • God Modules - Single modules containing unrelated functionality
  • Leaky Abstractions - Exposing internal implementation details through public APIs
  • Flat Structure - No hierarchy or organization in module layout
  • Mixed Concerns - Business logic mixed with infrastructure code

Migration Guidelines

When restructuring modules: plan the new structure, use deprecation warnings for API changes when needed, create integration tests to verify functionality, update documentation, and consider backward compatibility implications.

Summary

Good module organization provides:

  • Clear separation of concerns with well-defined boundaries
  • Predictable structure that developers can navigate easily
  • Maintainable dependencies with clear hierarchies
  • Testable interfaces with appropriate abstractions
  • Extensible design that can grow with the project

Following these patterns ensures the codebase remains organized and maintainable as it evolves.

Error Handling Best Practices

This document outlines the error handling patterns and practices used throughout the Eidetica codebase, focusing on structured errors, ergonomic APIs, and maintainable error propagation.

Core Error Architecture

1. Unified Result Type

Eidetica uses a unified Result<T> type across the entire codebase with automatic conversion between module-specific errors and the main error type. This provides consistent error handling and a single import for Result type throughout the codebase.

2. Module-Specific Error Types

Each module defines its own structured error type with semantic helpers. Error types include contextual information and helper methods for classification (e.g., is_authentication_error(), is_permission_denied()).

Error Design Patterns

1. Semantic Error Classification

Provide helper methods that allow callers to handle errors semantically, such as is_not_found(), is_storage_error(), or is_corruption_error(). This enables clean error handling based on error semantics rather than type matching.

2. Contextual Error Information

Include relevant context in error variants, such as available options when something is not found, or specific reasons for failures. This debugging information helps users understand what went wrong and can assist in error recovery.

3. Error Conversion Patterns

Use #[from] and #[error(transparent)] for zero-cost error conversion between module boundaries. This allows wrapping errors with additional context or passing them through directly.

4. Non-Exhaustive Error Enums

Use #[non_exhaustive] on error enums to allow adding new error variants in the future without breaking existing code.

Error Handling Strategies

1. Early Return with ? Operator

Use the ? operator for clean error propagation, validating preconditions early and returning errors as soon as they occur.

2. Error Context Enhancement

Add context when propagating errors up the call stack by wrapping lower-level errors with higher-level context that explains what operation failed.

3. Fallible Iterator Patterns

Handle errors in iterator chains gracefully by either failing fast on the first error or collecting all results before handling errors, depending on the use case.

Authentication Error Patterns

1. Permission-Based Errors

Structure authentication errors to be actionable by including what permission was required, what the user had, and potentially what options are available.

2. Security Error Handling

Be careful not to leak sensitive information in error messages. Reference resources by name or ID rather than content, and avoid exposing internal system details.

Performance Considerations

1. Error Allocation Optimization

Minimize allocations in error creation by using static strings for fixed messages and avoiding unnecessary string formatting in hot paths.

2. Error Path Optimization

Keep error paths simple and fast by deferring error creation until actually needed, using closures with ok_or_else() rather than ok_or().

Testing Error Conditions

1. Error Testing Patterns

Test both error conditions and error classification helpers. Verify that error context is preserved through the error chain and that error messages contain expected information.

2. Error Helper Testing

Test semantic error classification helpers to ensure they correctly identify error categories and that the classification logic remains consistent as error types evolve.

Common Anti-Patterns

  • String-based errors - Avoid unstructured string errors that lack context
  • Generic error types - Don't use overly generic errors that lose type information
  • Panic on recoverable errors - Return Result instead of using unwrap() or expect()
  • Leaking sensitive information - Don't expose internal details in error messages

Migration Guidelines

When updating error handling, maintain error semantics, add context gradually, test error paths thoroughly, and keep documentation current.

Summary

Effective error handling in Eidetica provides:

  • Structured error types with rich context and classification
  • Consistent error patterns across all modules
  • Semantic error helpers for easy error handling in calling code
  • Zero-cost error conversion between module boundaries
  • Performance-conscious error creation and propagation
  • Testable error conditions with comprehensive coverage

Following these patterns ensures errors are informative, actionable, and maintainable throughout the codebase evolution.

Testing Best Practices

This document outlines testing patterns and practices used in the Eidetica codebase, focusing on integration testing, test organization, and comprehensive validation strategies.

Testing Architecture

1. Integration-First Testing Strategy

Eidetica uses a single integration test binary approach rather than unit tests, organized in tests/it/ with modules mirroring the main codebase structure.

Key principle: Test through public interfaces to validate real-world usage patterns.

2. Test Module Organization

Each test module mirrors the main codebase structure, with mod.rs for declarations, helpers.rs for utilities, and separate files for different features.

3. Comprehensive Test Helpers

The codebase provides helper functions in tests/it/helpers.rs for common setup scenarios and module-specific helpers for specialized testing needs.

Authentication Testing Patterns

The auth module provides specialized helpers for testing authentication scenarios, including key creation macros, permission setup utilities, and operation validation helpers.

Permission Testing

Test authentication and authorization systematically using the auth module helpers to verify different permission levels and access control scenarios.

CRDT Testing

Test CRDT properties including merge semantics, conflict resolution, and deterministic behavior. The crdt module provides specialized helpers for testing commutativity, associativity, and idempotency of CRDT operations.

Performance Testing

Performance testing can be done using criterion benchmarks alongside integration tests. Consider memory allocation patterns and operation timing in critical paths.

Error Testing

Comprehensive error testing ensures robust error handling throughout the system. Test both error conditions and recovery scenarios to validate system resilience.

Test Data Management

Create realistic test data using builder patterns for complex scenarios. Consider property-based testing for CRDT operations to validate mathematical properties like commutativity, associativity, and idempotency.

Test Organization

Organize tests by functionality and use environment variables for test configuration. Use #[ignore] for expensive tests that should only run on demand.

Testing Anti-Patterns to Avoid

  • Overly complex test setup - Keep setup minimal and use helpers
  • Testing implementation details - Test behavior through public interfaces
  • Flaky tests with timing dependencies - Avoid sleep() and timing assumptions
  • Buried assertions - Make test intent clear with obvious assertions

Summary

Effective testing in Eidetica provides:

  • Integration-focused approach that tests real-world usage patterns
  • Comprehensive helpers that reduce test boilerplate and improve maintainability
  • Authentication testing that validates security and permission systems
  • CRDT testing that ensures merge semantics and conflict resolution work correctly
  • Performance testing that validates system behavior under load
  • Error condition testing that ensures robust error handling and recovery

Following these patterns ensures the codebase maintains high quality and reliability as it evolves.

Performance Best Practices

This document outlines performance optimization patterns used throughout the Eidetica codebase.

Core Performance Principles

1. Hot Path Optimization

Identify and optimize performance-critical code paths. Common hot paths in Eidetica include CRDT state computation, entry storage/retrieval, authentication verification, bulk operations, and string conversions.

2. Memory Efficiency

Minimize allocations through appropriate string parameter types, pre-allocation of collections, stack allocation preference, and efficient caching strategies.

3. Algorithmic Efficiency

Choose algorithms that scale well with data size by using appropriate data structures, implementing caching for expensive computations, and preferring direct iteration over complex iterator chains in hot paths.

String Parameter Optimization

1. Parameter Type Selection

Use Into<String> for stored parameters and AsRef<str> for lookup operations to minimize allocations and conversions.

2. Bulk Operation Optimization

Convert parameters upfront for bulk operations rather than converting on each iteration to reduce overhead.

Memory Allocation Patterns

1. Pre-allocation Strategies

Allocate collections with known or estimated capacity to reduce reallocation overhead. Pre-allocate strings when building keys or compound values.

2. Memory-Efficient Data Structures

Choose data structures based on access patterns: BTreeMap for ordered iteration and range queries, HashMap for fast lookups, and Vec for dense indexed access.

3. Avoiding Unnecessary Clones

Use references and borrowing effectively. Work with references when possible and clone only when ownership transfer is required.

CRDT Performance Patterns

1. State Computation Caching

Cache expensive CRDT state computations using entry ID and store name as cache keys. Immutable entries eliminate cache invalidation concerns.

2. Efficient Merge Operations

Optimize merge algorithms by pre-allocating capacity, performing in-place merges when possible, and cloning only when adding new keys.

3. Lazy Computation Patterns

Defer expensive computations until needed using lazy initialization patterns to avoid unnecessary work.

Backend Performance Patterns

1. Batch Operations

Optimize backend operations for bulk access by implementing batch retrieval and storage methods that leverage backend-specific bulk operations.

2. Connection Pooling and Resource Management

Use connection pooling for expensive resources and implement read caching with bounded LRU caches to reduce backend load.

Algorithm Optimization

1. Direct Loops vs Iterator Chains

Prefer direct loops over complex iterator chains in hot paths for better performance and clearer control flow.

2. Efficient Graph Traversal

Use iterative traversal with explicit stacks to avoid recursion overhead and maintain visited sets to prevent redundant processing in DAG traversal.

Profiling and Measurement

1. Benchmark-Driven Development

Use criterion for performance testing with varied data sizes to understand scaling characteristics.

2. Performance Monitoring

Track operation timings in critical paths to identify bottlenecks and measure optimization effectiveness.

Memory Profiling

1. Memory Usage Tracking

Implement allocation tracking for operations to identify memory-intensive code paths and optimize accordingly.

2. Memory-Efficient Collections

Use adaptive collection types that switch between Vec for small collections and HashMap for larger ones to optimize memory usage patterns.

Common Performance Anti-Patterns

Avoid unnecessary string allocations through repeated concatenation, repeated expensive computations that could be cached, and unbounded cache growth that leads to memory exhaustion.

Summary

Effective performance optimization in Eidetica focuses on string parameter optimization, memory-efficient patterns, hot path optimization, CRDT performance with caching, backend optimization with batch operations, and algorithm efficiency. Following these patterns ensures the system maintains good performance characteristics as it scales.

Security Best Practices

This document outlines security patterns and practices used throughout the Eidetica codebase.

Core Security Architecture

1. Authentication System

Eidetica uses Ed25519 digital signatures for all entry authentication. The system provides high-performance cryptographic verification through content-addressable entries that enable automatic tampering detection. All entries must be signed by authorized keys, with private keys stored separately from synchronized data.

2. Authorization Model

The system implements a hierarchical permission model with three levels: Read (view data and compute states), Write (create and modify entries), and Admin (manage permissions and authentication settings). Permissions follow a hierarchical structure where higher levels include all lower-level permissions.

3. Secure Entry Creation

All entries require authentication during creation. The system verifies authentication keys exist and have appropriate permissions before creating entries. Each entry is signed and stored with verification to ensure integrity.

Cryptographic Best Practices

1. Digital Signature Handling

Ed25519 signatures provide authentication for all entries. The system creates signatures from canonical byte representations and verifies them against stored public keys to ensure data integrity and authenticity.

2. Key Generation and Storage

Keys are generated using cryptographically secure random number generators. Private keys are stored separately from public keys and are securely cleared from memory when removed to prevent key material leakage.

3. Canonical Serialization

The system ensures consistent serialization for signature verification by sorting all fields deterministically and creating canonical JSON representations. This prevents signature verification failures due to serialization differences.

Permission Management

1. Database-Level Permissions

Each database maintains fine-grained permissions mapping keys to permission levels. The system checks permissions by looking up key-specific permissions or falling back to default permissions. Admin-only operations include permission updates, with safeguards to prevent self-lockout.

2. Operation-Specific Authorization

Different operations require different permission levels: reading data requires Read permission, writing data requires Write permission, and managing settings or permissions requires Admin permission. The system enforces these requirements before allowing any operation to proceed.

Secure Data Handling

1. Input Validation

All inputs undergo validation to prevent injection and malformation attacks. Entry IDs must be valid hex-encoded SHA-256 hashes, key names must contain only safe alphanumeric characters, and store names cannot conflict with reserved system names. The system enforces strict size limits and character restrictions.

2. Secure Serialization

The system prevents deserialization attacks through custom deserializers that validate data during parsing. Entry data is subject to size limits and format validation, ensuring only well-formed data enters the system.

Attack Prevention

1. Denial of Service Protection

The system implements comprehensive resource limits including maximum entry sizes, store counts, and parent node limits. Rate limiting prevents excessive operations per second from any single key, with configurable thresholds to balance security and usability.

2. Hash Collision Protection

SHA-256 hashing ensures content-addressable IDs are collision-resistant. The system verifies that entry IDs match their content hash, detecting any tampering or corruption attempts.

3. Timing Attack Prevention

Security-sensitive comparisons use constant-time operations to prevent timing-based information leakage. This includes signature comparisons and key matching operations.

Audit and Logging

1. Security Event Logging

The system logs all security-relevant events including authentication attempts, permission denials, rate limit violations, and key management operations. Events are timestamped and can be forwarded to external monitoring systems for centralized security analysis.

2. Intrusion Detection

Active monitoring detects suspicious patterns such as repeated authentication failures indicating brute force attempts or unusual operation frequencies suggesting system abuse. The detector maintains sliding time windows to track patterns and generate alerts when thresholds are exceeded.

Common Security Anti-Patterns

Key security mistakes to avoid include storing private keys in plain text, missing input validation, leaking sensitive information in error messages, and using weak random number generation. Always use proper key types with secure memory handling, validate all inputs, provide generic error messages, and use cryptographically secure random number generators.

Summary

Effective security in Eidetica encompasses strong authentication with Ed25519 digital signatures, fine-grained authorization with hierarchical permissions, secure cryptographic operations with proper key management, comprehensive input validation, attack prevention through rate limiting and resource controls, and thorough auditing with intrusion detection capabilities.

Documentation Best Practices

This document outlines documentation standards and practices used throughout the Eidetica codebase.

Documentation Philosophy

Documentation as Code

Documentation receives the same rigor as source code - version controlled, reviewed, tested, and maintained alongside code changes.

Audience-Focused Writing

Each documentation type serves specific audiences: public API docs for library users, internal docs for contributors, architecture docs for design understanding, and best practices for development consistency.

Progressive Disclosure

Information flows from general to specific: overview to getting started to detailed guides to reference documentation.

API Documentation Standards

Module-Level Documentation

Every module requires comprehensive header documentation including core functionality, integration points, security considerations, and performance notes. Module docs should provide an overview of the module's purpose and how it fits into the larger system.

Function Documentation Standards

Document all public functions with: purpose description, parameter details, performance notes, related functions, and error conditions. Focus on what the function does and when to use it, not implementation details.

Type Documentation

Document structs, enums, and traits with context about their purpose, usage patterns, and implementation notes. Focus on when and why to use the type, not just what it does.

Error Documentation

Document error types with context about when they occur, what they mean, and how to recover from them. Include security implications where relevant.

Code Example Standards

All documentation examples must be complete, runnable, and testable. Examples should demonstrate proper error handling patterns and include performance guidance where relevant. Use realistic scenarios and show best practices.

Internal Documentation

Architecture Decision Records (ADRs)

Document significant design decisions with status, context, decision rationale, and consequences. ADRs help future contributors understand why specific choices were made.

Design Rationale Documentation

Complex implementations should include explanations of algorithm choices, performance characteristics, and trade-offs. Focus on the "why" behind implementation decisions.

TODO and Known Limitations

Document current limitations and planned improvements with clear categorization. Include guidance for contributors who want to help address these items.

Documentation Testing

Doctests

All documentation examples must compile and run. Use cargo test --doc to verify examples work correctly. Examples should include proper imports and error handling.

Documentation Coverage

Track coverage with RUSTDOCFLAGS="-D missing_docs" cargo doc to ensure all public APIs are documented. Check for broken links and maintain comprehensive documentation coverage.

External Documentation

User Guide Structure

Organize documentation progressively from overview to detailed reference. Structure includes user guides for problem-solving, internal docs for implementation details, and generated API documentation.

Contribution Guidelines

Different documentation types serve different purposes: user docs focus on solving problems with clear examples, internal docs explain implementation decisions, and API docs provide comprehensive reference material. All examples must compile and demonstrate best practices.

Common Documentation Anti-Patterns

Avoid outdated examples that no longer work with current APIs, incomplete examples missing imports or setup, implementation-focused documentation that explains how rather than what and why, and missing context about when to use functionality.

Good documentation provides clear purpose, complete examples, proper context, parameter descriptions, return value information, and performance characteristics.

Summary

Effective documentation in Eidetica treats documentation as code, focuses on specific audiences, uses progressive disclosure, maintains comprehensive API documentation, provides clear user guides, explains design decisions, ensures all examples are tested and working, and follows consistent standards. These practices ensure documentation remains valuable, accurate, and maintainable as the project evolves.

Logging Best Practices

This guide documents best practices for using the tracing crate within the Eidetica codebase.

Overview

Eidetica uses the tracing crate for all logging needs. This provides:

  • Structured logging with minimal overhead
  • Compile-time optimization for disabled log levels
  • Span-based context for async operations
  • Integration with external observability tools

Log Level Guidelines

Choose log levels based on the importance and frequency of events:

ERROR (tracing::error!)

Use for unrecoverable errors that prevent operations from completing:

tracing::error!("Failed to store entry {}: {}", entry.id(), error);

When to use:

  • Database operation failures
  • Network errors that can't be retried
  • Authentication/authorization failures
  • Corrupted data detection

WARN (tracing::warn!)

Use for important warnings that don't prevent operation:

tracing::warn!("Failed to send to {}: {}. Adding to retry queue.", peer, error);

When to use:

  • Retryable failures
  • Invalid configuration (with fallback)
  • Deprecated feature usage
  • Performance degradation detected

INFO (tracing::info!)

Use for high-level operational messages:

tracing::info!("Sync server started on {}", address);

When to use:

  • Service lifecycle events (start/stop)
  • Successful major operations
  • Configuration changes
  • Important state transitions

DEBUG (tracing::debug!)

Use for detailed operational information:

tracing::debug!("Syncing {} databases with peer {}", tree_count, peer_id);

When to use:

  • Detailed operation progress
  • Protocol interactions
  • Algorithm steps
  • Non-critical state changes

TRACE (tracing::trace!)

Use for very detailed trace information:

tracing::trace!("Processing entry {} with {} parents", entry_id, parent_count);

When to use:

  • Individual item processing
  • Detailed algorithm execution
  • Network packet contents
  • Frequent operations in hot paths

Performance Considerations

Hot Path Optimization

For performance-critical code paths, follow these guidelines:

  1. Use appropriate levels: Hot paths should use trace! to avoid overhead
  2. Avoid string formatting: Use structured fields instead
  3. Check before complex operations: Use tracing::enabled! for expensive log data
// Good: Structured fields, minimal overhead
tracing::trace!(entry_id = %entry.id(), parent_count = parents.len(), "Processing entry");

// Bad: String formatting in hot path
tracing::debug!("Processing entry {} with {} parents", entry.id(), parents.len());

// Good: Check before expensive operation
if tracing::enabled!(tracing::Level::TRACE) {
    let debug_info = expensive_debug_calculation();
    tracing::trace!("Debug info: {}", debug_info);
}

Async and Background Operations

Use spans to provide context for async operations:

use tracing::{info_span, Instrument};

async fn sync_with_peer(peer_id: &str) {
    async {
        tracing::debug!("Starting sync");
        // ... sync logic ...
        tracing::debug!("Sync complete");
    }
    .instrument(info_span!("sync", peer_id = %peer_id))
    .await;
}

Module-Specific Guidelines

Sync Module

  • Use info! for server lifecycle and peer connections
  • Use debug! for sync protocol operations
  • Use trace! for individual entry transfers
  • Use spans for peer-specific context

Backend Module

  • Use error! for storage failures
  • Use debug! for cache operations
  • Use trace! for individual entry operations

Authentication Module

  • Use error! for signature verification failures
  • Use error! for permission violations
  • Use debug! for key operations
  • Never log private keys or sensitive data

CRDT Module

  • Use debug! for merge operations
  • Use trace! for individual CRDT operations
  • Include operation type in structured fields

Structured Logging

Prefer structured fields over string interpolation:

// Good: Structured fields
tracing::info!(
    tree_id = %database.id(),
    entry_count = entries.len(),
    peer = %peer_address,
    "Synchronizing database"
);

// Bad: String interpolation
tracing::info!(
    "Synchronizing database {} with {} entries to peer {}",
    database.id(), entries.len(), peer_address
);

Error Context

When logging errors, include relevant context:

// Good: Includes context
tracing::error!(
    error = %e,
    entry_id = %entry.id(),
    tree_id = %database.id(),
    "Failed to store entry during sync"
);

// Bad: Missing context
tracing::error!("Failed to store entry: {}", e);

Testing with Logs

Automatic Test Logging Setup

Eidetica uses a global test setup with the ctor crate to automatically initialize tracing for all tests. This is configured in tests/it/main.rs:

This means all tests automatically have tracing enabled at INFO level without any setup code needed in individual test functions.

Viewing Test Logs

By default, Rust's test harness captures log output and only shows it for failing tests:

# Normal test run - only see logs from failing tests
cargo test

# See logs from ALL tests (passing and failing)
cargo test -- --nocapture

# Control log level with environment variable
RUST_LOG=eidetica=debug cargo test -- --nocapture

# See logs from specific test
cargo test test_sync_operations -- --nocapture

# Trace level for specific module during tests
RUST_LOG=eidetica::sync=trace cargo test -- --nocapture

Writing Tests with Logging

Tests should use println! for outputs.

Key Benefits

  • Zero setup: No initialization code needed in individual tests
  • Environment control: Use RUST_LOG to control verbosity per test run
  • Clean output: Logs only appear when tests fail or with --nocapture
  • Proper isolation: with_test_writer() ensures logs don't mix between parallel tests
  • Library visibility: See internal library operations during test execution

Common Patterns

Operation Success/Failure

match operation() {
    Ok(result) => {
        tracing::debug!("Operation succeeded");
        result
    }
    Err(e) => {
        tracing::error!(error = %e, "Operation failed");
        return Err(e);
    }
}

Retry Logic

for attempt in 1..=max_attempts {
    match try_operation() {
        Ok(result) => {
            if attempt > 1 {
                tracing::info!("Operation succeeded after {} attempts", attempt);
            }
            return Ok(result);
        }
        Err(e) if attempt < max_attempts => {
            tracing::warn!(
                error = %e,
                attempt,
                max_attempts,
                "Operation failed, retrying"
            );
        }
        Err(e) => {
            tracing::error!(
                error = %e,
                attempts = max_attempts,
                "Operation failed after all retries"
            );
            return Err(e);
        }
    }
}

Anti-Patterns to Avoid

  1. Don't log sensitive data: Never log private keys, passwords, or PII
  2. Don't use println/eprintln: Always use tracing macros
  3. Don't log in tight loops: Use trace level or aggregate logging
  4. Don't format strings unnecessarily: Use structured fields
  5. Don't ignore log levels: Use appropriate levels for context

Future Considerations

As the codebase grows, consider:

  • Adding custom tracing subscribers for specific subsystems
  • Implementing trace sampling for high-volume operations
  • Adding metrics collection alongside tracing
  • Creating domain-specific span attributes

Future Development

Key areas for future development:

  • CRDT Refinement: Enhanced CRDT implementations and merge logic
  • Security: Entry signing and key management systems
  • Persistent Storage: Database backends beyond in-memory storage
  • Blob Storage: Integration with distributed storage systems for large data
  • Querying: Advanced query and filtering capabilities
  • Additional CRDTs: Sequences, Sets, Counters, and other CRDT types
  • Replication: Peer-to-peer synchronization protocols
  • Indexing: Performance optimization for large datasets
  • Concurrency: Improved performance under high load
  • Entry Metadata: Enhanced metadata for better query operations

Design Documents

This section contains formal design documents that capture the architectural thinking, decision-making process, and implementation details for complex features in Eidetica. These documents serve as a historical record of our technical decisions and provide context for future development.

Purpose

Design documents in this section:

  • Document the rationale behind major technical decisions
  • Capture alternative approaches that were considered
  • Outline implementation strategies and tradeoffs
  • Serve as a reference for future developers
  • Help maintain consistency in architectural decisions

Document Structure

Each design document typically includes:

  • Problem statement and context
  • Goals and non-goals
  • Proposed solution
  • Alternative approaches considered
  • Implementation details and tradeoffs
  • Future considerations and potential improvements

Available Design Documents

Implemented

Proposed

  • Users - Multi-user system with password-based authentication, user isolation, and per-user key management
  • Key Management - Technical details for key encryption, storage, and discovery in the Users system
  • Error Handling - Modular error architecture for improved debugging and user experience

Implementation Status:

  • Direct Keys - Fully implemented and functional
  • Delegated Databases - Fully implemented and functional with comprehensive test coverage

Authentication Design

This document outlines the authentication and authorization scheme for Eidetica, a decentralized database built on Merkle-CRDT principles. The design emphasizes flexibility, security, and integration with the core CRDT system while maintaining distributed consistency.

Table of Contents

Overview

Eidetica's authentication scheme is designed to leverage the same CRDT and Merkle-DAG principles that power the core database while providing robust access control for distributed environments. Unlike traditional authentication systems, this design must handle authorization conflicts that can arise from network partitions and concurrent modifications to access control rules.

Databases operate in one of two authentication modes: unsigned mode (no authentication configured) or signed mode (authentication required). This design supports both security-critical databases requiring signed operations, unsigned and typically local-only databases for higher performance, and unsigned 'overlay' trees that can be computed from signed trees.

The authentication system is not implemented as a pure consumer of the database API but is tightly integrated with the core system. This integration enables efficient validation and conflict resolution during entry creation and database merging operations.

Authentication Modes and Bootstrap Behavior

Eidetica databases support two distinct authentication modes with automatic transitions between them:

Unsigned Mode (No Authentication)

Databases are in unsigned mode when created without authentication configuration. In this mode:

  • The _settings.auth key is either missing or contains an empty Doc ({"auth": {}})
  • Both states are equivalent and treated identically by the system
  • Unsigned operations succeed: Transactions without signatures are allowed
  • No validation overhead: Authentication validation is skipped for performance
  • Suitable for: Local-only databases, temporary workspaces, development environments, overlay networks

Unsigned mode enables use cases where authentication overhead is unnecessary, such as:

  • Local computation that never needs to sync
  • Development and testing environments
  • Temporary scratch databases
  • The upcoming "overlays" feature (see below)

Signed Mode (Mandatory Authentication)

Once authentication is configured, databases are in signed mode where:

  • The _settings.auth key contains at least one authentication key
  • All operations require valid signatures: Only authenticated and transactions are valid
  • Fail-safe validation: Corrupted or deleted auth configuration causes all transactions to fail
  • Permanent transition: Cannot return to unsigned mode (would require creating a new database)

In signed mode, unsigned operations will fail with an authentication error. The system enforces mandatory authentication to maintain security guarantees once authentication has been established.

Fail-Safe Behavior:

The validation system uses two-layer protection to prevent and detect authentication corruption:

  1. Proactive Prevention (Layer 1): Transactions that would corrupt or delete auth configuration fail during commit(), before the entry enters the Merkle DAG
  2. Reactive Fail-Safe (Layer 2): If auth is already corrupted (from older code versions or external manipulation), all subsequent operations on top of the corrupted state are also invalid

Validation States:

Auth State_settings.auth ValueUnsigned OperationsAuthenticated OperationsStatus
Unsigned ModeMissing or {} (empty Doc)✓ Allowed✓ Triggers bootstrapValid
Signed ModeValid key configuration✗ Rejected✓ ValidatedValid
CorruptedWrong type (String, etc.)✗ PREVENTED✗ PREVENTEDCannot be created
DeletedTombstone (was deleted)✗ PREVENTED✗ PREVENTEDCannot be created

Note: Corrupted and Deleted states shown in the table are theoretical - the system prevents their creation through proactive validation. The fail-safe layer (Layer 2) remains as defense-in-depth against historical corruption or external DAG manipulation.

This defense-in-depth approach ensures that corrupted authentication configuration cannot be created or exploited to bypass security. See Authentication Behavior Reference for detailed implementation information.

Future: Overlay Databases

The unsigned mode design enables a planned feature called "overlays", computed databases that can be calculated from multiple machines.

The idea is that an "overlay" adds information to a database, backups for example, that can be reconstructed entirely from the original database.

Design Goals and Principles

Primary Goals

  1. Flexible Authentication: Support both unsigned mode for local-only work and signed mode for distributed collaboration
  2. Distributed Consistency: Authentication rules must merge deterministically across network partitions
  3. Cryptographic Security: All authentication based on Ed25519 public/private key cryptography
  4. Hierarchical Access Control: Support admin, read/write, and read-only permission levels
  5. Delegation: Support for delegating authentication to other databases without granting admin privileges (infrastructure built, activation pending)
  6. Auditability: All authentication changes are tracked in the immutable DAG history

Non-Goals

  • Perfect Security: Admin key compromise requires manual intervention
  • Real-time Revocation: Key revocation is eventually consistent, not immediate

System Architecture

Authentication Data Location

Authentication configuration is stored in the special _settings store under the auth key. This placement ensures that:

  • Authentication rules are included in _settings, which contains all the data necessary to validate the database and add new Entries
  • Access control changes are tracked in the immutable history
  • Settings can be validated against the current entry being created

The _settings store uses the crate::crdt::Doc type, which is a hierarchical CRDT that resolves conflicts using Last-Write-Wins (LWW) semantics. The ordering for LWW is determined deterministically by the DAG design (see CRDT documentation for details).

Clarification: Throughout this document, when we refer to Doc, this is the hierarchical CRDT document type supporting nested structures. The _settings store specifically uses Doc to enable complex authentication configurations including nested policy documents and key management.

Permission Hierarchy

Eidetica implements a three-tier permission model:

Permission LevelModify _settingsAdd/Remove KeysChange PermissionsRead DataWrite DataPublic Database Access
Admin
Write
Read

Authentication Framework

Key Structure

The current implementation supports direct authentication keys stored in the _settings.auth configuration. Each key consists of:

classDiagram
    class AuthKey {
        String pubkey
        Permission permissions
        KeyStatus status
    }

    class Permission {
        <<enumeration>>
        Admin(priority: u32)
        Write(priority: u32)
        Read
    }

    class KeyStatus {
        <<enumeration>>
        Active
        Revoked
    }

    AuthKey --> Permission
    AuthKey --> KeyStatus

Note: Both direct keys and delegated databases are fully implemented and functional, including DelegatedTreeRef, PermissionBounds, and TreeReference types.

Direct Key Example

{
  "_settings": {
    "auth": {
      "KEY_LAPTOP": {
        "pubkey": "ed25519:PExACKOW0L7bKAM9mK_mH3L5EDwszC437uRzTqAbxpk",
        "permissions": "write:10",
        "status": "active"
      },
      "KEY_DESKTOP": {
        "pubkey": "ed25519:QJ7bKAM9mK_mH3L5EDwszC437uRzTqAbxpkPExACKOW0L",
        "permissions": "read",
        "status": "active"
      },
      "*": {
        "pubkey": "*",
        "permissions": "read",
        "status": "active"
      },
      "PUBLIC_WRITE": {
        "pubkey": "*",
        "permissions": "write:100",
        "status": "active"
      }
    },
    "name": "My Database"
  }
}

Note: The wildcard key * enables global permissions for anyone. Wildcard keys:

  • Can have any permission level: "read", "write:N", or "admin:N"
  • Are commonly used for world-readable databases (with "read" permissions) but can grant broader access
  • Can be revoked like any other key
  • Can be included in delegated databases (if you delegate to a database with a wildcard, that's valid)

Entry Signing Format

Every entry in Eidetica must be signed. The authentication information is embedded in the entry structure:

{
  "database": {
    "root": "tree_root_id",
    "parents": ["parent_entry_id"],
    "data": "{\"key\": \"value\"}",
    "metadata": "{\"_settings\": [\"settings_tip_id\"]}"
  },
  "stores": [
    {
      "name": "users",
      "parents": ["parent_entry_id"],
      "data": "{\"user_data\": \"example\"}"
    }
  ],
  "auth": {
    "sig": "ed25519_signature_base64_encoded",
    "key": "KEY_LAPTOP"
  }
}

The auth.key field can be either:

  • Direct key: A string referencing a key name in this database's _settings.auth
  • Delegation path: An ordered list of {"key": "delegated_tree_1", "tips": ["A", "B"]} elements, where the last element must contain only a "key" field

The auth.sig field contains the base64-encoded Ed25519 signature of the entry's content hash.

Key Management

Key Lifecycle

The current implementation supports two key statuses:

stateDiagram-v2
    [*] --> Active: Key Added
    Active --> Revoked: Revoke Key
    Revoked --> Active: Reactivate Key

    note right of Active : Can create new entries
    note right of Revoked : Historical entries preserved, cannot create new entries

Key Status Semantics

  1. Active: Key can create new entries and all historical entries remain valid
  2. Revoked: Key cannot create new entries. Historical entries remain valid and their content is preserved during merges

Key Behavioral Details:

  • Entries created before revocation remain valid to preserve history integrity
  • An Admin can transition a key back to Active state from Revoked status
  • Revoked status prevents new entries but preserves existing content in merges

Priority System

Priority is integrated into the permission levels for Admin and Write permissions:

  • Admin(priority): Can modify settings and manage keys with equal or lower priority
  • Write(priority): Can write data but not modify settings
  • Read: No priority, read-only access

Priority values are u32 integers where lower values indicate higher priority:

  • Priority 0: Highest priority, typically the initial admin key
  • Higher numbers = lower priority
  • Keys can only modify other keys with equal or lower priority (equal or higher number)

Important: Priority only affects administrative operations (key management). It does not influence CRDT merge conflict resolution, which uses Last Write Wins semantics based on the DAG structure.

Key Naming and Aliasing

Auth settings serve two distinct purposes in delegation:

  1. Delegation references - Names that point to OTHER DATABASES (DelegatedTreeRef containing TreeReference)
  2. Signing keys - Names that point to PUBLIC KEYS (AuthKey containing Ed25519 public key)

Auth settings can also contain multiple names for the same public key, each potentially with different permissions. This enables:

  • Readable delegation paths - Use friendly names like "alice_laptop" instead of long public key strings
  • Permission contexts - Same key can have different permissions depending on how it's referenced
  • API compatibility - Bootstrap can use public key strings while delegation uses friendly names

Example: Multiple names for same key

{
  "_settings": {
    "auth": {
      "Ed25519:abc123...": {
        "pubkey": "Ed25519:abc123...",
        "permissions": "admin:0",
        "status": "active"
      },
      "alice_work": {
        "pubkey": "Ed25519:abc123...",
        "permissions": "write:10",
        "status": "active"
      },
      "alice_readonly": {
        "pubkey": "Ed25519:abc123...",
        "permissions": "read",
        "status": "active"
      }
    }
  }
}

Use Cases:

  • Instance API bootstrap: When using instance.new_database(settings, key_name), the database is automatically bootstrapped with the signing key added to auth settings using the public key string as the name (e.g., "Ed25519:abc123..."). This is the name used for signature verification.

  • User API bootstrap: When using user.new_database(settings, key_id), the behavior is similar - the key is added with its public key string as the name, regardless of any display name stored in user key metadata.

  • Delegation paths: Delegation references keys by their name in auth settings. To enable readable delegation paths like ["alice@example.com", "alice_laptop"] instead of ["alice@example.com", "Ed25519:abc123..."], add friendly name aliases to the delegated database's auth settings.

  • Permission differentiation: The same physical key can have different permission levels depending on which name is used to reference it.

Key Aliasing Pattern:

// Bootstrap creates entry with public key string as name
let database = instance.new_database(settings, "alice_key")?;
// Auth now contains: { "Ed25519:abc123...": AuthKey(...) }

// Add friendly name alias for delegation
let transaction = database.new_transaction()?;
let settings = transaction.get_settings()?;
settings.update_auth_settings(|auth| {
    // Same public key, friendly name, potentially different permission
    auth.add_key("alice_laptop", AuthKey::active(
        "Ed25519:abc123...",  // Same public key
        Permission::Write(10),  // Can differ from bootstrap permission
    )?)?;
    Ok(())
})?;
transaction.commit()?;
// Auth now contains both:
// { "Ed25519:abc123...": AuthKey(..., Admin(0)) }
// { "alice_laptop": AuthKey(..., Write(10)) }

Important Notes:

  • Both entries reference the same cryptographic key but can have different permissions
  • Signature verification works with any name that maps to the correct public key
  • Delegation paths use the key name from auth settings, making friendly aliases essential for readable delegation
  • The name used in the auth.key field (either direct or in a delegation path) must exactly match a name in the auth settings
  • Adding multiple names for the same key does not create duplicates - they are intentional aliases with potentially different permission contexts

Delegation (Delegated Authentication)

Status: Fully implemented and functional with comprehensive test coverage.

Concept and Benefits

Delegation allows any database to be referenced as a source of authentication keys for another database. This enables flexible authentication patterns where databases can delegate authentication to other databases without granting administrative privileges on the delegating database. Key benefits include:

  • Flexible Delegation: Any database can delegate authentication to any other database
  • User Autonomy: Users can manage their own personal databases with keys they control
  • Cross-Project Authentication: Share authentication across multiple projects or databases
  • Granular Permissions: Set both minimum and maximum permission bounds for delegated keys

Delegated databases are normal databases, and their authentication settings are used with permission clamping applied.

Important: Any database can be used as a delegated database - there's no special "authentication database" type. This means:

  • A project's main database can delegate to a user's personal database
  • Multiple projects can delegate to the same shared authentication database
  • Databases can form delegation networks where databases delegate to each other
  • The delegated database doesn't need to know it's being used for delegation

Structure

A delegated database reference in the main database's _settings.auth contains:

{
  "_settings": {
    "auth": {
      "example@eidetica.dev": {
        "permission-bounds": {
          "max": "write:15",
          "min": "read" // optional, defaults to no minimum
        },
        "database": {
          "root": "hash_of_root_entry",
          "tips": ["hash1", "hash2"]
        }
      },
      "another@example.com": {
        "permission-bounds": {
          "max": "admin:20" // min not specified, so no minimum bound
        },
        "database": {
          "root": "hash_of_another_root",
          "tips": ["hash3"]
        }
      }
    }
  }
}

The referenced delegated database maintains its own _settings.auth with direct keys:

{
  "_settings": {
    "auth": {
      "KEY_LAPTOP": {
        "pubkey": "ed25519:AAAAC3NzaC1lZDI1NTE5AAAAI...",
        "permissions": "admin:0",
        "status": "active"
      },
      "KEY_MOBILE": {
        "pubkey": "ed25519:AAAAC3NzaC1lZDI1NTE5AAAAI...",
        "permissions": "write:10",
        "status": "active"
      }
    }
  }
}

Permission Clamping

Permissions from delegated databases are clamped based on the permission-bounds field in the main database's reference:

  • max (required): The maximum permission level that keys from the delegated database can have
    • Must be <= the permissions of the key adding the delegated database reference
  • min (optional): The minimum permission level for keys from the delegated database
    • If not specified, there is no minimum bound
    • If specified, keys with lower permissions are raised to this level

The effective priority is derived from the effective permission returned after clamping. If the delegated key's permission already lies within the min/max bounds its original priority value is preserved; when a permission is clamped to a bound the bound's priority value becomes the effective priority:

graph LR
    A["Delegated Database: admin:5"] --> B["Main Database: max=write:10, min=read"] --> C["Effective: write:10"]
    D["Delegated Database: write:8"] --> B --> E["Effective: write:8"]
    F["Delegated Database: read"] --> B --> G["Effective: read"]

    H["Delegated Database: admin:5"] --> I["Main Database: max=read (no min)"] --> J["Effective: read"]
    K["Delegated Database: read"] --> I --> L["Effective: read"]
    M["Delegated Database: write:20"] --> N["Main Database: max=admin:15, min=write:25"] --> O["Effective: write:25"]

Clamping Rules:

  • Effective permission = clamp(delegated_tree_permission, min, max)
    • If delegated database permission > max, it's lowered to max
    • If min is specified and delegated database permission < min, it's raised to min
    • If min is not specified, no minimum bound is applied
  • The max bound must be <= permissions of the key that added the delegated database reference
  • Effective priority = priority embedded in the effective permission produced by clamping. This is either the delegated key's priority (when already inside the bounds) or the priority that comes from the min/max bound that performed the clamp.
  • Delegated database admin permissions only apply within that delegated database
  • Permission clamping occurs at each level of delegation chains
  • Note: There is no "none" permission level - absence of permissions means no access

Multi-Level References

Delegated databases can reference other delegated databases, creating delegation chains:

{
  "auth": {
    "sig": "signature_bytes",
    "key": [
      {
        "key": "example@eidetica.dev",
        "tips": ["current_tip"]
      },
      {
        "key": "old-identity",
        "tips": ["old_tip"]
      },
      {
        "key": "LEGACY_KEY"
      }
    ]
  }
}

Delegation Chain Rules:

  • The auth.key field contains an ordered list representing the delegation path
  • Each element has a "key" field and optionally "tips" for delegated databases
  • The final element must contain only a "key" field (the actual signing key)
  • Each step represents traversing from one database to the next in the delegation chain

Path Traversal:

  • Steps with tips → lookup delegation reference name in current DB → find DelegatedTreeRef → jump to referenced database
  • Final step (no tips) → lookup signing key name in current DB → find AuthKey → get Ed25519 public key for signature verification
  • Key names at each step reference entries in that database's auth settings by name (see Key Naming and Aliasing)

Permission and Validation:

  • Permission clamping applies at each level using the min/max function
  • Priority at each step is the priority inside the permission value that survives the clamp at that level (outer reference, inner key, or bound, depending on which one is selected by the clamping rules)
  • Tips must be valid at each level of the chain for the delegation to be valid

Delegated Database References

The main database must validate the delegated database structure as well as the main database.

Latest Known Tips

"Latest known tips" refers to the latest tips of a delegated database that have been seen used in valid key signatures within the current database. This creates a "high water mark" for each delegated database:

  1. When an entry uses a delegated database key, it includes the delegated database's tips at signing time
  2. The database tracks these tips as the "latest known tips" for that delegated database
  3. Future entries using that delegated database must reference tips that are equal to or newer than the latest known tips, or must be valid at the latest known tips
  4. This ensures that key revocations in delegated databases are respected once observed

Tip Tracking and Validation

To validate entries with delegated database keys:

  1. Check that the referenced tips are descendants of (or equal to) the latest known tips for that delegated database
  2. If they're not, check that the entry validates at the latest known tips
  3. Verify the key exists and has appropriate permissions at those tips
  4. Update the latest known tips if these are newer
  5. Apply permission clamping based on the delegation reference

This mechanism ensures that once a key revocation is observed in a delegated database, no entry can use an older version of that database where the key was still valid.

Key Revocation

Delegated database key deletion is always treated as revoked status in the main database. This prevents new entries from building on the deleted key's content while preserving the historical content during merges. This approach maintains the integrity of existing entries while preventing future reliance on removed authentication credentials.

By treating delegated database key deletion as revoked status, users can manage their own key lifecycle in the Main Database while ensuring that:

  • Historical entries remain valid and their content is preserved
  • New entries cannot use the revoked key's entries as parents
  • The merge operation proceeds normally with content preserved
  • Users cannot create conflicts that would affect other users' valid entries

Conflict Resolution and Merging

Conflicts in the _settings database are resolved by the crate::crdt::Doc type using Last Write Wins (LWW) semantics. When the database has diverged with both sides of the merge having written to the _settings database, the write with the higher logical timestamp (determined by the DAG structure) will win, regardless of the priority of the signing key.

Priority rules apply only to administrative permissions - determining which keys can modify other keys - but do not influence the conflict resolution during merges.

This is applied to delegated databases as well. A write to the Main Database must also recursively merge any changed settings in the delegated databases using the same LWW strategy to handle network splits in the delegated databases.

Key Status Changes in Delegated Databases: Examples

The following examples demonstrate how key status changes in delegated databases affect entries in the main database.

Example 1: Basic Delegated Database Key Status Change

Initial State:

graph TD
    subgraph "Main Database"
        A["Entry A<br/>Settings: delegated_tree1 = max:write:10, min:read<br/>Tip: UA"]
        B["Entry B<br/>Signed by delegated_tree1:laptop<br/>Tip: UA<br/>Status: Valid"]
        C["Entry C<br/>Signed by delegated_tree1:laptop<br/>Tip: UB<br/>Status: Valid"]
    end

    subgraph "Delegated Database"
        UA["Entry UA<br/>Settings: laptop = active"]
        UB["Entry UB<br/>Signed by laptop"]
    end

    A --> B
    B --> C
    UA --> UB

After Key Status Change in Delegated Database:

graph TD
    subgraph "Main Database"
        A["Entry A<br/>Settings: user1 = write:15"]
        B["Entry B<br/>Signed by delegated_tree1:laptop<br/>Tip: UA<br/>Status: Valid"]
        C["Entry C<br/>Signed by delegated_tree1:laptop<br/>Tip: UB<br/>Status: Valid"]
        D["Entry D<br/>Signed by delegated_tree1:mobile<br/>Tip: UC<br/>Status: Valid"]
        E["Entry E<br/>Signed by delegated_tree1:laptop<br/>Parent: C<br/>Tip: UB<br/>Status: Valid"]
        F["Entry F<br/>Signed by delegated_tree1:mobile<br/>Tip: UC<br/>Sees E but ignores since the key is invalid"]
        G["Entry G<br/>Signed by delegated_tree1:desktop<br/>Tip: UB<br/>Still thinks delegated_tree1:laptop is valid"]
        H["Entry H<br/>Signed by delegated_tree1:mobile<br/>Tip: UC<br/>Merges, as there is a valid key at G"]
    end

    subgraph "Delegated Database (delegated_tree1)"
        UA["Entry UA<br/>Settings: laptop = active, mobile = active, desktop = active"]
        UB["Entry UB<br/>Signed by laptop"]
        UC["Entry UC<br/>Settings: laptop = revoked<br/>Signed by mobile"]
    end

    A --> B
    B --> C
    C --> D
    D --> F
    C --> E
    E --> G
    F --> H
    G --> H
    UA --> UB
    UB --> UC

Example 2: Last Write Wins Conflict Resolution

Scenario: Two admins make conflicting authentication changes during a network partition. Priority determines who can make the changes, but Last Write Wins determines the final merged state.

After Network Reconnection and Merge:

graph TD
    subgraph "Merged Main Database"
        A["Entry A"]
        B["Entry B<br/>Alice (admin:10) bans user_bob<br/>Timestamp: T1"]
        C["Entry C<br/>Super admin (admin:0) promotes user_bob to admin:5<br/>Timestamp: T2"]
        M["Entry M<br/>Merge entry<br/>user_bob = admin<br/>Last write (T2) wins via LWW"]
        N["Entry N<br/>Alice attempts to ban user_bob<br/>Rejected: Alice can't modify admin-level user with higher priority"]
    end

    A --> B
    A --> C
    B --> M
    C --> M
    M --> N

Key Points:

  • All administrative actions are preserved in history
  • Last Write Wins resolves the merge conflict: the most recent change (T2) takes precedence
  • Permission-based authorization still prevents unauthorized modifications: Alice (admin:10) cannot ban a higher-priority user (admin:5) due to insufficient priority level
  • The merged state reflects the most recent write, not the permission priority
  • Permission priority rules prevent Alice from making the change in Entry N, as she lacks authority to modify higher-priority admin users

Authorization Scenarios

Network Partition Recovery

When network partitions occur, the authentication system must handle concurrent changes gracefully:

Scenario: Two branches of the database independently modify the auth settings, requiring CRDT-based conflict resolution using Last Write Wins.

Both branches share the same root, but a network partition has caused them to diverge before merging back together.

graph TD
    subgraph "Merged Main Database"
        ROOT["Entry ROOT"]
        A1["Entry A1<br/>admin adds new_developer<br/>Timestamp: T1"]
        A2["Entry A2<br/>dev_team revokes contractor_alice<br/>Timestamp: T3"]
        B1["Entry B1<br/>contractor_alice data change<br/>Valid at time of creation"]
        B2["Entry B2<br/>admin adds emergency_key<br/>Timestamp: T2"]
        M["Entry M<br/>Merge entry<br/>Final state based on LWW:<br/>- new_developer: added (T1)<br/>- emergency_key: added (T2)<br/>- contractor_alice: revoked (T3, latest)"]
    end

    ROOT --> A1
    ROOT --> B1
    A1 --> A2
    B1 --> B2
    A2 --> M
    B2 --> M

Conflict Resolution Rules Applied:

  • Settings Merge: All authentication changes are merged using Doc CRDT semantics with Last Write Wins
  • Timestamp Ordering: Changes are resolved based on logical timestamps, with the most recent change taking precedence
  • Historical Validity: Entry B1 remains valid because it was created before the status change
  • Content Preservation: With "revoked" status, content is preserved in merges but cannot be used as parents for new entries
  • Future Restrictions: Future entries by contractor_alice would be rejected based on the applied status change

Security Considerations

Threat Model

Protected Against

  • Unauthorized Entry Creation: All entries must be signed by valid keys
  • Permission Escalation: Users cannot grant themselves higher privileges than their main database reference
  • Historical Tampering: Immutable DAG prevents retroactive modifications
  • Replay Attacks: Content-addressable IDs prevent entry duplication
  • Administrative Hierarchy Violations: Lower priority keys cannot modify higher priority keys (but can modify equal priority keys)
  • Permission Boundary Violations: Delegated database permissions are constrained within their specified min/max bounds
  • Race Conditions: Last Write Wins provides deterministic conflict resolution

Requires Manual Recovery

  • Admin Key Compromise: When no higher-priority key exists
  • Conflicting Administrative Changes: LWW may result in unintended administrative state during network partitions

Cryptographic Assumptions

  • Ed25519 Security: Default to ed25519 signatures with explicit key type storage
  • Hash Function Security: SHA-256 for content addressing
  • Key Storage: Private keys must be securely stored by clients
  • Network Security: Assumption of eventually consistent but potentially unreliable network

Attack Vectors

Mitigated

  • Key Replay: Content-addressable entry IDs prevent signature replay
  • Downgrade Attacks: Explicit key type storage prevents algorithm confusion
  • Partition Attacks: CRDT merging handles network partition scenarios
  • Privilege Escalation: Permission clamping prevents users from exceeding granted permissions

Partial Mitigation

  • DoS via Large Histories: Priority system limits damage from compromised lower-priority keys
  • Social Engineering: Administrative hierarchy limits scope of individual key compromise
  • Timestamp Manipulation: LWW conflict resolution is deterministic but may be influenced by the chosen timestamp resolution algorithm
  • Administrative Confusion: Network partitions may result in unexpected administrative states due to LWW resolution

Not Addressed

  • Side-Channel Attacks: Client-side key storage security is out of scope
  • Physical Key Extraction: Assumed to be handled by client security measures
  • Long-term Cryptographic Breaks: Future crypto-agility may be needed

Implementation Details

Authentication Validation Process

The current validation process:

  1. Extract Authentication Info: Parse the auth field from the entry
  2. Resolve Key Name: Lookup the direct key in _settings.auth
  3. Check Key Status: Verify the key is Active (not Revoked)
  4. Validate Signature: Verify the Ed25519 signature against the entry content hash
  5. Check Permissions: Ensure the key has sufficient permissions for the operation

Current features include: Direct key validation, delegated database resolution, tip validation, and permission clamping.

Sync Permissions

Eidetica servers require proof of read permissions before allowing database synchronization. The server challenges the client to sign a random nonce, then validates the signature against the database's authentication configuration.

Authenticated Bootstrap Protocol

The authenticated bootstrap protocol enables devices to join existing databases without prior local state while requesting authentication access:

Bootstrap Flow:

  1. Bootstrap Detection: Empty tips in SyncTreeRequest signals bootstrap needed
  2. Auth Request: Client includes requesting key, key name, and requested permission
  3. Global Permission Check: Server checks if global * wildcard permission satisfies request
  4. Immediate Approval: If global permission exists and satisfies, access granted immediately
  5. Manual Approval Queue: If no global permission, request stored for admin review
  6. Database Transfer: Complete database state sent with approval confirmation
  7. Access Granted: Client receives database and can make authenticated operations

Protocol Extensions:

  • SyncTreeRequest includes: requesting_key, requesting_key_name, requested_permission
  • BootstrapResponse includes: key_approved, granted_permission
  • BootstrapPending response for manual approval scenarios
  • New sync API: sync_with_peer_for_bootstrap() for authenticated bootstrap scenarios

Security:

  • Ed25519 key cryptography for secure identity
  • Permission levels maintained (Read/Write/Admin)
  • Global wildcard permissions for automatic approval (secure by configuration)
  • Manual approval queue for controlled access (secure by default)
  • Immutable audit trail of all key additions in database history

CRDT Metadata Considerations

The current system uses entry metadata to reference settings tips. With authentication:

  • Metadata continues to reference current _settings tips for validation efficiency
  • Authentication validation uses the settings state at the referenced tips
  • This ensures entries are validated against the authentication rules that were current when created

Implementation Architecture

Core Components

  1. AuthValidator (auth/validation.rs): Validates entries and resolves authentication

    • Direct key resolution and validation
    • Signature verification
    • Permission checking
    • Caching for performance
  2. Crypto Module (auth/crypto.rs): Cryptographic operations

    • Ed25519 key generation and parsing
    • Entry signing and verification
    • Key format: ed25519:<base64-encoded-public-key>
  3. AuthSettings (auth/settings.rs): Settings management interface

    • Add/update/get authentication keys
    • Convert between settings storage and auth types
    • Validate authentication operations
    • Check permission access with can_access() method for both specific and wildcard keys
  4. Permission Module (auth/permission.rs): Permission logic

    • Permission checking for operations
    • Permission clamping for delegated databases

Storage Format

Authentication configuration is stored in _settings.auth as a Doc CRDT:

// Key storage structure
AuthKey {
    pubkey: String,           // Ed25519 public key
    permissions: Permission,  // Admin(u32), Write(u32), or Read
    status: KeyStatus,        // Active or Revoked
}

Future Considerations

Current Implementation Status

  1. Direct Keys: ✅ Fully implemented and tested
  2. Delegated Databases: ✅ Fully implemented with comprehensive test coverage
  3. Permission Clamping: ✅ Functional for delegation chains
  4. Delegation Depth Limits: ✅ Implemented with MAX_DELEGATION_DEPTH=10

Future Enhancements

  1. Advanced Key Status: Add Ignore and Banned statuses for more nuanced key management
  2. Performance Optimizations: Further caching and validation improvements
  3. User experience improvements for key management

References

  1. Eidetica Core Concepts
  2. CRDT Principles
  3. Entry Structure

Synchronization Design Document

This document outlines the design principles, architecture decisions, and implementation strategy for Eidetica's synchronization system.

Design Goals

Primary Objectives

  1. Decentralized Architecture: No central coordination required
  2. Performance: Minimize latency and maximize throughput
  3. Reliability: Handle network failures and recover gracefully
  4. Scalability: Support many peers and large datasets
  5. Security: Authenticated and verified peer communications
  6. Simplicity: Easy to configure and use

Non-Goals

  • Selective sync: Sync entire databases only (not partial)
  • Multi-hop routing: Direct peer connections only
  • Complex conflict resolution: CRDT-based automatic resolution only
  • Centralized coordination: No dependency on coordination servers

Key Design Innovation: Bootstrap-First Sync

Problem: Traditional distributed databases require complex setup procedures for new nodes to join existing networks. Peers must either start with empty databases or go through complex initialization.

Solution: Eidetica's bootstrap-first sync protocol enables zero-state joining:

  • Single API call handles both bootstrap and incremental sync
  • Automatic detection determines whether full or partial sync is needed
  • No setup required - new devices can immediately join existing databases
  • Bidirectional capability - any peer can bootstrap from any other peer

Use Cases Enabled:

  • Chat/messaging apps: Join conversation rooms instantly with full history
  • Collaborative documents: Open shared documents from any device
  • Data synchronization: Sync app data to new devices automatically
  • Backup/restore: Restore complete application state from peers

Core Design Principles

1. Merkle-CRDT Foundation

The sync system builds on Merkle DAG and CRDT principles:

  • Content-addressable entries: Immutable, hash-identified data
  • DAG structure: Parent-child relationships form directed acyclic graph
  • CRDT merging: Deterministic conflict resolution
  • Causal consistency: Operations maintain causal ordering

Benefits:

  • Natural deduplication (same content = same hash)
  • Efficient diff computation (compare tips)
  • Automatic conflict resolution
  • Partition tolerance

2. BackgroundSync Engine with Command Pattern

Decision: Single background thread with command-channel communication

Rationale:

  • Clean architecture: Eliminates circular dependencies
  • Ownership clarity: Background thread owns transport state
  • Non-blocking: Commands sent via channels don't block operations
  • Flexibility: Fire-and-forget or request-response patterns

Implementation:

The sync system uses a thin frontend that sends commands to a background thread:

  • Frontend handles API and peer/relationship management in sync database
  • Background owns transport and handles network operations
  • Both components access sync database directly for peer data
  • Commands used only for operations requiring background processing
  • Failed operations added to retry queue

Trade-offs:

  • ✅ No circular dependencies or complex locking
  • ✅ Clear ownership model (transport in background, data in sync database)
  • ✅ Works in both async and sync contexts
  • ✅ Graceful startup/shutdown handling
  • ❌ All sync operations serialized through single thread

3. Hook-Based Change Detection

Decision: Use write callbacks for change detection and sync triggering

Rationale:

  • Flexible: Callbacks can be attached per-database with full context
  • Consistent: Every commit triggers registered callbacks
  • Simple: Direct function calls with Entry, Database, and Instance parameters
  • Performance: Minimal overhead, no trait dispatch

Architecture:

// Callback function type (stored internally as Arc by Instance)
pub type WriteCallback = dyn Fn(&Entry, &Database, &Instance) -> Result<()> + Send + Sync;

// Integration with Database
impl Database {
    pub fn on_local_write<F>(&self, callback: F) -> Result<()>
    where
        F: Fn(&Entry, &Database, &Instance) -> Result<()> + Send + Sync + 'static
    {
        // Register callback with instance for this database
        // Instance wraps the callback in Arc internally
    }
}

// Usage example for sync
let sync = instance.sync().expect("Sync enabled");
let sync_clone = sync.clone();
let peer_pubkey = "peer_key".to_string();
database.on_local_write(move |entry, db, _instance| {
    sync_clone.queue_entry_for_sync(&peer_pubkey, entry.id(), db.root_id())
})?;

Benefits:

  • Direct access to Entry, Database, and Instance in callbacks
  • No need for context wrappers or trait implementations
  • Callbacks receive full context needed for sync decisions
  • Simple cloning pattern for use in closures
  • Easy testing and debugging

4. Modular Transport Layer with SyncHandler Architecture

Decision: Abstract transport layer with handler-based request processing

Core Interface:

pub trait SyncTransport: Send + Sync {
    /// Start server with handler for processing sync requests
    async fn start_server(&mut self, addr: &str, handler: Arc<dyn SyncHandler>) -> Result<()>;

    /// Send sync request and get response
    async fn send_request(&self, address: &Address, request: &SyncRequest) -> Result<SyncResponse>;
}

pub trait SyncHandler: Send + Sync {
    /// Process incoming sync requests with database access
    async fn handle_request(&self, request: &SyncRequest) -> SyncResponse;
}

Rationale:

  • Database Access: Handlers can store received entries via backend
  • Stateful Processing: Support GetTips, GetEntries, SendEntries operations
  • Clean Separation: Transport handles networking, handler handles sync logic
  • Flexibility: Support different network environments
  • Evolution: Easy to add new transport protocols
  • Testing: Mock transports for unit tests

Supported Transports:

HTTP Transport

pub struct HttpTransport {
    client: reqwest::Client,
    server: Option<HttpServer>,
    handler: Option<Arc<dyn SyncHandler>>,
}

Implementation:

  • Axum server with handler state injection
  • JSON serialization at /api/v0 endpoint
  • Handler processes requests with database access

Use cases:

  • Simple development and testing
  • Firewall-friendly environments
  • Integration with existing HTTP infrastructure

Trade-offs:

  • ✅ Widely supported and debuggable
  • ✅ Works through most firewalls/proxies
  • ✅ Full database access via handler
  • ❌ Less efficient than P2P protocols
  • ❌ Requires port management

Iroh P2P Transport

pub struct IrohTransport {
    endpoint: Option<Endpoint>,
    server_state: ServerState,
    handler: Option<Arc<dyn SyncHandler>>,
}

Implementation:

  • QUIC bidirectional streams for request/response
  • Handler integration in stream processing
  • JsonHandler for serialization consistency

Use cases:

  • Production deployments
  • NAT traversal required
  • Direct peer-to-peer communication

Trade-offs:

  • ✅ Efficient P2P protocol with NAT traversal
  • ✅ Built-in relay and hole punching
  • ✅ QUIC-based with modern networking features
  • ✅ Full database access via handler
  • ❌ More complex setup and debugging
  • ❌ Additional dependency

5. Persistent State Management

Decision: All peer and relationship state stored persistently in sync database

Architecture:

Sync Database (Persistent):
├── peers/{peer_pubkey} -> PeerInfo (addresses, status, metadata)
├── relationships/{peer}/{database} -> SyncRelationship
├── sync_state/cursors/{peer}/{database} -> SyncCursor
├── sync_state/metadata/{peer} -> SyncMetadata
└── sync_state/history/{sync_id} -> SyncHistoryEntry

BackgroundSync (Transient):
├── retry_queue: Vec<RetryEntry> (failed sends pending retry)
└── sync_tree_id: ID (reference to sync database for peer lookups)

Design:

  • All peer data is stored in the sync database via PeerManager
  • BackgroundSync reads peer information on-demand when needed
  • Frontend writes peer/relationship changes directly to sync database
  • Single source of truth in persistent storage

Rationale:

  • Durability: All critical state survives restarts
  • Consistency: Single source of truth in sync database
  • Recovery: Full state recovery after failures
  • Simplicity: No duplicate state management

Architecture Deep Dive

Component Interactions

graph LR
    subgraph "Change Detection"
        A[Transaction::commit] --> B[WriteCallbacks]
        B --> C[Sync::queue_entry_for_sync]
    end

    subgraph "Command Channel"
        C --> D[Command TX]
        D --> E[Command RX]
    end

    subgraph "BackgroundSync Thread"
        E --> F[BackgroundSync]
        F --> G[Transport Layer]
        G --> H[HTTP/Iroh/Custom]
        F --> I[Retry Queue]
        F -.->|reads| ST[Sync Database]
    end

    subgraph "State Management"
        K[SyncStateManager] --> L[Persistent State]
        F --> K
    end

    subgraph "Peer Management"
        M[PeerManager] --> N[Peer Registry]
        F --> M
    end

Data Flow Design

1. Entry Commit Flow

1. Application calls database.new_transaction().commit()
2. Transaction stores entry in backend
3. Transaction triggers write callbacks with Entry, Database, and Instance
4. Callback invokes sync.queue_entry_for_sync()
5. Sync sends QueueEntry command to BackgroundSync via channel
6. BackgroundSync fetches entry from backend
7. Entry sent immediately to peer via transport
8. Failed sends added to retry queue

2. Peer Connection Flow

1. Application calls sync.connect_to_peer(address)
2. Sync creates HandshakeRequest with device info
3. Transport sends handshake to peer
4. Peer responds with HandshakeResponse
5. Both peers verify signatures and protocol versions
6. Successful peers are registered in PeerManager
7. Connection state updated to Connected

3. Sync Relationship Flow

1. Application calls sync.add_tree_sync(peer_id, tree_id)
2. PeerManager stores relationship in sync database
3. Future commits to database trigger sync callbacks
4. Callbacks query relationships from sync database
5. Entries queued for sync with configured peers

BackgroundSync Command Management

Command Structure

The BackgroundSync engine processes commands sent from the frontend:

  • SendEntries: Direct entry transmission to peer
  • QueueEntry: Entry committed, needs sync
  • AddPeer/RemovePeer: Peer registry management
  • CreateRelationship: Database-peer sync mapping
  • StartServer/StopServer: Transport server control
  • ConnectToPeer: Establish peer connection
  • SyncWithPeer: Trigger bidirectional sync
  • Shutdown: Graceful termination

Processing Model

Immediate processing: Commands handled as received

  • No batching delays or queue buildup
  • Failed operations go to retry queue
  • Fire-and-forget for most operations
  • Request-response via oneshot channels when needed

Retry queue: Failed sends with exponential backoff

  • 2^attempts seconds delay (max 64s)
  • Configurable max attempts before dropping
  • Processed every 30 seconds by timer

Error Handling Strategy

Transient errors: Retry with exponential backoff

  • Network timeouts
  • Temporary peer unavailability
  • Transport-level failures

Persistent errors: Remove after max retries

  • Invalid peer addresses
  • Authentication failures
  • Protocol incompatibilities

Recovery mechanisms:

// Automatic retry tracking
entry.mark_attempted(Some(error.to_string()));

// Cleanup failed entries periodically
queue.cleanup_failed_entries(max_retries)?;

// Metrics for monitoring
let stats = queue.get_sync_statistics()?;

Transport Layer Design

Iroh Transport Configuration

Design Decision: Builder pattern for transport configuration

The Iroh transport uses a builder pattern to support different deployment scenarios:

RelayMode Options:

  • Default: Production deployments use n0's global relay infrastructure
  • Staging: Testing against n0's staging infrastructure
  • Disabled: Local testing without internet dependency
  • Custom: Enterprise deployments with private relay servers

Rationale:

  • Flexibility: Different environments need different configurations
  • Performance: Local tests run faster without relay overhead
  • Privacy: Enterprises can run private relay infrastructure
  • Simplicity: Defaults work for most users without configuration

Address Serialization:

The Iroh transport serializes NodeAddr information as JSON containing:

  • Node ID (cryptographic identity)
  • Direct socket addresses (for P2P connectivity)

This allows the same get_server_address() interface to work for both HTTP (returns socket address) and Iroh (returns rich connectivity info).

Security Design

Authentication Model

Device Identity:

  • Each database instance has an Ed25519 keypair
  • Public key serves as device identifier
  • Private key signs all sync operations

Peer Verification:

  • Handshake includes signature challenge
  • Both peers verify counterpart signatures
  • Only verified peers allowed to sync

Entry Authentication:

  • All entries signed by creating device
  • Receiving peer verifies signatures
  • Invalid signatures rejected

Trust Model

Assumptions:

  • Peers are semi-trusted (authenticated but may be malicious)
  • Private keys are secure
  • Transport layer provides integrity

Threat Mitigation:

  • Man-in-middle: Ed25519 signatures prevent tampering
  • Replay attacks: Entry IDs are content-based (no replays possible)
  • Denial of service: Rate limiting and queue size limits
  • Data corruption: Signature verification catches corruption

Protocol Security

Handshake Protocol:

A -> B: HandshakeRequest {
    device_id: string,
    public_key: ed25519_pubkey,
    challenge: random_bytes(32),
    signature: sign(private_key, challenge)
}

B -> A: HandshakeResponse {
    device_id: string,
    public_key: ed25519_pubkey,
    challenge_response: sign(private_key, original_challenge),
    counter_challenge: random_bytes(32)
}

A -> B: verify(B.public_key, challenge_response, challenge)
B -> A: verify(A.public_key, signature, challenge)

Bootstrap-First Protocol:

The sync protocol supports zero-state joining through automatic bootstrap detection:

# Bootstrap Scenario (client has no local database)
A -> B: SyncTreeRequest {
    tree_id: ID,
    our_tips: [] // Empty = bootstrap needed
}

B -> A: BootstrapResponse {
    tree_id: ID,
    root_entry: Entry,
    all_entries: Vec<Entry> // Complete database
}

# Incremental Scenario (client has database)
A -> B: SyncTreeRequest {
    tree_id: ID,
    our_tips: [tip1, tip2, ...] // Current tips
}

B -> A: IncrementalResponse {
    tree_id: ID,
    missing_entries: Vec<Entry>, // New changes for client
    their_tips: [tip1, tip2, ...] // Server's tips for bidirectional sync
}

# Bidirectional Completion (client sends missing entries to server)
A -> B: SendEntriesRequest {
    tree_id: ID,
    entries: Vec<Entry> // Entries server is missing
}

B -> A: SendEntriesResponse {
    success: bool
}

Design Benefits:

  • Unified API: Single request type handles both scenarios
  • Auto-detection: Server determines sync type from empty tips
  • Zero-configuration: No manual bootstrap setup required
  • Efficient: Only transfers necessary data (full or incremental)
  • True Bidirectional: Complete synchronization in single operation using existing protocol fields

Performance Considerations

Memory Usage

Queue sizing:

  • Default: 100 entries per peer × 100 bytes = 10KB per peer
  • Configurable limits prevent memory exhaustion
  • Automatic cleanup of failed entries

Persistent state:

  • Minimal: ~1KB per peer relationship
  • Periodic cleanup of old history entries
  • Efficient serialization formats

Network Efficiency

Batching benefits:

  • Reduce TCP/HTTP overhead
  • Better bandwidth utilization
  • Fewer transport-layer handshakes

Compression potential:

  • Similar entries share structure
  • JSON/binary format optimization
  • Transport-level compression (HTTP gzip, QUIC)

CPU Usage

Background worker:

  • Configurable check intervals
  • Async processing doesn't block application
  • Efficient queue scanning

Hook execution:

  • Fast in-memory operations only
  • Hook failures don't affect commits
  • Minimal serialization overhead

Configuration Design

Queue Configuration

pub struct SyncQueueConfig {
    pub max_queue_size: usize,      // Size-based flush trigger
    pub max_queue_age_secs: u64,    // Age-based flush trigger
    pub batch_size: usize,          // Max entries per network call
}

Tuning guidelines:

  • High-frequency apps: Lower max_queue_age_secs (5-15s)
  • Batch workloads: Higher max_queue_size (200-1000)
  • Low bandwidth: Lower batch_size (10-25)
  • High bandwidth: Higher batch_size (100-500)

Worker Configuration

pub struct SyncFlushConfig {
    pub check_interval_secs: u64,   // How often to check for flushes
    pub enabled: bool,              // Enable/disable background worker
}

Trade-offs:

  • Lower check_interval = more responsive, higher CPU
  • Higher check_interval = less responsive, lower CPU

Implementation Strategy

Phase 1: Core Infrastructure ✅

  • BackgroundSync engine with command pattern
  • Hook-based change detection
  • Basic peer management
  • HTTP transport
  • Ed25519 handshake protocol

Phase 2: Production Features ✅

  • Iroh P2P transport (handler needs fix)
  • Retry queue with exponential backoff
  • Sync state persistence via DocStore
  • Channel-based communication
  • 78 integration tests passing

Phase 3: Advanced Features

  • Sync priorities and QoS
  • Bandwidth throttling
  • Monitoring and metrics
  • Multi-database coordination

Phase 4: Scalability

  • Persistent queue spillover
  • Streaming for large entries
  • Advanced conflict resolution
  • Performance analytics

Testing Strategy

Unit Testing

Component isolation:

  • Mock transport layer for networking tests
  • In-memory backends for storage tests
  • Deterministic time for age-based tests

Coverage targets:

  • Queue operations: 100%
  • Hook execution: 100%
  • Error handling: 95%
  • State management: 95%

Integration Testing

Multi-peer scenarios:

  • 2-peer bidirectional sync
  • 3+ peer mesh networks
  • Database sync relationship management
  • Network failure recovery

Performance testing:

  • Large queue handling
  • High-frequency updates
  • Memory usage under load
  • Network efficiency measurement

End-to-End Testing

Real network conditions:

  • Simulated network failures
  • High latency connections
  • Bandwidth constraints
  • Concurrent peer connections

Migration and Compatibility

Backward Compatibility

Protocol versioning:

  • Version negotiation in handshake
  • Graceful degradation for older versions
  • Clear upgrade paths

Data format evolution:

  • Extensible serialization formats
  • Schema migration strategies
  • Rollback procedures

Deployment Considerations

Configuration migration:

  • Default configuration for new installations
  • Migration scripts for existing data
  • Validation of configuration parameters

Operational procedures:

  • Health check endpoints
  • Monitoring integration
  • Log aggregation and analysis

Future Evolution

Planned Enhancements

  1. Selective sync: Per-store sync control
  2. Conflict resolution: Advanced merge strategies
  3. Performance: Compression and protocol optimization
  4. Monitoring: Rich metrics and observability
  5. Scalability: Large-scale deployment support

Research Areas

  1. Byzantine fault tolerance: Handle malicious peers
  2. Incentive mechanisms: Economic models for sync
  3. Privacy: Encrypted sync protocols
  4. Consensus: Distributed agreement protocols
  5. Sharding: Horizontal scaling techniques

Success Metrics

Performance Targets

  • Queue latency: < 1ms for queue operations
  • Sync latency: < 5s for small changes in normal conditions
  • Throughput: > 1000 entries/second per peer
  • Memory usage: < 10MB for 100 active peers

Reliability Targets

  • Availability: 99.9% sync success rate
  • Recovery: < 30s to resume after network failure
  • Consistency: 100% eventual consistency (no data loss)
  • Security: 0 known authentication bypasses

Usability Targets

  • Setup time: < 5 minutes for basic configuration
  • Documentation: Complete API and troubleshooting guides
  • Error messages: Clear, actionable error descriptions
  • Monitoring: Built-in observability for operations teams

Settings Storage Design

Overview

This document describes how Eidetica stores, retrieves, and tracks settings in databases. Settings are stored exclusively in the _settings store and tracked via entry metadata for efficient access.

Architecture

Settings Storage

Settings are stored in the _settings store (constant SETTINGS in constants.rs):

// Settings structure in _settings store
{
    "auth": {
        "key_name": {
            "key": "...",           // Public key
            "permissions": "...",   // Permission level
            "status": "..."         // Active/Revoked
        }
    }
    // Future: tree_config, replication, etc.
}

Key Properties:

  • Data Type: Doc CRDT for deterministic merging
  • Location: Exclusively in _settings store
  • Access: Through Transaction::get_settings() method

Settings Retrieval

Settings can be accessed through two primary interfaces:

SettingsStore provides a type-safe, high-level interface for settings management:

use eidetica::store::SettingsStore;

// Create a SettingsStore from a transaction
let settings_store = transaction.get_settings()?;

// Type-safe access to common settings
let database_name = settings_store.get_name()?;
let auth_settings = settings_store.get_auth_settings()?;

Transaction API

Transaction::get_settings() returns a SettingsStore that handles:

  • Historical state: Computed from all relevant entries in the database
  • Staged changes: Any modifications to _settings in the current transaction

Entry Metadata

Every entry includes metadata tracking settings state:

#[derive(Debug, Clone, Serialize, Deserialize)]
struct EntryMetadata {
    /// Tips of the _settings store at the time this entry was created
    settings_tips: Vec<ID>,
    /// Random entropy for ensuring unique IDs for root entries
    entropy: Option<u64>,
}

Metadata Properties:

  • Automatically populated by Transaction::commit()
  • Used for efficient settings validation in sparse checkouts
  • Stored in TreeNode.metadata field as serialized JSON

SettingsStore API

Overview

SettingsStore provides a specialized, type-safe interface for managing the _settings subtree. It wraps DocStore to offer convenient methods for common settings operations while maintaining proper CRDT semantics and transaction boundaries.

Key Benefits

  • Type Safety: Eliminates raw CRDT manipulation for common operations
  • Convenience: Direct methods for authentication key management
  • Atomicity: Closure-based updates ensure atomic multi-step operations
  • Validation: Built-in validation for authentication configurations
  • Abstraction: Hides implementation details while providing escape hatch via as_doc_store()

Primary Methods

impl SettingsStore {
    // Core settings management
    fn get_name(&self) -> Result<String>;
    fn set_name(&self, name: &str) -> Result<()>;

    // Authentication key management
    fn set_auth_key(&self, key_name: &str, key: AuthKey) -> Result<()>;
    fn get_auth_key(&self, key_name: &str) -> Result<AuthKey>;
    fn revoke_auth_key(&self, key_name: &str) -> Result<()>;

    // Complex operations via closure
    fn update_auth_settings<F>(&self, f: F) -> Result<()>
    where F: FnOnce(&mut AuthSettings) -> Result<()>;

    // Advanced access
    fn as_doc_store(&self) -> &DocStore;
    fn validate_entry_auth(&self, sig_key: &SigKey, instance: Option<&Instance>) -> Result<ResolvedAuth>;
}

Data Structures

Entry Structure

pub struct Entry {
    database: TreeNode,              // Main database node with metadata
    stores: Vec<SubTreeNode>,  // Named stores including _settings
    sig: SigInfo,                // Signature information
}

TreeNode Structure

struct TreeNode {
    pub root: ID,                   // Root entry ID of the database
    pub parents: Vec<ID>,           // Parent entry IDs in main database history
    pub metadata: Option<RawData>,  // Structured metadata (settings tips, entropy)
}

Note: TreeNode no longer contains a data field - all data is stored in named stores.

SubTreeNode Structure

struct SubTreeNode {
    pub name: String,        // Store name (e.g., "_settings")
    pub parents: Vec<ID>,    // Parent entries in store history
    pub data: RawData,       // Serialized store data
}

Authentication Settings

Authentication configuration is stored in _settings.auth:

AuthSettings Structure

pub struct AuthSettings {
    inner: Doc,  // Wraps Doc data from _settings.auth
}

Key Operations:

  • add_key(): Add/update authentication keys
  • revoke_key(): Mark keys as revoked
  • get_key(): Retrieve specific keys
  • get_all_keys(): Get all authentication keys

Authentication Flow

  1. Settings Access: Transaction::get_settings() retrieves current auth configuration
  2. Key Resolution: AuthValidator resolves key names to full key information
  3. Permission Check: Validates operation against key permissions
  4. Signature Verification: Verifies entry signatures match configured keys

Usage Patterns

Reading Settings

// In a Transaction context
let settings_store = transaction.get_settings()?;

// Access database name
let name = settings_store.get_name()?;

// Access auth configuration
let auth_settings = settings_store.get_auth_settings()?;

Modifying Settings

Using SettingsStore

use eidetica::store::SettingsStore;
use eidetica::auth::{AuthKey, Permission};

// Get a SettingsStore handle for type-safe operations
let settings_store = transaction.get_settings()?;

// Update database name
settings_store.set_name("My Database")?;

// Set authentication keys with validation (upsert behavior)
let auth_key = AuthKey::active(
    "ed25519:user_public_key",
    Permission::Write(10),
)?;
settings_store.set_auth_key("alice", auth_key)?;

// Perform complex auth operations atomically
settings_store.update_auth_settings(|auth| {
    auth.overwrite_key("bob", bob_key)?;
    auth.revoke_key("old_user")?;
    Ok(())
})?;

// Commit the transaction
transaction.commit()?;

Using DocStore Directly (Low-Level)

// Get a DocStore handle for the _settings store
let mut settings_store = transaction.get_store::<DocStore>("_settings")?;

// Update a setting
settings_store.set("name", "My Database")?;

// Commit the transaction
transaction.commit()?;

Bootstrap Process

When creating a database with authentication:

  1. First entry includes auth configuration in _settings.auth
  2. Transaction::commit() detects bootstrap scenario
  3. Allows self-signed entry to establish initial auth configuration

Design Benefits

  1. Single Source of Truth: All settings in _settings store
  2. CRDT Semantics: Deterministic merge resolution for concurrent updates
  3. Efficient Access: Metadata tips enable quick settings retrieval
  4. Clean Architecture: Entry is pure data, Transaction handles business logic
  5. Extensibility: Easy to add new setting categories alongside auth

Implementation Status: 🔵 Proposed

Users System

This design document outlines a comprehensive multi-user system for Eidetica that provides user isolation, password-based authentication, and per-user key management.

Problem Statement

The current implementation has no concept of users:

  1. No User Isolation: All keys and settings are stored at the Instance level, shared across all operations.

  2. No Authentication: There's no way to protect access to private keys or restrict database operations to specific users.

  3. No Multi-User Support: Only one implicit "user" can work with an Instance at a time.

  4. Key Management Challenges: All private keys are accessible to anyone with Instance access, with no encryption or access control.

  5. No User Preferences: Users cannot have personalized settings for which databases they care about, sync preferences, etc.

Goals

  1. Unified Architecture: Single implementation that supports both embedded (single-user ergonomics) and server (multi-user) use cases.

  2. Multi-User Support: Multiple users can have accounts on a single Instance, each with isolated keys and preferences.

  3. Password-Based Authentication: Users authenticate with passwords to access their keys and perform operations.

  4. User Isolation: Each user's private keys and preferences are encrypted and isolated from other users.

  5. Root User: A special system user that the Instance uses for infrastructure operations.

  6. User Preferences: Users can configure which databases they care about and how they want to sync them.

  7. Database Tracking: Instance-wide visibility into which databases exist and which users access them.

  8. Ergonomic APIs: Simple single-user API for embedded apps, explicit multi-user API for servers (both build on same foundation).

Non-Goals

  1. Multi-Factor Authentication: Advanced auth methods deferred to future work.
  2. Role-Based Access Control: Complex permission systems beyond user isolation are out of scope.
  3. User Groups: Team/organization features are not included.
  4. Federated Identity: External identity providers are not addressed.

Proposed Solution

Architecture Overview

The system separates infrastructure management (Instance) from contextual operations (User):

Instance (Infrastructure Layer)
├── Backend Storage (local only, not in databases)
│   └── _device_key (SigningKey for Instance identity)
│
├── System Databases (separate databases, authenticated with _device_key)
│   ├── _instance
│   │   └── Instance configuration and metadata
│   ├── _users (Table with UUID primary keys)
│   │   └── User directory: Maps UUID → UserInfo (username stored in UserInfo)
│   ├── _databases
│   │   └── Database tracking: Maps database_id → DatabaseTracking
│   └── _sync
│       └── Sync configuration and bootstrap requests
│
└── User Management
    ├── User creation (with or without password)
    └── User login (returns User session)

User (Operations Layer - returned from login)
├── User session with decrypted keys
├── Database operations (new, load, find)
├── Key management (add, list, get)
└── User preferences

Key Architectural Principle: Instance handles infrastructure (user accounts, backend, system databases). User handles all contextual operations (database creation, key management). All operations run in a User context after login.

Core Data Structures

1. UserInfo (stored in _users database)

Storage: Users are stored in a Table with auto-generated UUID primary keys. The username field is used for login lookups via search operations.

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct UserInfo {
    /// Unique username (login identifier)
    /// Note: Stored with UUID primary key in Table, username used for search
    pub username: String,

    /// ID of the user's private database
    pub user_database_id: ID,

    /// Password hash (using Argon2id)
    /// None for passwordless users (single-user embedded mode)
    pub password_hash: Option<String>,

    /// Salt for password hashing (base64 encoded string)
    /// None for passwordless users (single-user embedded mode)
    pub password_salt: Option<String>,

    /// User account creation timestamp
    pub created_at: u64,

    /// Last login timestamp
    pub last_login: Option<u64>,

    /// Account status
    pub status: UserStatus,
}

#[derive(Clone, Debug, Serialize, Deserialize)]
pub enum UserStatus {
    Active,
    Disabled,
    Locked,
}

2. UserProfile (stored in user's private database _settings subtree)

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct UserProfile {
    /// Username
    pub username: String,

    /// Display name
    pub display_name: Option<String>,

    /// Email or other contact info
    pub contact_info: Option<String>,

    /// User preferences
    pub preferences: UserPreferences,
}

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct UserPreferences {
    /// Default sync behavior
    pub default_sync_enabled: bool,

    /// Other user-specific settings
    pub properties: HashMap<String, String>,
}

3. UserKey (stored in user's private database keys subtree)

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct UserKey {
    /// Key identifier (typically the base64-encoded public key string)
    pub key_id: String,

    /// Private key bytes (encrypted or unencrypted based on encryption field)
    pub private_key_bytes: Vec<u8>,

    /// Encryption metadata
    pub encryption: KeyEncryption,

    /// Display name for this key
    pub display_name: Option<String>,

    /// When this key was created
    pub created_at: u64,

    /// Last time this key was used
    pub last_used: Option<u64>,

    /// Database-specific SigKey mappings
    /// Maps: Database ID → SigKey used in that database's auth settings
    pub database_sigkeys: HashMap<ID, String>,
}

#[derive(Clone, Debug, Serialize, Deserialize)]
#[serde(tag = "type", rename_all = "lowercase")]
pub enum KeyEncryption {
    /// Key is encrypted with password-derived key
    Encrypted {
        /// Encryption nonce/IV (12 bytes for AES-GCM)
        nonce: Vec<u8>,
    },
    /// Key is stored unencrypted (passwordless users only)
    Unencrypted,
}

4. UserDatabasePreferences (stored in user's private database databases Table)

Purpose: Tracks which databases a user cares about and their per-user sync preferences. The User tracks preferences (what the user wants), while the Sync module tracks status (what's happening). This separation allows multiple users with different sync preferences to sync the same database in a single Instance.

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct UserDatabasePreferences {
    /// Database ID being tracked
    pub database_id: ID,

    /// Which user key to use for this database
    pub key_id: String,

    /// User's sync preferences for this database
    pub sync_settings: SyncSettings,

    /// When user added this database
    pub added_at: i64,
}

#[derive(Clone, Debug, Serialize, Deserialize, Default)]
pub struct SyncSettings {
    /// Whether user wants to sync this database
    pub sync_enabled: bool,

    /// Sync on commit
    pub sync_on_commit: bool,

    /// Sync interval (if periodic)
    pub interval_seconds: Option<u64>,

    /// Additional sync configuration
    pub properties: HashMap<String, String>,
}

#[derive(Clone, Debug)]
pub struct DatabasePreferences {
    /// Database ID to add/update
    pub database_id: ID,

    /// Which user key to use for this database
    pub key_id: String,

    /// Sync settings for this database
    pub sync_settings: SyncSettings,
}

Design Notes:

  • SigKey Discovery: When adding a database via add_database(), the system automatically discovers which SigKey the user can use via Database::find_sigkeys(), selecting the highest-permission SigKey available. The discovered SigKey is stored in UserKey.database_sigkeys HashMap.

  • Separation of Concerns: The key_id in UserDatabasePreferences references the user's key, while the actual SigKey mapping is stored in UserKey.database_sigkeys. This allows the same key to use different SigKeys in different databases.

  • Sync Settings vs Sync Status: User preferences indicate what the user wants (sync_enabled, sync_on_commit), while the Sync module tracks actual sync status (last_synced, connection state). Multiple users can have different preferences for the same database.

5. DatabaseTracking (stored in _databases table)

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct DatabaseTracking {
    /// Database ID (this is the key in the table)
    pub database_id: ID,

    /// Cached database name (for quick lookup)
    pub name: Option<String>,

    /// Users who have this database in their preferences
    pub users: Vec<String>,

    /// Database creation time
    pub created_at: u64,

    /// Last modification time
    pub last_modified: u64,

    /// Additional metadata
    pub metadata: HashMap<String, String>,
}

System Databases

The Instance manages four separate system databases, all authenticated with _device_key:

_instance System Database

  • Type: Separate database
  • Purpose: Instance configuration and management
  • Structure: Configuration settings, metadata, system policies
  • Authentication: _device_key as Admin; admin users can be granted access
  • Access: Admin users have Admin permission, regular users have Read permission
  • Created: On Instance initialization

_users System Database

  • Type: Separate database
  • Purpose: User directory and authentication
  • Structure: Table with UUID primary keys, stores UserInfo (username field for login lookups)
  • Authentication: _device_key as Admin
  • Access: Admin users can manage users
  • Created: On Instance initialization
  • Note: Username uniqueness enforced at application layer via search; see Race Conditions section

_databases System Database

  • Type: Separate database
  • Purpose: Instance-wide database registry and optimization
  • Structure: Table mapping database_id → DatabaseTracking
  • Authentication: _device_key as Admin
  • Maintenance: Updated when users add/remove databases from preferences
  • Benefits: Fast discovery of databases, see which users care about each DB
  • Created: On Instance initialization

_sync System Database

  • Type: Separate database (existing)
  • Purpose: Synchronization configuration and bootstrap request management
  • Structure: Various subtrees for sync settings, peer info, bootstrap requests
  • Authentication: _device_key as Admin
  • Access: Managed by Instance and Sync module
  • Created: When sync is enabled via Instance::enable_sync()

Instance Identity vs User Management

The Instance identity is separate from user management:

Instance Identity

The Instance uses _device_key for its identity:

  • Storage: Stored in backend (local storage, not in any database)
  • Purpose: Instance sync identity and system database authentication
  • Access: Available to Instance on startup (no password required)
  • Usage: Used to authenticate to all system databases as Admin

User Management

Users are created by administrators or self-registration:

#![allow(unused)]
fn main() {
/// Users authenticate with passwords
/// Each has isolated key storage and preferences
/// Must login to perform operations
}

User Lifecycle:

  1. Created via Instance::create_user() by an admin
  2. User logs in via Instance::login_user()
  3. User session provides access to keys and preferences
  4. User logs out via User::logout()

Library Architecture Layers

The library separates infrastructure (Instance) from contextual operations (User):

Instance Layer: Infrastructure Management

Instance manages the multi-user infrastructure and system resources:

Initialization:

  1. Load or generate _device_key from backend
  2. Create system databases (_instance, _users, _databases) authenticated with _device_key
  3. Initialize Instance with backend and system databases

Responsibilities:

  • User account management (create, login)
  • System database maintenance
  • Backend coordination
  • Database tracking

Key Points:

  • Instance is always multi-user underneath
  • No direct database or key operations
  • All operations require a User session

User Layer: Contextual Operations

User represents an authenticated session with decrypted keys:

Creation:

  • Returned from Instance::login_user(username, Option<password>)
  • Contains decrypted private keys in memory
  • Has access to user's preferences and database mappings

Responsibilities:

  • Database operations (create_database, open_database, find_database)
  • Key management (add_private_key, list_keys, get_signing_key)
  • Database preferences
  • Bootstrap approval

Key Points:

  • All database creation and key management happens through User
  • Keys are zeroized on logout or drop
  • Clean separation between users

Passwordless Users

For embedded/single-user scenarios, users can be created without passwords:

Creation:

// Create passwordless user
instance.create_user("alice", None)?;

// Login without password
let user = instance.login_user("alice", None)?;

// Use User API normally
let db = user.new_database(settings)?;

Characteristics:

  • No authentication overhead
  • Keys stored unencrypted in user database
  • Perfect for embedded apps, CLI tools, single-user deployments
  • Still uses full User API for operations

Password-Protected Users

For multi-user scenarios, users have password-based authentication:

Creation:

// Create password-protected user
instance.create_user("bob", Some("password123"))?;

// Login with password verification
let user = instance.login_user("bob", Some("password123"))?;

// Use User API normally
let db = user.new_database(settings)?;

Characteristics:

  • Argon2id password hashing
  • AES-256-GCM key encryption
  • Perfect for servers, multi-tenant applications
  • Clear separation between users

Instance API

Instance manages infrastructure and user accounts:

Initialization

impl Instance {
    /// Create instance
    /// - Loads/generates _device_key from backend
    /// - Creates system databases (_instance, _users, _databases)
    pub fn open(backend: Box<dyn BackendImpl>) -> Result<Self>;
}

User Management

impl Instance {
    /// Create a new user account
    /// Returns user_uuid (the generated primary key)
    pub fn create_user(
        &self,
        username: &str,
        password: Option<&str>,
    ) -> Result<String>;

    /// Login a user (returns User session object)
    /// Searches by username; errors if duplicate usernames detected
    pub fn login_user(
        &self,
        username: &str,
        password: Option<&str>,
    ) -> Result<User>;

    /// List all users (returns usernames)
    pub fn list_users(&self) -> Result<Vec<String>>;

    /// Disable a user account
    pub fn disable_user(&self, username: &str) -> Result<()>;
}

User API

/// User session object, returned after successful login
///
/// Represents an authenticated user with decrypted private keys loaded in memory.
/// All contextual operations (database creation, key management) happen through User.
pub struct User {
    user_uuid: String,   // Stable internal UUID (Table primary key)
    username: String,    // Username (login identifier)
    user_database: Database,
    instance: WeakInstance,  // Weak reference to Instance for storage access
    /// Decrypted user keys (in memory only during session)
    key_manager: UserKeyManager,
}

impl User {
    /// Get the internal user UUID (stable identifier)
    pub fn user_uuid(&self) -> &str;

    /// Get the username (login identifier)
    pub fn username(&self) -> &str;

    // === Database Operations ===

    /// Create a new database in this user's context
    pub fn create_database(&self, settings: Doc, signing_key: &str) -> Result<Database>;

    /// Load a database using this user's keys
    pub fn open_database(&self, database_id: &ID) -> Result<Database>;

    /// Find databases by name
    pub fn find_database(&self, name: impl AsRef<str>) -> Result<Vec<Database>>;

    /// Find the best key for accessing a database

    /// Get the SigKey mapping for a key in a specific database
    pub fn key_mapping(
        &self,
        key_id: &str,
        database_id: &ID,
    ) -> Result<Option<String>>;

    /// Add a SigKey mapping for a key in a specific database
    pub fn map_key(
        &mut self,
        key_id: &str,
        database_id: &ID,
        sigkey: &str,
    ) -> Result<()>;

    // === Database Tracking and Preferences ===

    /// Add a database to this user's tracked databases with auto-discovery of SigKeys.
    pub fn add_database(
        &mut self,
        prefs: DatabasePreferences,
    ) -> Result<()>;

    /// List all databases this user is tracking.
    pub fn list_database_prefs(&self) -> Result<Vec<UserDatabasePreferences>>;

    /// Get the preferences for a specific database.
    pub fn database_prefs(
        &self,
        database_id: &ID,
    ) -> Result<UserDatabasePreferences>;

    /// Set/update preferences for a database (upsert behavior).
    /// Alias for add_database.
    pub fn set_database(
        &mut self,
        prefs: DatabasePreferences,
    ) -> Result<()>;

    /// Remove a database from this user's tracked databases.
    pub fn remove_database(&mut self, database_id: &ID) -> Result<()>;

    // === Key Management ===

    /// Generate a new private key for this user
    pub fn add_private_key(
        &mut self,
        display_name: Option<&str>,
    ) -> Result<String>;

    /// List all key IDs owned by this user
    pub fn list_keys(&self) -> Result<Vec<String>>;

    /// Get a signing key by its ID
    pub fn get_signing_key(&self, key_id: &str) -> Result<SigningKey>;

    // === Session Management ===

    /// Logout (clears decrypted keys from memory)
    pub fn logout(self) -> Result<()>;
}

UserKeyManager (Internal)

/// Internal key manager that holds decrypted keys during user session
struct UserKeyManager {
    /// Decrypted keys (key_id → SigningKey)
    decrypted_keys: HashMap<String, SigningKey>,

    /// Key metadata (loaded from user database)
    key_metadata: HashMap<String, UserKey>,

    /// User's password-derived encryption key (for saving new keys)
    encryption_key: Vec<u8>,
}

See key_management.md for detailed implementation.

User Flows

User Creation Flow

Password-Protected User:

  1. Admin calls instance.create_user(username, Some(password))
  2. System searches _users Table for existing username (race condition possible)
  3. System hashes password with Argon2id and random salt
  4. Generates default Ed25519 keypair for the user (kept in memory only)
  5. Retrieves instance _device_key public key from backend
  6. Creates user database with authentication for both _device_key (Admin) and user's key (Admin)
  7. Encrypts user's private key with password-derived key (AES-256-GCM)
  8. Stores encrypted key in user database keys Table (using public key as identifier, signed with _device_key)
  9. Creates UserInfo and inserts into _users Table (auto-generates UUID primary key)
  10. Returns user_uuid

Passwordless User:

  1. Admin calls instance.create_user(username, None)
  2. System searches _users Table for existing username (race condition possible)
  3. Generates default Ed25519 keypair for the user (kept in memory only)
  4. Retrieves instance _device_key public key from backend
  5. Creates user database with authentication for both _device_key (Admin) and user's key (Admin)
  6. Stores unencrypted private key in user database keys Table (marked as Unencrypted)
  7. Creates UserInfo with None for password fields and inserts into _users Table
  8. Returns user_uuid

Note: For password-protected users, the keypair is never stored unencrypted in the backend. For passwordless users, keys are stored unencrypted for instant access. The user database is authenticated with both the instance _device_key (for admin operations) and the user's default key (for user ownership). Initial entries are signed with _device_key.

Login Flow

Password-Protected User:

  1. User calls instance.login_user(username, Some(password))
  2. System searches _users Table by username
  3. If multiple users with same username found, returns DuplicateUsersDetected error
  4. Verifies password against stored hash
  5. Loads user's private database
  6. Loads encrypted keys from user database
  7. Derives encryption key from password
  8. Decrypts all private keys
  9. Creates UserKeyManager with decrypted keys
  10. Updates last_login timestamp in _users Table (using UUID)
  11. Returns User session object (contains both user_uuid and username)

Passwordless User:

  1. User calls instance.login_user(username, None)
  2. System searches _users Table by username
  3. If multiple users with same username found, returns DuplicateUsersDetected error
  4. Verifies UserInfo has no password (password_hash and password_salt are None)
  5. Loads user's private database
  6. Loads unencrypted keys from user database
  7. Creates UserKeyManager with keys (no decryption needed)
  8. Returns User session object (contains both user_uuid and username)

Database Creation Flow

  1. User obtains User session via login
  2. User creates database settings (Doc with name, etc.)
  3. Calls user.new_database(settings)
  4. System selects first available signing key from user's keyring
  5. Creates database using Database::new() for root entry creation
  6. Stores database_sigkeys mapping in UserKey for future loads
  7. Returns Database object
  8. User can now create transactions and perform operations on the database

Database Access Flow

The user accesses databases through the User.open_database() method, which handles all key management automatically:

  1. User calls user.open_database(&database_id)
  2. System finds appropriate key via find_key()
    • Checks user's key metadata for SigKey mappings to this database
    • Verifies keys are authorized in database's auth settings
    • Selects key with highest permission level
  3. System retrieves decrypted SigningKey from UserKeyManager
  4. System gets SigKey mapping via key_mapping()
  5. System loads Database with Database::open()
    • Database stores KeySource::Provided with signing key and sigkey
  6. User creates transactions normally: database.new_transaction()
    • Transaction automatically receives provided key from Database
    • No backend key lookup required
  7. User performs operations and commits
    • Transaction uses provided SigningKey directly during commit()

Key Insight: Once a Database is loaded via User.open_database(), all subsequent operations transparently use the user's keys. The user doesn't need to think about key management - it's handled at database load time.

Key Addition Flow

Password-Protected User:

  1. User calls user.add_private_key(display_name)
  2. System generates new Ed25519 keypair
  3. Encrypts private key with user's password-derived key (AES-256-GCM)
  4. Creates UserKey metadata with Encrypted variant
  5. Stores encrypted key in user database
  6. Adds to in-memory UserKeyManager
  7. Returns key_id

Passwordless User:

  1. User calls user.add_private_key(display_name)
  2. System generates new Ed25519 keypair
  3. Creates UserKey metadata with Unencrypted variant
  4. Stores unencrypted key in user database
  5. Adds to in-memory UserKeyManager
  6. Returns key_id

Bootstrap Integration

The Users system integrates with the bootstrap protocol for access control:

  • User Authentication: Bootstrap requests approved by logged-in users
  • Permission Checking: Only users with a key that has Admin permission for the database can approve bootstrap requests
  • Key Discovery: User's key manager finds appropriate Admin key for database
  • Transaction Creation: Uses user's Admin key SigKey to add requesting key to database auth

See bootstrap.md for detailed bootstrap protocol and wildcard permissions.

Integration with Key Management

The key management design (see key_management.md) provides the technical implementation details for:

  1. Password-Derived Encryption: How user passwords are used to derive encryption keys for private key storage
  2. Key Encryption Format: Specific encryption algorithms and formats used
  3. Database ID → SigKey Mapping: Technical structure and storage
  4. Key Discovery Algorithms: How keys are matched to databases and permissions

The Users system provides the architectural context:

  • Who owns keys (users)
  • How keys are isolated (user databases)
  • When keys are decrypted (during user session)
  • How keys are managed (User API)

Security Considerations

Password Security

  1. Password Hashing: Use Argon2id for password hashing with appropriate parameters
  2. Random Salts: Each user has a unique random salt
  3. No Password Storage: Only hashes stored, never plaintext
  4. Rate Limiting: Login attempts should be rate-limited

Key Encryption

  1. Password-Derived Keys: Use PBKDF2 or Argon2 to derive encryption keys from passwords
  2. Authenticated Encryption: Use AES-GCM or ChaCha20-Poly1305
  3. Unique Nonces: Each encrypted key has a unique nonce/IV
  4. Memory Security: Clear decrypted keys from memory on logout

User Isolation

  1. Database-Level Isolation: Each user's private database is separate
  2. Access Control: Users cannot access other users' databases or keys
  3. Authentication Required: All user operations require valid session
  4. Session Timeouts: Consider implementing session expiration

Instance Identity Protection

  1. Backend Security: _device_key stored in backend with appropriate file permissions
  2. Limited Exposure: _device_key only used for system database authentication
  3. Audit Logging: Log Instance-level operations on system databases
  4. Key Rotation: Support rotating _device_key (requires updating all system databases)

Known Limitations

Username Uniqueness Race Condition

Issue: Username uniqueness is enforced at the application layer using search-then-insert operations, which creates a race condition in distributed/concurrent scenarios.

Current Behavior:

  • create_user() searches for existing username, then inserts if not found
  • Two concurrent creates with same username can both succeed
  • Results in multiple UserInfo records with same username but different UUIDs

Detection:

  • login_user() searches by username
  • If multiple matches found, returns UserError::DuplicateUsersDetected
  • Prevents login until conflict is resolved manually

Performance Implications

  1. Login Cost: Password hashing and key decryption add latency to login (acceptable)
  2. Memory Usage: Decrypted keys held in memory during session
  3. Database Tracking: O(1) lookup for database metadata and user lists (via UUID primary key)
  4. Username Lookup: O(n) search for username validation/login (where n = total users)
  5. Key Discovery: O(n) where n = number of user's keys (typically small)

Implementation Strategy

Phase 1: Core User Infrastructure

  1. Define data structures (UserInfo, UserProfile, UserKey, etc.)
  2. Implement password hashing and verification
  3. Implement key encryption/decryption
  4. Create _instance system database
  5. Create _users system database
  6. Create _databases tracking table
  7. Unit tests for crypto and data structures

Phase 2: User Management API

  1. Implement Instance::create_user()
  2. Implement Instance::login_user()
  3. Implement User struct and basic methods
  4. Implement UserKeyManager
  5. Integration tests for user creation and login

Phase 3: Key Management Integration

  1. Implement User::add_private_key()
  2. Implement User::set_database_sigkey()
  3. Implement key discovery methods
  4. Update Transaction to work with User sessions
  5. Tests for key operations

Phase 4: Database Preferences

  1. Implement database preference storage
  2. Implement database tracking updates
  3. Implement preference query APIs
  4. Tests for preference management

Phase 5: Migration and Integration

  1. Update existing code to work with Users
  2. Provide migration utilities for existing instances
  3. Update documentation and examples
  4. End-to-end integration tests

Future Work

  1. Multi-Factor Authentication: Add support for TOTP, hardware keys
  2. User Groups/Roles: Team collaboration features
  3. Permission Delegation: Allow users to delegate access to specific databases
  4. Key Recovery: Secure key recovery mechanisms
  5. Session Management: Advanced session features (multiple devices, revocation)
  6. Audit Logs: Comprehensive logging of user operations
  7. User Quotas: Storage and database limits per user

Conclusion

The Users system provides a clean separation between infrastructure (Instance) and contextual operations (User):

Core Architecture:

  • Instance manages infrastructure: user accounts, backend, system databases
  • User handles all contextual operations: database creation, key management
  • Separate system databases (_instance, _users, _databases, _sync)
  • Instance identity (_device_key) stored in backend for system database authentication
  • Strong isolation between users

User Types:

  • Passwordless Users: Optional password support enables instant login without authentication overhead, perfect for embedded apps
  • Password-Protected Users: Argon2id password hashing and AES-256-GCM key encryption for multi-user scenarios

Key Benefits:

  • Clean separation: Instance = infrastructure, User = operations
  • All operations run in User context after login
  • Flexible authentication: users can have passwords or not
  • Instance restart just loads _device_key from backend

Implementation Status: 🔵 Proposed

Key Management Technical Details

This design document describes the technical implementation of key storage, encryption, and discovery within the Eidetica Users system. For the overall architecture and user-centric key management, see users.md.

Overview

Keys in Eidetica are managed at the user level. Each user owns a set of private keys that are:

  • Encrypted with the user's password
  • Stored in the user's private database
  • Mapped to specific SigKeys in different databases
  • Decrypted only during active user sessions

Problem Statement

Key management requires solving several technical challenges:

  1. Secure Storage: Private keys must be encrypted at rest
  2. Password-Derived Encryption: Encryption keys derived from user passwords
  3. SigKey Mapping: Same key can be known by different SigKeys in different databases
  4. Key Discovery: Finding which key to use for a given database operation
  5. Memory Security: Clearing sensitive data after use

Technical Components

Password-Derived Key Encryption

Algorithm: Argon2id for key derivation, AES-256-GCM for encryption

Argon2id Parameters:

  • Memory cost: 64 MiB minimum
  • Time cost: 3 iterations minimum
  • Parallelism: 4 threads
  • Output: 32 bytes for AES-256

Encryption Process:

  1. Derive 256-bit encryption key from password using Argon2id
  2. Generate random 12-byte nonce for AES-GCM
  3. Serialize private key to bytes
  4. Encrypt with AES-256-GCM
  5. Store ciphertext and nonce

Decryption Process:

  1. Derive encryption key from password (same parameters)
  2. Decrypt ciphertext using nonce and encryption key
  3. Deserialize bytes back to SigningKey

Key Storage Format

Keys are stored in the user's private database in the keys subtree as a Table:

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct UserKey {
    /// Local key identifier (public key string or hardcoded name)
    /// Examples: "ed25519:ABC123..." or "_device_key"
    pub key_id: String,

    /// Encrypted private key bytes (encrypted with user password-derived key)
    pub encrypted_private_key: Vec<u8>,

    /// Nonce/IV used for encryption (12 bytes for AES-GCM)
    pub nonce: Vec<u8>,

    /// Display name for UI/logging
    pub display_name: Option<String>,

    /// Unix timestamp when key was created
    pub created_at: u64,

    /// Unix timestamp when key was last used for signing
    pub last_used: Option<u64>,

    /// Database-specific SigKey mappings
    /// Maps: Database ID → SigKey string
    pub database_sigkeys: HashMap<ID, String>,
}

Storage Location: User database → keys subtree → Table

Table Key: The key_id field (not stored in struct, used as table key)

SigKey Mapping

A key can be known by different SigKeys in different databases:

Local Key: "ed25519:ABC123..."
├── Database A: SigKey "alice"
├── Database B: SigKey "admin"
└── Database C: SigKey "alice_laptop"

Mapping Storage: The database_sigkeys HashMap in UserKey stores these mappings as database_id → sigkey_string.

Lookup: When creating a transaction, retrieve the appropriate SigKey from the mapping using the database ID.

Database Access Index

To efficiently find which keys can access a database, we build a reverse index from database auth settings:

/// Built by reading _settings.auth from database tips
pub struct DatabaseAccessIndex {
    /// Maps: Database ID → Vec<(local_key_id, permission)>
    access_map: HashMap<ID, Vec<(String, Permission)>>,
}

Index Building: For each database, read its _settings.auth, match SigKeys to user keys via the database_sigkeys mapping, and store the resulting (key_id, permission) pairs.

Key Lookup: Query the index by database ID to get all user keys with access, optionally filtered by minimum permission level.

Key Discovery

Finding the right key for a database operation involves:

  1. Get Available Keys: Query the DatabaseAccessIndex for keys with access to the database, filtered by minimum permission if needed
  2. Filter to Decrypted Keys: Ensure we have the private key decrypted in memory
  3. Select Best Key: Choose the key with highest permission level for the database
  4. Retrieve SigKey: Get the mapped SigKey from the database_sigkeys field for transaction creation

Memory Security

Decrypted keys are held in memory only during active user sessions:

  • Session-Based: Keys decrypted on login, held in memory during session
  • Explicit Clearing: On logout, overwrite key bytes with zeros using the zeroize crate
  • Drop Safety: Implement Drop to automatically clear keys when manager is destroyed
  • Encryption Key: Also clear the password-derived encryption key from memory

Implementation Details

UserKeyManager Structure

pub struct UserKeyManager {
    /// Decrypted private keys (only in memory during session)
    /// Map: key_id → SigningKey
    decrypted_keys: HashMap<String, SigningKey>,

    /// Key metadata (including SigKey mappings)
    /// Map: key_id → UserKey
    key_metadata: HashMap<String, UserKey>,

    /// User's password-derived encryption key
    /// Used for encrypting new keys during session
    encryption_key: Vec<u8>,

    /// Database access index (for key discovery)
    access_index: DatabaseAccessIndex,
}

Creation: On user login, derive encryption key from password, decrypt all user's private keys, and build the database access index.

Key Operations:

  • Add Key: Encrypt private key with session encryption key, create metadata, store in both maps
  • Get Key: Retrieve decrypted key by ID, update last_used timestamp
  • Serialize: Export all key metadata (with encrypted keys) for storage

Password Change

When a user changes their password, all keys must be re-encrypted:

  1. Verify Old Password: Authenticate user with current password
  2. Derive New Encryption Key: Generate new salt, derive key from new password
  3. Re-encrypt All Keys: Iterate through decrypted keys, encrypt each with new key
  4. Update Password Hash: Hash new password with new salt
  5. Store Updates: Write all updated UserKey records and password hash in transaction
  6. Update In-Memory State: Replace session encryption key with new one

Security Properties

Encryption Strength

  • Key Derivation: Argon2id with 64 MiB memory, 3 iterations
  • Encryption: AES-256-GCM (authenticated encryption)
  • Key Size: 256-bit encryption keys
  • Nonce: Unique 96-bit nonces for each encryption

Attack Resistance

  • Brute Force: Argon2id parameters make password cracking expensive
  • Replay Attacks: Nonces prevent reuse of ciphertexts
  • Tampering: GCM authentication tag detects modifications
  • Memory Dumps: Keys cleared from memory on logout

Limitations

  • Password Strength: Security depends on user password strength
  • No HSM Support: Keys stored in software (future enhancement)
  • No Key Recovery: Lost password means lost keys (by design)

Performance Considerations

Login Performance

Password derivation is intentionally slow:

  • Argon2id: ~100-200ms per derivation
  • Key decryption: ~1ms per key
  • Total login time: ~200ms + (num_keys × 1ms)

This is acceptable for login operations.

Runtime Performance

During active session:

  • Key lookups: O(1) from HashMap
  • SigKey lookups: O(1) from HashMap
  • Database key discovery: O(n) where n = number of keys
  • No decryption overhead (keys already decrypted)

Testing Strategy

  1. Unit Tests:

    • Password derivation consistency
    • Encryption/decryption round-trips
    • Key serialization/deserialization
    • SigKey mapping operations
  2. Security Tests:

    • Verify different passwords produce different encrypted keys
    • Verify wrong password fails decryption
    • Verify nonce uniqueness
    • Verify memory clearing
  3. Integration Tests:

    • Full user session lifecycle
    • Key addition and usage
    • Password change flow
    • Multiple keys with different SigKey mappings

Future Enhancements

  1. Hardware Security Module Support: Store keys in HSMs
  2. Key Derivation Tunning: Adjust Argon2 parameters based on hardware
  3. Key Backup/Recovery: Secure key recovery mechanisms
  4. Multi-Device Sync: Sync encrypted keys across devices
  5. Biometric Authentication: Use biometrics instead of passwords where available

Conclusion

This key management implementation provides:

  • Strong encryption of private keys at rest
  • User-controlled key ownership through passwords
  • Flexible SigKey mapping for multi-database use
  • Efficient key discovery for database operations
  • Memory security through session-based decryption

For the overall architecture and user management, see the Users design.

Implementation Status: 🔵 Proposed

Bootstrap and Access Control

This design document describes the bootstrap mechanism for requesting access to databases and the wildcard permission system for open access.

Overview

Bootstrap provides a "knocking" mechanism for clients to request access to databases they don't have permissions for. Wildcard permissions provide an alternative for databases that want to allow open access without requiring bootstrap requests.

Problem Statement

When a client wants to sync a database they don't have access to:

  1. No Direct Access: Client's key is not in the database's auth settings
  2. Need Permission Grant: Requires an admin to add the client's key
  3. Coordination Challenge: Client and admin need a way to coordinate the access grant
  4. Public Databases: Some databases should be openly accessible without coordination

Proposed Solution

Two complementary mechanisms:

  1. Wildcard Permissions: For databases that want open access
  2. Bootstrap Protocol: For databases that want controlled access grants

Wildcard Permissions

Wildcard Key

A database can grant universal permissions by setting the special "*" key in its auth settings:

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct AuthSettings {
    /// Maps SigKey → AuthKey
    /// Special key "*" grants permissions to all clients
    keys: HashMap<String, AuthKey>,
}

How It Works

When a client attempts to sync a database:

  1. Check for wildcard key: If "*" exists in _settings.auth, grant the specified permission to any client
  2. No key required: Client doesn't need their key in the database's auth settings
  3. Immediate access: No bootstrap request or approval needed

Use Cases

Public Read Access: Set wildcard key with Read permission to allow anyone to read the database. Clients can sync immediately without bootstrap.

Open Collaboration: Set wildcard key with Write permission to allow anyone to write (use carefully).

Hybrid Model: Combine wildcard Read permission with specific Write/Admin permissions for named keys. This allows public read access while restricting modifications to specific users.

Security Considerations

  • Use sparingly: Wildcard permissions bypass authentication
  • Read-only common: Most appropriate for public data
  • Write carefully: Wildcard write allows any client to modify the database
  • Per-database: Each database controls its own wildcard settings

Bootstrap Protocol

Overview

Bootstrap provides a request/approval workflow for controlled access grants:

Client                    Server                     User (with Admin key)
  |                         |                             |
  |-- Sync Request -------→ |                             |
  |                         |-- Check Auth Settings       |
  |                         |   (no matching key)         |
  |                         |                             |
  |←- Auth Required --------| (if no global permissions)  |
  |                         |                             |
  |-- Bootstrap Request --→ |                             |
  |   (with key & perms)    |                             |
  |                         |-- Store in _sync DB -------→|
  |                         |                             |
  |←- Request Pending ------| (Bootstrap ID returned)     |
  |                         |                             |
  |   [Wait for approval]   |                             |
  |                         |                             |
  |                         |           ←-- List Pending -|
  |                         |           --- Pending [] -->|
  |                         |                             |
  |                         |           ←-- Approve ------|
  |                         |←- Add Key to DB Auth -------|
  |                         |   (using user's Admin key)  |
  |                         |                             |
  |-- Retry Normal Sync --→ |                             |
  |                         |-- Check Auth (now has key)  |
  |←- Sync Success ---------| (access granted)            |

Client Bootstrap Request

When a client needs access to a database:

  1. Client attempts normal sync
  2. If auth is required, client calls sync_with_peer_for_bootstrap() with key name and requested permission
  3. Server stores bootstrap request in _sync database
  4. Client receives pending status and waits for approval

Bootstrap Request Storage

Bootstrap requests are stored in the _sync database:

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct BootstrapRequest {
    /// Database being requested
    pub tree_id: ID,

    /// Client's public key (for verification)
    pub requesting_pubkey: String,

    /// Client's key name (to add to auth settings)
    pub requesting_key_name: String,

    /// Permission level requested
    pub requested_permission: Permission,

    /// When request was made
    pub timestamp: String,

    /// Current status
    pub status: RequestStatus,

    /// Client's network address
    pub peer_address: Address,
}

#[derive(Clone, Debug, Serialize, Deserialize)]
pub enum RequestStatus {
    Pending,
    Approved,
    Rejected,
}

Approval by User with Admin Permission

Any logged-in user who has a key with Admin permission for the database can approve the request:

  1. User logs in with instance.login_user()
  2. Lists pending requests with user.pending_bootstrap_requests(&sync)
  3. User selects a key they own that has Admin permission on the target database
  4. Calls user.approve_bootstrap_request(&mut sync, request_id, approving_key_id)
  5. System validates the user owns the specified key
  6. System retrieves the signing key from the user's key manager
  7. System explicitly validates the key has Admin permission on the target database
  8. Creates transaction using the user's signing key
  9. Adds requesting key to database's auth settings
  10. Updates request status to Approved in the sync database

Permission Validation Strategy

Bootstrap approval and rejection use explicit permission validation:

  • Approval: The system explicitly checks that the approving user has Admin permission on the target database before adding the requesting key. This provides clear error messages (InsufficientPermission) and fails fast if the user lacks the required permission.

  • Rejection: The system explicitly checks that the rejecting user has Admin permission on the target database before allowing rejection. Since rejection only modifies the sync database (not the target database), explicit validation is necessary to enforce the Admin permission requirement.

Rationale: Explicit validation provides:

  • Clear, informative error messages for users
  • Fast failure before attempting database modifications
  • Consistent permission checking across both operations
  • Better debugging experience when permission issues occur

Client Retry After Approval

Once approved, the client retries with normal sync after waiting or polling periodically. If access was granted, the sync succeeds and the client can use the database.

Key Requirements

For Bootstrap Request:

  • Client must have generated a keypair
  • Client specifies the permission level they're requesting

For Approval:

  • User must be logged in
  • User must have a key with Admin permission for the target database
  • That key must be in the database's auth settings

For Rejection:

  • User must be logged in
  • User must have a key with Admin permission for the target database
  • That key must be in the database's auth settings
  • System explicitly validates Admin permission before allowing rejection

Design Decisions

No Auto-Approval

Previous designs included auto-approval based on database policy. This has been removed in favor of:

  1. Global Permissions: Use wildcard "*" key for open access
  2. Manual Approval: All bootstrap requests require explicit approval by a user with Admin permission

Rationale:

  • Simpler architecture (no policy evaluation)
  • Clearer security model (explicit user actions)
  • Global permissions handle "open access" use case
  • Bootstrap is for controlled access grants by authorized users

Auto-approval will be removed once the new system is completed.

API Design

Wildcard Permissions API

impl SettingsStore {
    /// Set wildcard permissions for database
    pub fn set_wildcard_permission(&self, permission: Permission) -> Result<()>;

    /// Remove wildcard permissions
    pub fn remove_wildcard_permission(&self) -> Result<()>;

    /// Check if database has wildcard permissions
    pub fn get_wildcard_permission(&self) -> Result<Option<Permission>>;
}

Bootstrap API

impl Sync {
    /// List pending bootstrap requests
    pub fn pending_bootstrap_requests(&self) -> Result<Vec<(String, BootstrapRequest)>>;

    /// Get specific bootstrap request
    pub fn get_bootstrap_request(&self, request_id: &str) -> Result<Option<BootstrapRequest>>;
}

impl User {
    /// Get all pending bootstrap requests from the sync system
    pub fn pending_bootstrap_requests(
        &self,
        sync: &Sync,
    ) -> Result<Vec<(String, BootstrapRequest)>>;

    /// Approve a bootstrap request (requires Admin permission)
    /// The approving_key_id must be owned by this user and have Admin permission on the target database
    pub fn approve_bootstrap_request(
        &self,
        sync: &mut Sync,
        request_id: &str,
        approving_key_id: &str,
    ) -> Result<()>;

    /// Reject a bootstrap request (requires Admin permission)
    /// The rejecting_key_id must be owned by this user and have Admin permission on the target database
    pub fn reject_bootstrap_request(
        &self,
        sync: &mut Sync,
        request_id: &str,
        rejecting_key_id: &str,
    ) -> Result<()>;
}

// Client-side bootstrap request
impl Sync {
    /// Request bootstrap access to a database
    pub async fn sync_with_peer_for_bootstrap(
        &self,
        peer_addr: &Address,
        tree_id: &ID,
        key_name: &str,
        requested_permission: Permission,
    ) -> Result<()>;
}

Security Considerations

Wildcard Permissions

  1. Public Exposure: Wildcard permissions make databases publicly accessible
  2. Write Risk: Wildcard write allows anyone to modify data
  3. Audit Trail: All modifications still signed by individual keys
  4. Revocation: Can remove wildcard permission at any time

Bootstrap Protocol

  1. Request Validation: Verify requesting public key matches signature
  2. Permission Limits: Clients request permission, approving user decides what to grant
  3. Admin Permission Required: Only users with Admin permission on the database can approve
  4. Request Expiry: Consider implementing request expiration
  5. Rate Limiting: Prevent spam bootstrap requests

Implementation Strategy

Phase 1: Wildcard Permissions

  1. Update AuthSettings to support "*" key
  2. Modify sync protocol to check for wildcard permissions
  3. Add SettingsStore API for wildcard management
  4. Tests for wildcard permission scenarios

Phase 2: Bootstrap Request Storage

  1. Define BootstrapRequest structure
  2. Implement storage in _sync database
  3. Add request listing and retrieval APIs
  4. Tests for request storage and retrieval

Phase 3: Client Bootstrap Protocol

  1. Implement sync_with_peer_for_bootstrap() client method
  2. Add bootstrap request submission to sync protocol
  3. Implement pending status handling
  4. Tests for client bootstrap flow

Phase 4: User Approval

  1. Implement User::approve_bootstrap_request()
  2. Implement User::reject_bootstrap_request()
  3. Add Admin permission checking and key addition logic
  4. Tests for approval workflow

Phase 5: Integration

  1. Update sync protocol to handle bootstrap responses
  2. Implement client retry logic
  3. End-to-end integration tests
  4. Documentation and examples

Future Enhancements

  1. Request Expiration: Automatically expire old pending requests
  2. Notification System: Notify users with Admin permission of new bootstrap requests
  3. Permission Negotiation: Allow approving user to grant different permission than requested
  4. Batch Approval: Approve multiple requests at once
  5. Bootstrap Policies: Configurable rules for auto-rejection (e.g., block certain addresses)
  6. Audit Log: Track all bootstrap requests and decisions

Conclusion

The bootstrap and access control system provides:

Wildcard Permissions:

  • Simple open access for public databases
  • Flexible permission levels (Read, Write, Admin)
  • Per-database control

Bootstrap Protocol:

  • Secure request/approval workflow
  • User-controlled access grants
  • Integration with Users system for authentication

Together, these mechanisms support both open and controlled access patterns for Eidetica databases.

Error Handling Design

Overview

Error handling in Eidetica follows principles of modularity, locality, and user ergonomics using structured error types with zero-cost conversion.

Design Philosophy

Error Locality: Each module owns its error types, keeping them discoverable alongside functions that produce them.

Structured Error Data: Uses typed fields instead of string-based errors for pattern matching, context preservation, and performance.

Progressive Context: Errors gain context moving up the stack - lower layers provide technical details, higher layers add user-facing categorization.

Architecture

Error Hierarchy: Database structure where modules define error types aggregated into top-level Error enum with variants for Io, Serialize, Auth, Backend, Base, CRDT, Store, and Transaction errors.

Module-Specific Errors: Each component has domain-specific error enums covering key resolution, storage operations, database management, merge conflicts, data access, and transaction coordination.

Transparent Conversion: #[error(transparent)] enables zero-cost conversion between module errors and top-level type using ? operator.

Error Categories

By Nature: Not found errors (module-specific variants), permission errors (authentication/authorization), validation errors (input/state consistency), operation errors (business logic violations).

By Layer: Core errors (fundamental operations), storage layer (database/persistence), data layer (CRDT/store operations), application layer (high-level coordination).

Error Handling Patterns

Contextual Propagation: Errors preserve context while moving up the stack, maintaining technical details and enabling categorization.

Classification Helpers: Top-level Error provides methods like is_not_found(), is_permission_denied(), is_authentication_error() for broad category handling.

Non-Exhaustive Enums: All error enums use #[non_exhaustive] for future extension without breaking changes.

Performance

Zero-Cost Abstractions: Transparent errors eliminate wrapper overhead, structured fields avoid string formatting until display, no heap allocations in common paths.

Efficient Propagation: Seamless ? operator across module boundaries with automatic conversion and preserved context.

Usage Patterns

Library Users: Use helper methods for stable APIs that won't break with new error variants.

Library Developers: Define new variants in appropriate module enums with structured fields for context, add helper methods for classification.

Extensibility

New error variants can be added without breaking existing code. Operations spanning modules can wrap/convert errors for appropriate context. Structured data enables sophisticated error recovery based on specific failure modes.