Eidetica Documentation
Welcome to the official documentation for Eidetica - a decentralized database built on Merkle-CRDT principles with built-in peer-to-peer synchronization.
Key Features
- Decentralized Architecture: No central server required - peers connect directly
- Conflict-Free Replication: Automatic merge resolution using CRDT principles
- Content-Addressable Storage: Immutable, hash-identified data entries
- Real-time Synchronization: Background sync with configurable batching and timing
- Multiple Transport Protocols: HTTP and Iroh P2P with NAT traversal
- Authentication & Security: Ed25519 signatures for all operations
- Flexible Data Models: Support for documents, key-value, and structured data
Project Structure
Eidetica is organized as a Cargo workspace:
- Library (
crates/lib/
): The core Eidetica library crate - CLI Binary (
crates/bin/
): Command-line interface using the library - Examples (
examples/
): Standalone applications demonstrating usage
Choose a section to get started:
- User Guide: Learn how to use the Eidetica library.
- Internal Documentation: Understand the internal design and contribute to Eidetica.
- Design Documents: Architectural documents used for development.
User Guide
Welcome to the Eidetica User Guide. This guide will help you understand and use Eidetica effectively in your applications.
What is Eidetica?
Eidetica is a Rust library for managing structured data with built-in history tracking. It combines concepts from distributed systems, Merkle-CRDTs, and traditional databases to provide a unique approach to data management:
- Efficient data storage with customizable Databases
- History tracking for all changes via immutable Entries forming a DAG
- Structured data types via named, typed Stores within logical Databases
- Atomic changes across multiple data structures using Transactions
- Designed for distribution (future capability)
How to Use This Guide
This user guide is structured to guide you from basic setup to advanced concepts:
- Getting Started: Installation, basic setup, and your first steps.
- Basic Usage Pattern: A quick look at the typical workflow.
- Core Concepts: Understand the fundamental building blocks:
- Entries & Databases: The core DAG structure.
- Databases: How data is stored.
- Stores: Where structured data lives (
DocStore
,Table
,YDoc
). - Transactions: How atomic changes are made.
- Tutorial: Todo App: A step-by-step walkthrough using a simple application.
- Code Examples: Focused code snippets for common tasks.
Quick Overview: The Core Flow
Eidetica revolves around a few key components working together:
Database
: You start by choosing or creating a storageDatabase
(e.g.,InMemoryDatabase
).Instance
: You create aInstance
instance, providing it theDatabase
. This is your main database handle.Database
: Using theInstance
, you create or load aDatabase
, which acts as a logical container for related data and tracks its history.Transaction
: To read or write data, you start aTransaction
from theDatabase
. This ensures atomicity and consistent views.Store
: Within aTransaction
, you get handles to namedStore
s (likeDocStore
orTable<YourData>
). These provide methods (set
,get
,insert
,remove
, etc.) to interact with your structured data.Commit
: Changes made viaStore
handles within theTransaction
are staged. Callingcommit()
on theTransaction
finalizes these changes atomically, creating a new historicalEntry
in theDatabase
.
Basic Usage Pattern
Here's a quick examplee showing loading a database and writing new data.
extern crate eidetica; extern crate serde; use eidetica::{backend::database::InMemory, Instance, crdt::Doc, store::{DocStore, Table}}; use serde::{Serialize, Deserialize}; #[derive(Serialize, Deserialize, Clone, Debug)] struct MyData { name: String, } fn main() -> eidetica::Result<()> { let backend = InMemory::new(); let db = Instance::new(Box::new(backend)); db.add_private_key("my_private_key")?; // Create/Load Database let database = match db.find_database("my_database") { Ok(mut databases) => databases.pop().unwrap(), // Found existing Err(e) if e.is_not_found() => { let mut doc = Doc::new(); doc.set_string("name", "my_database"); db.new_database(doc, "my_private_key")? } Err(e) => return Err(e), }; // --- Writing Data --- // Start a Transaction let txn = database.new_transaction()?; let inserted_id = { // Scope for store handles // Get Store handles let config = txn.get_store::<DocStore>("config")?; let items = txn.get_store::<Table<MyData>>("items")?; // Use Store methods config.set("version", "1.0")?; items.insert(MyData { name: "example".to_string() })? }; // Handles drop, changes are staged in txn // Commit changes let new_entry_id = txn.commit()?; println!("Committed changes, new entry ID: {}", new_entry_id); // --- Reading Data --- // Use Database::get_store_viewer for a read-only view let items_viewer = database.get_store_viewer::<Table<MyData>>("items")?; if let Ok(item) = items_viewer.get(&inserted_id) { println!("Read item: {:?}", item); } Ok(()) }
See Transactions and Code Examples for more details.
Project Status
Eidetica is currently under active development. The core functionality is working, but APIs are considered experimental and may change in future releases. It is suitable for evaluation and prototyping, but not yet recommended for production systems requiring long-term API stability.
Getting Started
This guide will walk you through the basics of using Eidetica in your Rust applications. We'll cover the essential steps to set up and interact with the database.
Installation
Add Eidetica to your project dependencies:
[dependencies]
eidetica = "0.1.0" # Update version as appropriate
# Or if using from a local workspace:
# eidetica = { path = "path/to/eidetica/crates/lib" }
Setting up the Database
To start using Eidetica, you need to:
- Choose and initialize a Database (storage mechanism)
- Create a Instance instance (the main entry point)
- Add authentication keys (required for all operations)
- Create or access a Database (logical container for data)
Here's a simple example:
extern crate eidetica; use eidetica::{backend::database::InMemory, Instance, crdt::Doc}; fn main() -> eidetica::Result<()> { // Create a new in-memory database let database = InMemory::new(); let db = Instance::new(Box::new(database)); // Add an authentication key (required for all operations) db.add_private_key("my_private_key")?; // Create a database to store data let mut settings = Doc::new(); settings.set_string("name", "my_database"); let _database = db.new_database(settings, "my_private_key")?; Ok(()) }
The database determines how your data is stored. The example above uses InMemory
, which keeps everything in memory but can save to a file:
#![allow(unused)] fn main() { extern crate eidetica; use eidetica::{Instance, backend::database::InMemory}; use std::path::PathBuf; fn save_db(db: &Instance) -> eidetica::Result<()> { // Save the database to a file let path = PathBuf::from("my_database.json"); let database_guard = db.backend(); if let Some(in_memory) = database_guard.as_any().downcast_ref::<InMemory>() { in_memory.save_to_file(&path)?; } Ok(()) } }
You can load a previously saved database:
#![allow(unused)] fn main() { extern crate eidetica; use eidetica::{Instance, backend::database::InMemory}; use std::path::PathBuf; fn load_instance() -> eidetica::Result<Instance> { let path = PathBuf::from("my_database.json"); let database = InMemory::load_from_file(&path)?; // Note: Authentication keys are automatically loaded with the database if they exist Ok(Instance::new(Box::new(database))) } }
Authentication Requirements
Important: All operations in Eidetica require authentication. Every entry created in the database must be cryptographically signed with a valid Ed25519 private key. This ensures data integrity and provides a consistent security model.
Working with Data
Eidetica uses Stores to organize data within a database. One common store type is Table
, which maintains a collection of items with unique IDs.
Defining Your Data
Any data you store must be serializable with serde
:
Basic Operations
All operations in Eidetica happen within an atomic Transaction:
Inserting Data:
#![allow(unused)] fn main() { extern crate eidetica; extern crate serde; use eidetica::{backend::database::InMemory, Instance, crdt::Doc, store::Table, Database}; use serde::{Serialize, Deserialize}; #[derive(Clone, Debug, Serialize, Deserialize)] struct Person { name: String, age: u32, } fn insert_alice(database: &Database) -> eidetica::Result<()> { // Start an authenticated transaction let op = database.new_transaction()?; // Get or create a Table store let people = op.get_store::<Table<Person>>("people")?; // Insert a person and get their ID let person = Person { name: "Alice".to_string(), age: 30 }; let _id = people.insert(person)?; // Commit the changes (automatically signed with the database's default key) op.commit()?; Ok(()) } }
Reading Data:
#![allow(unused)] fn main() { extern crate eidetica; extern crate serde; use eidetica::{Database, store::Table}; use serde::{Serialize, Deserialize}; #[derive(Clone, Debug, Serialize, Deserialize)] struct Person { name: String, age: u32, } fn read(database: &Database, id: &str) -> eidetica::Result<()> { let op = database.new_transaction()?; let people = op.get_store::<Table<Person>>("people")?; // Get a single person by ID if let Ok(person) = people.get(id) { println!("Found: {} ({})", person.name, person.age); } // Search for all people (using a predicate that always returns true) let all_people = people.search(|_| true)?; for (id, person) in all_people { println!("ID: {}, Name: {}, Age: {}", id, person.name, person.age); } Ok(()) } }
Updating Data:
#![allow(unused)] fn main() { extern crate eidetica; extern crate serde; use eidetica::{Database, store::Table}; use serde::{Serialize, Deserialize}; #[derive(Clone, Debug, Serialize, Deserialize)] struct Person { name: String, age: u32, } fn update(database: &Database, id: &str) -> eidetica::Result<()> { let op = database.new_transaction()?; let people = op.get_store::<Table<Person>>("people")?; // Get, modify, and update if let Ok(mut person) = people.get(id) { person.age += 1; people.set(id, person)?; } op.commit()?; Ok(()) } }
Deleting Data:
#![allow(unused)] fn main() { extern crate eidetica; extern crate serde; use eidetica::{Database, store::Table}; use serde::{Serialize, Deserialize}; #[derive(Clone, Debug, Serialize, Deserialize)] struct Person { name: String, age: u32, } fn delete(database: &Database, id: &str) -> eidetica::Result<()> { let op = database.new_transaction()?; let people = op.get_store::<Table<Person>>("people")?; // Note: Table doesn't currently support deletion // You can overwrite with a "deleted" marker or use other approaches op.commit()?; Ok(()) } }
A Complete Example
For a complete working example, see the Todo Example included in the repository.
Next Steps
After getting familiar with the basics, you might want to explore:
- Core Concepts to understand Eidetica's unique features
- Advanced operations like querying and filtering
- Using different store types for various data patterns
- Configuring and optimizing your database
Core Concepts
Understanding the fundamental ideas behind Eidetica will help you use it effectively and appreciate its unique capabilities.
Architectural Foundation
Eidetica builds on several powerful concepts from distributed systems and database design:
- Content-addressable storage: Data is identified by the hash of its content, similar to Git and IPFS
- Directed acyclic graphs (DAGs): Changes form a graph structure rather than a linear history
- Conflict-free replicated data types (CRDTs): Data structures that can merge concurrent changes automatically
- Immutable data structures: Once created, data is never modified, only new versions are added
These foundations enable Eidetica's key features: robust history tracking, efficient synchronization, and eventual consistency in distributed environments.
Merkle-CRDTs
Eidetica is inspired by the Merkle-CRDT concept from OrbitDB, which combines:
- Merkle DAGs: A data structure where each node contains a cryptographic hash of its children, creating a tamper-evident history
- CRDTs: Data types designed to resolve conflicts automatically when concurrent changes occur
In a Merkle-CRDT, each update creates a new node in the graph, containing:
- References to parent nodes (previous versions)
- The updated data
- Metadata for conflict resolution
This approach allows for:
- Strong auditability of all changes
- Automatic conflict resolution
- Efficient synchronization between replicas
Data Model Layers
Eidetica organizes data in a layered architecture:
+-----------------------+
| User Application |
+-----------------------+
| Instance |
+-----------------------+
| Databases |
+----------+------------+
| Stores | Operations |
+----------+------------+
| Entries (DAG) |
+-----------------------+
| Database Storage |
+-----------------------+
Each layer builds on the ones below, providing progressively higher-level abstractions:
- Database Storage: Physical storage of data (currently InMemory with file persistence)
- Entries: Immutable, content-addressed objects forming the database's history
- Databases & Stores: Logical organization and typed access to data
- Operations: Atomic transactions across multiple stores
- Instance: The top-level database container and API entry point
Entries and the DAG
At the core of Eidetica is a directed acyclic graph (DAG) of immutable Entry objects:
-
Each Entry represents a point-in-time snapshot of data and has:
- A unique ID derived from its content (making it content-addressable)
- Links to parent entries (forming the graph structure)
- Data payloads organized by store
- Metadata for database and store relationships
-
The DAG enables:
- Full history tracking (nothing is ever deleted)
- Efficient verification of data integrity
- Conflict resolution when merging concurrent changes
IPFS Inspiration and Future Direction
While Eidetica draws inspiration from IPFS (InterPlanetary File System), it currently uses its own implementation patterns:
- IPFS is a content-addressed, distributed storage system where data is identified by cryptographic hashes
- OrbitDB (which inspired Eidetica) uses IPFS for backend storage and distribution
Eidetica's future plans include:
- Developing efficient internal APIs for transferring objects between Eidetica instances
- Potential IPFS-compatible addressing for distributed storage
- More efficient synchronization mechanisms than traditional IPFS
Stores: A Core Innovation
Eidetica extends the Merkle-CRDT concept with Stores, which partition data within each Entry:
- Each store is a named, typed data structure within a Database
- Stores can use different data models and conflict resolution strategies
- Stores maintain their own history tracking within the larger Database
This enables:
- Type-safe, structure-specific APIs for data access
- Efficient partial synchronization (only needed stores)
- Modular features through pluggable stores
- Atomic operations across different data structures
Planned future stores include:
- Object Storage: Efficiently handling large objects with content-addressable hashing
- Backup: Archiving database history for space efficiency
- Encrypted Store: Transparent encrypted data storage
Atomic Operations and Transactions
All changes in Eidetica happen through atomic Transactions:
- A Transaction is created from a Database
- Stores are accessed and modified through the Transaction
- When committed, all changes across all stores become a single new Entry
- If the Transaction fails, no changes are applied
This model ensures data consistency while allowing complex operations across multiple stores.
Settings as Stores
In Eidetica, even configuration is stored as a store:
- A Database's settings are stored in a special "settings" Store internally that is hidden from regular usage
- This approach unifies the data model and allows settings to participate in history tracking
CRDT Properties and Eventual Consistency
Eidetica is designed with distributed systems in mind:
- All data structures have CRDT properties for automatic conflict resolution
- Different store types implement appropriate CRDT strategies:
- DocStore uses last-writer-wins (LWW) with implicit timestamps
- Table preserves all items, with LWW for updates to the same item
These properties ensure that when Eidetica instances synchronize, they eventually reach a consistent state regardless of the order in which updates are received.
History Tracking and Time Travel
One of Eidetica's most powerful features is comprehensive history tracking:
- All changes are preserved in the Entry DAG
- "Tips" represent the latest state of a Database or Store
- Historical states can be reconstructed by traversing the DAG
This design allows for future capabilities like:
- Point-in-time recovery
- Auditing and change tracking
- Historical queries and analysis
- Branching and versioning
Current Status and Roadmap
Eidetica is under active development, and some features mentioned in this documentation are still in planning or development stages. Here's a summary of the current status:
Implemented Features
- Core Entry and Database structure
- In-memory database with file persistence
- DocStore and Table store implementations
- CRDT functionality:
- Doc (hierarchical nested document structure with recursive merging and tombstone support for deletions)
- Atomic operations across stores
- Tombstone support for proper deletion handling in distributed environments
Planned Features
- Object Storage store for efficient handling of large objects
- Backup store for archiving database history
- Encrypted store for transparent encrypted data storage
- IPFS-compatible addressing for distributed storage
- Enhanced synchronization mechanisms
- Point-in-time recovery
This roadmap is subject to change as development progresses. Check the project repository for the most up-to-date information on feature availability.
Entries & Databases
The basic units of data and organization in Eidetica.
Entries
Entries are the fundamental building blocks in Eidetica. An Entry represents an atomic unit of data with the following characteristics:
- Content-addressable: Each entry has a unique ID derived from its content, similar to Git commits.
- Immutable: Once created, entries cannot be modified.
- Parent references: Entries maintain references to their parent entries, forming a directed acyclic graph (DAG).
- Database association: Each entry belongs to a database and can reference parent entries within both the main database and stores.
- Store data: Entries can contain data for one or more stores, representing different aspects or types of data.
Entries function similar to commits in Git - they represent a point-in-time snapshot of data with links to previous states, enabling history tracking.
Databases
A Database in Eidetica is a logical container for related entries, conceptually similar to:
- A traditional database containing multiple tables
- A branch in a version control system
- A collection in a document database
Key characteristics of Databases:
- Root Entry: Each database has a root entry that serves as its starting point.
- Named Identity: Databases typically have a name stored in their settings store.
- History Tracking: Databases maintain the complete history of all changes as a linked graph of entries.
- Store Organization: Data within a database is organized into named stores, each potentially using different data structures.
- Atomic Operations: All changes to a database happen through transactions, which create new entries.
Database Transactions
You interact with Databases through Transactions:
extern crate eidetica; use eidetica::{backend::database::InMemory, Instance, crdt::Doc, store::DocStore, Database}; use eidetica::Result; fn example(database: Database) -> Result<()> { // Create a new transaction let op = database.new_transaction()?; // Access stores and perform actions let settings = op.get_store::<DocStore>("settings")?; settings.set("version", "1.2.0")?; // Commit the changes, creating a new Entry let new_entry_id = op.commit()?; Ok(()) } fn main() -> Result<()> { let backend = InMemory::new(); let db = Instance::new(Box::new(backend)); db.add_private_key("key")?; let mut settings = Doc::new(); settings.set_string("name", "test"); let database = db.new_database(settings, "key")?; example(database)?; Ok(()) }
When you commit a transaction, Eidetica:
- Creates a new Entry containing all changes
- Links it to the appropriate parent entries
- Adds it to the database's history
- Returns the ID of the new entry
Database Settings
Each Database maintains its settings as a key-value store in a special "settings" store:
// Get the settings store
let settings = database.get_settings()?;
// Access settings
let name = settings.get("name")?;
let version = settings.get("version")?;
Common settings include:
name
: The identifier for the database (used byInstance::find_database
). This is the primary standard setting currently used.- Other application-specific settings can be stored here.
Tips and History
Databases in Eidetica maintain a concept of "tips" - the latest entries in the database's history:
// Get the current tip entries
let tips = database.get_tips()?;
Tips represent the current state of the database. As new transactions are committed, new tips are created, and the history grows. This historical information remains accessible, allowing you to:
- Track all changes to data over time
- Reconstruct the state at any point in history (requires manual traversal or specific backend support - see Backends)
Database vs. Store
While a Database is the logical container, the actual data is organized into Stores. This separation allows:
- Different types of data structures within a single Database
- Type-safe access to different parts of your data
- Fine-grained history tracking by store
- Efficient partial replication and synchronization
See Stores for more details on how data is structured within a Database.
Database Storage
Database storage implementations in Eidetica define how and where data is physically stored.
The Database Abstraction
The Database trait abstracts the underlying storage mechanism for Eidetica entries. This separation of concerns allows the core database logic to remain independent of the specific storage details.
Key responsibilities of a Database:
- Storing and retrieving entries by their unique IDs
- Tracking relationships between entries
- Calculating tips (latest entries) for databases and stores
- Managing the graph-like structure of entry history
Available Database Implementations
InMemory
The InMemory
database is currently the primary storage implementation:
- Stores all entries in memory
- Can load from and save to a JSON file
- Well-suited for development, testing, and applications with moderate data volumes
- Simple to use and requires no external dependencies
Example usage:
// Create a new in-memory database
use eidetica::backend::database::InMemory;
let database = InMemory::new();
let db = Instance::new(Box::new(database));
// ... use the database ...
// Save to a file (optional)
let path = PathBuf::from("my_database.json");
let database_guard = db.backend().lock().unwrap();
if let Some(in_memory) = database_guard.as_any().downcast_ref::<InMemory>() {
in_memory.save_to_file(&path)?;
}
// Load from a file
let database = InMemory::load_from_file(&path)?;
let db = Instance::new(Box::new(database));
Note: The InMemory
database is the only storage implementation currently provided with Eidetica.
Database Trait Responsibilities
The Database
trait (eidetica::backend::Database
) defines the core interface required for storage. Beyond simple get
and put
for entries, it includes methods crucial for navigating the database's history and structure:
get_tips(tree_id)
: Finds the latest entries in a specificDatabase
.get_subtree_tips(tree_id, subtree_name)
: Finds the latest entries for a specificStore
within aDatabase
.all_roots()
: Finds all top-levelDatabase
roots stored in the database.get_tree(tree_id)
/get_subtree(...)
: Retrieve all entries for a database/store, typically sorted topologically (required for some history operations, potentially expensive).
Implementing these methods efficiently often requires the database to understand the DAG structure, making the database more than just a simple key-value store.
Database Performance Considerations
The Database implementation significantly impacts database performance:
- Entry Retrieval: How quickly entries can be accessed by ID
- Graph Traversal: Efficiency of history traversal and tip calculation
- Memory Usage: How entries are stored and whether they're kept in memory
- Concurrency: How concurrent operations are handled
Stores
Stores provide structured, type-safe access to different kinds of data within a Database.
The Store Concept
In Eidetica, Stores extend the Merkle-CRDT concept by explicitly partitioning data within each Entry. A Store:
- Represents a specific type of data structure (like a key-value store or a collection of records)
- Has a unique name within its parent Database
- Maintains its own history tracking
- Is strongly typed (via Rust generics)
Stores are what make Eidetica practical for real applications, as they provide high-level, data-structure-aware interfaces on top of the core Entry and Database concepts.
Why Stores?
Stores offer several advantages:
- Type Safety: Each store implementation provides appropriate methods for its data type
- Isolation: Changes to different stores can be tracked separately
- Composition: Multiple data structures can exist within a single Database
- Efficiency: Only relevant stores need to be loaded or synchronized
- Atomic Operations: Changes across multiple stores can be committed atomically
Available Store Types
Eidetica provides three main store types, each optimized for different data patterns:
Type | Purpose | Key Features | Best For |
---|---|---|---|
DocStore | Document storage | Path-based operations, nested structures | Configuration, metadata, structured docs |
Table<T> | Record collections | Auto-generated UUIDs, type safety, search | User lists, products, any structured records |
YDoc | Collaborative editing | Y-CRDT integration, real-time sync | Shared documents, collaborative text editing |
DocStore (Document-Oriented Storage)
The DocStore
store provides a document-oriented interface for storing and retrieving structured data. It wraps the crdt::Doc
type to provide ergonomic access patterns with both simple key-value operations and path-based operations for nested data structures.
Basic Usage
// Get a DocStore store
let op = database.new_transaction()?;
let store = op.get_store::<DocStore>("app_data")?;
// Set simple values
store.set("version", "1.0.0")?;
store.set("author", "Alice")?;
// Path-based operations for nested structures
// This creates nested maps: {"database": {"host": "localhost", "port": "5432"}}
store.set_path("database.host", "localhost")?;
store.set_path("database.port", "5432")?;
// Retrieve values
let version = store.get("version")?; // Returns a Value
let host = store.get_path("database.host")?; // Navigate nested structure
op.commit()?;
Important: Path Operations Create Nested Structures
When using set_path("a.b.c", value)
, DocStore creates nested maps, not flat keys with dots:
// This code:
store.set_path("user.profile.name", "Bob")?;
// Creates this structure:
// {
// "user": {
// "profile": {
// "name": "Bob"
// }
// }
// }
// NOT: { "user.profile.name": "Bob" } ❌
Use cases for DocStore
:
- Application configuration
- Metadata storage
- Structured documents
- Settings management
- Any data requiring path-based access
Table
The Table<T>
store manages collections of serializable items, similar to a table in a database:
// Define a struct for your data
#[derive(Serialize, Deserialize, Clone)]
struct User {
name: String,
email: String,
active: bool,
}
// Get a Table store
let op = database.new_transaction()?;
let users = op.get_store::<Table<User>>("users")?;
// Insert items (returns a generated UUID)
let user = User {
name: "Alice".to_string(),
email: "alice@example.com".to_string(),
active: true,
};
let id = users.insert(user)?;
// Get an item by ID
if let Ok(user) = users.get(&id) {
println!("Found user: {}", user.name);
}
// Update an item
if let Ok(mut user) = users.get(&id) {
user.active = false;
users.set(&id, user)?;
}
// Search for items matching a condition
let active_users = users.search(|user| user.active)?;
for (id, user) in active_users {
println!("Active user: {} (ID: {})", user.name, id);
}
op.commit()?;
Use cases for Table
:
- Collections of structured objects
- Record storage (users, products, todos, etc.)
- Any data where individual items need unique IDs
- When you need to search across records with custom predicates
YDoc (Y-CRDT Integration)
The YDoc
store provides integration with Y-CRDT (Yjs) for real-time collaborative editing. This requires the "y-crdt" feature:
// Enable in Cargo.toml: eidetica = { features = ["y-crdt"] }
use eidetica::store::YDoc;
use eidetica::y_crdt::{Map, Text, Transact};
// Get a YDoc store
let op = database.new_transaction()?;
let doc_store = op.get_store::<YDoc>("document")?;
// Work with Y-CRDT structures
doc_store.with_doc_mut(|doc| {
let text = doc.get_or_insert_text("content");
let metadata = doc.get_or_insert_map("meta");
let mut txn = doc.transact_mut();
// Collaborative text editing
text.insert(&mut txn, 0, "Hello, collaborative world!");
// Set metadata
metadata.insert(&mut txn, "title", "My Document");
metadata.insert(&mut txn, "author", "Alice");
Ok(())
})?;
// Apply updates from other collaborators
let external_update = receive_update_from_network();
doc_store.apply_update(&external_update)?;
// Get updates to send to others
let update = doc_store.get_update()?;
broadcast_update(update);
op.commit()?;
Use cases for YDoc
:
- Real-time collaborative text editing
- Shared documents with multiple editors
- Conflict-free data synchronization
- Applications requiring sophisticated merge algorithms
Store Implementation Details
Each Store implementation in Eidetica:
- Implements the
Store
trait - Provides methods appropriate for its data structure
- Handles serialization/deserialization of data
- Manages the store's history within the Database
The Store
trait defines the minimal interface:
pub trait Store: Sized {
fn new(op: &Transaction, store_name: &str) -> Result<Self>;
fn name(&self) -> &str;
}
Store implementations add their own methods on top of this minimal interface.
Store History and Merging (CRDT Aspects)
While Eidetica uses Merkle-DAGs for overall history, the way data within a Store is combined when branches merge relies on Conflict-free Replicated Data Type (CRDT) principles. This ensures that even if different replicas of the database have diverged and made concurrent changes, they can be merged back together automatically without conflicts (though the merge result depends on the CRDT strategy).
Each Store type implements its own merge logic, typically triggered implicitly when an Transaction
reads the current state of the store (which involves finding and merging the tips of that store's history):
-
DocStore
: Implements a Last-Writer-Wins (LWW) strategy using the internalDoc
type. When merging concurrent writes to the same key or path, the write associated with the laterEntry
"wins", and its value is kept. Writes to different keys are simply combined. Deleted keys (viadelete()
) are tracked with tombstones to ensure deletions propagate properly. -
Table<T>
: Also uses LWW for updates to the same row ID. If two concurrent operations modify the same row, the later write wins. Inserts of different rows are combined (all inserted rows are kept). Deletions generally take precedence over concurrent updates (though precise semantics might evolve).
Note: The CRDT merge logic happens internally when an Transaction
loads the initial state of a Store or when a store viewer is created. You typically don't invoke merge logic directly.
Future Store Types
Eidetica's architecture allows for adding new Store implementations. Potential future types include:
- ObjectStore: For storing large binary blobs.
These are not yet implemented. Development is currently focused on the core API and the existing DocStore
and Table
types.
Transactions: Atomic Changes
In Eidetica, all modifications to the data stored within a Database
's Store
s happen through an Transaction
. This is a fundamental concept ensuring atomicity and providing a consistent mechanism for interacting with your data.
Authentication Note: All transactions in Eidetica are authenticated by default. Every transaction uses the database's default signing key to ensure that all changes are cryptographically verified and can be traced to their source.
A Transaction
bundles multiple Store operations (which affect individual subtrees) into a single atomic Entry that gets committed to the database.
Why Transactions?
Transactions provide several key benefits:
- Atomicity: Changes made to multiple
Store
s within a singleTransaction
are committed together as one atomic unit. If thecommit()
fails, no changes are persisted. This is similar to transactions in traditional databases. - Consistency: A
Transaction
captures a snapshot of theDatabase
's state (specifically, the tips of the relevantStore
s) when it's created or when aStore
is first accessed within it. All reads and writes within thatTransaction
occur relative to this consistent state. - Change Staging: Modifications made via
Store
handles are staged within theTransaction
object itself, not written directly to the database untilcommit()
is called. - Authentication: All transactions are automatically authenticated using the database's default signing key, ensuring data integrity and access control.
- History Creation: A successful
commit()
results in the creation of a newEntry
in theDatabase
, containing the staged changes and linked to the previous state (the tips theTransaction
was based on). This is how history is built.
The Transaction Lifecycle
Using a Transaction
follows a distinct lifecycle:
-
Creation: Start an authenticated transaction from a
Database
instance.extern crate eidetica; use eidetica::{backend::database::InMemory, Instance, crdt::Doc}; fn main() -> eidetica::Result<()> { // Setup database let backend = InMemory::new(); let db = Instance::new(Box::new(backend)); db.add_private_key("key")?; let mut settings = Doc::new(); settings.set_string("name", "test"); let database = db.new_database(settings, "key")?; let _txn = database.new_transaction()?; // Automatically uses the database's default signing key Ok(()) }
-
Store Access: Get handles to the specific
Store
s you want to interact with. This implicitly loads the current state (tips) of that store into the transaction if accessed for the first time.extern crate eidetica; extern crate serde; use eidetica::{backend::database::InMemory, Instance, crdt::Doc, store::{Table, DocStore}, Database}; use serde::{Serialize, Deserialize}; #[derive(Clone, Debug, Serialize, Deserialize)] struct User { name: String, } fn main() -> eidetica::Result<()> { // Setup database and transaction let backend = InMemory::new(); let db = Instance::new(Box::new(backend)); db.add_private_key("key")?; let mut settings = Doc::new(); settings.set_string("name", "test"); let database = db.new_database(settings, "key")?; let txn = database.new_transaction()?; // Get handles within a scope or manage their lifetime let _users_store = txn.get_store::<Table<User>>("users")?; let _config_store = txn.get_store::<DocStore>("config")?; txn.commit()?; Ok(()) }
-
Staging Changes: Use the methods provided by the
Store
handles (set
,insert
,get
,remove
, etc.). These methods interact with the data staged within theTransaction
.#![allow(unused)] fn main() { extern crate eidetica; extern crate serde; extern crate chrono; use eidetica::store::{Table, DocStore}; use serde::{Serialize, Deserialize}; #[derive(Clone, Debug, Serialize, Deserialize)] struct User { name: String, } fn example(users_store: &Table<User>, config_store: &DocStore, user_id: &str) -> eidetica::Result<()> { users_store.insert(User { name: "Alice".to_string() })?; let _current_name = users_store.get(user_id)?; config_store.set("last_updated", chrono::Utc::now().to_rfc3339())?; Ok(()) } }
Note:
get
methods within a transaction read from the staged state, reflecting any changes already made within the same transaction. -
Commit: Finalize the changes. This consumes the
Transaction
object, calculates the finalEntry
content based on staged changes, cryptographically signs the entry, writes the newEntry
to theDatabase
, and returns theID
of the newly createdEntry
.extern crate eidetica; use eidetica::{backend::database::InMemory, Instance, crdt::Doc}; fn main() -> eidetica::Result<()> { // Setup database let backend = InMemory::new(); let db = Instance::new(Box::new(backend)); db.add_private_key("key")?; let mut settings = Doc::new(); settings.set_string("name", "test"); let database = db.new_database(settings, "key")?; // Create transaction and commit let txn = database.new_transaction()?; let new_entry_id = txn.commit()?; println!("Changes committed. New state represented by Entry: {}", new_entry_id); Ok(()) }
After
commit()
, thetxn
variable is no longer valid.
Read-Only Access
While Transaction
s are essential for writes, you can perform reads without an explicit Transaction
using Database::get_subtree_viewer
:
#![allow(unused)] fn main() { extern crate eidetica; extern crate serde; use eidetica::{Database, store::Table}; use serde::{Serialize, Deserialize}; #[derive(Clone, Debug, Serialize, Deserialize)] struct User { name: String, } fn example(database: Database, user_id: &str) -> eidetica::Result<()> { let users_viewer = database.get_store_viewer::<Table<User>>("users")?; if let Ok(_user) = users_viewer.get(user_id) { // Read data based on the current tips of the 'users' store } Ok(()) } }
A SubtreeViewer
provides read-only access based on the latest committed state (tips) of that specific store at the time the viewer is created. It does not allow modifications and does not require a commit()
.
Choose Transaction
when you need to make changes or require a transaction-like boundary for multiple reads/writes. Choose SubtreeViewer
for simple, read-only access to the latest state.
Authentication Guide
How to use Eidetica's authentication system for securing your data.
Quick Start
Every Eidetica database requires authentication. Here's the minimal setup:
use eidetica::{Instance, backend::database::InMemory};
use eidetica::crdt::Doc;
// Create database
let database = InMemory::new();
let db = Instance::new(Box::new(database));
// Add an authentication key (generates Ed25519 keypair)
db.add_private_key("my_key")?;
// Create a database using that key
let mut settings = Doc::new();
settings.set("name", "my_database");
let database = db.new_database(settings, "my_key")?;
// All operations are now authenticated
let op = database.new_transaction()?;
// ... make changes ...
op.commit()?; // Automatically signed
Key Concepts
Mandatory Authentication: Every entry must be signed - no exceptions.
Permission Levels:
- Admin: Can modify settings and manage keys
- Write: Can read and write data
- Read: Can only read data
Key Storage: Private keys are stored in Instance, public keys in database settings.
Common Tasks
Adding Users
Give other users access to your database:
use eidetica::auth::{AuthKey, Permission, KeyStatus};
let op = database.new_transaction()?;
let auth = op.auth_settings()?;
// Add a user with write access
let user_key = AuthKey {
key: "ed25519:USER_PUBLIC_KEY_HERE".to_string(),
permissions: Permission::Write(10),
status: KeyStatus::Active,
};
auth.add_key("alice", user_key)?;
op.commit()?;
Making Data Public
Allow anyone to read your database:
let op = database.new_transaction()?;
let auth = op.auth_settings()?;
// Wildcard key for public read access
let public_key = AuthKey {
key: "*".to_string(),
permissions: Permission::Read,
status: KeyStatus::Active,
};
auth.add_key("*", public_key)?;
op.commit()?;
Revoking Access
Remove a user's access:
let op = database.new_transaction()?;
let auth = op.auth_settings()?;
// Revoke the key
if let Some(mut key) = auth.get_key("alice")? {
key.status = KeyStatus::Revoked;
auth.update_key("alice", key)?;
}
op.commit()?;
Note: Historical entries created by revoked keys remain valid.
Multi-User Setup Example
// Initial setup with admin hierarchy
let op = database.new_transaction()?;
let auth = op.auth_settings()?;
// Super admin (priority 0 - highest)
auth.add_key("super_admin", AuthKey {
key: "ed25519:SUPER_ADMIN_KEY".to_string(),
permissions: Permission::Admin(0),
status: KeyStatus::Active,
})?;
// Department admin (priority 10)
auth.add_key("dept_admin", AuthKey {
key: "ed25519:DEPT_ADMIN_KEY".to_string(),
permissions: Permission::Admin(10),
status: KeyStatus::Active,
})?;
// Regular users (priority 100)
auth.add_key("user1", AuthKey {
key: "ed25519:USER1_KEY".to_string(),
permissions: Permission::Write(100),
status: KeyStatus::Active,
})?;
op.commit()?;
Key Management Tips
- Use descriptive key names: "alice_laptop", "build_server", etc.
- Set up admin hierarchy: Lower priority numbers = higher authority
- Regular key rotation: Periodically update keys for security
- Backup admin keys: Keep secure copies of critical admin keys
Advanced: Cross-Database Authentication
Databases can delegate authentication to other databases:
// In main database, delegate to a user's personal database
let op = main_tree.new_operation()?;
let auth = op.auth_settings()?;
// Reference another database for authentication
auth.add_delegated_tree("user@example.com", DelegatedTreeRef {
tree_root: "USER_TREE_ROOT_ID".to_string(),
max_permission: Permission::Write(15),
min_permission: Some(Permission::Read),
})?;
op.commit()?;
This allows users to manage their own keys in their personal databases while accessing your database with appropriate permissions.
Troubleshooting
"Authentication failed": Check that:
- The key exists in database settings
- The key status is Active (not Revoked)
- The key has sufficient permissions for the operation
"Cannot modify key": Admin operations require:
- Admin-level permissions
- Equal or higher priority than the target key
Network partitions: Authentication changes merge automatically using Last-Write-Wins. The most recent change takes precedence.
See Also
- Core Concepts - Understanding Databases and Entries
- Getting Started - Basic database setup
- Authentication Details - Technical implementation
Synchronization Guide
Eidetica's synchronization system enables real-time data synchronization between distributed peers in a decentralized network. This guide covers how to set up, configure, and use the sync features.
Overview
The sync system uses a BackgroundSync architecture with command-pattern communication:
- Single background thread handles all sync operations
- Command-channel communication between frontend and backend
- Automatic change detection via hook system
- Multiple transport protocols (HTTP, Iroh P2P)
- Database-level sync relationships for granular control
- Authentication and security using Ed25519 signatures
- Persistent state tracking via DocStore
Quick Start
1. Enable Sync on Your Database
use eidetica::{Instance, backend::InMemory};
// Create a database with sync enabled
let backend = Box::new(InMemory::new());
let db = Instance::new(backend).with_sync()?;
// Add a private key for authentication
db.add_private_key("device_key")?;
2. Enable a Transport Protocol
// Enable HTTP transport
db.sync_mut()?.enable_http_transport()?;
// Start a server to accept connections
db.sync_mut()?.start_server("127.0.0.1:8080")?;
3. Connect to a Remote Peer
use eidetica::sync::{Address, peer_types::PeerStatus};
// Create an address for the remote peer
let remote_addr = Address::http("192.168.1.100:8080")?;
// Connect and perform handshake
let peer_pubkey = db.sync_mut()?.connect_to_peer(&remote_addr).await?;
// Activate the peer for syncing
db.sync_mut()?.update_peer_status(&peer_pubkey, PeerStatus::Active)?;
4. Set Up Database Synchronization
// Create a database to sync
let database = db.new_database(Doc::new(), "device_key")?;
let tree_id = database.root_id().to_string();
// Configure this database to sync with the peer
db.sync_mut()?.add_tree_sync(&peer_pubkey, &tree_id)?;
5. Automatic Synchronization
Once configured, any changes to the database will automatically be queued for synchronization:
// Make changes to the database - these will be auto-synced
let op = database.new_transaction()?;
let store = op.get_subtree::<DocStore>("data")?;
store.set_string("message", "Hello, distributed world!")?;
op.commit()?; // This triggers sync queue entry
Transport Protocols
HTTP Transport
The HTTP transport uses REST APIs for synchronization:
// Enable HTTP transport
sync.enable_http_transport()?;
// Start server
sync.start_server("127.0.0.1:8080")?;
// Connect to remote peer
let addr = Address::http("peer.example.com:8080")?;
let peer_key = sync.connect_to_peer(&addr).await?;
Iroh P2P Transport (Recommended)
Iroh provides direct peer-to-peer connectivity with NAT traversal:
// Enable Iroh transport with production defaults (uses n0's relay servers)
sync.enable_iroh_transport()?;
// Or configure for specific environments:
use iroh::RelayMode;
use eidetica::sync::transports::iroh::IrohTransport;
// For local testing without internet (fast, no relays)
let transport = IrohTransport::builder()
.relay_mode(RelayMode::Disabled)
.build()?;
sync.enable_iroh_transport_with_config(transport)?;
// For staging/testing environments
let transport = IrohTransport::builder()
.relay_mode(RelayMode::Staging)
.build()?;
sync.enable_iroh_transport_with_config(transport)?;
// For enterprise deployments with custom relay servers
use iroh::{RelayMap, RelayNode, RelayUrl};
let relay_url: RelayUrl = "https://relay.example.com".parse()?;
let relay_node = RelayNode {
url: relay_url,
quic: Some(Default::default()), // Enable QUIC for better performance
};
let transport = IrohTransport::builder()
.relay_mode(RelayMode::Custom(RelayMap::from_iter([relay_node])))
.build()?;
sync.enable_iroh_transport_with_config(transport)?;
// Start the Iroh server (binds to its own ports)
sync.start_server("ignored")?; // Iroh manages its own addressing
// Get the server address for sharing with peers
let my_address = sync.get_server_address()?;
// This returns a JSON string containing:
// - node_id: Your cryptographic node identity
// - direct_addresses: Socket addresses where you can be reached
// Connect to a peer using their address
let addr = Address::iroh(&peer_address_json)?;
let peer_key = sync.connect_to_peer(&addr).await?;
// Or if you only have the node ID (will use relays to discover)
let addr = Address::iroh(peer_node_id)?;
let peer_key = sync.connect_to_peer(&addr).await?;
Relay Modes:
RelayMode::Default
- Production n0 relay servers (default, recommended for most users)RelayMode::Disabled
- Direct P2P only, no relays (for local testing, requires direct connectivity)RelayMode::Staging
- n0's staging relay servers (for testing against staging infrastructure)RelayMode::Custom(RelayMap)
- Your own relay servers (for enterprise/private deployments)
How Iroh Connectivity Works:
- Peers discover each other through relay servers or direct addresses
- Attempt direct connection via NAT hole-punching (~90% success rate)
- Fall back to relay if direct connection fails
- Automatically upgrade to direct connection when possible
Sync Configuration
BackgroundSync Architecture
The sync system automatically starts a background thread when transport is enabled:
// The BackgroundSync engine starts automatically when you enable transport
sync.enable_http_transport()?; // This starts the background thread
// The background thread runs an event loop with:
// - Command processing (immediate)
// - Periodic sync timer (5 minutes)
// - Retry queue timer (30 seconds)
// - Connection check timer (60 seconds)
Automatic Sync Behavior
Once configured, the system handles everything automatically:
// When you commit changes, they're sent immediately
let op = database.new_transaction()?;
op.commit()?; // Sync hook sends command to background thread
// Failed sends are retried with exponential backoff
// 2^attempts seconds delay (max 64 seconds)
// Configurable max attempts before dropping
// No manual queue management or worker control needed
// The BackgroundSync engine handles all operations
Peer Management
Registering Peers
// Register a peer manually
sync.register_peer("ed25519:abc123...", Some("Alice's Device"))?;
// Add multiple addresses for the same peer
sync.add_peer_address(&peer_key, Address::http("192.168.1.100:8080")?)?;
sync.add_peer_address(&peer_key, Address::iroh("iroh://peer_id@relay")?)?;
Peer Status Management
use eidetica::sync::peer_types::PeerStatus;
// Activate peer for syncing
sync.update_peer_status(&peer_key, PeerStatus::Active)?;
// Pause syncing with a peer
sync.update_peer_status(&peer_key, PeerStatus::Inactive)?;
// Get peer information
if let Some(peer_info) = sync.get_peer_info(&peer_key)? {
println!("Peer: {} ({})", peer_info.display_name.unwrap_or("Unknown".to_string()), peer_info.status);
}
Database Sync Relationships
// Add database to sync relationship
sync.add_tree_sync(&peer_key, &tree_id)?;
// List all databases synced with a peer
let synced_trees = sync.get_peer_trees(&peer_key)?;
// List all peers syncing a specific database
let syncing_peers = sync.get_tree_peers(&tree_id)?;
// Remove database from sync relationship
sync.remove_tree_sync(&peer_key, &tree_id)?;
Security
Authentication
All sync operations use Ed25519 digital signatures:
// The sync system automatically uses your device key for authentication
// Add additional keys if needed
db.add_private_key("backup_key")?;
// Set a specific key as default for a database
database.set_default_auth_key("backup_key");
Peer Verification
During handshake, peers exchange and verify public keys:
// The connect_to_peer method automatically:
// 1. Exchanges public keys
// 2. Verifies signatures
// 3. Registers the verified peer
let verified_peer_key = sync.connect_to_peer(&addr).await?;
Monitoring and Diagnostics
Sync Operations
The BackgroundSync engine handles all operations automatically:
// Entries are synced immediately when committed
// No manual queue management needed
// The background thread handles:
// - Immediate sending of new entries
// - Retry queue with exponential backoff
// - Periodic sync every 5 minutes
// - Connection health checks every minute
// Server status
let is_running = sync.is_server_running();
let server_addr = sync.get_server_address()?;
Sync State Tracking
use eidetica::sync::state::SyncStateManager;
// Get sync state for a database-peer relationship
let op = sync.sync_tree().new_operation()?;
let state_manager = SyncStateManager::new(&op);
let cursor = state_manager.get_sync_cursor(&peer_key, &tree_id)?;
println!("Last synced: {:?}", cursor.last_synced_entry);
let metadata = state_manager.get_sync_metadata(&peer_key)?;
println!("Success rate: {:.2}%", metadata.sync_success_rate() * 100.0);
Error Handling
The sync system provides detailed error reporting:
use eidetica::sync::SyncError;
match sync.connect_to_peer(&addr).await {
Ok(peer_key) => println!("Connected to peer: {}", peer_key),
Err(e) if e.is_sync_error() => {
match e.sync_error().unwrap() {
SyncError::HandshakeFailed(msg) => eprintln!("Handshake failed: {}", msg),
SyncError::NoTransportEnabled => eprintln!("No transport protocol enabled"),
SyncError::PeerNotFound(key) => eprintln!("Peer not found: {}", key),
_ => eprintln!("Sync error: {}", e),
}
},
Err(e) => eprintln!("Other error: {}", e),
}
Best Practices
1. Use Iroh Transport for Production
Iroh provides better NAT traversal and P2P capabilities than HTTP.
2. Understand Automatic Sync Behavior
The BackgroundSync engine handles operations automatically:
- Entries sync immediately when committed
- Failed sends retry with exponential backoff (2^attempts seconds)
- Periodic sync runs every 5 minutes for all peers
3. Monitor Sync Health
Regularly check sync statistics and peer status to ensure healthy operation.
4. Handle Network Failures Gracefully
The sync system automatically retries failed operations, but your application should handle temporary disconnections.
5. Secure Your Private Keys
Store device keys securely and use different keys for different purposes when appropriate.
Advanced Topics
Custom Sync Hooks
You can implement custom sync hooks to extend the sync system:
use eidetica::sync::hooks::{SyncHook, SyncHookContext};
struct CustomSyncHook;
impl SyncHook for CustomSyncHook {
fn on_entry_committed(&self, context: &SyncHookContext) -> Result<()> {
println!("Entry {} committed to database {}", context.entry.id(), context.tree_id);
// Custom logic here
Ok(())
}
}
Multiple Database Instances
You can run multiple sync-enabled databases in the same process:
// Database 1
let db1 = Instance::new(Box::new(InMemory::new())).with_sync()?;
db1.sync_mut()?.enable_http_transport()?;
db1.sync_mut()?.start_server("127.0.0.1:8080")?;
// Database 2
let db2 = Instance::new(Box::new(InMemory::new())).with_sync()?;
db2.sync_mut()?.enable_http_transport()?;
db2.sync_mut()?.start_server("127.0.0.1:8081")?;
// Connect them
let addr = Address::http("127.0.0.1:8080")?;
let peer_key = db2.sync_mut()?.connect_to_peer(&addr).await?;
Troubleshooting
Common Issues
"No transport enabled" error:
- Ensure you've called
enable_http_transport()
orenable_iroh_transport()
Sync not happening:
- Check peer status is
Active
- Verify database sync relationships are configured
- Check network connectivity between peers
Performance issues:
- Consider using Iroh transport for better performance
- Check retry queue for persistent failures
- Verify network connectivity is stable
Authentication failures:
- Ensure private keys are properly configured
- Verify peer public keys are correct
- Check that peers are using compatible protocol versions
Sync Quick Reference
A concise reference for Eidetica's synchronization API with common usage patterns and code snippets.
Setup and Initialization
Basic Sync Setup
use eidetica::{Instance, backend::InMemory};
// Create database with sync enabled
let backend = Box::new(InMemory::new());
let db = Instance::new(backend).with_sync()?;
// Add authentication key
db.add_private_key("device_key")?;
// Enable transport
db.sync_mut()?.enable_http_transport()?;
db.sync_mut()?.start_server("127.0.0.1:8080")?;
Understanding BackgroundSync
// The BackgroundSync engine starts automatically with transport
db.sync_mut()?.enable_http_transport()?; // Starts background thread
// Background thread handles:
// - Command processing (immediate)
// - Periodic sync (5 min intervals)
// - Retry queue (30 sec intervals)
// - Connection checks (60 sec intervals)
// All sync operations are automatic - no manual queue management needed
Peer Management
Connect to Remote Peer
use eidetica::sync::{Address, peer_types::PeerStatus};
// HTTP connection
let addr = Address::http("192.168.1.100:8080")?;
let peer_key = db.sync_mut()?.connect_to_peer(&addr).await?;
// Iroh P2P connection
let addr = Address::iroh("iroh://peer_id@relay.example.com")?;
let peer_key = db.sync_mut()?.connect_to_peer(&addr).await?;
// Activate peer
db.sync_mut()?.update_peer_status(&peer_key, PeerStatus::Active)?;
Manual Peer Registration
// Register peer manually
let peer_key = "ed25519:abc123...";
db.sync_mut()?.register_peer(peer_key, Some("Alice's Device"))?;
// Add addresses
db.sync_mut()?.add_peer_address(peer_key, Address::http("192.168.1.100:8080")?)?;
db.sync_mut()?.add_peer_address(peer_key, Address::iroh("iroh://peer_id")?)?;
// Note: Registration does NOT immediately connect to the peer
// Connection happens lazily during next sync operation or periodic sync (5 min)
// Use connect_to_peer() for immediate connection if needed
Peer Status Management
// List all peers
let peers = db.sync()?.list_peers()?;
for peer in peers {
println!("{}: {} ({})",
peer.pubkey,
peer.display_name.unwrap_or("Unknown".to_string()),
peer.status
);
}
// Get specific peer info
if let Some(peer) = db.sync()?.get_peer_info(&peer_key)? {
println!("Status: {:?}", peer.status);
println!("Addresses: {:?}", peer.addresses);
}
// Update peer status
db.sync_mut()?.update_peer_status(&peer_key, PeerStatus::Inactive)?;
Database Synchronization
Setup Database Sync Relationships
// Create database
let database = db.new_database(Doc::new(), "device_key")?;
let tree_id = database.root_id().to_string();
// Enable sync with peer
db.sync_mut()?.add_tree_sync(&peer_key, &tree_id)?;
// List synced databases for peer
let databases = db.sync()?.get_peer_trees(&peer_key)?;
// List peers syncing a database
let peers = db.sync()?.get_tree_peers(&tree_id)?;
Remove Sync Relationships
// Remove database from sync with peer
db.sync_mut()?.remove_tree_sync(&peer_key, &tree_id)?;
// Remove peer completely
db.sync_mut()?.remove_peer(&peer_key)?;
Data Operations (Auto-Sync)
Basic Data Changes
use eidetica::store::DocStore;
// Any database operation automatically triggers sync
let op = database.new_transaction()?;
let store = op.get_subtree::<DocStore>("data")?;
store.set_string("message", "Hello World")?;
store.set_path("user.name", "Alice")?;
store.set_path("user.age", 30)?;
// Commit triggers sync hooks automatically
op.commit()?; // Entries queued for sync to all configured peers
Bulk Operations
// Multiple operations in single commit
let op = database.new_transaction()?;
let store = op.get_subtree::<DocStore>("data")?;
for i in 0..100 {
store.set_string(&format!("item_{}", i), &format!("value_{}", i))?;
}
// Single commit, single sync entry
op.commit()?;
Monitoring and Diagnostics
Server Control
// Start/stop sync server
let sync = db.sync_mut()?;
sync.start_server("127.0.0.1:8080")?;
// Check server status
if sync.is_server_running() {
let addr = sync.get_server_address()?;
println!("Server running at: {}", addr);
}
// Stop server
sync.stop_server()?;
Sync State Tracking
// Get sync state manager
let op = db.sync()?.sync_tree().new_operation()?;
let state_manager = SyncStateManager::new(&op);
// Get sync cursor for peer-database relationship
let cursor = state_manager.get_sync_cursor(&peer_key, &tree_id)?;
if let Some(cursor) = cursor {
println!("Last synced: {:?}", cursor.last_synced_entry);
println!("Total synced: {}", cursor.total_synced_count);
}
// Get peer metadata
let metadata = state_manager.get_sync_metadata(&peer_key)?;
if let Some(meta) = metadata {
println!("Successful syncs: {}", meta.successful_sync_count);
println!("Failed syncs: {}", meta.failed_sync_count);
}
Sync State Tracking
use eidetica::sync::state::SyncStateManager;
// Get sync database operation
let op = sync.sync_tree().new_operation()?;
let state_manager = SyncStateManager::new(&op);
// Check sync cursor
let cursor = state_manager.get_sync_cursor(&peer_key, &tree_id)?;
println!("Last synced: {:?}", cursor.last_synced_entry);
println!("Total synced: {}", cursor.total_synced_count);
// Check sync metadata
let metadata = state_manager.get_sync_metadata(&peer_key)?;
println!("Success rate: {:.2}%", metadata.sync_success_rate() * 100.0);
println!("Avg duration: {:.1}ms", metadata.average_sync_duration_ms);
// Get recent sync history
let history = state_manager.get_sync_history(&peer_key, Some(10))?;
for entry in history {
println!("Sync {}: {} entries in {:.1}ms",
entry.sync_id, entry.entries_count, entry.duration_ms);
}
Error Handling
Common Error Patterns
use eidetica::sync::SyncError;
// Connection errors
match sync.connect_to_peer(&addr).await {
Ok(peer_key) => println!("Connected: {}", peer_key),
Err(e) if e.is_sync_error() => {
match e.sync_error().unwrap() {
SyncError::HandshakeFailed(msg) => {
eprintln!("Handshake failed: {}", msg);
// Retry with different address or check credentials
},
SyncError::NoTransportEnabled => {
eprintln!("Enable transport first");
sync.enable_http_transport()?;
},
SyncError::PeerNotFound(key) => {
eprintln!("Peer {} not registered", key);
// Register peer first
},
_ => eprintln!("Other sync error: {}", e),
}
},
Err(e) => eprintln!("Non-sync error: {}", e),
}
Monitoring Sync Health
// Check server status
if !sync.is_server_running() {
eprintln!("Warning: Sync server not running");
}
// Monitor peer connectivity
let peers = sync.list_peers()?;
for peer in peers {
if peer.status != PeerStatus::Active {
eprintln!("Warning: Peer {} is {}", peer.pubkey, peer.status);
}
}
// Sync happens automatically, but you can monitor state
// via the SyncStateManager for diagnostics
Configuration Examples
Development Setup
// Fast, responsive sync for development
// Enable HTTP transport for easy debugging
db.sync_mut()?.enable_http_transport()?;
db.sync_mut()?.start_server("127.0.0.1:8080")?;
// Connect to local test peer
let addr = Address::http("127.0.0.1:8081")?;
let peer = db.sync_mut()?.connect_to_peer(&addr).await?;
Production Setup
// Use Iroh for production deployments (defaults to n0's relay servers)
db.sync_mut()?.enable_iroh_transport()?;
// Or configure for specific environments:
use iroh::RelayMode;
use eidetica::sync::transports::iroh::IrohTransport;
// Custom relay server (e.g., enterprise deployment)
let relay_url: iroh::RelayUrl = "https://relay.example.com".parse()?;
let relay_node = iroh::RelayNode {
url: relay_url,
quic: Some(Default::default()),
};
let transport = IrohTransport::builder()
.relay_mode(RelayMode::Custom(iroh::RelayMap::from_iter([relay_node])))
.build()?;
db.sync_mut()?.enable_iroh_transport_with_config(transport)?;
// Connect to peers
let addr = Address::iroh(peer_node_id)?;
let peer = db.sync_mut()?.connect_to_peer(&addr).await?;
// Sync happens automatically:
// - Immediate on commit
// - Retry with exponential backoff
// - Periodic sync every 5 minutes
Multi-Database Setup
// Run multiple sync-enabled databases
let db1 = Instance::new(Box::new(InMemory::new())).with_sync()?;
db1.sync_mut()?.enable_http_transport()?;
db1.sync_mut()?.start_server("127.0.0.1:8080")?;
let db2 = Instance::new(Box::new(InMemory::new())).with_sync()?;
db2.sync_mut()?.enable_http_transport()?;
db2.sync_mut()?.start_server("127.0.0.1:8081")?;
// Connect them together
let addr = Address::http("127.0.0.1:8080")?;
let peer = db2.sync_mut()?.connect_to_peer(&addr).await?;
Testing Patterns
Testing with Iroh (No Relays)
#[tokio::test]
async fn test_iroh_sync_local() -> Result<()> {
use iroh::RelayMode;
use eidetica::sync::transports::iroh::IrohTransport;
// Configure Iroh for local testing (no relay servers)
let transport1 = IrohTransport::builder()
.relay_mode(RelayMode::Disabled)
.build()?;
let transport2 = IrohTransport::builder()
.relay_mode(RelayMode::Disabled)
.build()?;
// Setup databases with local Iroh transport
let db1 = Instance::new(Box::new(InMemory::new())).with_sync()?;
db1.sync_mut()?.enable_iroh_transport_with_config(transport1)?;
db1.sync_mut()?.start_server("ignored")?; // Iroh manages its own addresses
let db2 = Instance::new(Box::new(InMemory::new())).with_sync()?;
db2.sync_mut()?.enable_iroh_transport_with_config(transport2)?;
db2.sync_mut()?.start_server("ignored")?;
// Get the serialized NodeAddr (includes direct addresses)
let addr1 = db1.sync()?.get_server_address()?;
let addr2 = db2.sync()?.get_server_address()?;
// Connect peers using full NodeAddr info
let peer1 = db2.sync_mut()?.connect_to_peer(&Address::iroh(&addr1)).await?;
let peer2 = db1.sync_mut()?.connect_to_peer(&Address::iroh(&addr2)).await?;
// Now they can sync directly via P2P
Ok(())
}
Mock Peer Setup (HTTP)
#[tokio::test]
async fn test_sync_between_peers() -> Result<()> {
// Setup first peer
let db1 = Instance::new(Box::new(InMemory::new())).with_sync()?;
db1.add_private_key("peer1")?;
db1.sync_mut()?.enable_http_transport()?;
db1.sync_mut()?.start_server("127.0.0.1:0")?; // Random port
let addr1 = db1.sync()?.get_server_address()?;
// Setup second peer
let db2 = Instance::new(Box::new(InMemory::new())).with_sync()?;
db2.add_private_key("peer2")?;
db2.sync_mut()?.enable_http_transport()?;
// Connect peers
let addr = Address::http(&addr1)?;
let peer1_key = db2.sync_mut()?.connect_to_peer(&addr).await?;
db2.sync_mut()?.update_peer_status(&peer1_key, PeerStatus::Active)?;
// Setup sync relationship
let tree1 = db1.new_database(Doc::new(), "peer1")?;
let tree2 = db2.new_database(Doc::new(), "peer2")?;
db2.sync_mut()?.add_tree_sync(&peer1_key, &tree1.root_id().to_string())?;
// Test sync
let op1 = tree1.new_transaction()?;
let store1 = op1.get_subtree::<DocStore>("data")?;
store1.set_string("test", "value")?;
op1.commit()?;
// Wait for sync
tokio::time::sleep(Duration::from_secs(2)).await;
// Verify sync occurred
// ... verification logic
Ok(())
}
Best Practices Summary
✅ Do
- Enable sync before creating databases you want to synchronize
- Use
PeerStatus::Active
only for peers you want to sync with - Use Iroh transport for production deployments
- Monitor sync state and peer connectivity
- Handle network failures gracefully
- Let BackgroundSync handle retry logic automatically
❌ Don't
- Disable sync hooks on databases you want to synchronize
- Manually manage sync queues (BackgroundSync handles this)
- Ignore sync errors in production code
- Use HTTP transport for high-volume production (prefer Iroh)
- Assume sync is instantaneous (it's eventually consistent)
🔧 Troubleshooting Checklist
-
Sync not working?
- Check transport is enabled and server started
- Verify peer status is
Active
- Confirm database sync relationships configured
- Check network connectivity
-
Performance issues?
- Consider using Iroh transport
- Check for network bottlenecks
- Verify retry queue isn't growing unbounded
- Monitor peer connectivity status
-
Memory usage high?
- Check for dead/unresponsive peers
- Verify retry queue is processing correctly
- Consider restarting sync to clear state
-
Sync delays?
- Remember sync is immediate on commit
- Check if entries are in retry queue
- Verify network is stable
- Check peer responsiveness
Logging
Eidetica uses the tracing
crate for structured logging throughout the library. This provides flexible, performant logging that library users can configure to their needs.
Quick Start
To enable logging in your Eidetica application, add tracing-subscriber
to your dependencies and initialize it in your main function:
[dependencies]
eidetica = "0.1"
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
use tracing_subscriber::EnvFilter;
fn main() {
// Initialize tracing with environment filter
tracing_subscriber::fmt()
.with_env_filter(
EnvFilter::from_default_env()
.add_directive("eidetica=info".parse().unwrap())
)
.init();
// Your application code here
}
Configuring Log Levels
Control logging verbosity using the RUST_LOG
environment variable:
# Show only errors
RUST_LOG=eidetica=error cargo run
# Show info messages and above
RUST_LOG=eidetica=info cargo run
# Show debug messages for sync module
RUST_LOG=eidetica::sync=debug cargo run
# Show all trace messages (very verbose)
RUST_LOG=eidetica=trace cargo run
Log Levels in Eidetica
Eidetica uses the following log levels:
- ERROR: Critical errors that prevent operations from completing
- Failed database operations
- Network errors during sync
- Authentication failures
- WARN: Important warnings that don't prevent operation
- Retry operations after failures
- Invalid configuration detected
- Deprecated feature usage
- INFO: High-level operational messages
- Sync server started/stopped
- Successful synchronization with peers
- Database loaded/saved
- DEBUG: Detailed operational information
- Sync protocol details
- Database synchronization progress
- Hook execution
- TRACE: Very detailed trace information
- Individual entry processing
- Detailed CRDT operations
- Network protocol messages
Examples
Basic Application Logging
use eidetica::Instance;
use eidetica::backend::database::InMemory;
use tracing_subscriber::EnvFilter;
fn main() -> eidetica::Result<()> {
// Set up logging with default info level
tracing_subscriber::fmt()
.with_env_filter(
EnvFilter::from_default_env()
.add_directive("eidetica=info".parse().unwrap())
)
.init();
let backend = Box::new(InMemory::new());
let db = Instance::new(backend);
// Add private key first
db.add_private_key("my_key")?;
// Create a database
let mut settings = eidetica::crdt::Doc::new();
settings.set_string("name", "my_database");
let database = db.new_database(settings, "my_key")?;
Ok(())
}
Custom Logging Configuration
use tracing_subscriber::{fmt, EnvFilter};
use tracing_subscriber::prelude::*;
fn main() {
// Configure logging with custom format and filtering
let filter = EnvFilter::try_new(
"eidetica=debug,eidetica::sync=trace"
).unwrap();
tracing_subscriber::registry()
.with(fmt::layer()
.with_target(false) // Don't show target module
.compact() // Use compact formatting
)
.with(filter)
.init();
// Your application code here
}
Logging in Tests
For tests, you can use tracing-subscriber
's test utilities:
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use tracing_subscriber::EnvFilter; #[test] fn test_with_logging() { // Initialize logging for this test let _ = tracing_subscriber::fmt() .with_env_filter(EnvFilter::from_default_env()) .with_test_writer() .try_init(); // Test code here - logs will be captured with test output } } }
Performance Considerations
The tracing
crate is designed to have minimal overhead when logging is disabled. Log statements that aren't enabled at the current level are optimized away at compile time.
For performance-critical code paths, Eidetica uses appropriate log levels:
- Hot paths use
trace!
level to avoid overhead in production - Synchronization operations use
debug!
for detailed tracking - Only important events use
info!
and above
Integration with Observability Tools
The tracing
ecosystem supports various backends for production observability:
- Console output: Default, human-readable format
- JSON output: For structured logging systems
- OpenTelemetry: For distributed tracing
- Jaeger/Zipkin: For trace visualization
See the tracing
documentation for more advanced integration options.
Developer Walkthrough: Building with Eidetica
This guide provides a practical walkthrough for developers starting with Eidetica, using the simple command-line Todo Example to illustrate core concepts.
Core Concepts
Eidetica organizes data differently from traditional relational databases. Here's a breakdown of the key components you'll interact with, illustrated by the Todo example (examples/todo/src/main.rs
).
Note: This example uses the Eidetica library from the workspace at crates/lib/
.
1. The Database (Instance
)
The Instance
is your main entry point to interacting with an Eidetica database instance. It manages the underlying storage (the "database") and provides access to data structures called Databases.
In the Todo example, we initialize or load the database using an InMemory
database, which can be persisted to a file:
use eidetica::backend::database::InMemory;
use eidetica::Instance;
use std::path::PathBuf;
use anyhow::Result;
fn load_or_create_db(path: &PathBuf) -> Result<Instance> {
if path.exists() {
// Load existing DB from file
let database = InMemory::load_from_file(path)?;
let db = Instance::new(Box::new(database));
// Authentication keys are automatically loaded with the database
Ok(db)
} else {
// Create a new in-memory database
let database = InMemory::new();
let db = Instance::new(Box::new(database));
// Add authentication key (required for all operations)
db.add_private_key("todo_app_key")?;
Ok(db)
}
}
fn save_db(db: &Instance, path: &PathBuf) -> Result<()> {
let database = db.backend();
let database_guard = database.lock().unwrap();
// Cast is needed to call database-specific methods like save_to_file
let in_memory_database = database_guard
.as_any()
.downcast_ref::<InMemory>()
.ok_or(anyhow::anyhow!("Failed to downcast database"))?; // Simplified error
in_memory_database.save_to_file(path)?;
Ok(())
}
// Usage in main:
// let db = load_or_create_db(&cli.database_path)?;
// save_db(&db, &cli.database_path)?;
2. Databases (Database
)
A Database
is a primary organizational unit within a Instance
. Think of it somewhat like a schema or a logical database within a larger instance. It acts as a container for related data, managed through Stores
. Databases provide versioning and history tracking for the data they contain.
The Todo example uses a single Database named "todo":
use eidetica::Instance;
use eidetica::Database;
use anyhow::Result;
fn load_or_create_todo_tree(db: &Instance) -> Result<Database> {
let tree_name = "todo";
let auth_key = "todo_app_key"; // Must match the key added to the database
// Attempt to find an existing database by name using find_database
match db.find_database(tree_name) {
Ok(mut databases) => {
// Found one or more databases with the name.
// We arbitrarily take the first one found.
// In a real app, you might want specific logic for duplicates.
println!("Found existing todo database.");
Ok(databases.pop().unwrap()) // Safe unwrap as find_database errors if empty
}
Err(e) if e.is_not_found() => {
// If not found, create a new one
println!("No existing todo database found, creating a new one...");
let mut doc = eidetica::crdt::Doc::new(); // Database settings
doc.set("name", tree_name);
let database = db.new_database(doc, auth_key)?;
// No initial commit needed here as stores like Table handle
// their creation upon first access within an operation.
Ok(database)
}
Err(e) => {
// Handle other potential errors from find_database
Err(e.into())
}
}
}
// Usage in main:
// let todo_database = load_or_create_todo_database(&db)?;
3. Transactions (Transaction
)
All modifications to a Database
's data happen within a Transaction
. Transactions ensure atomicity – similar to transactions in traditional databases. Changes made within a transaction are only applied to the Database when the transaction is successfully committed.
Every transaction is automatically authenticated using the database's default signing key. This ensures that all changes are cryptographically verified and traceable.
use eidetica::Database;
use anyhow::Result;
fn some_data_modification(database: &Database) -> eidetica::Result<()> {
// Start an authenticated atomic transaction
let op = database.new_transaction()?; // Automatically uses the database's default signing key
// ... perform data changes using the 'op' handle ...
// Commit the changes atomically (automatically signed)
op.commit()?;
Ok(())
}
Read-only access also typically uses a Transaction
to ensure a consistent view of the data at a specific point in time.
4. Stores (Store
)
Stores
are the heart of data storage within a Database
. Unlike rigid tables in SQL databases, Stores are highly flexible containers.
- Analogy: You can think of a Store loosely like a table or a collection within a Database.
- Flexibility: Stores aren't tied to a single data type or structure. They are generic containers identified by a name (e.g., "todos").
- Implementations: Eidetica provides several
Store
implementations for common data patterns. The Todo example usesTable<T>
, which is specialized for storing collections of structured data (like rows) where each item has a unique ID. Other implementations might exist for key-value pairs, lists, etc. - Extensibility: You can implement your own
Store
types to model complex or domain-specific data structures.
The Todo example uses a Table
to store Todo
structs:
use eidetica::{Database, Error};
use eidetica::store::Table;
use serde::{Deserialize, Serialize};
use chrono::{DateTime, Utc};
use anyhow::{anyhow, Result};
// Define the data structure (must be Serializable + Deserializable)
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Todo {
pub title: String,
pub completed: bool,
pub created_at: DateTime<Utc>,
pub completed_at: Option<DateTime<Utc>>,
}
fn add_todo(database: &Database, title: String) -> Result<()> {
let op = database.new_transaction()?;
// Get a handle to the 'todos' store, specifying its type is Table<Todo>
let todos_store = op.get_subtree::<Table<Todo>>("todos")?;
let todo = Todo::new(title);
// Insert the data - Table assigns an ID
let todo_id = todos_store.insert(todo)?;
op.commit()?;
println!("Added todo with ID: {}", todo_id);
Ok(())
}
fn complete_todo(database: &Database, id: &str) -> Result<()> {
let op = database.new_transaction()?;
let todos_store = op.get_subtree::<Table<Todo>>("todos")?;
// Get data by ID
let mut todo = todos_store.get(id).map_err(|e| anyhow!("Get failed: {}", e))?;
todo.complete();
// Update data by ID
todos_store.set(id, todo)?;
op.commit()?;
Ok(())
}
fn list_todos(database: &Database) -> Result<()> {
let op = database.new_transaction()?;
let todos_store = op.get_subtree::<Table<Todo>>("todos")?;
// Search/scan the store
let todos_with_ids = todos_store.search(|_| true)?; // Get all
// ... print todos ...
Ok(())
}
5. Data Modeling (Serialize
, Deserialize
)
Eidetica leverages the serde
framework for data serialization. Any data structure you want to store needs to implement serde::Serialize
and serde::Deserialize
. This allows you to store complex Rust types directly.
#[derive(Debug, Clone, Serialize, Deserialize)] // Serde traits
pub struct Todo {
pub title: String,
pub completed: bool,
pub created_at: DateTime<Utc>,
pub completed_at: Option<DateTime<Utc>>,
}
6. Y-CRDT Integration (YDoc
)
The Todo example also demonstrates the use of YDoc
for collaborative data structures, specifically for user information and preferences. This requires the "y-crdt" feature flag.
use eidetica::store::YDoc;
use eidetica::y_crdt::{Map, Transact};
fn set_user_info(database: &Database, name: Option<&String>, email: Option<&String>, bio: Option<&String>) -> Result<()> {
let op = database.new_transaction()?;
// Get a handle to the 'user_info' YDoc store
let user_info_store = op.get_subtree::<YDoc>("user_info")?;
// Update user information using the Y-CRDT document
user_info_store.with_doc_mut(|doc| {
let user_info_map = doc.get_or_insert_map("user_info");
let mut txn = doc.transact_mut();
if let Some(name) = name {
user_info_map.insert(&mut txn, "name", name.clone());
}
if let Some(email) = email {
user_info_map.insert(&mut txn, "email", email.clone());
}
if let Some(bio) = bio {
user_info_map.insert(&mut txn, "bio", bio.clone());
}
Ok(())
})?;
op.commit()?;
Ok(())
}
fn set_user_preference(database: &Database, key: String, value: String) -> Result<()> {
let op = database.new_transaction()?;
// Get a handle to the 'user_prefs' YDoc store
let user_prefs_store = op.get_subtree::<YDoc>("user_prefs")?;
// Update user preference using the Y-CRDT document
user_prefs_store.with_doc_mut(|doc| {
let prefs_map = doc.get_or_insert_map("preferences");
let mut txn = doc.transact_mut();
prefs_map.insert(&mut txn, key, value);
Ok(())
})?;
op.commit()?;
Ok(())
}
Multiple Store Types in One Database:
The Todo example demonstrates how different store types can coexist within the same database:
- "todos" (Table
): Stores todo items with automatic ID generation - "user_info" (YDoc): Stores user profile information using Y-CRDT Maps
- "user_prefs" (YDoc): Stores user preferences using Y-CRDT Maps
This shows how Eidetica allows you to choose the most appropriate data structure for each type of data within your application, optimizing for different use cases (record storage vs. collaborative editing).
Running the Todo Example
To see these concepts in action, you can run the Todo example:
# Navigate to the example directory
cd examples/todo
# Build the example
cargo build
# Run commands (this will create todo_db.json)
cargo run -- add "Learn Eidetica"
cargo run -- list
# Note the ID printed
cargo run -- complete <id_from_list>
cargo run -- list
Refer to the example's README.md and test.sh for more usage details.
This walkthrough provides a starting point. Explore the Eidetica documentation and other examples to learn about more advanced features like different store types, history traversal, and distributed capabilities.
Code Examples
This page provides focused code snippets for common tasks in Eidetica.
Assumes basic setup like use eidetica::{Instance, Database, Error, ...};
and error handling (?
) for brevity.
1. Initializing the Database (Instance
)
use eidetica::backend::database::InMemory;
use eidetica::Instance;
use std::path::PathBuf;
// Option A: Create a new, empty in-memory database
let database_new = InMemory::new();
let db_new = Instance::new(Box::new(database_new));
// Option B: Load from a previously saved file
let db_path = PathBuf::from("my_database.json");
if db_path.exists() {
match InMemory::load_from_file(&db_path) {
Ok(database_loaded) => {
let db_loaded = Instance::new(Box::new(database_loaded));
println!("Database loaded successfully.");
// Use db_loaded
}
Err(e) => {
eprintln!("Error loading database: {}", e);
// Handle error, maybe create new
}
}
} else {
println!("Database file not found, creating new.");
// Use db_new from Option A
}
2. Creating or Loading a Database
use eidetica::crdt::Doc;
let db: Instance = /* obtained from step 1 */;
let tree_name = "my_app_data";
let auth_key = "my_key"; // Must match a key added to the database
let database = match db.find_database(tree_name) {
Ok(mut databases) => {
println!("Found existing database: {}", tree_name);
databases.pop().unwrap() // Assume first one is correct
}
Err(e) if e.is_not_found() => {
println!("Creating new database: {}", tree_name);
let mut doc = Doc::new();
doc.set("name", tree_name);
db.new_database(doc, auth_key)? // All databases require authentication
}
Err(e) => return Err(e.into()), // Propagate other errors
};
println!("Using Database with root ID: {}", database.root_id());
3. Writing Data (DocStore Example)
use eidetica::store::DocStore;
let database: Database = /* obtained from step 2 */;
// Start an authenticated transaction (automatically uses the database's default key)
let op = database.new_transaction()?;
{
// Get the DocStore store handle (scoped)
let config_store = op.get_subtree::<DocStore>("configuration")?;
// Set some values
config_store.set("api_key", "secret-key-123")?;
config_store.set("retry_count", "3")?;
// Overwrite a value
config_store.set("api_key", "new-secret-456")?;
// Remove a value
config_store.remove("old_setting")?; // Ok if it doesn't exist
}
// Commit the changes atomically
let entry_id = op.commit()?;
println!("DocStore changes committed in entry: {}", entry_id);
4. Writing Data (Table Example)
use eidetica::store::Table;
use serde::{Serialize, Deserialize};
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
struct Task {
description: String,
completed: bool,
}
let database: Database = /* obtained from step 2 */;
// Start an authenticated transaction (automatically uses the database's default key)
let op = database.new_transaction()?;
let inserted_id;
{
// Get the Table handle
let tasks_store = op.get_subtree::<Table<Task>>("tasks")?;
// Insert a new task
let task1 = Task { description: "Buy milk".to_string(), completed: false };
inserted_id = tasks_store.insert(task1)?;
println!("Inserted task with ID: {}", inserted_id);
// Insert another task
let task2 = Task { description: "Write docs".to_string(), completed: false };
tasks_store.insert(task2)?;
// Update the first task (requires getting it first if you only have the ID)
if let Ok(mut task_to_update) = tasks_store.get(&inserted_id) {
task_to_update.completed = true;
tasks_store.set(&inserted_id, task_to_update)?;
println!("Updated task {}", inserted_id);
} else {
eprintln!("Task {} not found for update?", inserted_id);
}
// Remove a task (if you knew its ID)
// tasks_store.remove(&some_other_id)?;
}
// Commit all inserts/updates/removes
let entry_id = op.commit()?;
println!("Table changes committed in entry: {}", entry_id);
5. Reading Data (DocStore Viewer)
use eidetica::store::DocStore;
let database: Database = /* obtained from step 2 */;
// Get a read-only viewer for the latest state
let config_viewer = database.get_subtree_viewer::<DocStore>("configuration")?;
match config_viewer.get("api_key") {
Ok(api_key) => println!("Current API Key: {}", api_key),
Err(e) if e.is_not_found() => println!("API Key not set."),
Err(e) => return Err(e.into()),
}
match config_viewer.get("retry_count") {
Ok(count_str) => {
// Note: DocStore values can be various types
let count: u32 = count_str.parse().unwrap_or(0);
println!("Retry Count: {}", count);
}
Err(_) => println!("Retry count not set or invalid."),
}
6. Reading Data (Table Viewer)
use eidetica::store::Table;
// Assume Task struct from example 4
let database: Database = /* obtained from step 2 */;
// Get a read-only viewer
let tasks_viewer = database.get_subtree_viewer::<Table<Task>>("tasks")?;
// Get a specific task by ID
let id_to_find = /* obtained previously, e.g., inserted_id */;
match tasks_viewer.get(&id_to_find) {
Ok(task) => println!("Found task {}: {:?}", id_to_find, task),
Err(e) if e.is_not_found() => println!("Task {} not found.", id_to_find),
Err(e) => return Err(e.into()),
}
// Iterate over all tasks
println!("\nAll Tasks:");
match tasks_viewer.iter() {
Ok(iter) => {
for result in iter {
match result {
Ok((id, task)) => println!(" ID: {}, Task: {:?}", id, task),
Err(e) => eprintln!("Error reading task during iteration: {}", e),
}
}
}
Err(e) => eprintln!("Error creating iterator: {}", e),
}
7. Working with Nested Data (ValueEditor)
use eidetica::store::{DocStore, Value};
let database: Database = /* obtained from step 2 */;
// Start an authenticated transaction (automatically uses the database's default key)
let op = database.new_transaction()?;
// Get the DocStore store handle
let user_store = op.get_subtree::<DocStore>("users")?;
// Using ValueEditor to create and modify nested structures
{
// Get an editor for a specific user
let user_editor = user_store.get_value_mut("user123");
// Set profile information with method chaining - creates paths as needed
user_editor
.get_value_mut("profile")
.get_value_mut("name")
.set(Value::String("Jane Doe".to_string()))?;
user_editor
.get_value_mut("profile")
.get_value_mut("email")
.set(Value::String("jane@example.com".to_string()))?;
// Set preferences as a map
let mut preferences = Map::new();
preferences.set_string("theme".to_string(), "dark".to_string());
preferences.set_string("notifications".to_string(), "enabled".to_string());
user_editor
.get_value_mut("preferences")
.set(Value::Map(preferences))?;
// Add to preferences using the editor
user_editor
.get_value_mut("preferences")
.get_value_mut("language")
.set(Value::String("en".to_string()))?;
// Delete a specific preference
user_editor
.get_value_mut("preferences")
.delete_child("notifications")?;
}
// Commit the changes
let entry_id = op.commit()?;
println!("ValueEditor changes committed in entry: {}", entry_id);
// Read back the nested data
let viewer_op = database.new_transaction()?;
let viewer_store = viewer_op.get_subtree::<DocStore>("users")?;
// Get the user data and navigate through it
if let Ok(user_data) = viewer_store.get("user123") {
if let Value::Map(user_map) = user_data {
// Access profile
if let Some(Value::Map(profile)) = user_map.get("profile") {
if let Some(Value::String(name)) = profile.get("name") {
println!("User name: {}", name);
}
}
// Access preferences
if let Some(Value::Map(prefs)) = user_map.get("preferences") {
println!("User preferences:");
for (key, value) in prefs.as_hashmap() {
match value {
Value::String(val) => println!(" {}: {}", key, val),
Value::Deleted => println!(" {}: [deleted]", key),
_ => println!(" {}: [complex value]", key),
}
}
}
}
}
// Using ValueEditor to read nested data (alternative to manual navigation)
{
let editor = viewer_store.get_value_mut("user123");
// Get profile name
match editor.get_value_mut("profile").get_value("name") {
Ok(Value::String(name)) => println!("User name (via editor): {}", name),
_ => println!("Name not found or not a string"),
}
// Check if a preference exists
match editor.get_value_mut("preferences").get_value("notifications") {
Ok(_) => println!("Notifications setting exists"),
Err(e) if e.is_not_found() => println!("Notifications setting was deleted"),
Err(_) => println!("Error accessing notifications setting"),
}
}
// Using get_root_mut to access the entire store
{
let root_editor = viewer_store.get_root_mut();
println!("\nAll users in store:");
match root_editor.get() {
Ok(Value::Map(users)) => {
for (user_id, _) in users.as_hashmap() {
println!(" User ID: {}", user_id);
}
},
_ => println!("No users found or error accessing store"),
}
}
8. Working with Y-CRDT Documents (YDoc)
The YDoc
store provides access to Y-CRDT (Yrs) documents for collaborative data structures. This requires the "y-crdt" feature flag.
use eidetica::store::YDoc;
use eidetica::y_crdt::{Map as YMap, Transact};
let database: Database = /* obtained from step 2 */;
// Start an authenticated transaction (automatically uses the database's default key)
let op = database.new_transaction()?;
// Get the YDoc store handle
let user_info_store = op.get_subtree::<YDoc>("user_info")?;
// Writing to Y-CRDT document
user_info_store.with_doc_mut(|doc| {
let user_info_map = doc.get_or_insert_map("user_info");
let mut txn = doc.transact_mut();
user_info_map.insert(&mut txn, "name", "Alice Johnson");
user_info_map.insert(&mut txn, "email", "alice@example.com");
user_info_map.insert(&mut txn, "bio", "Software developer");
Ok(())
})?;
// Commit the transaction
let entry_id = op.commit()?;
println!("YDoc changes committed in entry: {}", entry_id);
// Reading from Y-CRDT document
let read_op = database.new_transaction()?;
let reader_store = read_op.get_subtree::<YDoc>("user_info")?;
reader_store.with_doc(|doc| {
let user_info_map = doc.get_or_insert_map("user_info");
let txn = doc.transact();
println!("User Information:");
if let Some(name) = user_info_map.get(&txn, "name") {
let name_str = name.to_string(&txn);
println!("Name: {name_str}");
}
if let Some(email) = user_info_map.get(&txn, "email") {
let email_str = email.to_string(&txn);
println!("Email: {email_str}");
}
if let Some(bio) = user_info_map.get(&txn, "bio") {
let bio_str = bio.to_string(&txn);
println!("Bio: {bio_str}");
}
Ok(())
})?;
// Working with nested Y-CRDT maps
let prefs_op = database.new_transaction()?;
let prefs_store = prefs_op.get_subtree::<YDoc>("user_prefs")?;
prefs_store.with_doc_mut(|doc| {
let prefs_map = doc.get_or_insert_map("preferences");
let mut txn = doc.transact_mut();
prefs_map.insert(&mut txn, "theme", "dark");
prefs_map.insert(&mut txn, "notifications", "enabled");
prefs_map.insert(&mut txn, "language", "en");
Ok(())
})?;
prefs_op.commit()?;
// Reading preferences
let prefs_read_op = database.new_transaction()?;
let prefs_read_store = prefs_read_op.get_subtree::<YDoc>("user_prefs")?;
prefs_read_store.with_doc(|doc| {
let prefs_map = doc.get_or_insert_map("preferences");
let txn = doc.transact();
println!("User Preferences:");
// Iterate over all preferences
for (key, value) in prefs_map.iter(&txn) {
let value_str = value.to_string(&txn);
println!("{key}: {value_str}");
}
Ok(())
})?;
YDoc Features:
- Collaborative Editing: Y-CRDT documents provide conflict-free merging for concurrent modifications
- Rich Data Types: Support for Maps, Arrays, Text, and other Y-CRDT types
- Functional Interface: Access via
with_doc()
for reads andwith_doc_mut()
for writes - Atomic Integration: Changes are staged within the Transaction and committed atomically
Use Cases for YDoc:
- User profiles and preferences (as shown in the todo example)
- Collaborative documents and shared state
- Real-time data synchronization
- Any scenario requiring conflict-free concurrent updates
9. Saving the Database (InMemory)
use eidetica::backend::database::InMemory;
use std::path::PathBuf;
let db: Instance = /* database instance */;
let db_path = PathBuf::from("my_database.json");
// Lock the database mutex
let database_guard = db.backend().lock().map_err(|_| anyhow::anyhow!("Failed to lock database mutex"))?;
// Downcast to the concrete InMemory type
if let Some(in_memory_database) = database_guard.as_any().downcast_ref::<InMemory>() {
match in_memory_database.save_to_file(&db_path) {
Ok(_) => println!("Database saved successfully to {:?}", db_path),
Err(e) => eprintln!("Error saving database: {}", e),
}
} else {
eprintln!("Database is not InMemory, cannot save to file this way.");
}
Eidetica Architecture Overview
Eidetica is a decentralized database designed to "Remember Everything." This document outlines the architecture and how different components interact with each other.
Eidetica is built on a foundation of content-addressable entries organized in databases, with a pluggable backend system for storage. Entry
objects are immutable and contain Tree/SubTree structures that form the Merkle-DAG, with integrated authentication using Ed25519 digital signatures. The system provides Database and Store abstractions over these internal structures to enable efficient merging and synchronization of distributed data.
See the Core Components section for details on the key building blocks.
graph TD
A[User Application] --> B[Instance]
B --> C[Database]
C --> T[Transaction]
T --> S[Stores: DocStore, Table, etc.]
subgraph Backend Layer
C --> BE[Backend: InMemory, etc.]
BE --> D[Entry Storage]
end
subgraph Entry Internal Structure
H[EntryBuilder] -- builds --> E[Entry]
E -- contains --> I[TreeNode]
E -- contains --> J[SubTreeNode Vector]
E -- contains --> K[SigInfo]
I --> IR[Root ID, Parents, Metadata]
J --> JR[Name, Parents, Data]
end
subgraph Authentication System
K --> N[SigKey]
K --> O[Signature]
L[AuthValidator] -- validates --> E
L -- uses --> M[_settings subtree]
Q[CryptoModule] -- signs/verifies --> E
end
subgraph User Abstractions
C -.-> |"provides view over"| I
S -.-> |"provides view over"| J
end
T -- uses --> H
H -- stores --> BE
C -- uses --> L
S -- modifies --> J
Architectural Terminology
This document clarifies the important distinction between internal data structure names and user-facing API abstractions in Eidetica's architecture.
Overview
Eidetica uses two parallel naming schemes that serve different purposes:
- Internal Data Structures: TreeNode/SubTreeNode - the actual Merkle-DAG data structures
- User-Facing Abstractions: Database/Store - high-level views over these structures
Understanding this distinction is crucial for maintaining consistency in code, documentation, and APIs.
Internal Data Structures
TreeNode and SubTreeNode
These are the fundamental building blocks of the Merkle-DAG, defined within the Entry
module:
-
TreeNode
: The internal representation of the main tree node within an Entry- Contains the root ID, parent references, and metadata
- Represents the core structural data of the Merkle-DAG
- Always singular per Entry
-
SubTreeNode
: The internal representation of named subtree nodes within an Entry- Contains subtree name, parent references, and data payload
- Multiple SubTreeNodes can exist per Entry
- Each represents a named partition of data (analogous to tables)
When to Use Tree/SubTree Terminology
- When discussing the actual data structures within Entry
- In Entry module documentation and implementation
- When describing the Merkle-DAG at the lowest level
- In comments that deal with the serialized data format
- When explaining parent-child relationships in the DAG
User-Facing Abstractions
Database and Store
These represent the current high-level abstraction layer that users interact with:
-
Database
: A collection of related entries with shared authentication and history- Provides a view over a tree of entries
- Manages operations, authentication, and synchronization
- What users think of as a "database" or "collection"
-
Store
: Typed data access patterns within a database- DocStore, Table, YDoc are concrete Store implementations
- Provide familiar APIs (key-value, document, collaborative editing)
- Each Store operates on a named subtree within entries
When to Use Database/Store Terminology
- In all public APIs and user-facing documentation
- In user guides, tutorials, and examples
- When describing current application-level concepts
- In error messages shown to users
- In logging that users might see
Future Abstraction Layers
Database/Store represents the current abstraction over TreeNode/SubTreeNode structures, but it is not the only possible abstraction. Future versions of Eidetica may introduce alternative abstraction layers that provide different views or APIs over the same underlying layered Merkle-DAG structures.
The key principle is that TreeNode/SubTreeNode remain the stable internal representation, while various abstractions can be built on top to serve different use cases or API paradigms.
The Relationship
User Application
↓
Database ←─ User-facing abstraction
↓
Transaction ←─ Operations layer
↓
Entry ←─ Contains TreeNode + SubTreeNodes
↓ (internal data structures)
Backend ←─ Storage layer
- A
Database
provides operations over a tree ofEntry
objects - Each
Entry
contains oneTreeNode
and multipleSubTreeNode
structures Store
implementations provide typed access to specificSubTreeNode
data- Users never directly interact with TreeNode/SubTreeNode
Code Guidelines
Internal Implementation
// Correct - dealing with Entry internals
entry.tree.root // TreeNode field
entry.subtrees.iter() // SubTreeNode collection
builder.set_subtree_data_mut() // Working with subtree data structures
Public APIs
// Correct - user-facing abstractions
database.new_transaction() // Database operations
transaction.get_store::<DocStore>("users") // Store access
instance.create_database("mydata") // Database management
Documentation
- Internal docs: Can reference both levels, explaining their relationship
- User guides: Only use Database/Store terminology
- API docs: Use Database/Store exclusively
- Code comments: Use appropriate terminology for the level being discussed
Rationale
This dual naming scheme serves several important purposes:
-
Separation of Concerns: Internal structures focus on correctness and efficiency, while abstractions focus on usability
-
API Stability: Users interact with stable Database/Store concepts, while internal TreeNode/SubTreeNode structures can evolve
-
Conceptual Clarity: Users think in terms of databases and data stores, not Merkle-DAG nodes
-
Implementation Flexibility: Internal refactoring doesn't affect user-facing terminology
-
Domain Appropriateness: Tree/Subtree accurately describes the Merkle-DAG structure, while Database/Store matches user mental models
Core Components
The architectural foundation of Eidetica, implementing the Merkle-CRDT design principles through a carefully orchestrated set of interconnected components.
Component Overview
These components work together to provide Eidetica's unique combination of features: content-addressable storage, cryptographic authentication, conflict-free synchronization, and flexible data access patterns.
Architecture Layers
Entry: The fundamental data unit containing TreeNode and SubTreeNode structures - immutable, content-addressable, and cryptographically signed
Database: User-facing abstraction providing operations over trees of entries with independent history and authentication policies
Instance: The main database orchestration layer managing databases, authentication, and storage
Transaction: Transaction mechanism providing atomic operations across multiple stores
Data Access and Storage
Stores: User-facing typed data access patterns (DocStore, Table, YDoc) that provide application-friendly interfaces over subtree data
Backend: Pluggable storage abstraction supporting different persistence strategies
CRDT: Conflict-free data types enabling distributed merging and synchronization
Security and Synchronization
Authentication: Ed25519-based cryptographic system for signing and verification
Synchronization: Distributed sync protocols built on the Merkle-DAG foundation
Terminology Note
Eidetica uses a dual terminology system:
- Internal structures: TreeNode/SubTreeNode refer to the actual Merkle-DAG data structures within entries
- User abstractions: Database/Store refer to the high-level APIs and concepts users interact with
See Terminology for detailed guidelines on when to use each naming scheme.
Entry
The fundamental building block of Eidetica's data model, representing an immutable, cryptographically-signed unit of data within the Merkle-DAG structure.
Conceptual Role
Entries serve as the atomic units of both data storage and version history in Eidetica. They combine the functions of:
- Data Container: Holding actual application data and metadata
- Version Node: Linking to parent entries to form a history DAG
- Authentication Unit: Cryptographically signed to ensure integrity and authorization
- Content-Addressable Object: Uniquely identified by their content hash for deduplication and verification
Internal Data Structure
Entry contains two fundamental internal data structures that form the Merkle-DAG:
TreeNode: The main tree node containing:
- Root ID of the tree this entry belongs to
- Parent entry references for the main tree history
- Optional metadata (not merged with other entries)
SubTreeNodes: Named subtree nodes, each containing:
- Subtree name (analogous to store/table names)
- Parent entry references specific to this subtree's history
- Serialized CRDT data payload for this subtree
Authentication Envelope: Every entry includes signature information that proves authorization and ensures tamper-detection.
Relationship to User Abstractions
While entries internally use TreeNode and SubTreeNode structures, users interact with higher-level abstractions:
- Database: Provides operations over the tree of entries (uses TreeNode data)
- Stores: Typed access patterns (DocStore, Tables, etc.) over subtree data (uses SubTreeNode data)
This separation allows the internal Merkle-DAG structures to remain efficient and correct while providing user-friendly APIs.
Identity and Integrity
Content-Addressable Identity: Each entry's ID is a SHA-256 hash of its canonical content, making entries globally unique and enabling efficient deduplication.
Deterministic Hashing: IDs are computed from a canonical JSON representation, ensuring identical entries produce identical IDs across different systems.
Immutability Guarantee: Once created, entries cannot be modified, ensuring the integrity of the historical record and cryptographic signatures.
Design Benefits
Distributed Synchronization: Content-addressable IDs enable efficient sync protocols where systems can identify missing or conflicting entries.
Cryptographic Verification: Signed entries provide strong guarantees about data authenticity and integrity.
Granular History: The DAG structure enables sophisticated queries like "show me all changes since timestamp X" or "merge these two concurrent branches".
Efficient Storage: Identical entries are automatically deduplicated, and metadata can be stored separately from bulk data.
Internal Data Structure Detail
An Entry contains the following internal data structures:
struct Entry {
// Main tree node - the core Merkle-DAG structure
tree: TreeNode {
root: ID, // Root entry ID of the tree
parents: Vec<ID>, // Parent entries in main tree history
metadata: Option<RawData>, // Optional metadata (not merged)
},
// Named subtree nodes - independent data partitions
subtrees: Vec<SubTreeNode> {
name: String, // Subtree name (e.g., "users", "posts")
parents: Vec<ID>, // Parent entries specific to this subtree
data: RawData, // Serialized CRDT data for this subtree
},
// Authentication and signature information
sig: SigInfo {
sig: Option<String>, // Base64-encoded Ed25519 signature
key: SigKey, // Reference to signing key
},
}
// Where:
type RawData = String; // JSON-serialized CRDT structures
type ID = String; // SHA-256 content hash (hex-encoded)
Key Design Points
- TreeNode: Represents the entry's position in the main Merkle-DAG tree structure
- SubTreeNodes: Enable independent histories for different data partitions within the same entry
- Separation: The tree structure (TreeNode) is separate from the data partitions (SubTreeNodes)
- Multiple Histories: Each entry can participate in one main tree history plus multiple independent subtree histories
Backend
Pluggable storage abstraction layer supporting different storage implementations.
Database Trait
Abstracts underlying storage to allow different backends without changing core logic.
Core Operations:
- Entry storage and retrieval by content-addressable ID
- Verification status tracking for authentication
- Database and store tip calculation
- Topological sorting for consistent entry ordering
Current Implementation
InMemory: HashMap-based storage with JSON file persistence
- Stores entries and verification status
- Includes save/load functionality for state preservation
- Supports all Database trait operations
Verification Status
Verified: Entry cryptographically verified and authorized
Unverified: Entry lacks authentication or failed verification
Status determined during commit based on signature validation and permission checking.
Key Features
Entry Storage: Immutable entries with content-addressable IDs
Tip Calculation: Identifies entries with no children in databases/stores
Height Calculation: Computes topological heights for proper ordering
Graph Traversal: Efficient DAG navigation for database operations
Custom Backend Implementation
Implement Database trait with:
- Storage-specific logic for all trait methods
- Verification status tracking support
- Thread safety (Send + Sync + Any)
- Performance considerations for graph operations
Instance
Purpose and Architecture
Instance acts as the orchestration layer between application code and the underlying storage systems. It manages multiple independent Databases (analogous to databases), handles cryptographic authentication, and coordinates with pluggable storage backends.
Each Instance instance maintains a unique device identity through an automatically-generated Ed25519 keypair, enabling secure multi-device synchronization.
Key Responsibilities
Database Management: Creates and provides access to Databases, each representing an independent history of data entries.
Authentication Infrastructure: Manages Ed25519 private keys for signing operations and validating permissions. All operations require authenticated access.
Backend Coordination: Interfaces with pluggable storage backends (currently just InMemory) while abstracting storage details from higher-level code.
Device Identity: Automatically maintains device-specific cryptographic identity for sync operations.
Design Principles
- Authentication-First: Every operation requires cryptographic validation
- Pluggable Storage: Storage backends can be swapped without affecting application logic
- Multi-Database: Supports multiple independent data collections within a single instance
- Sync-Ready: Built-in device identity and hooks for distributed synchronization
Database
Represents an independent, versioned collection of data entries within Eidetica, analogous to a database in traditional databases.
Conceptual Model
Databases organize related data entries into a coherent unit with its own history and authentication policies. Each Database is identified by its root entry's content-addressable ID, making it globally unique and verifiable.
Unlike traditional databases, Databases maintain full historical data through a Merkle DAG structure, enabling features like:
- Conflict-free merging of concurrent changes
- Cryptographic verification of data integrity
- Decentralized synchronization across devices
- Point-in-time queries (unimplemented)
Architecture and Lifecycle
Database Creation: Initialized with settings (stored as a Doc CRDT) and associated with an authentication key for signing operations.
Data Access: Applications interact with Databases through Transaction instances, which provide transactional semantics and store access.
Entry History: Each operation creates new entries that reference their parents, building an immutable history DAG.
Settings Management: Database-level configuration (permissions, sync settings, etc.) is stored as CRDT data, allowing distributed updates.
Authentication
Each Database maintains its own authentication configuration in the special _settings
store. All entries must be cryptographically signed with Ed25519 signatures - there are no unsigned entries in Eidetica.
Databases support direct keys, delegation to other databases for flexible cross-project authentication, and a three-tier permission hierarchy (Admin, Write, Read) with priority-based key management. Authentication changes merge deterministically using Last-Write-Wins semantics.
For complete details, see Authentication.
Integration Points
Store Access: Databases provide typed access to different data structures (DocStore, Table, YDoc) through the store system.
Synchronization: Databases serve as the primary unit of synchronization, with independent merge and conflict resolution.
Transaction
Atomic transaction mechanism for database modifications.
Lifecycle
- Creation: Initialize with current database tips as parents
- Store Access: Get typed handles for data manipulation
- Staging: Accumulate changes in internal entry
- Commit: Sign, validate, and store finalized entry
Features
- Multiple store changes in single commit
- Automatic authentication using database's default key
- Type-safe store access
- Cryptographic signing and validation
Integration
Entry Management: Creates and manages entries via EntryBuilder
Authentication: Signs operations and validates permissions
CRDT Support: Enables store conflict resolution
Backend Storage: Stores entries with verification status
Authentication
Comprehensive Ed25519-based cryptographic authentication system that ensures data integrity and access control across Eidetica's distributed architecture.
Overview
Eidetica implements mandatory authentication for all entries - there are no unsigned entries in the system. Every operation requires valid Ed25519 signatures, providing strong guarantees about data authenticity and enabling sophisticated access control in decentralized environments.
The authentication system is deeply integrated with the core database, not merely a consumer of the API. This tight integration enables efficient validation, deterministic conflict resolution during network partitions, and preservation of historical validity.
Architecture
Storage Location: Authentication configuration resides in the special _settings.auth
store of each Database, using Doc CRDT for deterministic conflict resolution.
Validation Component: The AuthValidator provides centralized entry validation with performance-optimized caching.
Signature Format: All entries include authentication information in their structure:
{
"auth": {
"sig": "ed25519_signature_base64_encoded",
"key": "KEY_NAME_OR_DELEGATION_PATH"
}
}
Permission Hierarchy
Three-tier permission model with integrated priority system:
Permission | Settings Access | Key Management | Data Write | Data Read | Priority |
---|---|---|---|---|---|
Admin | ✓ | ✓ | ✓ | ✓ | 0-2^32 |
Write | ✗ | ✗ | ✓ | ✓ | 0-2^32 |
Read | ✗ | ✗ | ✗ | ✓ | None |
Priority Semantics:
- Lower numbers = higher priority (0 is highest)
- Admin/Write permissions include u32 priority value
- Keys can only modify other keys with equal or lower priority
- Priority affects administrative operations, NOT CRDT merge resolution
Key Management
Direct Keys
Ed25519 public keys stored directly in the database's _settings.auth
:
{
"_settings": {
"auth": {
"KEY_LAPTOP": {
"pubkey": "ed25519:BASE64_PUBLIC_KEY",
"permissions": "write:10",
"status": "active"
}
}
}
}
Key Lifecycle
Keys transition between two states:
- Active: Can create new entries, all operations permitted
- Revoked: Cannot create new entries, historical entries remain valid
This design preserves the integrity of historical data while preventing future use of compromised keys.
Wildcard Keys
Special *
key enables public access:
- Can grant any permission level (read, write, or admin)
- Commonly used for world-readable databases
- Subject to same revocation mechanisms as regular keys
Delegation System
Databases can delegate authentication to other databases, enabling powerful authentication patterns without granting administrative privileges on the delegating database.
Core Concepts
Delegated Database References: Any database can reference another database as an authentication source:
{
"_settings": {
"auth": {
"user@example.com": {
"permission-bounds": {
"max": "write:15",
"min": "read" // optional
},
"database": {
"root": "TREE_ROOT_ID",
"tips": ["TIP_ID_1", "TIP_ID_2"]
}
}
}
}
}
Permission Clamping
Delegated permissions are constrained by bounds:
- max: Maximum permission level (required)
- min: Minimum permission level (optional)
- Effective permission = clamp(delegated_permission, min, max)
- Priority derives from the effective permission after clamping
Delegation Chains
Multi-level delegation supported with permission clamping at each level:
{
"auth": {
"key": [
{ "key": "org_tree", "tips": ["tip1"] },
{ "key": "team_tree", "tips": ["tip2"] },
{ "key": "ACTUAL_KEY" }
]
}
}
Tip Tracking
"Latest known tips" mechanism ensures key revocations are respected:
- Entries include delegated database tips at signing time
- Database tracks these as "latest known tips"
- Future entries must use equal or newer tips
- Prevents using old database states where revoked keys were valid
Authentication Flow
- Entry Creation: Application creates entry with auth field
- Signing: Entry signed with Ed25519 private key
- Resolution: AuthValidator resolves key (direct or delegated)
- Status Check: Verify key is Active (not Revoked)
- Tip Validation: For delegated keys, validate against latest known tips
- Permission Clamping: Apply bounds for delegated permissions
- Signature Verification: Cryptographically verify Ed25519 signature
- Permission Check: Ensure key has sufficient permissions
- Storage: Entry stored if all validations pass
Conflict Resolution
Authentication changes use Last-Write-Wins (LWW) semantics based on the DAG structure:
- Settings conflicts resolved deterministically by Doc CRDT
- Priority determines who CAN make changes
- LWW determines WHICH change wins in a conflict
- Historical entries remain valid even after permission changes
- Revoked status prevents new entries but preserves existing content
Network Partition Handling
During network splits:
- Both sides may modify authentication settings
- Upon reconnection, LWW resolves conflicts
- Most recent change (by DAG timestamp) takes precedence
- All historical entries remain valid
- Future operations follow merged authentication state
Security Considerations
Protected Against
- Unauthorized entry creation (mandatory signatures)
- Permission escalation (permission clamping)
- Historical tampering (immutable DAG)
- Replay attacks (content-addressable IDs)
- Administrative hierarchy violations (priority system)
Requires Manual Recovery
- Admin key compromise when no higher-priority key exists
- Conflicting administrative changes during partitions
Implementation Components
AuthValidator (auth/validation.rs
): Core validation logic with caching
Crypto Module (auth/crypto.rs
): Ed25519 operations and signature verification
AuthSettings (auth/settings.rs
): Settings management and key operations
Permission Module (auth/permission.rs
): Permission checking and clamping logic
See Also
- Database - How Databases integrate with authentication
- Entry - Authentication data in entry structure
- Authentication Design - Full design specification
Synchronization Architecture
This document describes the internal architecture of Eidetica's synchronization system, including design decisions, data structures, and implementation details.
Architecture Overview
The synchronization system uses a BackgroundSync architecture with command-pattern communication:
- Single background thread handling all sync operations
- Command channel communication between frontend and background
- Merkle-CRDT synchronization for conflict-free replication
- Modular transport layer supporting HTTP and Iroh P2P protocols
- Hook-based change detection for automatic sync triggering
- Persistent state tracking in sync database using DocStore
graph TB
subgraph "Application Layer"
APP[Application Code] --> TREE[Database Operations]
end
subgraph "Core Database Layer"
TREE --> ATOMICOP[Transaction]
BASEDB[Instance] --> TREE
BASEDB --> SYNC[Sync Module]
ATOMICOP --> COMMIT[Commit Operation]
COMMIT --> HOOKS[Execute Sync Hooks]
end
subgraph "Sync Frontend"
SYNC[Sync Module] --> CMDTX[Command Channel]
SYNC --> PEERMGR[PeerManager]
SYNC --> SYNCTREE[Sync Database]
HOOKS --> SYNCHOOK[SyncHookImpl]
SYNCHOOK --> CMDTX
end
subgraph "BackgroundSync Engine"
CMDTX --> BGSYNC[BackgroundSync Thread]
BGSYNC --> TRANSPORT[Transport Layer]
BGSYNC --> RETRY[Retry Queue]
BGSYNC --> TIMERS[Periodic Timers]
BGSYNC -.->|reads| SYNCTREE[Sync Database]
BGSYNC -.->|reads| PEERMGR[PeerManager]
end
subgraph "Sync State Management"
SYNCSTATE[SyncStateManager]
SYNCCURSOR[SyncCursor]
SYNCMETA[SyncMetadata]
SYNCHISTORY[SyncHistoryEntry]
SYNCSTATE --> SYNCCURSOR
SYNCSTATE --> SYNCMETA
SYNCSTATE --> SYNCHISTORY
BGSYNC --> SYNCSTATE
end
subgraph "Storage Layer"
BACKEND[(Backend Storage)]
SYNCTREE --> BACKEND
SYNCSTATE --> SYNCTREE
end
subgraph "Transport Layer"
TRANSPORT --> HTTP[HTTP Transport]
TRANSPORT --> IROH[Iroh P2P Transport]
HTTP --> NETWORK1[Network/HTTP]
IROH --> NETWORK2[Network/QUIC]
end
Core Components
1. Sync Module (sync/mod.rs
)
The main Sync
struct is now a thin frontend that communicates with a background sync engine:
pub struct Sync {
/// Communication channel to the background sync engine
command_tx: mpsc::Sender<SyncCommand>,
/// The backend for read operations and database management
backend: Arc<dyn Database>,
/// The database containing synchronization settings
sync_tree: Database,
/// Track if transport has been enabled
transport_enabled: bool,
}
Key responsibilities:
- Provides public API methods
- Sends commands to background thread
- Manages sync database for peer/relationship storage
- Creates hooks that send commands to background
2. BackgroundSync Engine (sync/background.rs
)
The BackgroundSync
struct handles all sync operations in a single background thread and accesses peer state directly from the sync database:
pub struct BackgroundSync {
// Core components
transport: Box<dyn SyncTransport>,
backend: Arc<dyn Database>,
// Reference to sync database for peer/relationship management
sync_tree_id: ID,
// Server state
server_address: Option<String>,
// Retry queue for failed sends
retry_queue: Vec<RetryEntry>,
// Communication
command_rx: mpsc::Receiver<SyncCommand>,
}
BackgroundSync accesses peer and relationship data directly from the sync database:
- All peer data is stored persistently in the sync database via
PeerManager
- Peer information is read on-demand when needed for sync operations
- Peer data automatically survives application restarts
- Single source of truth eliminates state synchronization issues
Command types:
pub enum SyncCommand {
// Entry operations
SendEntries { peer: String, entries: Vec<Entry> },
QueueEntry { peer: String, entry_id: ID, tree_id: ID },
// Sync control
SyncWithPeer { peer: String },
Shutdown,
// Server operations (with response channels)
StartServer { addr: String, response: oneshot::Sender<Result<()>> },
StopServer { response: oneshot::Sender<Result<()>> },
GetServerAddress { response: oneshot::Sender<Result<String>> },
// Peer connection operations
ConnectToPeer { address: Address, response: oneshot::Sender<Result<String>> },
SendRequest { address: Address, request: SyncRequest, response: oneshot::Sender<Result<SyncResponse>> },
}
Event loop architecture:
The BackgroundSync engine runs a tokio select loop that handles:
- Command processing: Immediate handling of frontend commands
- Periodic sync: Every 5 minutes, sync with all registered peers
- Retry processing: Every 30 seconds, attempt to resend failed entries
- Connection checks: Every 60 seconds, verify peer connectivity
All operations are non-blocking and handled concurrently within the single background thread.
Server initialization:
When starting a server, BackgroundSync creates a SyncHandlerImpl
with database access:
// Inside handle_start_server()
let handler = Arc::new(SyncHandlerImpl::new(
self.backend.clone(),
DEVICE_KEY_NAME,
));
self.transport.start_server(addr, handler).await?;
This enables the transport layer to process incoming sync requests and store received entries.
3. Command Pattern Architecture
The command pattern provides clean separation between the frontend and background sync engine:
Command categories:
- Entry operations:
SendEntries
,QueueEntry
- Handle network I/O for entry transmission - Server management:
StartServer
,StopServer
,GetServerAddress
- Manage transport server state - Network operations:
ConnectToPeer
,SendRequest
- Perform async network operations - Control:
SyncWithPeer
,Shutdown
- Coordinate background sync operations
Data access pattern:
- Peer and relationship data: Written directly to sync database by frontend, read on-demand by background
- Network operations: Handled via commands to maintain async boundaries
- Transport state: Owned and managed by background sync engine
This architecture:
- Eliminates circular dependencies: Clear ownership boundaries
- Maintains async separation: Network operations stay in background thread
- Enables direct data access: Both components access sync database directly for peer data
- Provides clean shutdown: Graceful handling in both async and sync contexts
4. Change Detection Hooks (sync/hooks.rs
)
The hook system automatically detects when entries need synchronization:
pub trait SyncHook: Send + Sync {
fn on_entry_committed(&self, context: &SyncHookContext) -> Result<()>;
}
pub struct SyncHookContext {
pub tree_id: ID,
pub entry: Entry,
pub is_root_entry: bool,
}
Integration flow:
- Transaction detects entry commit
- Executes registered sync hooks with entry context
- SyncHookImpl creates QueueEntry command
- Command sent to BackgroundSync via channel
- Background thread fetches and sends entry immediately
The hook implementation is per-peer, allowing targeted synchronization. Commands are fire-and-forget to avoid blocking the commit operation.
5. Peer Management (sync/peer_manager.rs
)
The PeerManager
handles peer registration and relationship management:
impl PeerManager {
/// Register a new peer
pub fn register_peer(&self, pubkey: &str, display_name: Option<&str>) -> Result<()>;
/// Add database sync relationship
pub fn add_tree_sync(&self, peer_pubkey: &str, tree_root_id: &str) -> Result<()>;
/// Get peers that sync a specific database
pub fn get_tree_peers(&self, tree_root_id: &str) -> Result<Vec<String>>;
}
Data storage:
- Peers stored in
peers.{pubkey}
paths in sync database - Database relationships in
peers.{pubkey}.sync_trees
arrays - Addresses in
peers.{pubkey}.addresses
arrays
6. Sync State Tracking (sync/state.rs
)
Persistent state tracking for synchronization progress:
pub struct SyncCursor {
pub peer_pubkey: String,
pub tree_id: ID,
pub last_synced_entry: Option<ID>,
pub last_sync_time: String,
pub total_synced_count: u64,
}
pub struct SyncMetadata {
pub peer_pubkey: String,
pub successful_sync_count: u64,
pub failed_sync_count: u64,
pub total_entries_synced: u64,
pub average_sync_duration_ms: f64,
}
Storage organization:
sync_state/
├── cursors/{peer_pubkey}/{tree_id} -> SyncCursor
├── metadata/{peer_pubkey} -> SyncMetadata
└── history/{sync_id} -> SyncHistoryEntry
7. Transport Layer (sync/transports/
)
Modular transport system supporting multiple protocols with SyncHandler architecture:
pub trait SyncTransport: Send + Sync {
/// Start server with handler for processing requests
async fn start_server(&mut self, addr: &str, handler: Arc<dyn SyncHandler>) -> Result<()>;
/// Send entries to peer
async fn send_entries(&self, address: &Address, entries: &[Entry]) -> Result<()>;
/// Send sync request and get response
async fn send_request(&self, address: &Address, request: &SyncRequest) -> Result<SyncResponse>;
}
SyncHandler Architecture:
The transport layer uses a callback-based handler pattern to enable database access:
pub trait SyncHandler: Send + Sync {
/// Handle incoming sync requests with database access
async fn handle_request(&self, request: &SyncRequest) -> SyncResponse;
}
This architecture solves the fundamental problem of received data storage by:
- Providing database backend access to transport servers
- Enabling stateful request processing (GetTips, GetEntries, SendEntries)
- Maintaining clean separation between networking and sync logic
- Supporting both HTTP and Iroh transports with identical handler interface
HTTP Transport:
- REST API endpoint at
/api/v0
for sync operations - JSON serialization for wire format
- Axum-based server with handler state injection
- Standard HTTP error codes
Iroh P2P Transport:
- QUIC-based direct peer connections with handler integration
- Built-in NAT traversal
- Efficient binary protocol with JsonHandler serialization
- Bidirectional streams for request/response pattern
Data Flow
1. Entry Commit Flow
sequenceDiagram
participant App as Application
participant Database as Database
participant Transaction as Transaction
participant Hooks as SyncHooks
participant Cmd as Command Channel
participant BG as BackgroundSync
App->>Database: new_transaction()
Database->>Transaction: create with sync hooks
App->>Transaction: modify data
App->>Transaction: commit()
Transaction->>Backend: store entry
Transaction->>Hooks: execute_hooks(context)
Hooks->>Cmd: send(QueueEntry)
Cmd->>BG: deliver command
Note over BG: Background thread
BG->>BG: handle_command()
BG->>BG: fetch entry from backend
BG->>Transport: send_entries(peer, entries)
2. BackgroundSync Processing
The background thread processes commands immediately upon receipt:
- SendEntries: Transmit entries to peer, retry on failure
- QueueEntry: Fetch entry from backend and send immediately
- SyncWithPeer: Initiate bidirectional synchronization
- AddPeer/RemovePeer: Update peer registry
- CreateRelationship: Establish database-peer sync mapping
- Server operations: Start/stop transport server
Failed operations are automatically added to the retry queue with exponential backoff timing.
3. Smart Duplicate Prevention
Eidetica implements semantic duplicate prevention through Merkle-CRDT tip comparison, eliminating the need for simple "sent entry" tracking.
How It Works
Database Synchronization Process:
- Tip Exchange: Both peers share their current database tips (frontier entries)
- Gap Analysis: Compare local and remote tips to identify missing entries
- Smart Filtering: Only send entries the peer doesn't have (based on DAG analysis)
- Ancestor Inclusion: Automatically include necessary parent entries
// Background sync's smart duplicate prevention
async fn sync_tree_with_peer(&self, peer_pubkey: &str, tree_id: &ID, address: &Address) -> Result<()> {
// Step 1: Get our tips for this database
let our_tips = self.backend.get_tips(tree_id)?;
// Step 2: Get peer's tips via network request
let their_tips = self.get_peer_tips(tree_id, address).await?;
// Step 3: Smart filtering - only send what they're missing
let entries_to_send = self.find_entries_to_send(&our_tips, &their_tips)?;
if !entries_to_send.is_empty() {
self.transport.send_entries(address, &entries_to_send).await?;
}
// Step 4: Fetch what we're missing from them
let missing_entries = self.find_missing_entries(&our_tips, &their_tips)?;
if !missing_entries.is_empty() {
let entries = self.fetch_entries_from_peer(address, &missing_entries).await?;
self.store_received_entries(entries).await?;
}
}
Benefits over Simple Tracking:
Approach | Duplicate Prevention | Correctness | Network Efficiency |
---|---|---|---|
Tip-Based (Current) | ✅ Semantic understanding | ✅ Always correct | ✅ Optimal - only sends needed |
Simple Tracking | ❌ Can get out of sync | ❌ May miss updates | ❌ May send unnecessary data |
Merkle-CRDT Synchronization Algorithm
Phase 1: Tip Discovery
sequenceDiagram
participant A as Peer A
participant B as Peer B
A->>B: GetTips(tree_id)
B->>A: TipsResponse([tip1, tip2, ...])
Note over A: Compare tips to identify gaps
A->>A: find_entries_to_send(our_tips, their_tips)
A->>A: find_missing_entries(our_tips, their_tips)
Phase 2: Gap Analysis
The find_entries_to_send
method performs sophisticated DAG analysis:
fn find_entries_to_send(&self, our_tips: &[ID], their_tips: &[ID]) -> Result<Vec<Entry>> {
// Find tips that peer doesn't have
let tips_to_send: Vec<ID> = our_tips
.iter()
.filter(|tip_id| !their_tips.contains(tip_id))
.cloned()
.collect();
if tips_to_send.is_empty() {
return Ok(Vec::new()); // Peer already has everything
}
// Use DAG traversal to collect all necessary ancestors
self.collect_ancestors_to_send(&tips_to_send, their_tips)
}
Phase 3: Efficient Transfer
Only entries that are genuinely missing are transferred:
- No duplicates: Tips comparison guarantees no redundant sends
- Complete data: DAG traversal ensures all dependencies included
- Bidirectional: Both peers send and receive simultaneously
- Incremental: Only new changes since last sync
Integration with Command Pattern
The smart duplicate prevention integrates seamlessly with the command architecture:
Direct Entry Sends:
// Via SendEntries command - caller determines what to send
self.command_tx.send(SyncCommand::SendEntries {
peer: peer_pubkey.to_string(),
entries // No filtering - trust caller
}).await?;
Database Synchronization:
// Via SyncWithPeer command - background sync determines what to send
self.command_tx.send(SyncCommand::SyncWithPeer {
peer: peer_pubkey.to_string()
}).await?;
// Background sync performs tip comparison and smart filtering
Performance Characteristics
Network Efficiency:
- O(tip_count) network requests for tip discovery
- O(missing_entries) data transfer (minimal)
- Zero redundancy in steady state
Computational Complexity:
- O(n log n) tip comparison where n = tip count
- O(m) DAG traversal where m = missing entries
- Constant memory per sync operation
State Requirements:
- No persistent tracking of individual sends needed
- Stateless operation - each sync is independent
- Self-correcting - any missed entries caught in next sync
4. Handshake Protocol
Peer connection establishment:
sequenceDiagram
participant A as Peer A
participant B as Peer B
A->>B: HandshakeRequest { device_id, public_key, challenge }
B->>B: verify signature
B->>B: register peer
B->>A: HandshakeResponse { device_id, public_key, challenge_response }
A->>A: verify signature
A->>A: register peer
Note over A,B: Both peers now registered and authenticated
Performance Characteristics
Memory Usage
BackgroundSync state: Minimal memory footprint
- Single background thread with owned state
- Retry queue: O(n) where n = failed entries pending retry
- Peer state: ~1KB per registered peer
- Relationships: ~100 bytes per peer-database relationship
Persistent state: Stored in sync database
- Sync cursors: ~200 bytes per peer-database relationship
- Metadata: ~500 bytes per peer
- History: ~300 bytes per sync operation (with cleanup)
- Sent entries tracking: ~50 bytes per entry-peer pair
Network Efficiency
Immediate processing:
- Commands processed as received (no batching delay)
- Failed sends added to retry queue with exponential backoff
- Automatic compression in transport layer
Background timers:
- Periodic sync: Every 5 minutes (configurable)
- Retry processing: Every 30 seconds
- Connection checks: Every 60 seconds
Concurrency
Single-threaded design:
- One background thread handles all sync operations
- No lock contention or race conditions
- Commands queued via channel (non-blocking)
Async integration:
- Tokio-based event loop
- Non-blocking transport operations
- Works in both async and sync contexts
Connection Management
Lazy Connection Establishment
Eidetica uses a lazy connection strategy where connections are established on-demand rather than immediately when peers are registered:
Key Design Principles:
- No Persistent Connections: Connections are not maintained between sync operations
- Transport-Layer Handling: Connection establishment is delegated to the transport layer
- Automatic Discovery: Background sync periodically discovers and syncs with all registered peers
- On-Demand Establishment: Connections are created when sync operations occur
Connection Lifecycle:
graph LR
subgraph "Peer Registration"
REG[register_peer] --> STORE[Store in Sync Database]
end
subgraph "Discovery & Connection"
TIMER[Periodic Timer<br/>Every 5 min] --> SCAN[Scan Active Peers<br/>from Sync Database]
SCAN --> SYNC[sync_with_peer]
SYNC --> CONN[Transport Establishes<br/>Connection On-Demand]
CONN --> XFER[Transfer Data]
XFER --> CLOSE[Connection Closed]
end
subgraph "Manual Connection"
API[connect_to_peer API] --> HANDSHAKE[Perform Handshake]
HANDSHAKE --> STORE2[Store Peer Info]
end
Benefits of Lazy Connection:
- Resource Efficient: No idle connections consuming resources
- Resilient: Network issues don't affect registered peer state
- Scalable: Can handle many peers without connection overhead
- Self-Healing: Failed connections automatically retried on next sync cycle
Connection Triggers:
-
Periodic Sync (every 5 minutes):
- BackgroundSync scans all active peers from sync database
- Attempts to sync with each peer's registered databases
- Connections established as needed during sync
-
Manual Sync Commands:
SyncWithPeer
command triggers immediate connectionSendEntries
command establishes connection for data transfer
-
Explicit Connection:
connect_to_peer()
API for manual connection establishment- Performs handshake and stores peer information
No Alert on Registration:
When register_peer()
or add_peer_address()
is called:
- Peer information is stored in the sync database
- No command is sent to BackgroundSync
- No immediate connection attempt is made
- Peer will be discovered in next periodic sync cycle (within 5 minutes)
This design ensures that peer registration is a lightweight operation that doesn't block or trigger network activity.
Transport Implementations
Iroh Transport
The Iroh transport provides peer-to-peer connectivity using QUIC with automatic NAT traversal.
Key Components:
- Relay Servers: Intermediary servers that help establish P2P connections
- Hole Punching: Direct connection establishment through NATs (~90% success rate)
- NodeAddr: Contains node ID and direct socket addresses for connectivity
- QUIC Protocol: Provides reliable, encrypted communication
Configuration via Builder Pattern:
The IrohTransportBuilder
allows configuring:
RelayMode
: Controls relay server usageDefault
: Uses n0's production relay serversStaging
: Uses n0's staging infrastructureDisabled
: Direct P2P only (for local testing)Custom(RelayMap)
: User-provided relay servers
enable_local_discovery
: mDNS for local network discovery (future feature)
Address Serialization:
When get_server_address()
is called, Iroh returns a JSON-serialized NodeAddrInfo
containing:
node_id
: The peer's cryptographic identitydirect_addresses
: Socket addresses where the peer can be reached
This allows peers to connect using either relay servers or direct connections, whichever succeeds first.
Connection Flow:
- Endpoint initialization with configured relay mode
- Relay servers help peers discover each other
- Attempt direct connection via hole punching
- Fall back to relay if direct connection fails
- Upgrade to direct connection when possible
HTTP Transport
The HTTP transport provides traditional client-server connectivity using REST endpoints.
Features:
- Simple JSON API at
/api/v0
- Axum server with Tokio runtime
- Request/response pattern
- No special NAT traversal needed
Architecture Benefits
Command Pattern Advantages
Clean separation of concerns:
- Frontend handles API and database management
- Background owns transport and sync state
- No circular dependencies
Flexible communication:
- Fire-and-forget for most operations
- Request-response with oneshot channels when needed
- Graceful degradation if channel full
Reliability Features
Retry mechanism:
- Automatic retry queue for failed operations
- Exponential backoff prevents network flooding
- Configurable maximum retry attempts
- Per-entry failure tracking
State persistence:
- Sync state stored in database via DocStore store
- Tracks sent entries to prevent duplicates
- Survives restarts and crashes
- Provides complete audit trail of sync operations
Handshake security:
- Ed25519 signature verification
- Challenge-response protocol prevents replay attacks
- Device key management integrated with backend
- Mutual authentication between peers
Error Handling
Retry Queue Management
The BackgroundSync engine maintains a retry queue for failed send operations:
- Exponential backoff: 2^attempts seconds delay (max 64 seconds)
- Attempt tracking: Failed sends increment attempt counter
- Maximum retries: Entries dropped after configurable max attempts
- Periodic processing: Retry timer checks queue every 30 seconds
Each retry entry tracks the peer, entries to send, attempt count, and last attempt timestamp.
Transport Error Handling
- Network failures: Added to retry queue with exponential backoff
- Protocol errors: Logged and skipped
- Peer unavailable: Entries remain in retry queue
State Consistency
- Command channel full: Commands dropped (fire-and-forget)
- Hook failures: Don't prevent commit, logged as warnings
- Transport errors: Don't affect local data integrity
Testing Architecture
Current Test Coverage
The sync module maintains comprehensive test coverage across multiple test suites:
Unit Tests (6 passing):
- Hook collection execution and error handling
- Sync cursor and metadata operations
- State manager functionality
Integration Tests (78 passing):
- Basic sync operations and persistence
- HTTP and Iroh transport lifecycles
- Peer management and relationships
- DAG synchronization algorithms
- Protocol handshake and authentication
- Bidirectional sync flows
- Transport polymorphism and isolation
Test Categories
Transport Tests:
- Server lifecycle management for both HTTP and Iroh
- Client-server communication patterns
- Error handling and recovery
- Address management and peer discovery
Protocol Tests:
- Handshake with signature verification
- Version compatibility checking
- Request/response message handling
- Entry synchronization protocols
DAG Sync Tests:
- Linear chain synchronization
- Branching structure handling
- Partial overlap resolution
- Bidirectional sync flows
Implementation Status
Completed Features ✅
Architecture:
- BackgroundSync engine with command pattern
- Single background thread ownership model
- Channel-based frontend/backend communication
- Automatic runtime detection (async/sync contexts)
Core Functionality:
- HTTP and Iroh transport implementations with SyncHandler architecture
- SyncHandler trait enabling database access in transport layer
- Full protocol support (GetTips, GetEntries, SendEntries)
- Ed25519 handshake protocol with signatures
- Persistent sync state via DocStore
- Per-peer sync hook creation
- Retry queue with exponential backoff
- Periodic sync timers (5 min intervals)
State Management:
- Sync relationships tracking
- Peer registration and management
- Transport address handling
- Server lifecycle control
In Progress 🔄
High Priority:
- Complete bidirectional sync algorithm in
sync_with_peer()
- Implement full retry queue processing logic
Medium Priority:
- Device ID management (currently hardcoded)
- Connection health monitoring
- Update remaining test modules for new architecture
Future Enhancements 📋
Performance:
- Entry batching for large sync operations
- Compression for network transfers
- Bandwidth throttling controls
- Connection pooling
Reliability:
- Circuit breaker for problematic peers
- Advanced retry strategies
- Connection state tracking
- Automatic reconnection logic
Monitoring:
- Sync metrics collection
- Health check endpoints
- Performance dashboards
- Sync status visualization
Stores
Typed data access patterns within databases providing structured interaction with Entry RawData.
Core Concepts
SubTree Trait: Interface for typed store implementations accessed through Operation handles.
Reserved Names: Store names with underscore prefix (e.g., _settings
) reserved for internal use.
Typed APIs: Handle serialization/deserialization and provide structured access to raw entry data.
Current Implementations
Table
Record-oriented store for managing collections with unique identifiers.
Features:
- Stores user-defined types (T: Serialize + Deserialize)
- Automatic UUID generation for records
- CRUD operations: insert, get, set, search
- Type-safe access via Operation::get_subtree
Use Cases: User lists, task management, any collection requiring persistent IDs.
DocStore
Document-oriented store wrapping crdt::Doc
for nested structures and path-based access.
Features:
- Path-based operations for nested data (set_path, get_path, etc.)
- Simple key-value operations (get, set, delete)
- Support for nested map structures via Value enum
- Tombstone support for distributed deletion propagation
- Last-write-wins merge strategy
Use Cases: Configuration data, metadata, structured documents, sync state.
YDoc (Y-CRDT Integration)
Real-time collaborative editing with sophisticated conflict resolution.
Features (requires "y-crdt" feature):
- Y-CRDT algorithms for collaboration
- Differential saving for storage efficiency
- Full Y-CRDT API access
- Caching for performance optimization
Architecture:
- YrsBinary wrapper implements CRDT traits
- Differential updates vs full snapshots
- Binary update merging preserves Y-CRDT algorithms
Operations:
- Document access with safe closures
- External update application
- Incremental change tracking
Use Cases: Collaborative documents, real-time editing, complex conflict resolution.
Custom SubTree Implementation
Requirements:
- Struct implementing SubTree trait
- Handle creation linked to Transaction
- Custom API methods using Transaction interaction:
- get_local_data for staged state
- get_full_state for merged historical state
- update_subtree for staging changes
Integration
Operation Context: All stores accessed through atomic operations
CRDT Support: Stores can implement CRDT trait for conflict resolution
Serialization: Data stored as RawData strings in Entry structure
DocStore
Public store implementation providing document-oriented storage with path-based nested data access.
Overview
DocStore is a publicly available store type that provides a document-oriented interface for storing and retrieving data. It wraps the crdt::Doc
type to provide ergonomic access patterns for nested data structures, making it ideal for configuration, metadata, and structured document storage.
Key Characteristics
Public API: DocStore is exposed as part of the public store API and can be used in applications.
Doc CRDT Based: Wraps the crdt::Doc
type which uses Node structures internally for deterministic merging of concurrent changes.
Path-Based Operations: Supports both flat key-value storage and path-based access to nested structures.
Important Behavior: Nested Structure Creation
Path-Based Operations Create Nested Maps
When using set_path()
with dot-separated paths, DocStore creates nested map structures, not flat keys with dots:
// This code:
docstore.set_path("user.profile.name", "Alice")?;
// Creates this structure:
{
"user": {
"profile": {
"name": "Alice"
}
}
}
// NOT this:
{ "user.profile.name": "Alice" } // ❌ This is NOT what happens
Accessing Nested Data
When using get_all()
to retrieve all data, you get the nested structure and must navigate it accordingly:
let all_data = docstore.get_all()?;
// Wrong way - looking for a flat key with dots
let value = all_data.get("user.profile.name"); // ❌ Returns None
// Correct way - navigate the nested structure
if let Some(Value::Node(user_node)) = all_data.get("user") {
if let Some(Value::Node(profile_node)) = user_node.get("profile") {
if let Some(Value::Text(name)) = profile_node.get("name") {
println!("Name: {}", name); // ✅ "Alice"
}
}
}
API Methods
Basic Operations
set(key, value)
- Set a simple key-value pairget(key)
- Get a value by keyget_as<T>(key)
- Get and deserialize a valuedelete(key)
- Delete a key (creates tombstone)get_all()
- Get all data as a Map
Path Operations
set_path(path, value)
- Set a value at a nested path (creates intermediate maps)get_path(path)
- Get a value from a nested pathget_path_as<T>(path)
- Get and deserialize from a pathdelete_path(path)
- Delete a value at a path
Path Mutation Operations
modify_path<F>(path, f)
- Modify existing value at pathget_or_insert_path<F>(path, default)
- Get or insert with defaultmodify_or_insert_path<F, G>(path, modify, default)
- Modify or insert
Utility Operations
contains_key(key)
- Check if a key existscontains_path(path)
- Check if a path exists
Usage Examples
Application Configuration
let op = database.new_transaction()?;
let config = op.get_subtree::<DocStore>("app_config")?;
// Set configuration values
config.set("app_name", "MyApp")?;
config.set_path("database.host", "localhost")?;
config.set_path("database.port", "5432")?;
config.set_path("features.auth.enabled", "true")?;
op.commit()?;
Sync State Management
DocStore is used internally for sync state tracking in the sync module:
// Creating nested sync state structure
let sync_state = op.get_subtree::<DocStore>("sync_state")?;
// Store cursor information in nested structure
let cursor_path = format!("cursors.{}.{}", peer_pubkey, tree_id);
sync_state.set_path(cursor_path, cursor_json)?;
// Store metadata in nested structure
let metadata_path = format!("metadata.{}", peer_pubkey);
sync_state.set_path(metadata_path, metadata_json)?;
// Store history in nested structure
let history_path = format!("history.{}", sync_id);
sync_state.set_path(history_path, history_json)?;
// Later, retrieve all data and navigate the structure
let all_data = sync_state.get_all()?;
// Navigate to history entries
if let Some(Value::Node(history_node)) = all_data.get("history") {
for (sync_id, entry_value) in history_node.iter() {
// Process each history entry
if let Value::Text(json_str) = entry_value {
let entry: SyncHistoryEntry = serde_json::from_str(json_str)?;
// Use the entry...
}
}
}
Common Pitfalls
Expecting Flat Keys
The most common mistake is expecting set_path("a.b.c", value)
to create a flat key "a.b.c"
when it actually creates nested maps.
Incorrect get_all() Usage
When using get_all()
, remember that the returned Map contains the nested structure, not flat keys:
// After: docstore.set_path("config.server.port", "8080")
let all = docstore.get_all()?;
// Wrong:
all.get("config.server.port") // Returns None
// Right:
all.get("config")
.and_then(|v| v.as_node())
.and_then(|n| n.get("server"))
.and_then(|v| v.as_node())
.and_then(|n| n.get("port")) // Returns Some(Value::Text("8080"))
Design Rationale
The nested structure approach was chosen because:
- Natural Hierarchy: Represents hierarchical data more naturally
- Partial Updates: Allows updating parts of a structure without rewriting everything
- CRDT Compatibility: Works well with Doc CRDT merge semantics
- Query Flexibility: Enables querying at any level of the hierarchy
See Also
- Doc CRDT - Underlying CRDT implementation
- Sync State Management - Primary use case for DocStore
- SubTree Trait - Base trait for all store implementations
CRDT Implementation
Trait-based system for Conflict-free Replicated Data Types enabling deterministic conflict resolution.
Core Concepts
CRDT Trait: Defines merge operation for resolving conflicts between divergent states. Requires Serialize, Deserialize, and Default implementations.
Merkle-CRDT Principles: CRDT state stored in Entry's RawData for deterministic merging across distributed systems.
Multiple CRDT Support: Different CRDT types can be used for different stores within the same database.
Doc and Node Types
Doc: The main CRDT document type that users interact with
- Wraps the internal Node structure for clean API boundaries
- Provides document-level operations (get, set, merge, etc.)
- Handles path-based operations for nested data access
- Separates user-facing API from internal implementation
Node: Internal database structure implementing the actual CRDT logic
- Supports nested maps and values via the Value enum
- Value types: Text (string), Node (nested map), List (ordered collection), Deleted (tombstone)
- Recursive merging for nested structures
- Last-write-wins strategy for conflicting values
- Type-aware conflict resolution
Tombstones
Critical for distributed deletion propagation:
- Mark data as deleted instead of physical removal
- Retained and synchronized between replicas
- Ensure deletions propagate to all nodes
- Prevent resurrection of deleted data
Merge Algorithm
LCA-Based Computation: Uses Lowest Common Ancestor for efficient state calculation
Process:
- Identify parent entries (tips) for store
- Find LCA if multiple parents exist
- Merge all paths from LCA to parent tips
- Cache results for performance
Caching: Automatic caching of computed states with (Entry_ID, Store) keys for dramatic performance improvements.
Custom CRDT Implementation
Requirements:
- Struct implementing Default, Serialize, Deserialize
- Data marker trait implementation
- CRDT trait with deterministic merge logic
- Optional SubTree handle for user-friendly API
Data Flow
The data flow in Eidetica follows a structured sequence of interactions between core components.
Basic Flow
- User creates a Instance with a database backend
- User creates Databases within the database
- Operations construct immutable Entry objects through EntryBuilder
- Entries reference parent entries, forming a directed acyclic graph
- Entries are stored and retrieved through the database interface
- Authentication validates and signs entries when configured
Authentication Flow
When authentication is enabled, additional steps occur during commit:
- Entry signing with cryptographic signatures
- Permission validation for the operation type
- Bootstrap handling for initial admin configuration
- Verification status assignment based on validation results
This ensures data integrity and access control while maintaining compatibility with unsigned entries.
CRDT Caching Flow
The system uses an efficient caching layer for CRDT state computation:
- Cache lookup using Entry ID and Store as the key
- On cache miss, recursive LCA algorithm computes state and caches the result
- Cache hits return instantly for subsequent queries
- Performance scales well due to immutable entries and high cache hit rates
CRDT Principles
Eidetica implements a Merkle-CRDT using content-addressable entries organized in a Merkle DAG structure. Entries store data and maintain parent references to form a distributed version history that supports deterministic merging.
Core Concepts
- Content-Addressable Entries: Immutable data units forming a directed acyclic graph
- CRDT Trait: Enables deterministic merging of concurrent changes
- Parent References: Maintain history and define DAG structure
- Tips Tracking: Identifies current heads for efficient synchronization
Fork and Merge Support
The system supports branching and merging through parent-child relationships:
- Forking: Multiple entries can share parents, creating divergent branches
- Merging: Entries with multiple parents merge separate branches
- Deterministic Ordering: Entries sorted by height then ID for consistent results
Merge Algorithm
Uses a recursive LCA-based approach for computing CRDT states:
- Cache Check: Avoids redundant computation through automatic caching
- LCA Computation: Finds lowest common ancestor for multi-parent entries
- Recursive Building: Computes ancestor states recursively
- Path Merging: Merges all entries from LCA to parents with proper ordering
- Local Integration: Applies current entry's data to final state
Key Properties
- Correctness: Consistent state computation regardless of access patterns
- Performance: Caching eliminates redundant work
- Deterministic: Maintains ordering through proper LCA computation
- Immutable Caching: Entry immutability ensures cache validity
Testing Architecture
Eidetica employs a comprehensive testing strategy to ensure reliability and correctness. This document outlines our testing approach, organization, and best practices for developers working with or contributing to the codebase.
Test Organization
Eidetica centralizes all its tests into a unified integration test binary located in the tests/it/
directory. All testing is done through public interfaces, without separate unit tests, promoting interface stability.
The main categories of testing activities are:
Comprehensive Integration Tests
All tests for the Eidetica crate are located in the crates/lib/tests/it/
directory. These tests verify both:
- Component behavior: Validating individual components through their public interfaces
- System behavior: Ensuring different components interact correctly when used together
This unified suite is organized as a single integration test binary, following the pattern described by matklad.
The module structure within crates/lib/tests/it/
mirrors the main library structure from crates/lib/src/
. Each major component has its own test module directory.
Example Applications as Tests
The examples/
directory contains standalone applications that demonstrate library features. While not traditional tests, these examples serve as pragmatic validation of the API's usability and functionality in real-world scenarios.
For instance, the examples/todo/
directory contains a complete Todo application that demonstrates practical usage of Eidetica, effectively acting as both documentation and functional validation.
Test Coverage Goals
Eidetica maintains ambitious test coverage targets:
- Core Data Types: 95%+ coverage for all core data types (
Entry
,Database
,SubTree
) - CRDT Implementations: 100% coverage for all CRDT implementations
- Database Implementations: 90%+ coverage, including error cases
- Public API Methods: 100% coverage
Testing Patterns and Practices
Test-Driven Development
For new features, we follow a test-driven approach:
- Write tests defining expected behavior
- Implement features to satisfy those tests
- Refactor while maintaining test integrity
Interface-First Testing
We exclusively test through public interfaces. This approach ensures API stability.
Test Helpers
Eidetica provides test helpers organized into main helpers (crates/lib/tests/it/helpers.rs
) for common database and database setup, and module-specific helpers for specialized testing scenarios. Each test module has its own helpers.rs
file with utilities specific to that component's testing needs.
Standard Test Structure
Tests follow a consistent setup-action-assertion pattern, utilizing test helpers for environment preparation and result verification.
Error Case Testing
Tests cover both successful operations and error conditions to ensure robust error handling throughout the system.
CRDT-Specific Testing
Given Eidetica's CRDT foundation, special attention is paid to testing CRDT properties:
- Merge Semantics: Validating that merge operations produce expected results
- Conflict Resolution: Ensuring conflicts resolve according to CRDT rules
- Determinism: Verifying that operations are commutative when required
Running Tests
Basic Test Execution
Run all tests with:
cargo test
# Or using the task runner
task test
Eidetica uses nextest for test execution, which provides improved test output and performance:
cargo nextest run --workspace --all-features
Targeted Testing
Run specific test categories:
# Run all integration tests
cargo test --test it
# Run specific integration tests
cargo nextest run tests::it::store
Run tests using cargo test --test it
for all integration tests, or target specific modules with patterns like cargo test --test it auth::
. The project also supports cargo nextest
for improved test output and performance.
Coverage Analysis
Eidetica uses tarpaulin for code coverage analysis:
# Run with coverage analysis
task coverage
# or
cargo tarpaulin --workspace --skip-clean --include-tests --all-features --output-dir coverage --out lcov
Module Test Organization
Each test module follows a consistent structure with mod.rs
for declarations, helpers.rs
for module-specific utilities, and separate files for different features or aspects being tested.
Contributing New Tests
When adding features or fixing bugs:
- Add focused tests to the appropriate module within the
crates/lib/tests/it/
directory. These tests should cover:- Specific functionality of the component or module being changed through its public interface.
- Interactions between the component and other parts of the system.
- Consider adding example code in the
examples/
directory for significant new features to demonstrate usage and provide further validation. - Test both normal operation ("happy path") and error cases.
- Use the test helpers in
crates/lib/tests/it/helpers.rs
for general setup, and module-specific helpers for specialized scenarios. - If you need common test utilities for a new pattern, add them to the appropriate helpers.rs file.
Best Practices
- Descriptive Test Names: Use
test_<component>_<functionality>
ortest_<functionality>_<scenario>
naming pattern - Self-Documenting Tests: Write clear test code with useful comments
- Isolation: Ensure tests don't interfere with each other
- Speed: Keep tests fast to encourage frequent test runs
- Determinism: Avoid flaky tests that intermittently fail
Performance Considerations
The architecture provides several performance characteristics:
- Content-addressable storage: Enables efficient deduplication through SHA-256 content hashing.
- Database structure (DAG): Supports partial replication and sparse checkouts. Tip calculation complexity depends on parent relationships.
- InMemoryDatabase: Provides high-speed operations but is limited by available RAM.
- Lock-based concurrency: May create bottlenecks in high-concurrency write scenarios.
- Height calculation: Uses BFS-based topological sorting with O(V + E) complexity.
- CRDT merge algorithm: Employs recursive LCA-based merging with intelligent caching.
CRDT Merge Performance
The recursive LCA-based merge algorithm uses caching for performance optimization:
Algorithm Complexity
- Cached states: O(1) amortized performance
- Uncached states: O(D × M) where D is DAG depth and M is merge cost
- Overall performance benefits from high cache hit rates
Key Performance Benefits
- Efficient handling of complex DAG structures
- Optimized path finding reduces database calls
- Cache eliminates redundant computations
- Scales well with DAG complexity through memoization
- Memory-computation trade-off favors cached access patterns
Error Handling
The database uses a custom Result
(crate::Result
) and Error
(crate::Error
) type hierarchy defined in src/lib.rs
. Errors are typically propagated up the call stack using Result
.
The Error
enum uses a modular approach with structured error types from each component:
Io(#[from] std::io::Error)
: Wraps underlying I/O errors from backend operations or file system access.Serialize(#[from] serde_json::Error)
: Wraps errors occurring during JSON serialization or deserialization.Auth(auth::AuthError)
: Structured authentication errors with detailed context.Backend(backend::DatabaseError)
: Database storage and retrieval errors.Instance(instance::InstanceError)
: Instance management errors.CRDT(crdt::CRDTError)
: CRDT operation and merge errors.Store(store::StoreError)
: Store data access and validation errors.Transaction(transaction::TransactionError)
: Transaction coordination errors.
The use of #[error(transparent)]
allows for zero-cost conversion from module-specific errors into crate::Error
using the ?
operator. Helper methods like is_not_found()
, is_permission_denied()
, and is_authentication_error()
enable categorized error handling without pattern matching on specific variants.
Best Practices
This section documents established patterns and guidelines for developing within the Eidetica codebase. Following these practices ensures consistency, performance, and maintainability across the project.
Overview
The best practices documentation covers:
- API Design Patterns - Guidelines for string parameters, conversion patterns, and performance considerations
- Module Organization - Code structure, dependency management, and module design patterns
- Error Handling - Structured error types, error propagation, and error handling strategies
- Testing - Integration testing, test organization, and comprehensive validation strategies
- Performance - Hot path optimization, memory efficiency, and scalable algorithms
- Security - Authentication, authorization, cryptographic operations, and secure data handling
- Documentation - Documentation standards, API documentation, and writing guidelines
Core Principles
All best practices in Eidetica are built around these fundamental principles:
1. Performance with Ergonomics
- Optimize for common use cases without sacrificing API usability
- Minimize conversion overhead while maintaining flexible parameter types
- Use appropriate generic bounds to avoid double conversions
2. Consistency Across Components
- Similar operations should have similar APIs across different modules
- Follow established patterns for parameter types and method naming
- Maintain consistent error handling and documentation patterns
3. Clear Intent and Documentation
- Function signatures should clearly communicate their intended usage
- Parameter types should indicate whether data is stored or accessed
- Performance characteristics should be documented for critical paths
4. Future-Ready Design
- Backward compatibility is NOT required during development
- Breaking changes are acceptable for both API and storage format
- Focus on correctness and performance over compatibility at this stage
Quick Reference
For New Contributors
Start with these essential guides:
- Module Organization - Understanding code structure and dependencies
- Error Handling - How errors work throughout the system
- Testing - Writing and running tests effectively
- Documentation - Writing good documentation and examples
For API Development
Focus on these areas for public API work:
- API Design Patterns - String parameters and method design
- Performance - Hot path optimization and memory efficiency
- Security - Authentication and secure coding practices
For Internal Development
These guides cover internal implementation patterns:
- Module Organization - Internal module structure and abstractions
- Performance - CRDT algorithms and backend optimization
- Testing - Integration testing and test helper patterns
Implementation Guidelines
When implementing new features or modifying existing code:
- Review existing patterns in similar components
- Follow the established conventions documented in this section
- Add comprehensive tests that validate the patterns
- Document the rationale for any deviations from established patterns
- Update documentation to reflect new patterns or changes
Contributing to Best Practices
These best practices evolve based on:
- Lessons learned from real-world usage
- Performance analysis and optimization needs
- Developer feedback and common patterns
- Code review discussions and decisions
When proposing changes to established patterns, include:
- Rationale for the change
- Performance impact analysis
- Updated documentation and examples
API Design Patterns
This document outlines established patterns for API design within the Eidetica codebase, with particular emphasis on string parameter handling, conversion patterns, and performance considerations.
String Parameter Guidelines
One of the most important API design decisions in Rust is choosing the right parameter types for string data. Eidetica follows specific patterns to optimize performance while maintaining ergonomic APIs.
Core Principle: Storage vs Lookup Pattern
The fundamental rule for string parameters in Eidetica:
- Use
Into<String>
for parameters that will be stored (converted to ownedString
) - Use
AsRef<str>
for parameters that are only accessed temporarily (lookup, comparison)
When to Use Into<String>
Use impl Into<String>
when the function will store the parameter as an owned String
. This avoids double conversion and is more efficient for storage operations while still accepting &str
, String
, and &String
transparently.
When to Use AsRef<str>
Use impl AsRef<str>
when the function only needs to read the string temporarily for lookups, comparisons, or validation. This provides maximum flexibility with no unnecessary allocations and clearly indicates the parameter is not stored.
Anti-Patterns to Avoid
Never use AsRef<str>
followed by immediate .to_string()
- this causes double conversion. Instead, use Into<String>
for direct conversion when storing the value.
Common Conversion Patterns
ID Types
For ID parameters, prefer Into<ID>
when working with ID-typed fields for clear intent and type safety.
Path Segments
For path operations, use Into<String>
with Clone
bounds when segments will be stored as keys.
Performance Guidelines
Hot Path Optimizations
For performance-critical operations:
- Bulk Operations: Convert all parameters upfront to avoid per-iteration conversions
- Iterator Chains: Prefer direct loops over complex iterator chains in hot paths
API Documentation Standards
Always document the expected usage pattern for string parameters, indicating whether the parameter will be stored or just accessed, and which string types are accepted.
Testing Patterns
Ensure APIs work with all string types (&str
, String
, &String
) by testing conversion compatibility.
API Evolution Guidelines
During development, APIs can be freely changed to follow best practices. Update methods directly with improved parameter types, add comprehensive tests, update documentation, and consider performance impact. Breaking changes are acceptable when they improve performance, ergonomics, or consistency.
Summary
Following these patterns ensures:
- Optimal performance through minimal conversions
- Consistent APIs across the codebase
- Clear intent about parameter usage
- Maximum flexibility for API consumers
- Maintainable code for future development
When in doubt, ask: "Is this parameter stored or just accessed?" The answer determines whether to use Into<String>
or AsRef<str>
.
Module Organization
This document outlines best practices for organizing code modules within the Eidetica codebase, focusing on clear separation of concerns, consistent structure, and maintainable hierarchies.
Module Hierarchy Principles
1. Domain-Driven Organization
Organize modules around business domains and functionality rather than technical layers. Each module should have a clear responsibility and evolve independently while maintaining clean boundaries.
2. Consistent Module Structure
Every module should follow a standard internal structure with mod.rs
for public API and re-exports, errors.rs
for module-specific error types, and separate files for implementation logic. Keep related functionality together within the same module.
3. Error Module Standards
Each module must define its own error type with #[non_exhaustive]
for future compatibility, semantic helper methods for error classification, transparent delegation for dependency errors, and contextual information in error variants.
Public API Design
1. Clean Re-exports
Module mod.rs
files should provide clean public APIs with clear documentation, selective re-exports of public types, and convenient access to commonly used shared types.
2. Module Documentation Standards
Every module should have comprehensive documentation including purpose, core functionality, usage examples, integration points, and performance considerations.
Dependency Management
1. Dependency Direction
Maintain clear dependency hierarchies where higher-level modules depend on lower-level modules, modules at the same level avoid direct dependencies, and trait abstractions break circular dependencies when needed.
2. Feature Gating
Use feature flags for optional functionality, gating modules and exports appropriately with #[cfg(feature = "...")]
attributes.
Module Communication Patterns
1. Trait-Based Abstractions
Use traits to define interfaces between modules, allowing implementation modules to depend on abstractions rather than concrete types.
2. Event-Driven Communication
Consider event patterns for decoupled communication, particularly useful for logging, metrics, or cross-cutting concerns without introducing tight coupling.
Testing Integration
Integration tests should mirror the module structure with module-specific helpers for each domain. Test organization should follow the same hierarchy as the source modules.
Common Anti-Patterns to Avoid
- Circular Dependencies - Modules depending on each other in cycles
- God Modules - Single modules containing unrelated functionality
- Leaky Abstractions - Exposing internal implementation details through public APIs
- Flat Structure - No hierarchy or organization in module layout
- Mixed Concerns - Business logic mixed with infrastructure code
Migration Guidelines
When restructuring modules: plan the new structure, use deprecation warnings for API changes when needed, create integration tests to verify functionality, update documentation, and consider backward compatibility implications.
Summary
Good module organization provides:
- Clear separation of concerns with well-defined boundaries
- Predictable structure that developers can navigate easily
- Maintainable dependencies with clear hierarchies
- Testable interfaces with appropriate abstractions
- Extensible design that can grow with the project
Following these patterns ensures the codebase remains organized and maintainable as it evolves.
Error Handling Best Practices
This document outlines the error handling patterns and practices used throughout the Eidetica codebase, focusing on structured errors, ergonomic APIs, and maintainable error propagation.
Core Error Architecture
1. Unified Result Type
Eidetica uses a unified Result<T>
type across the entire codebase with automatic conversion between module-specific errors and the main error type. This provides consistent error handling and a single import for Result type throughout the codebase.
2. Module-Specific Error Types
Each module defines its own structured error type with semantic helpers. Error types include contextual information and helper methods for classification (e.g., is_authentication_error()
, is_permission_denied()
).
Error Design Patterns
1. Semantic Error Classification
Provide helper methods that allow callers to handle errors semantically, such as is_not_found()
, is_storage_error()
, or is_corruption_error()
. This enables clean error handling based on error semantics rather than type matching.
2. Contextual Error Information
Include relevant context in error variants, such as available options when something is not found, or specific reasons for failures. This debugging information helps users understand what went wrong and can assist in error recovery.
3. Error Conversion Patterns
Use #[from]
and #[error(transparent)]
for zero-cost error conversion between module boundaries. This allows wrapping errors with additional context or passing them through directly.
4. Non-Exhaustive Error Enums
Use #[non_exhaustive]
on error enums to allow adding new error variants in the future without breaking existing code.
Error Handling Strategies
1. Early Return with ?
Operator
Use the ?
operator for clean error propagation, validating preconditions early and returning errors as soon as they occur.
2. Error Context Enhancement
Add context when propagating errors up the call stack by wrapping lower-level errors with higher-level context that explains what operation failed.
3. Fallible Iterator Patterns
Handle errors in iterator chains gracefully by either failing fast on the first error or collecting all results before handling errors, depending on the use case.
Authentication Error Patterns
1. Permission-Based Errors
Structure authentication errors to be actionable by including what permission was required, what the user had, and potentially what options are available.
2. Security Error Handling
Be careful not to leak sensitive information in error messages. Reference resources by name or ID rather than content, and avoid exposing internal system details.
Performance Considerations
1. Error Allocation Optimization
Minimize allocations in error creation by using static strings for fixed messages and avoiding unnecessary string formatting in hot paths.
2. Error Path Optimization
Keep error paths simple and fast by deferring error creation until actually needed, using closures with ok_or_else()
rather than ok_or()
.
Testing Error Conditions
1. Error Testing Patterns
Test both error conditions and error classification helpers. Verify that error context is preserved through the error chain and that error messages contain expected information.
2. Error Helper Testing
Test semantic error classification helpers to ensure they correctly identify error categories and that the classification logic remains consistent as error types evolve.
Common Anti-Patterns
- String-based errors - Avoid unstructured string errors that lack context
- Generic error types - Don't use overly generic errors that lose type information
- Panic on recoverable errors - Return Result instead of using unwrap() or expect()
- Leaking sensitive information - Don't expose internal details in error messages
Migration Guidelines
When updating error handling, maintain error semantics, add context gradually, test error paths thoroughly, and keep documentation current.
Summary
Effective error handling in Eidetica provides:
- Structured error types with rich context and classification
- Consistent error patterns across all modules
- Semantic error helpers for easy error handling in calling code
- Zero-cost error conversion between module boundaries
- Performance-conscious error creation and propagation
- Testable error conditions with comprehensive coverage
Following these patterns ensures errors are informative, actionable, and maintainable throughout the codebase evolution.
Testing Best Practices
This document outlines testing patterns and practices used in the Eidetica codebase, focusing on integration testing, test organization, and comprehensive validation strategies.
Testing Architecture
1. Integration-First Testing Strategy
Eidetica uses a single integration test binary approach rather than unit tests, organized in tests/it/
with modules mirroring the main codebase structure.
Key principle: Test through public interfaces to validate real-world usage patterns.
2. Test Module Organization
Each test module mirrors the main codebase structure, with mod.rs
for declarations, helpers.rs
for utilities, and separate files for different features.
3. Comprehensive Test Helpers
The codebase provides helper functions in tests/it/helpers.rs
for common setup scenarios and module-specific helpers for specialized testing needs.
Authentication Testing Patterns
The auth module provides specialized helpers for testing authentication scenarios, including key creation macros, permission setup utilities, and operation validation helpers.
Permission Testing
Test authentication and authorization systematically using the auth module helpers to verify different permission levels and access control scenarios.
CRDT Testing
Test CRDT properties including merge semantics, conflict resolution, and deterministic behavior. The crdt module provides specialized helpers for testing commutativity, associativity, and idempotency of CRDT operations.
Performance Testing
Performance testing can be done using criterion benchmarks alongside integration tests. Consider memory allocation patterns and operation timing in critical paths.
Error Testing
Comprehensive error testing ensures robust error handling throughout the system. Test both error conditions and recovery scenarios to validate system resilience.
Test Data Management
Create realistic test data using builder patterns for complex scenarios. Consider property-based testing for CRDT operations to validate mathematical properties like commutativity, associativity, and idempotency.
Test Organization
Organize tests by functionality and use environment variables for test configuration. Use #[ignore]
for expensive tests that should only run on demand.
Testing Anti-Patterns to Avoid
- Overly complex test setup - Keep setup minimal and use helpers
- Testing implementation details - Test behavior through public interfaces
- Flaky tests with timing dependencies - Avoid sleep() and timing assumptions
- Buried assertions - Make test intent clear with obvious assertions
Summary
Effective testing in Eidetica provides:
- Integration-focused approach that tests real-world usage patterns
- Comprehensive helpers that reduce test boilerplate and improve maintainability
- Authentication testing that validates security and permission systems
- CRDT testing that ensures merge semantics and conflict resolution work correctly
- Performance testing that validates system behavior under load
- Error condition testing that ensures robust error handling and recovery
Following these patterns ensures the codebase maintains high quality and reliability as it evolves.
Performance Best Practices
This document outlines performance optimization patterns used throughout the Eidetica codebase.
Core Performance Principles
1. Hot Path Optimization
Identify and optimize performance-critical code paths. Common hot paths in Eidetica include CRDT state computation, entry storage/retrieval, authentication verification, bulk operations, and string conversions.
2. Memory Efficiency
Minimize allocations through appropriate string parameter types, pre-allocation of collections, stack allocation preference, and efficient caching strategies.
3. Algorithmic Efficiency
Choose algorithms that scale well with data size by using appropriate data structures, implementing caching for expensive computations, and preferring direct iteration over complex iterator chains in hot paths.
String Parameter Optimization
1. Parameter Type Selection
Use Into<String>
for stored parameters and AsRef<str>
for lookup operations to minimize allocations and conversions.
2. Bulk Operation Optimization
Convert parameters upfront for bulk operations rather than converting on each iteration to reduce overhead.
Memory Allocation Patterns
1. Pre-allocation Strategies
Allocate collections with known or estimated capacity to reduce reallocation overhead. Pre-allocate strings when building keys or compound values.
2. Memory-Efficient Data Structures
Choose data structures based on access patterns: BTreeMap for ordered iteration and range queries, HashMap for fast lookups, and Vec for dense indexed access.
3. Avoiding Unnecessary Clones
Use references and borrowing effectively. Work with references when possible and clone only when ownership transfer is required.
CRDT Performance Patterns
1. State Computation Caching
Cache expensive CRDT state computations using entry ID and store name as cache keys. Immutable entries eliminate cache invalidation concerns.
2. Efficient Merge Operations
Optimize merge algorithms by pre-allocating capacity, performing in-place merges when possible, and cloning only when adding new keys.
3. Lazy Computation Patterns
Defer expensive computations until needed using lazy initialization patterns to avoid unnecessary work.
Backend Performance Patterns
1. Batch Operations
Optimize backend operations for bulk access by implementing batch retrieval and storage methods that leverage backend-specific bulk operations.
2. Connection Pooling and Resource Management
Use connection pooling for expensive resources and implement read caching with bounded LRU caches to reduce backend load.
Algorithm Optimization
1. Direct Loops vs Iterator Chains
Prefer direct loops over complex iterator chains in hot paths for better performance and clearer control flow.
2. Efficient Graph Traversal
Use iterative traversal with explicit stacks to avoid recursion overhead and maintain visited sets to prevent redundant processing in DAG traversal.
Profiling and Measurement
1. Benchmark-Driven Development
Use criterion for performance testing with varied data sizes to understand scaling characteristics.
2. Performance Monitoring
Track operation timings in critical paths to identify bottlenecks and measure optimization effectiveness.
Memory Profiling
1. Memory Usage Tracking
Implement allocation tracking for operations to identify memory-intensive code paths and optimize accordingly.
2. Memory-Efficient Collections
Use adaptive collection types that switch between Vec for small collections and HashMap for larger ones to optimize memory usage patterns.
Common Performance Anti-Patterns
Avoid unnecessary string allocations through repeated concatenation, repeated expensive computations that could be cached, and unbounded cache growth that leads to memory exhaustion.
Summary
Effective performance optimization in Eidetica focuses on string parameter optimization, memory-efficient patterns, hot path optimization, CRDT performance with caching, backend optimization with batch operations, and algorithm efficiency. Following these patterns ensures the system maintains good performance characteristics as it scales.
Security Best Practices
This document outlines security patterns and practices used throughout the Eidetica codebase.
Core Security Architecture
1. Authentication System
Eidetica uses Ed25519 digital signatures for all entry authentication. The system provides high-performance cryptographic verification through content-addressable entries that enable automatic tampering detection. All entries must be signed by authorized keys, with private keys stored separately from synchronized data.
2. Authorization Model
The system implements a hierarchical permission model with three levels: Read (view data and compute states), Write (create and modify entries), and Admin (manage permissions and authentication settings). Permissions follow a hierarchical structure where higher levels include all lower-level permissions.
3. Secure Entry Creation
All entries require authentication during creation. The system verifies authentication keys exist and have appropriate permissions before creating entries. Each entry is signed and stored with verification to ensure integrity.
Cryptographic Best Practices
1. Digital Signature Handling
Ed25519 signatures provide authentication for all entries. The system creates signatures from canonical byte representations and verifies them against stored public keys to ensure data integrity and authenticity.
2. Key Generation and Storage
Keys are generated using cryptographically secure random number generators. Private keys are stored separately from public keys and are securely cleared from memory when removed to prevent key material leakage.
3. Canonical Serialization
The system ensures consistent serialization for signature verification by sorting all fields deterministically and creating canonical JSON representations. This prevents signature verification failures due to serialization differences.
Permission Management
1. Database-Level Permissions
Each database maintains fine-grained permissions mapping keys to permission levels. The system checks permissions by looking up key-specific permissions or falling back to default permissions. Admin-only operations include permission updates, with safeguards to prevent self-lockout.
2. Operation-Specific Authorization
Different operations require different permission levels: reading data requires Read permission, writing data requires Write permission, and managing settings or permissions requires Admin permission. The system enforces these requirements before allowing any operation to proceed.
Secure Data Handling
1. Input Validation
All inputs undergo validation to prevent injection and malformation attacks. Entry IDs must be valid hex-encoded SHA-256 hashes, key names must contain only safe alphanumeric characters, and store names cannot conflict with reserved system names. The system enforces strict size limits and character restrictions.
2. Secure Serialization
The system prevents deserialization attacks through custom deserializers that validate data during parsing. Entry data is subject to size limits and format validation, ensuring only well-formed data enters the system.
Attack Prevention
1. Denial of Service Protection
The system implements comprehensive resource limits including maximum entry sizes, store counts, and parent node limits. Rate limiting prevents excessive operations per second from any single key, with configurable thresholds to balance security and usability.
2. Hash Collision Protection
SHA-256 hashing ensures content-addressable IDs are collision-resistant. The system verifies that entry IDs match their content hash, detecting any tampering or corruption attempts.
3. Timing Attack Prevention
Security-sensitive comparisons use constant-time operations to prevent timing-based information leakage. This includes signature comparisons and key matching operations.
Audit and Logging
1. Security Event Logging
The system logs all security-relevant events including authentication attempts, permission denials, rate limit violations, and key management operations. Events are timestamped and can be forwarded to external monitoring systems for centralized security analysis.
2. Intrusion Detection
Active monitoring detects suspicious patterns such as repeated authentication failures indicating brute force attempts or unusual operation frequencies suggesting system abuse. The detector maintains sliding time windows to track patterns and generate alerts when thresholds are exceeded.
Common Security Anti-Patterns
Key security mistakes to avoid include storing private keys in plain text, missing input validation, leaking sensitive information in error messages, and using weak random number generation. Always use proper key types with secure memory handling, validate all inputs, provide generic error messages, and use cryptographically secure random number generators.
Summary
Effective security in Eidetica encompasses strong authentication with Ed25519 digital signatures, fine-grained authorization with hierarchical permissions, secure cryptographic operations with proper key management, comprehensive input validation, attack prevention through rate limiting and resource controls, and thorough auditing with intrusion detection capabilities.
Documentation Best Practices
This document outlines documentation standards and practices used throughout the Eidetica codebase.
Documentation Philosophy
Documentation as Code
Documentation receives the same rigor as source code - version controlled, reviewed, tested, and maintained alongside code changes.
Audience-Focused Writing
Each documentation type serves specific audiences: public API docs for library users, internal docs for contributors, architecture docs for design understanding, and best practices for development consistency.
Progressive Disclosure
Information flows from general to specific: overview to getting started to detailed guides to reference documentation.
API Documentation Standards
Module-Level Documentation
Every module requires comprehensive header documentation including core functionality, integration points, security considerations, and performance notes. Module docs should provide an overview of the module's purpose and how it fits into the larger system.
Function Documentation Standards
Document all public functions with: purpose description, parameter details, performance notes, related functions, and error conditions. Focus on what the function does and when to use it, not implementation details.
Type Documentation
Document structs, enums, and traits with context about their purpose, usage patterns, and implementation notes. Focus on when and why to use the type, not just what it does.
Error Documentation
Document error types with context about when they occur, what they mean, and how to recover from them. Include security implications where relevant.
Code Example Standards
All documentation examples must be complete, runnable, and testable. Examples should demonstrate proper error handling patterns and include performance guidance where relevant. Use realistic scenarios and show best practices.
Internal Documentation
Architecture Decision Records (ADRs)
Document significant design decisions with status, context, decision rationale, and consequences. ADRs help future contributors understand why specific choices were made.
Design Rationale Documentation
Complex implementations should include explanations of algorithm choices, performance characteristics, and trade-offs. Focus on the "why" behind implementation decisions.
TODO and Known Limitations
Document current limitations and planned improvements with clear categorization. Include guidance for contributors who want to help address these items.
Documentation Testing
Doctests
All documentation examples must compile and run. Use cargo test --doc
to verify examples work correctly. Examples should include proper imports and error handling.
Documentation Coverage
Track coverage with RUSTDOCFLAGS="-D missing_docs" cargo doc
to ensure all public APIs are documented. Check for broken links and maintain comprehensive documentation coverage.
External Documentation
User Guide Structure
Organize documentation progressively from overview to detailed reference. Structure includes user guides for problem-solving, internal docs for implementation details, and generated API documentation.
Contribution Guidelines
Different documentation types serve different purposes: user docs focus on solving problems with clear examples, internal docs explain implementation decisions, and API docs provide comprehensive reference material. All examples must compile and demonstrate best practices.
Common Documentation Anti-Patterns
Avoid outdated examples that no longer work with current APIs, incomplete examples missing imports or setup, implementation-focused documentation that explains how rather than what and why, and missing context about when to use functionality.
Good documentation provides clear purpose, complete examples, proper context, parameter descriptions, return value information, and performance characteristics.
Summary
Effective documentation in Eidetica treats documentation as code, focuses on specific audiences, uses progressive disclosure, maintains comprehensive API documentation, provides clear user guides, explains design decisions, ensures all examples are tested and working, and follows consistent standards. These practices ensure documentation remains valuable, accurate, and maintainable as the project evolves.
Logging Best Practices
This guide documents best practices for using the tracing
crate within the Eidetica codebase.
Overview
Eidetica uses the tracing
crate for all logging needs. This provides:
- Structured logging with minimal overhead
- Compile-time optimization for disabled log levels
- Span-based context for async operations
- Integration with external observability tools
Log Level Guidelines
Choose log levels based on the importance and frequency of events:
ERROR (tracing::error!
)
Use for unrecoverable errors that prevent operations from completing:
tracing::error!("Failed to store entry {}: {}", entry.id(), error);
When to use:
- Database operation failures
- Network errors that can't be retried
- Authentication/authorization failures
- Corrupted data detection
WARN (tracing::warn!
)
Use for important warnings that don't prevent operation:
tracing::warn!("Failed to send to {}: {}. Adding to retry queue.", peer, error);
When to use:
- Retryable failures
- Invalid configuration (with fallback)
- Deprecated feature usage
- Performance degradation detected
INFO (tracing::info!
)
Use for high-level operational messages:
tracing::info!("Sync server started on {}", address);
When to use:
- Service lifecycle events (start/stop)
- Successful major operations
- Configuration changes
- Important state transitions
DEBUG (tracing::debug!
)
Use for detailed operational information:
tracing::debug!("Syncing {} databases with peer {}", tree_count, peer_id);
When to use:
- Detailed operation progress
- Protocol interactions
- Algorithm steps
- Non-critical state changes
TRACE (tracing::trace!
)
Use for very detailed trace information:
tracing::trace!("Processing entry {} with {} parents", entry_id, parent_count);
When to use:
- Individual item processing
- Detailed algorithm execution
- Network packet contents
- Frequent operations in hot paths
Performance Considerations
Hot Path Optimization
For performance-critical code paths, follow these guidelines:
- Use appropriate levels: Hot paths should use
trace!
to avoid overhead - Avoid string formatting: Use structured fields instead
- Check before complex operations: Use
tracing::enabled!
for expensive log data
// Good: Structured fields, minimal overhead
tracing::trace!(entry_id = %entry.id(), parent_count = parents.len(), "Processing entry");
// Bad: String formatting in hot path
tracing::debug!("Processing entry {} with {} parents", entry.id(), parents.len());
// Good: Check before expensive operation
if tracing::enabled!(tracing::Level::TRACE) {
let debug_info = expensive_debug_calculation();
tracing::trace!("Debug info: {}", debug_info);
}
Async and Background Operations
Use spans to provide context for async operations:
use tracing::{info_span, Instrument};
async fn sync_with_peer(peer_id: &str) {
async {
tracing::debug!("Starting sync");
// ... sync logic ...
tracing::debug!("Sync complete");
}
.instrument(info_span!("sync", peer_id = %peer_id))
.await;
}
Module-Specific Guidelines
Sync Module
- Use
info!
for server lifecycle and peer connections - Use
debug!
for sync protocol operations - Use
trace!
for individual entry transfers - Use spans for peer-specific context
Backend Module
- Use
error!
for storage failures - Use
debug!
for cache operations - Use
trace!
for individual entry operations
Authentication Module
- Use
error!
for signature verification failures - Use
error!
for permission violations - Use
debug!
for key operations - Never log private keys or sensitive data
CRDT Module
- Use
debug!
for merge operations - Use
trace!
for individual CRDT operations - Include operation type in structured fields
Structured Logging
Prefer structured fields over string interpolation:
// Good: Structured fields
tracing::info!(
tree_id = %database.id(),
entry_count = entries.len(),
peer = %peer_address,
"Synchronizing database"
);
// Bad: String interpolation
tracing::info!(
"Synchronizing database {} with {} entries to peer {}",
database.id(), entries.len(), peer_address
);
Error Context
When logging errors, include relevant context:
// Good: Includes context
tracing::error!(
error = %e,
entry_id = %entry.id(),
tree_id = %database.id(),
"Failed to store entry during sync"
);
// Bad: Missing context
tracing::error!("Failed to store entry: {}", e);
Testing with Logs
Automatic Test Logging Setup
Eidetica uses a global test setup with the ctor
crate to automatically initialize tracing for all tests. This is configured in tests/it/main.rs
:
This means all tests automatically have tracing enabled at INFO level without any setup code needed in individual test functions.
Viewing Test Logs
By default, Rust's test harness captures log output and only shows it for failing tests:
# Normal test run - only see logs from failing tests
cargo test
# See logs from ALL tests (passing and failing)
cargo test -- --nocapture
# Control log level with environment variable
RUST_LOG=eidetica=debug cargo test -- --nocapture
# See logs from specific test
cargo test test_sync_operations -- --nocapture
# Trace level for specific module during tests
RUST_LOG=eidetica::sync=trace cargo test -- --nocapture
Writing Tests with Logging
Tests should use println!
for outputs.
Key Benefits
- Zero setup: No initialization code needed in individual tests
- Environment control: Use
RUST_LOG
to control verbosity per test run - Clean output: Logs only appear when tests fail or with
--nocapture
- Proper isolation:
with_test_writer()
ensures logs don't mix between parallel tests - Library visibility: See internal library operations during test execution
Common Patterns
Operation Success/Failure
match operation() {
Ok(result) => {
tracing::debug!("Operation succeeded");
result
}
Err(e) => {
tracing::error!(error = %e, "Operation failed");
return Err(e);
}
}
Retry Logic
for attempt in 1..=max_attempts {
match try_operation() {
Ok(result) => {
if attempt > 1 {
tracing::info!("Operation succeeded after {} attempts", attempt);
}
return Ok(result);
}
Err(e) if attempt < max_attempts => {
tracing::warn!(
error = %e,
attempt,
max_attempts,
"Operation failed, retrying"
);
}
Err(e) => {
tracing::error!(
error = %e,
attempts = max_attempts,
"Operation failed after all retries"
);
return Err(e);
}
}
}
Anti-Patterns to Avoid
- Don't log sensitive data: Never log private keys, passwords, or PII
- Don't use println/eprintln: Always use tracing macros
- Don't log in tight loops: Use trace level or aggregate logging
- Don't format strings unnecessarily: Use structured fields
- Don't ignore log levels: Use appropriate levels for context
Future Considerations
As the codebase grows, consider:
- Adding custom tracing subscribers for specific subsystems
- Implementing trace sampling for high-volume operations
- Adding metrics collection alongside tracing
- Creating domain-specific span attributes
Future Development
Key areas for future development:
- CRDT Refinement: Enhanced CRDT implementations and merge logic
- Security: Entry signing and key management systems
- Persistent Storage: Database backends beyond in-memory storage
- Blob Storage: Integration with distributed storage systems for large data
- Querying: Advanced query and filtering capabilities
- Additional CRDTs: Sequences, Sets, Counters, and other CRDT types
- Replication: Peer-to-peer synchronization protocols
- Indexing: Performance optimization for large datasets
- Concurrency: Improved performance under high load
- Entry Metadata: Enhanced metadata for better query operations
Design Documents
This section contains formal design documents that capture the architectural thinking, decision-making process, and implementation details for complex features in Eidetica. These documents serve as a historical record of our technical decisions and provide context for future development.
Purpose
Design documents in this section:
- Document the rationale behind major technical decisions
- Capture alternative approaches that were considered
- Outline implementation strategies and tradeoffs
- Serve as a reference for future developers
- Help maintain consistency in architectural decisions
Document Structure
Each design document typically includes:
- Problem statement and context
- Goals and non-goals
- Proposed solution
- Alternative approaches considered
- Implementation details and tradeoffs
- Future considerations and potential improvements
Available Design Documents
Implemented
- Authentication - Mandatory cryptographic authentication for all entries
- Settings Storage - How settings are stored and tracked in databases
Proposed
- Error Handling - Modular error architecture for improved debugging and user experience
Implementation Status:
- ✅ Direct Keys - Fully implemented and functional
- ✅ Delegated Databases - Fully implemented and functional with comprehensive test coverage
Authentication Design
This document outlines the authentication and authorization scheme for Eidetica, a decentralized database built on Merkle-CRDT principles. The design emphasizes flexibility, security, and integration with the core CRDT system while maintaining distributed consistency.
Table of Contents
- Authentication Design
Overview
Eidetica's authentication scheme is designed to leverage the same CRDT and Merkle-DAG principles that power the core database while providing robust access control for distributed environments. Unlike traditional authentication systems, this design must handle authorization conflicts that can arise from network partitions and concurrent modifications to access control rules.
As of the current implementation, authentication is mandatory for all entries. All database operations require valid Ed25519 signatures, eliminating the concept of unsigned entries. This ensures data integrity and provides a consistent security model across all operations.
The authentication system is not implemented as a pure consumer of the database API but is tightly integrated with the core system. This integration enables efficient validation and conflict resolution during entry creation and database merging operations.
Design Goals and Principles
Primary Goals
- Mandatory Authentication: All entries must be cryptographically signed - no unsigned entries allowed
- Distributed Consistency: Authentication rules must merge deterministically across network partitions
- Cryptographic Security: All authentication based on Ed25519 public/private key cryptography
- Hierarchical Access Control: Support admin, read/write, and read-only permission levels
- Delegation: Support for delegating authentication to other databases without granting admin privileges (infrastructure built, activation pending)
- Auditability: All authentication changes are tracked in the immutable DAG history
Non-Goals
- Perfect Security: Admin key compromise requires manual intervention
- Real-time Revocation: Key revocation is eventually consistent, not immediate
System Architecture
Authentication Data Location
Authentication configuration is stored in the special _settings
store under the auth
key. This placement ensures that:
- Authentication rules are included in
_settings
, which contains all the data necessary to validate the database and add new Entries - Access control changes are tracked in the immutable history
- Settings can be validated against the current entry being created
The _settings
store uses the crate::crdt::Doc
type, which is a hierarchical CRDT that resolves conflicts using Last-Write-Wins (LWW) semantics. The ordering for LWW is determined deterministically by the DAG design (see CRDT documentation for details).
Clarification: Throughout this document, when we refer to Doc
, this is the hierarchical CRDT document type that wraps internal Node structures supporting nested maps. The _settings
store specifically uses Doc
to enable complex authentication configurations.
Permission Hierarchy
Eidetica implements a three-tier permission model:
Permission Level | Modify _settings | Add/Remove Keys | Change Permissions | Read Data | Write Data | Public Database Access |
---|---|---|---|---|---|---|
Admin | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Write | ✗ | ✗ | ✗ | ✓ | ✓ | ✓ |
Read | ✗ | ✗ | ✗ | ✓ | ✗ | ✓ |
Authentication Framework
Key Structure
The current implementation supports direct authentication keys stored in the _settings.auth
configuration. Each key consists of:
classDiagram
class AuthKey {
String pubkey
Permission permissions
KeyStatus status
}
class Permission {
<<enumeration>>
Admin(priority: u32)
Write(priority: u32)
Read
}
class KeyStatus {
<<enumeration>>
Active
Revoked
}
AuthKey --> Permission
AuthKey --> KeyStatus
Note: Both direct keys and delegated databases are fully implemented and functional, including DelegatedTreeRef
, PermissionBounds
, and TreeReference
types.
Direct Key Example
{
"_settings": {
"auth": {
"KEY_LAPTOP": {
"pubkey": "ed25519:PExACKOW0L7bKAM9mK_mH3L5EDwszC437uRzTqAbxpk",
"permissions": "write:10",
"status": "active"
},
"KEY_DESKTOP": {
"pubkey": "ed25519:QJ7bKAM9mK_mH3L5EDwszC437uRzTqAbxpkPExACKOW0L",
"permissions": "read",
"status": "active"
},
"*": {
"pubkey": "*",
"permissions": "read",
"status": "active"
},
"PUBLIC_WRITE": {
"pubkey": "*",
"permissions": "write:100",
"status": "active"
}
},
"name": "My Database"
}
}
Note: The wildcard key *
enables global permissions for anyone. Wildcard keys:
- Can have any permission level: "read", "write:N", or "admin:N"
- Are commonly used for world-readable databases (with "read" permissions) but can grant broader access
- Can be revoked like any other key
- Can be included in delegated databases (if you delegate to a database with a wildcard, that's valid)
Entry Signing Format
Every entry in Eidetica must be signed. The authentication information is embedded in the entry structure:
{
"database": {
"root": "tree_root_id",
"parents": ["parent_entry_id"],
"data": "{\"key\": \"value\"}",
"metadata": "{\"_settings\": [\"settings_tip_id\"]}"
},
"stores": [
{
"name": "users",
"parents": ["parent_entry_id"],
"data": "{\"user_data\": \"example\"}"
}
],
"auth": {
"sig": "ed25519_signature_base64_encoded",
"key": "KEY_LAPTOP"
}
}
The auth.key
field can be either:
- Direct key: A string referencing a key name in this database's
_settings.auth
- Delegation path: An ordered list of
{"key": "delegated_tree_1", "tips": ["A", "B"]}
elements, where the last element must contain only a"key"
field
The auth.sig
field contains the base64-encoded Ed25519 signature of the entry's content hash.
Key Management
Key Lifecycle
The current implementation supports two key statuses:
stateDiagram-v2
[*] --> Active: Key Added
Active --> Revoked: Revoke Key
Revoked --> Active: Reactivate Key
note right of Active : Can create new entries
note right of Revoked : Historical entries preserved, cannot create new entries
Key Status Semantics
- Active: Key can create new entries and all historical entries remain valid
- Revoked: Key cannot create new entries. Historical entries remain valid and their content is preserved during merges
Key Behavioral Details:
- Entries created before revocation remain valid to preserve history integrity
- An Admin can transition a key back to Active state from Revoked status
- Revoked status prevents new entries but preserves existing content in merges
Priority System
Priority is integrated into the permission levels for Admin and Write permissions:
- Admin(priority): Can modify settings and manage keys with equal or lower priority
- Write(priority): Can write data but not modify settings
- Read: No priority, read-only access
Priority values are u32 integers where lower values indicate higher priority:
- Priority
0
: Highest priority, typically the initial admin key - Higher numbers = lower priority
- Keys can only modify other keys with equal or lower priority (equal or higher number)
Important: Priority only affects administrative operations (key management). It does not influence CRDT merge conflict resolution, which uses Last Write Wins semantics based on the DAG structure.
Delegation (Delegated Databases)
Status: Fully implemented and functional with comprehensive test coverage. Delegated databases enable powerful authentication delegation patterns.
Concept and Benefits
Delegation allows any database to be referenced as a source of authentication keys for another database. This enables flexible authentication patterns where databases can delegate authentication to other databases without granting administrative privileges on the delegating database. Key benefits include:
- Flexible Delegation: Any database can delegate authentication to any other database
- User Autonomy: Users can manage their own personal databases with keys they control
- Cross-Project Authentication: Share authentication across multiple projects or databases
- Granular Permissions: Set both minimum and maximum permission bounds for delegated keys
Delegated databases are normal databases, and their authentication settings are used with permission clamping applied.
Important: Any database can be used as a delegated database - there's no special "authentication database" type. This means:
- A project's main database can delegate to a user's personal database
- Multiple projects can delegate to the same shared authentication database
- Databases can form delegation networks where databases delegate to each other
- The delegated database doesn't need to know it's being used for delegation
Structure
A delegated database reference in the main database's _settings.auth
contains:
{
"_settings": {
"auth": {
"example@eidetica.dev": {
"permission-bounds": {
"max": "write:15",
"min": "read" // optional, defaults to no minimum
},
"database": {
"root": "hash_of_root_entry",
"tips": ["hash1", "hash2"]
}
},
"another@example.com": {
"permission-bounds": {
"max": "admin:20" // min not specified, so no minimum bound
},
"database": {
"root": "hash_of_another_root",
"tips": ["hash3"]
}
}
}
}
}
The referenced delegated database maintains its own _settings.auth
with direct keys:
{
"_settings": {
"auth": {
"KEY_LAPTOP": {
"pubkey": "ed25519:AAAAC3NzaC1lZDI1NTE5AAAAI...",
"permissions": "admin:0",
"status": "active"
},
"KEY_MOBILE": {
"pubkey": "ed25519:AAAAC3NzaC1lZDI1NTE5AAAAI...",
"permissions": "write:10",
"status": "active"
}
}
}
}
Permission Clamping
Permissions from delegated databases are clamped based on the permission-bounds
field in the main database's reference:
- max (required): The maximum permission level that keys from the delegated database can have
- Must be <= the permissions of the key adding the delegated database reference
- min (optional): The minimum permission level for keys from the delegated database
- If not specified, there is no minimum bound
- If specified, keys with lower permissions are raised to this level
The effective priority is derived from the effective permission returned after clamping. If the delegated key's permission already lies within the min
/max
bounds its original priority value is preserved; when a permission is clamped to a bound the bound's priority value becomes the effective priority:
graph LR
A["Delegated Database: admin:5"] --> B["Main Database: max=write:10, min=read"] --> C["Effective: write:10"]
D["Delegated Database: write:8"] --> B --> E["Effective: write:8"]
F["Delegated Database: read"] --> B --> G["Effective: read"]
H["Delegated Database: admin:5"] --> I["Main Database: max=read (no min)"] --> J["Effective: read"]
K["Delegated Database: read"] --> I --> L["Effective: read"]
M["Delegated Database: write:20"] --> N["Main Database: max=admin:15, min=write:25"] --> O["Effective: write:25"]
Clamping Rules:
- Effective permission = clamp(delegated_tree_permission, min, max)
- If delegated database permission > max, it's lowered to max
- If min is specified and delegated database permission < min, it's raised to min
- If min is not specified, no minimum bound is applied
- The max bound must be <= permissions of the key that added the delegated database reference
- Effective priority = priority embedded in the effective permission produced by clamping. This is either the delegated key's priority (when already inside the bounds) or the priority that comes from the
min
/max
bound that performed the clamp. - Delegated database admin permissions only apply within that delegated database
- Permission clamping occurs at each level of delegation chains
- Note: There is no "none" permission level - absence of permissions means no access
Multi-Level References
Delegated databases can reference other delegated databases, creating delegation chains:
{
"auth": {
"sig": "signature_bytes",
"key": [
{
"key": "example@eidetica.dev",
"tips": ["current_tip"]
},
{
"key": "old-identity",
"tips": ["old_tip"]
},
{
"key": "LEGACY_KEY"
}
]
}
}
Delegation Chain Rules:
- The
auth.key
field contains an ordered list representing the delegation path - Each element has a
"key"
field and optionally"tips"
for delegated databases - The final element must contain only a
"key"
field (the actual signing key) - Each step represents traversing from one database to the next in the delegation chain
- Permission clamping applies at each level using the minimum function
- Priority at each step is the priority inside the permission value that survives the clamp at that level (outer reference, inner key, or bound, depending on which one is selected by the clamping rules)
- Tips must be valid at each level of the chain for the delegation to be valid
Delegated Database References
The main database must validate the delegated database structure as well as the main database.
Latest Known Tips
"Latest known tips" refers to the latest tips of a delegated database that have been seen used in valid key signatures within the current database. This creates a "high water mark" for each delegated database:
- When an entry uses a delegated database key, it includes the delegated database's tips at signing time
- The database tracks these tips as the "latest known tips" for that delegated database
- Future entries using that delegated database must reference tips that are equal to or newer than the latest known tips, or must be valid at the latest known tips
- This ensures that key revocations in delegated databases are respected once observed
Tip Tracking and Validation
To validate entries with delegated database keys:
- Check that the referenced tips are descendants of (or equal to) the latest known tips for that delegated database
- If they're not, check that the entry validates at the latest known tips
- Verify the key exists and has appropriate permissions at those tips
- Update the latest known tips if these are newer
- Apply permission clamping based on the delegation reference
This mechanism ensures that once a key revocation is observed in a delegated database, no entry can use an older version of that database where the key was still valid.
Key Revocation
Delegated database key deletion is always treated as revoked
status in the main database. This prevents new entries from building on the deleted key's content while preserving the historical content during merges. This approach maintains the integrity of existing entries while preventing future reliance on removed authentication credentials.
By treating delegated database key deletion as revoked
status, users can manage their own key lifecycle in the Main Database while ensuring that:
- Historical entries remain valid and their content is preserved
- New entries cannot use the revoked key's entries as parents
- The merge operation proceeds normally with content preserved
- Users cannot create conflicts that would affect other users' valid entries
Conflict Resolution and Merging
Conflicts in the _settings
database are resolved by the crate::crdt::Doc
type using Last Write Wins (LWW) semantics. When the database has diverged with both sides of the merge having written to the _settings
database, the write with the higher logical timestamp (determined by the DAG structure) will win, regardless of the priority of the signing key.
Priority rules apply only to administrative permissions - determining which keys can modify other keys - but do not influence the conflict resolution during merges.
This is applied to delegated databases as well. A write to the Main Database must also recursively merge any changed settings in the delegated databases using the same LWW strategy to handle network splits in the delegated databases.
Key Status Changes in Delegated Databases: Examples
The following examples demonstrate how key status changes in delegated databases affect entries in the main database.
Example 1: Basic Delegated Database Key Status Change
Initial State:
graph TD
subgraph "Main Database"
A["Entry A<br/>Settings: delegated_tree1 = max:write:10, min:read<br/>Tip: UA"]
B["Entry B<br/>Signed by delegated_tree1:laptop<br/>Tip: UA<br/>Status: Valid"]
C["Entry C<br/>Signed by delegated_tree1:laptop<br/>Tip: UB<br/>Status: Valid"]
end
subgraph "Delegated Database"
UA["Entry UA<br/>Settings: laptop = active"]
UB["Entry UB<br/>Signed by laptop"]
end
A --> B
B --> C
UA --> UB
After Key Status Change in Delegated Database:
graph TD
subgraph "Main Database"
A["Entry A<br/>Settings: user1 = write:15"]
B["Entry B<br/>Signed by delegated_tree1:laptop<br/>Tip: UA<br/>Status: Valid"]
C["Entry C<br/>Signed by delegated_tree1:laptop<br/>Tip: UB<br/>Status: Valid"]
D["Entry D<br/>Signed by delegated_tree1:mobile<br/>Tip: UC<br/>Status: Valid"]
E["Entry E<br/>Signed by delegated_tree1:laptop<br/>Parent: C<br/>Tip: UB<br/>Status: Valid"]
F["Entry F<br/>Signed by delegated_tree1:mobile<br/>Tip: UC<br/>Sees E but ignores since the key is invalid"]
G["Entry G<br/>Signed by delegated_tree1:desktop<br/>Tip: UB<br/>Still thinks delegated_tree1:laptop is valid"]
H["Entry H<br/>Signed by delegated_tree1:mobile<br/>Tip: UC<br/>Merges, as there is a valid key at G"]
end
subgraph "Delegated Database (delegated_tree1)"
UA["Entry UA<br/>Settings: laptop = active, mobile = active, desktop = active"]
UB["Entry UB<br/>Signed by laptop"]
UC["Entry UC<br/>Settings: laptop = revoked<br/>Signed by mobile"]
end
A --> B
B --> C
C --> D
D --> F
C --> E
E --> G
F --> H
G --> H
UA --> UB
UB --> UC
Example 2: Last Write Wins Conflict Resolution
Scenario: Two admins make conflicting authentication changes during a network partition. Priority determines who can make the changes, but Last Write Wins determines the final merged state.
After Network Reconnection and Merge:
graph TD
subgraph "Merged Main Database"
A["Entry A"]
B["Entry B<br/>Alice (admin:10) bans user_bob<br/>Timestamp: T1"]
C["Entry C<br/>Super admin (admin:0) promotes user_bob to admin:5<br/>Timestamp: T2"]
M["Entry M<br/>Merge entry<br/>user_bob = admin<br/>Last write (T2) wins via LWW"]
N["Entry N<br/>Alice attempts to ban user_bob<br/>Rejected: Alice can't modify admin-level user with higher priority"]
end
A --> B
A --> C
B --> M
C --> M
M --> N
Key Points:
- All administrative actions are preserved in history
- Last Write Wins resolves the merge conflict: the most recent change (T2) takes precedence
- Permission-based authorization still prevents unauthorized modifications: Alice (admin:10) cannot ban a higher-priority user (admin:5) due to insufficient priority level
- The merged state reflects the most recent write, not the permission priority
- Permission priority rules prevent Alice from making the change in Entry N, as she lacks authority to modify higher-priority admin users
Authorization Scenarios
Network Partition Recovery
When network partitions occur, the authentication system must handle concurrent changes gracefully:
Scenario: Two branches of the database independently modify the auth settings, requiring CRDT-based conflict resolution using Last Write Wins.
Both branches share the same root, but a network partition has caused them to diverge before merging back together.
graph TD
subgraph "Merged Main Database"
ROOT["Entry ROOT"]
A1["Entry A1<br/>admin adds new_developer<br/>Timestamp: T1"]
A2["Entry A2<br/>dev_team revokes contractor_alice<br/>Timestamp: T3"]
B1["Entry B1<br/>contractor_alice data change<br/>Valid at time of creation"]
B2["Entry B2<br/>admin adds emergency_key<br/>Timestamp: T2"]
M["Entry M<br/>Merge entry<br/>Final state based on LWW:<br/>- new_developer: added (T1)<br/>- emergency_key: added (T2)<br/>- contractor_alice: revoked (T3, latest)"]
end
ROOT --> A1
ROOT --> B1
A1 --> A2
B1 --> B2
A2 --> M
B2 --> M
Conflict Resolution Rules Applied:
- Settings Merge: All authentication changes are merged using Doc CRDT semantics with Last Write Wins
- Timestamp Ordering: Changes are resolved based on logical timestamps, with the most recent change taking precedence
- Historical Validity: Entry B1 remains valid because it was created before the status change
- Content Preservation: With "revoked" status, content is preserved in merges but cannot be used as parents for new entries
- Future Restrictions: Future entries by contractor_alice would be rejected based on the applied status change
Security Considerations
Threat Model
Protected Against
- Unauthorized Entry Creation: All entries must be signed by valid keys
- Permission Escalation: Users cannot grant themselves higher privileges than their main database reference
- Historical Tampering: Immutable DAG prevents retroactive modifications
- Replay Attacks: Content-addressable IDs prevent entry duplication
- Administrative Hierarchy Violations: Lower priority keys cannot modify higher priority keys (but can modify equal priority keys)
- Permission Boundary Violations: Delegated database permissions are constrained within their specified min/max bounds
- Race Conditions: Last Write Wins provides deterministic conflict resolution
Requires Manual Recovery
- Admin Key Compromise: When no higher-priority key exists
- Conflicting Administrative Changes: LWW may result in unintended administrative state during network partitions
Cryptographic Assumptions
- Ed25519 Security: Default to ed25519 signatures with explicit key type storage
- Hash Function Security: SHA-256 for content addressing
- Key Storage: Private keys must be securely stored by clients
- Network Security: Assumption of eventually consistent but potentially unreliable network
Attack Vectors
Mitigated
- Key Replay: Content-addressable entry IDs prevent signature replay
- Downgrade Attacks: Explicit key type storage prevents algorithm confusion
- Partition Attacks: CRDT merging handles network partition scenarios
- Privilege Escalation: Permission clamping prevents users from exceeding granted permissions
Partial Mitigation
- DoS via Large Histories: Priority system limits damage from compromised lower-priority keys
- Social Engineering: Administrative hierarchy limits scope of individual key compromise
- Timestamp Manipulation: LWW conflict resolution is deterministic but may be influenced by the chosen timestamp resolution algorithm
- Administrative Confusion: Network partitions may result in unexpected administrative states due to LWW resolution
Not Addressed
- Side-Channel Attacks: Client-side key storage security is out of scope
- Physical Key Extraction: Assumed to be handled by client security measures
- Long-term Cryptographic Breaks: Future crypto-agility may be needed
Implementation Details
Authentication Validation Process
The current validation process:
- Extract Authentication Info: Parse the
auth
field from the entry - Resolve Key Name: Lookup the direct key in
_settings.auth
- Check Key Status: Verify the key is Active (not Revoked)
- Validate Signature: Verify the Ed25519 signature against the entry content hash
- Check Permissions: Ensure the key has sufficient permissions for the operation
Current features include: Direct key validation, delegated database resolution, tip validation, and permission clamping.
Sync Permissions
Eidetica servers require proof of read permissions before allowing database synchronization. The server challenges the client to sign a random nonce, then validates the signature against the database's authentication configuration.
CRDT Metadata Considerations
The current system uses entry metadata to reference settings tips. With authentication:
- Metadata continues to reference current
_settings
tips for validation efficiency - Authentication validation uses the settings state at the referenced tips
- This ensures entries are validated against the authentication rules that were current when created
Implementation Architecture
Core Components
-
AuthValidator (
auth/validation.rs
): Validates entries and resolves authentication- Direct key resolution and validation
- Signature verification
- Permission checking
- Caching for performance
-
Crypto Module (
auth/crypto.rs
): Cryptographic operations- Ed25519 key generation and parsing
- Entry signing and verification
- Key format:
ed25519:<base64-encoded-public-key>
-
AuthSettings (
auth/settings.rs
): Settings management interface- Add/update/get authentication keys
- Convert between settings storage and auth types
- Validate authentication operations
-
Permission Module (
auth/permission.rs
): Permission logic- Permission checking for operations
- Permission clamping for delegated databases
Storage Format
Authentication configuration is stored in _settings.auth
as a Doc CRDT:
// Key storage structure
AuthKey {
pubkey: String, // Ed25519 public key
permissions: Permission, // Admin(u32), Write(u32), or Read
status: KeyStatus, // Active or Revoked
}
Future Considerations
Current Implementation Status
- Direct Keys: ✅ Fully implemented and tested
- Delegated Databases: ✅ Fully implemented with comprehensive test coverage
- Permission Clamping: ✅ Functional for delegation chains
- Delegation Depth Limits: ✅ Implemented with MAX_DELEGATION_DEPTH=10
Future Enhancements
- Advanced Key Status: Add Ignore and Banned statuses for more nuanced key management
- Performance Optimizations: Further caching and validation improvements
- User experience improvements for key management
References
Synchronization Design Document
This document outlines the design principles, architecture decisions, and implementation strategy for Eidetica's synchronization system.
Design Goals
Primary Objectives
- Decentralized Architecture: No central coordination required
- Performance: Minimize latency and maximize throughput
- Reliability: Handle network failures and recover gracefully
- Scalability: Support many peers and large datasets
- Security: Authenticated and verified peer communications
- Simplicity: Easy to configure and use
Non-Goals
- Selective sync: Sync entire databases only (not partial)
- Multi-hop routing: Direct peer connections only
- Complex conflict resolution: CRDT-based automatic resolution only
- Centralized coordination: No dependency on coordination servers
Core Design Principles
1. Merkle-CRDT Foundation
The sync system builds on Merkle DAG and CRDT principles:
- Content-addressable entries: Immutable, hash-identified data
- DAG structure: Parent-child relationships form directed acyclic graph
- CRDT merging: Deterministic conflict resolution
- Causal consistency: Operations maintain causal ordering
Benefits:
- Natural deduplication (same content = same hash)
- Efficient diff computation (compare tips)
- Automatic conflict resolution
- Partition tolerance
2. BackgroundSync Engine with Command Pattern
Decision: Single background thread with command-channel communication
Rationale:
- Clean architecture: Eliminates circular dependencies
- Ownership clarity: Background thread owns transport state
- Non-blocking: Commands sent via channels don't block operations
- Flexibility: Fire-and-forget or request-response patterns
Implementation:
The sync system uses a thin frontend that sends commands to a background thread:
- Frontend handles API and peer/relationship management in sync database
- Background owns transport and handles network operations
- Both components access sync database directly for peer data
- Commands used only for operations requiring background processing
- Failed operations added to retry queue
Trade-offs:
- ✅ No circular dependencies or complex locking
- ✅ Clear ownership model (transport in background, data in sync database)
- ✅ Works in both async and sync contexts
- ✅ Graceful startup/shutdown handling
- ❌ All sync operations serialized through single thread
3. Hook-Based Change Detection
Decision: Use trait-based hooks integrated into Transaction commit
Rationale:
- Automatic: No manual sync triggering required
- Consistent: Every commit is considered for sync
- Extensible: Additional hooks can be added
- Performance: Minimal overhead when sync disabled
Architecture:
// Hook trait for extensibility
trait SyncHook {
fn on_entry_committed(&self, context: &SyncHookContext) -> Result<()>;
}
// Integration point in Transaction
impl Transaction {
pub fn commit(self) -> Result<ID> {
let entry = self.build_and_store_entry()?;
// Execute sync hooks after successful storage
if let Some(hooks) = &self.sync_hooks {
let context = SyncHookContext { tree_id, entry, ... };
hooks.execute_hooks(&context)?;
}
Ok(entry.id())
}
}
Benefits:
- Zero-configuration automatic sync
- Guaranteed coverage of all changes
- Failure isolation (hook failures don't affect commits)
- Easy testing and debugging
4. Modular Transport Layer with SyncHandler Architecture
Decision: Abstract transport layer with handler-based request processing
Core Interface:
pub trait SyncTransport: Send + Sync {
/// Start server with handler for processing sync requests
async fn start_server(&mut self, addr: &str, handler: Arc<dyn SyncHandler>) -> Result<()>;
/// Send sync request and get response
async fn send_request(&self, address: &Address, request: &SyncRequest) -> Result<SyncResponse>;
}
pub trait SyncHandler: Send + Sync {
/// Process incoming sync requests with database access
async fn handle_request(&self, request: &SyncRequest) -> SyncResponse;
}
Rationale:
- Database Access: Handlers can store received entries via backend
- Stateful Processing: Support GetTips, GetEntries, SendEntries operations
- Clean Separation: Transport handles networking, handler handles sync logic
- Flexibility: Support different network environments
- Evolution: Easy to add new transport protocols
- Testing: Mock transports for unit tests
Supported Transports:
HTTP Transport
pub struct HttpTransport {
client: reqwest::Client,
server: Option<HttpServer>,
handler: Option<Arc<dyn SyncHandler>>,
}
Implementation:
- Axum server with handler state injection
- JSON serialization at
/api/v0
endpoint - Handler processes requests with database access
Use cases:
- Simple development and testing
- Firewall-friendly environments
- Integration with existing HTTP infrastructure
Trade-offs:
- ✅ Widely supported and debuggable
- ✅ Works through most firewalls/proxies
- ✅ Full database access via handler
- ❌ Less efficient than P2P protocols
- ❌ Requires port management
Iroh P2P Transport
pub struct IrohTransport {
endpoint: Option<Endpoint>,
server_state: ServerState,
handler: Option<Arc<dyn SyncHandler>>,
}
Implementation:
- QUIC bidirectional streams for request/response
- Handler integration in stream processing
- JsonHandler for serialization consistency
Use cases:
- Production deployments
- NAT traversal required
- Direct peer-to-peer communication
Trade-offs:
- ✅ Efficient P2P protocol with NAT traversal
- ✅ Built-in relay and hole punching
- ✅ QUIC-based with modern networking features
- ✅ Full database access via handler
- ❌ More complex setup and debugging
- ❌ Additional dependency
5. Persistent State Management
Decision: All peer and relationship state stored persistently in sync database
Architecture:
Sync Database (Persistent):
├── peers/{peer_pubkey} -> PeerInfo (addresses, status, metadata)
├── relationships/{peer}/{database} -> SyncRelationship
├── sync_state/cursors/{peer}/{database} -> SyncCursor
├── sync_state/metadata/{peer} -> SyncMetadata
└── sync_state/history/{sync_id} -> SyncHistoryEntry
BackgroundSync (Transient):
├── retry_queue: Vec<RetryEntry> (failed sends pending retry)
└── sync_tree_id: ID (reference to sync database for peer lookups)
Design:
- All peer data is stored in the sync database via PeerManager
- BackgroundSync reads peer information on-demand when needed
- Frontend writes peer/relationship changes directly to sync database
- Single source of truth in persistent storage
Rationale:
- Durability: All critical state survives restarts
- Consistency: Single source of truth in sync database
- Recovery: Full state recovery after failures
- Simplicity: No duplicate state management
Architecture Deep Dive
Component Interactions
graph LR
subgraph "Change Detection"
A[Transaction::commit] --> B[SyncHooks]
B --> C[SyncHookImpl]
end
subgraph "Command Channel"
C --> D[Command TX]
D --> E[Command RX]
end
subgraph "BackgroundSync Thread"
E --> F[BackgroundSync]
F --> G[Transport Layer]
G --> H[HTTP/Iroh/Custom]
F --> I[Retry Queue]
F -.->|reads| ST[Sync Database]
end
subgraph "State Management"
K[SyncStateManager] --> L[Persistent State]
F --> K
end
subgraph "Peer Management"
M[PeerManager] --> N[Peer Registry]
F --> M
end
Data Flow Design
1. Entry Commit Flow
1. Application calls database.new_transaction().commit()
2. Transaction stores entry in backend
3. Transaction executes sync hooks
4. SyncHookImpl creates QueueEntry command
5. Command sent to BackgroundSync via channel
6. BackgroundSync fetches entry from backend
7. Entry sent immediately to peer via transport
8. Failed sends added to retry queue
2. Peer Connection Flow
1. Application calls sync.connect_to_peer(address)
2. Sync creates HandshakeRequest with device info
3. Transport sends handshake to peer
4. Peer responds with HandshakeResponse
5. Both peers verify signatures and protocol versions
6. Successful peers are registered in PeerManager
7. Connection state updated to Connected
3. Sync Relationship Flow
1. Application calls sync.add_tree_sync(peer_id, tree_id)
2. PeerManager stores relationship in sync database
3. Future commits to database trigger sync hooks
4. SyncChangeDetector finds this peer in relationships
5. Entries queued for sync with this peer
BackgroundSync Command Management
Command Structure
The BackgroundSync engine processes commands sent from the frontend:
- SendEntries: Direct entry transmission to peer
- QueueEntry: Entry committed, needs sync
- AddPeer/RemovePeer: Peer registry management
- CreateRelationship: Database-peer sync mapping
- StartServer/StopServer: Transport server control
- ConnectToPeer: Establish peer connection
- SyncWithPeer: Trigger bidirectional sync
- Shutdown: Graceful termination
Processing Model
Immediate processing: Commands handled as received
- No batching delays or queue buildup
- Failed operations go to retry queue
- Fire-and-forget for most operations
- Request-response via oneshot channels when needed
Retry queue: Failed sends with exponential backoff
- 2^attempts seconds delay (max 64s)
- Configurable max attempts before dropping
- Processed every 30 seconds by timer
Error Handling Strategy
Transient errors: Retry with exponential backoff
- Network timeouts
- Temporary peer unavailability
- Transport-level failures
Persistent errors: Remove after max retries
- Invalid peer addresses
- Authentication failures
- Protocol incompatibilities
Recovery mechanisms:
// Automatic retry tracking
entry.mark_attempted(Some(error.to_string()));
// Cleanup failed entries periodically
queue.cleanup_failed_entries(max_retries)?;
// Metrics for monitoring
let stats = queue.get_sync_statistics()?;
Transport Layer Design
Iroh Transport Configuration
Design Decision: Builder pattern for transport configuration
The Iroh transport uses a builder pattern to support different deployment scenarios:
RelayMode Options:
- Default: Production deployments use n0's global relay infrastructure
- Staging: Testing against n0's staging infrastructure
- Disabled: Local testing without internet dependency
- Custom: Enterprise deployments with private relay servers
Rationale:
- Flexibility: Different environments need different configurations
- Performance: Local tests run faster without relay overhead
- Privacy: Enterprises can run private relay infrastructure
- Simplicity: Defaults work for most users without configuration
Address Serialization:
The Iroh transport serializes NodeAddr information as JSON containing:
- Node ID (cryptographic identity)
- Direct socket addresses (for P2P connectivity)
This allows the same get_server_address()
interface to work for both HTTP (returns socket address) and Iroh (returns rich connectivity info).
Security Design
Authentication Model
Device Identity:
- Each database instance has an Ed25519 keypair
- Public key serves as device identifier
- Private key signs all sync operations
Peer Verification:
- Handshake includes signature challenge
- Both peers verify counterpart signatures
- Only verified peers allowed to sync
Entry Authentication:
- All entries signed by creating device
- Receiving peer verifies signatures
- Invalid signatures rejected
Trust Model
Assumptions:
- Peers are semi-trusted (authenticated but may be malicious)
- Private keys are secure
- Transport layer provides integrity
Threat Mitigation:
- Man-in-middle: Ed25519 signatures prevent tampering
- Replay attacks: Entry IDs are content-based (no replays possible)
- Denial of service: Rate limiting and queue size limits
- Data corruption: Signature verification catches corruption
Protocol Security
Handshake Protocol:
A -> B: HandshakeRequest {
device_id: string,
public_key: ed25519_pubkey,
challenge: random_bytes(32),
signature: sign(private_key, challenge)
}
B -> A: HandshakeResponse {
device_id: string,
public_key: ed25519_pubkey,
challenge_response: sign(private_key, original_challenge),
counter_challenge: random_bytes(32)
}
A -> B: verify(B.public_key, challenge_response, challenge)
B -> A: verify(A.public_key, signature, challenge)
Performance Considerations
Memory Usage
Queue sizing:
- Default: 100 entries per peer × 100 bytes = 10KB per peer
- Configurable limits prevent memory exhaustion
- Automatic cleanup of failed entries
Persistent state:
- Minimal: ~1KB per peer relationship
- Periodic cleanup of old history entries
- Efficient serialization formats
Network Efficiency
Batching benefits:
- Reduce TCP/HTTP overhead
- Better bandwidth utilization
- Fewer transport-layer handshakes
Compression potential:
- Similar entries share structure
- JSON/binary format optimization
- Transport-level compression (HTTP gzip, QUIC)
CPU Usage
Background worker:
- Configurable check intervals
- Async processing doesn't block application
- Efficient queue scanning
Hook execution:
- Fast in-memory operations only
- Hook failures don't affect commits
- Minimal serialization overhead
Configuration Design
Queue Configuration
pub struct SyncQueueConfig {
pub max_queue_size: usize, // Size-based flush trigger
pub max_queue_age_secs: u64, // Age-based flush trigger
pub batch_size: usize, // Max entries per network call
}
Tuning guidelines:
- High-frequency apps: Lower max_queue_age_secs (5-15s)
- Batch workloads: Higher max_queue_size (200-1000)
- Low bandwidth: Lower batch_size (10-25)
- High bandwidth: Higher batch_size (100-500)
Worker Configuration
pub struct SyncFlushConfig {
pub check_interval_secs: u64, // How often to check for flushes
pub enabled: bool, // Enable/disable background worker
}
Trade-offs:
- Lower check_interval = more responsive, higher CPU
- Higher check_interval = less responsive, lower CPU
Implementation Strategy
Phase 1: Core Infrastructure ✅
- BackgroundSync engine with command pattern
- Hook-based change detection
- Basic peer management
- HTTP transport
- Ed25519 handshake protocol
Phase 2: Production Features ✅
- Iroh P2P transport (handler needs fix)
- Retry queue with exponential backoff
- Sync state persistence via DocStore
- Channel-based communication
- 78 integration tests passing
Phase 3: Advanced Features
- Sync priorities and QoS
- Bandwidth throttling
- Monitoring and metrics
- Multi-database coordination
Phase 4: Scalability
- Persistent queue spillover
- Streaming for large entries
- Advanced conflict resolution
- Performance analytics
Testing Strategy
Unit Testing
Component isolation:
- Mock transport layer for networking tests
- In-memory backends for storage tests
- Deterministic time for age-based tests
Coverage targets:
- Queue operations: 100%
- Hook execution: 100%
- Error handling: 95%
- State management: 95%
Integration Testing
Multi-peer scenarios:
- 2-peer bidirectional sync
- 3+ peer mesh networks
- Database sync relationship management
- Network failure recovery
Performance testing:
- Large queue handling
- High-frequency updates
- Memory usage under load
- Network efficiency measurement
End-to-End Testing
Real network conditions:
- Simulated network failures
- High latency connections
- Bandwidth constraints
- Concurrent peer connections
Migration and Compatibility
Backward Compatibility
Protocol versioning:
- Version negotiation in handshake
- Graceful degradation for older versions
- Clear upgrade paths
Data format evolution:
- Extensible serialization formats
- Schema migration strategies
- Rollback procedures
Deployment Considerations
Configuration migration:
- Default configuration for new installations
- Migration scripts for existing data
- Validation of configuration parameters
Operational procedures:
- Health check endpoints
- Monitoring integration
- Log aggregation and analysis
Future Evolution
Planned Enhancements
- Selective sync: Per-store sync control
- Conflict resolution: Advanced merge strategies
- Performance: Compression and protocol optimization
- Monitoring: Rich metrics and observability
- Scalability: Large-scale deployment support
Research Areas
- Byzantine fault tolerance: Handle malicious peers
- Incentive mechanisms: Economic models for sync
- Privacy: Encrypted sync protocols
- Consensus: Distributed agreement protocols
- Sharding: Horizontal scaling techniques
Success Metrics
Performance Targets
- Queue latency: < 1ms for queue operations
- Sync latency: < 5s for small changes in normal conditions
- Throughput: > 1000 entries/second per peer
- Memory usage: < 10MB for 100 active peers
Reliability Targets
- Availability: 99.9% sync success rate
- Recovery: < 30s to resume after network failure
- Consistency: 100% eventual consistency (no data loss)
- Security: 0 known authentication bypasses
Usability Targets
- Setup time: < 5 minutes for basic configuration
- Documentation: Complete API and troubleshooting guides
- Error messages: Clear, actionable error descriptions
- Monitoring: Built-in observability for operations teams
Settings Storage Design
Overview
This document describes how Eidetica stores, retrieves, and tracks settings in databases. Settings are stored exclusively in the _settings
store and tracked via entry metadata for efficient access.
Architecture
Settings Storage
Settings are stored in the _settings
store (constant SETTINGS
in constants.rs
):
// Settings structure in _settings store
{
"auth": {
"key_name": {
"key": "...", // Public key
"permissions": "...", // Permission level
"status": "..." // Active/Revoked
}
}
// Future: tree_config, replication, etc.
}
Key Properties:
- Data Type:
Doc
CRDT for deterministic merging - Location: Exclusively in
_settings
store - Access: Through
Transaction::get_settings()
method
Settings Retrieval
Transaction::get_settings()
provides unified access to settings:
pub fn get_settings(&self) -> Result<Doc> {
// Get historical settings from the database
let mut historical_settings = self.get_full_state::<Doc>(SETTINGS)?;
// Get any staged changes to the _settings store in this operation
let staged_settings = self.get_local_data::<Doc>(SETTINGS)?;
// Merge using CRDT semantics
historical_settings = historical_settings.merge(&staged_settings)?;
Ok(historical_settings)
}
The method combines:
- Historical state: Computed from all relevant entries in the database
- Staged changes: Any modifications to
_settings
in the current operation
Entry Metadata
Every entry includes metadata tracking settings state:
#[derive(Debug, Clone, Serialize, Deserialize)]
struct EntryMetadata {
/// Tips of the _settings store at the time this entry was created
settings_tips: Vec<ID>,
/// Random entropy for ensuring unique IDs for root entries
entropy: Option<u64>,
}
Metadata Properties:
- Automatically populated by
Transaction::commit()
- Used for efficient settings validation in sparse checkouts
- Stored in
TreeNode.metadata
field as serialized JSON
Data Structures
Entry Structure
pub struct Entry {
database: TreeNode, // Main database node with metadata
stores: Vec<SubTreeNode>, // Named stores including _settings
sig: SigInfo, // Signature information
}
TreeNode Structure
struct TreeNode {
pub root: ID, // Root entry ID of the database
pub parents: Vec<ID>, // Parent entry IDs in main database history
pub metadata: Option<RawData>, // Structured metadata (settings tips, entropy)
}
Note: TreeNode
no longer contains a data field - all data is stored in named stores.
SubTreeNode Structure
struct SubTreeNode {
pub name: String, // Store name (e.g., "_settings")
pub parents: Vec<ID>, // Parent entries in store history
pub data: RawData, // Serialized store data
}
Authentication Settings
Authentication configuration is stored in _settings.auth
:
AuthSettings Structure
pub struct AuthSettings {
inner: Doc, // Wraps Doc data from _settings.auth
}
Key Operations:
add_key()
: Add/update authentication keysrevoke_key()
: Mark keys as revokedget_key()
: Retrieve specific keysget_all_keys()
: Get all authentication keys
Authentication Flow
- Settings Access:
Transaction::get_settings()
retrieves current auth configuration - Key Resolution:
AuthValidator
resolves key names to full key information - Permission Check: Validates operation against key permissions
- Signature Verification: Verifies entry signatures match configured keys
Usage Patterns
Reading Settings
// In an Transaction context
let settings = op.get_settings()?;
// Access auth configuration
if let Some(Value::Map(auth_map)) = settings.get("auth") {
// Process authentication settings
}
Modifying Settings
// Get a DocStore handle for the _settings store
let mut settings_store = op.get_subtree::<DocStore>("_settings")?;
// Update a setting
settings_store.set("tree_config.name", "My Database")?;
// Commit the operation
let entry_id = op.commit()?;
Bootstrap Process
When creating a database with authentication:
- First entry includes auth configuration in
_settings.auth
Transaction::commit()
detects bootstrap scenario- Allows self-signed entry to establish initial auth configuration
Design Benefits
- Single Source of Truth: All settings in
_settings
store - CRDT Semantics: Deterministic merge resolution for concurrent updates
- Efficient Access: Metadata tips enable quick settings retrieval
- Clean Architecture: Entry is pure data, Transaction handles business logic
- Extensibility: Easy to add new setting categories alongside
auth
Error Handling Design
Overview
Error handling in Eidetica follows principles of modularity, locality, and user ergonomics using structured error types with zero-cost conversion.
Design Philosophy
Error Locality: Each module owns its error types, keeping them discoverable alongside functions that produce them.
Structured Error Data: Uses typed fields instead of string-based errors for pattern matching, context preservation, and performance.
Progressive Context: Errors gain context moving up the stack - lower layers provide technical details, higher layers add user-facing categorization.
Architecture
Error Hierarchy: Database structure where modules define error types aggregated into top-level Error
enum with variants for Io, Serialize, Auth, Backend, Base, CRDT, Store, and Transaction errors.
Module-Specific Errors: Each component has domain-specific error enums covering key resolution, storage operations, database management, merge conflicts, data access, and transaction coordination.
Transparent Conversion: #[error(transparent)]
enables zero-cost conversion between module errors and top-level type using ?
operator.
Error Categories
By Nature: Not found errors (module-specific variants), permission errors (authentication/authorization), validation errors (input/state consistency), operation errors (business logic violations).
By Layer: Core errors (fundamental operations), storage layer (database/persistence), data layer (CRDT/store operations), application layer (high-level coordination).
Error Handling Patterns
Contextual Propagation: Errors preserve context while moving up the stack, maintaining technical details and enabling categorization.
Classification Helpers: Top-level Error
provides methods like is_not_found()
, is_permission_denied()
, is_authentication_error()
for broad category handling.
Non-Exhaustive Enums: All error enums use #[non_exhaustive]
for future extension without breaking changes.
Performance
Zero-Cost Abstractions: Transparent errors eliminate wrapper overhead, structured fields avoid string formatting until display, no heap allocations in common paths.
Efficient Propagation: Seamless ?
operator across module boundaries with automatic conversion and preserved context.
Usage Patterns
Library Users: Use helper methods for stable APIs that won't break with new error variants.
Library Developers: Define new variants in appropriate module enums with structured fields for context, add helper methods for classification.
Extensibility
New error variants can be added without breaking existing code. Operations spanning modules can wrap/convert errors for appropriate context. Structured data enables sophisticated error recovery based on specific failure modes.