Skip to main content

HocusPocus Architecture

Overview

The HocusPocus server is the backbone of the collaborative editing for notebook. It provides real-time collaboration capabilities with a multi-layered data persistence strategy to ensure zero data loss during document editing.

The architecture is designed to handle:

  • Multiple simultaneous connections
  • Real-time document updates
  • Graceful disconnections
  • Version management
  • System shutdowns

All while ensuring zero data loss during collaborative editing sessions.

Key Components

Core Components

  • Hocuspocus Server: Central websocket server handling collaborative editing
  • YJS Document Store: In-memory document representation
  • Document Persistence Layers:
    • Immediate operation journaling
    • Redis for quick access caching (1-second debounce)
    • Database for long-term persistence
    • Periodic forced saves

Connection Lifecycle

Authentication Flow

Client                 HocusPocus Server                 Database
| | |
| --- Connect + JWT ------> | |
| | --- Verify JWT -----------> |
| | <-- User Info ------------- |
| | --- Track Activity -------> |
| <-- Auth Success -------- | |
| | |

Connection Handling

1. Authentication

The server authenticates users through JWT tokens:

2. Connection Establishment

When a client connects:

  • The server registers the document in the active documents map
  • Tracks the client connection
  • Marks the document as initializing for this connection
  • Sets up event handlers for document operations

3. Disconnection Handling

When a client disconnects:

  • Cleanup initialization tracking
  • Remove client from active clients list
  • Save the document if it's the last client and there are unsaved changes

Document Changes and Persistence

Change Detection

The server implements smart content change detection:

  • Uses previousDocumentStates to track prior state
  • Compares byte sizes to efficiently detect real changes
  • Only counts meaningful changes toward save thresholds
function hasActualContentChanges(documentName, update, context) {
// First update for this document, store as initial state
if (!previousDocumentStates.has(documentName)) {
previousDocumentStates.set(documentName, update);
return true;
}

const previousState = previousDocumentStates.get(documentName);

// Compare state sizes instead of using equals
const prevSize =
previousState.byteLength ||
(previousState.buffer ? previousState.buffer.byteLength : 0);
const newSize =
update.byteLength || (update.buffer ? update.buffer.byteLength : 0);

// Store current state for next comparison
previousDocumentStates.set(documentName, update);

// If sizes are different, content has changed
if (prevSize !== newSize) {
return true;
}

return false;
}

Multi-Layered Data Persistence

1. Immediate Journaling

Every meaningful update is immediately written to a journal file for disaster recovery:

const journalOperation = async (
documentName,
update,
context,
contentChanged,
) => {
try {
// Only journal if there's an actual content change
if (!contentChanged) {
return;
}

const timestamp = Date.now();
const journalFile = path.join(JOURNAL_DIR, `${documentName}.journal`);
const entry =
JSON.stringify({
timestamp,
userId: context?.user?.userId || SYSTEM_USER_ID,
updateLength: update.length,
updateBase64: update.toString("base64"),
}) + "\n";

// Append to journal file atomically
fs.appendFileSync(journalFile, entry);
} catch (error) {
console.error(`Journal error for ${documentName}:`, error);
}
};

2. Redis Caching (1-second debounce)

Document changes are stored in Redis with a short debounce period for quick access:

store: debounce(async ({ documentName, state, context }) => {
try {
// Always store in Redis regardless of content changes
const redisKey = getRedisKey(documentName);
await redisClient.set(redisKey, Buffer.from(state), 'EX', REDIS_DOC_EXPIRY);

// Check if this is an update we need to track
const contentChanged = hasActualContentChanges(documentName, state, context);

// Only increment counter and consider database persistence for content changes
if (contentChanged) {
const currentCount = (documentUpdateCounters.get(documentName) || 0) + 1;
documentUpdateCounters.set(documentName, currentCount);

// Check if we should save to database
if (shouldSaveToDatabase(documentName)) {
await saveDocument(documentName, state, context);
}
}
} catch (error) {
console.error('Error storing document:', error);
}
}, 1000, { maxWait: 5000 }), // 1 second debounce with max wait of 5 seconds

3. Database Persistence

Documents are saved to the database based on thresholds:

  • After a certain number of updates (UPDATE_THRESHOLD)
  • After a maximum time between saves (MAX_TIME_BETWEEN_SAVES)
  • On disconnection of the last client
  • On system shutdown
const shouldSaveToDatabase = (documentName) => {
const currentCount = documentUpdateCounters.get(documentName) || 0;
const lastSave = lastDocumentSaves.get(documentName) || 0;
const timeSinceLastSave = Date.now() - lastSave;

// Save if we've hit the update threshold or it's been too long since last save
const shouldSave =
currentCount >= UPDATE_THRESHOLD ||
timeSinceLastSave >= MAX_TIME_BETWEEN_SAVES;

return shouldSave;
};

4. Periodic Forced Saves

A periodic task runs to ensure documents with changes are saved regularly, with intelligent optimization to prevent excessive version creation:

// frequency (3 minutes )
setInterval(periodicSaveTask, 3 * 60 * 1000);

// Smart threshold configuration
const PERIODIC_SAVE_MIN_INTERVAL = 5 * 60 * 1000; // Minimum 5 minutes between saves
const PERIODIC_SAVE_MIN_UPDATES = Math.floor(UPDATE_THRESHOLD * 0.1); // Minimum 10% of threshold

async function periodicSaveTask() {
try {
for (const [documentName, docData] of activeDocuments.entries()) {
// Get status information
const updateCount = documentUpdateCounters.get(documentName) || 0;
const lastSave = lastDocumentSaves.get(documentName) || 0;
const timeSinceLastSave = Date.now() - lastSave;

// Smart save conditions
const hasEnoughUpdates = updateCount >= PERIODIC_SAVE_MIN_UPDATES;
const hasBeenLongEnough = timeSinceLastSave >= PERIODIC_SAVE_MIN_INTERVAL;
const hasClients = docData.clients.size > 0;
const isInactiveLongEnough =
Date.now() - docData.lastActivity >= INACTIVE_CLIENT_THRESHOLD;

// Only force save when truly needed to avoid excessive versions
if (
updateCount > 0 &&
hasBeenLongEnough &&
(hasEnoughUpdates || (!hasClients && isInactiveLongEnough))
) {
// Get document state from Redis
const redisKey = getRedisKey(documentName);
const redisData = await redisClient.getBuffer(redisKey);

if (redisData) {
await saveDocument(
documentName,
redisData,
{ user: { userId: SYSTEM_USER_ID } },
true,
);
}
}
}
} catch (error) {
console.error(`Error in periodic save task:`, error);
}
}

This optimized approach ensures documents are saved periodically while preventing excessive version creation by:

  • Only saving documents that have accumulated at least 10% of the normal update threshold
  • Enforcing a minimum of 5 minutes between periodic saves
  • Handling inactive documents with special logic
  • Running less frequently (every 3 minutes instead of every minute)

Document Loading and Version Switching

Loading Documents

When a document is requested, the server:

  1. First checks Redis for the latest state
  2. If not in Redis, retrieves from the database
  3. Decompresses the state and returns it
fetch: async ({ documentName }) => {
try {
// First try to get from Redis for speed
const redisKey = getRedisKey(documentName);
const redisData = await redisClient.getBuffer(redisKey);

if (redisData) {
return redisData; // Return the buffer directly
}

// If not in Redis, get from database
const doc = await notebookService.getDocument(documentName);
if (!doc || !doc.version) {
console.log(`Document ${documentName} not found in database`);
return null;
}

// Get version from database
const version = await versionService.getVersion(doc.version);
if (!version || !version.content) {
console.log(`Version not found for document ${documentName}`);
return null;
}
const decodedState = safeDecompress(version.content);

// Save to Redis for future quick access
await redisClient.set(
redisKey,
Buffer.from(decodedState),
"EX",
REDIS_DOC_EXPIRY,
);

return decodedState;
} catch (error) {
console.error(`Error fetching document ${documentName}:`, error.message);
return null;
}
};

Version Switching

Version switching is handled through the createNewDocumentVersion function:

async function createNewDocumentVersion(documentName, state, userId) {
// Get document from database
const doc = await notebookService.getDocument(documentName);
if (!doc) {
throw new Error(`Notebook not found: ${documentName}`);
}

// Compress document state for storage
const compressed = safeCompress(state);

// Create version in database
const version = await versionService.createVersion({
notebook: doc._id,
content: compressed,
user: userId || SYSTEM_USER_ID,
});

// Update document with new version ID
await notebookService.updateDocument(doc._id, { version: version._id });

// Track notebook modification
await recentActivityService.trackActivity({
user: userId || SYSTEM_USER_ID,
project: doc.project,
item: doc._id,
itemType: "Notebook",
isModification: true,
});

return version;
}

Redis Integration

Redis Role in the Architecture

Redis serves multiple critical roles in the HocusPocus server:

  1. Fast Document Caching:

    • Documents are stored in Redis with a 1-second debounce
    • Acts as a first-level persistence layer and quick access cache
  2. Message Broadcasting:

    • Redis pub/sub channels are used for broadcasting events
    • Notifications about document changes can be shared across server instances
  3. Temporary Storage:

    • Redis stores document state temporarily with an expiry (REDIS_DOC_EXPIRY)
    • Provides a buffer between in-memory and database persistence

Redis Data Flow

Client  -->  Server  -->  Redis  -->  Database
^ | ^ | ^
| | | | |
└----------┘ └----------┘ |
Real-time 1-sec debounce Threshold-based
Updates Persistence

Redis Keys and Expiry

const {
redisClient,
getRedisKey,
REDIS_DOC_EXPIRY,
} = require("../utils/redis");

// Redis key format: notebook:{documentId}
const redisKey = getRedisKey(documentName);

// Store with expiry
await redisClient.set(redisKey, Buffer.from(state), "EX", REDIS_DOC_EXPIRY);

Graceful Shutdown and Data Protection

The server implements graceful shutdown procedures to ensure no data is lost:

// Handle process termination gracefully
process.on("SIGTERM", async () => {
console.log("Received SIGTERM, saving all active documents...");
await saveAllDocuments();
// Clean up state tracking
previousDocumentStates.clear();
process.exit(0);
});

process.on("SIGINT", async () => {
console.log("Received SIGINT, saving all active documents...");
await saveAllDocuments();
// Clean up state tracking
previousDocumentStates.clear();
process.exit(0);
});

async function saveAllDocuments() {
const savePromises = [];
for (const [documentName] of activeDocuments.entries()) {
try {
// Get document state from Redis instead of memory
const redisKey = getRedisKey(documentName);
const redisData = await redisClient.getBuffer(redisKey);

if (redisData) {
savePromises.push(
saveDocument(
documentName,
redisData,
{ user: { userId: SYSTEM_USER_ID } },
true,
),
);
}
} catch (error) {
console.error(
`Error preparing document ${documentName} for shutdown save:`,
error,
);
}
}
await Promise.allSettled(savePromises);
}

Configuration and Performance Tuning

The server includes several tuning parameters that balance performance, responsiveness, and data safety:

  • UPDATE_THRESHOLD: Controls how many updates before saving to the database (default: 400). Higher values reduce database load but increase potential data loss window.

  • MAX_TIME_BETWEEN_SAVES: Maximum time between database saves (2 minutes). Ensures data is periodically persisted even if update threshold isn't reached.

  • INIT_TIMEOUT: Initialization timeout period (5 seconds). Prevents counting initial document loading updates toward save thresholds.

  • REDIS_DOC_EXPIRY: How long documents remain in Redis cache. Balances memory usage with access speed for recently accessed documents.

  • PERIODIC_SAVE_MIN_INTERVAL: Minimum time between periodic saves (5 minutes). Prevents excessive saving while ensuring periodic persistence.

  • debounce: Global update debounce (200ms). Controls how frequently changes are broadcast to all connected clients. Batches rapid changes to reduce network traffic while maintaining responsive collaboration.

  • Redis store debounce: 1 second with maxWait of 5 seconds. Controls frequency of document state saves to Redis cache, balancing performance with data safety.

  • Periodic save interval: Every 3 minutes. How often the system checks for documents that need saving.

  • timeout: Connection timeout (120 seconds). Maximum time to maintain idle connections before terminating them.