Under the hood

Architecture & Technical Design

A deep dive into the graph database, AI pipelines, three-tier storage model, and the design decisions that make Noemata work.

Stack

Built with

Next.js 16
App Router + React 19
TypeScript
Strict mode
Neo4j AuraDB
Graph database
Cloudflare R2
Blob storage
Claude API
Sonnet 4 · AI layer
OpenAI
text-embedding-3-small
Clerk
Authentication
Vercel
Edge deployment

Data model

Three-tier storage architecture

Content flows through three tiers based on size and retrieval needs. Short content lives inline on graph nodes. Longer content is stored as blobs. Large content is chunked for semantic retrieval.

Tier 1

Graph · Neo4j

Node metadata, relationships, vector embeddings, and short content under 100 words. Every node carries its type, timestamps, and properties. Relationships are first-class citizens with 16 semantic types.

Tier 2

Blob · Cloudflare R2

Full note bodies, PDFs, images, and attachments for content over 100 words. Stored as markdown files keyed by node type and ID. Referenced via content_uri on the graph node.

Tier 3

RAG · Chunks in Neo4j

Content over 300 words is split into semantic chunks with overlap. Each chunk gets an embedding vector via text-embedding-3-small. Chunks link back to parents via CHUNK_OF and NEXT_CHUNK relationships.

POST /api/ingest → word count check
  > 100 words  → upload to R2, store content_uri on node
  > 300 words  → chunk into ~200-token segments with overlap
                → create Chunk nodes with CHUNK_OF relationships
                → generate embeddings (queued)

Graph schema

11 node types. 18 relationship types.

Every piece of knowledge is a typed node. Every connection is a semantic relationship. The graph is the source of truth.

Node types

project
area
note
task
person
concept
resource
tag
document
log

Relationship types

BELONGS_TOPART_OFDEPENDS_ONRELATES_TOEMERGES_FROMMENTIONSREFERENCESSUPPORTSFOUNDATIONAL_TOCOLLABORATES_ONCENTRAL_TOILLUSTRATESTAGGED_WITHMEMORIALIZED_BYCHUNK_OFNEXT_CHUNKPULSE_FORHAS_DOCUMENT

Intelligence layer

Two AI pipelines

Claude powers both the ingestion pipeline (auto-linking on save) and the retrieval pipeline (GraphRAG for questions).

Ingestion pipeline

Runs on every note save

01
Extract entities
People, projects, concepts, tags via Claude
02
Search graph
Match entities against existing nodes
03
Suggest links
Claude rates confidence for each relationship
04
Auto-commit or propose
≥85% auto-links, <85% surfaces for review
POST /api/ai/pipeline { nodeId, content }
  → extractEntities(content)        // Claude: people, projects, concepts, tags
  → searchNodesFallback(term, user)  // Neo4j: fuzzy match against graph
  → suggestRelationships(entities)   // Claude: confidence-scored link proposals
  → confidence ≥ 0.85 ? auto-commit : surface for review

GraphRAG retrieval

Runs on every question in the chat sidebar

01
Pin + extract
Load pinned nodes, extract entities from question
02
Graph search
Fuzzy-match entities against Neo4j nodes
03
Traverse
Walk 2-hop neighbors for matched nodes
04
Collect chunks
Gather semantic chunks from matched parents
05
Synthesize
Cap at 8K tokens, send to Claude with citations
POST /api/ai/ask { question, pinnedNodeIds, contextNodeIds, history }
  → load pinned + context nodes (highest priority)
  → extractEntities(question)              // what is the user asking about?
  → searchNodesFallback() × 5 terms        // find matching nodes
  → getNeighbors(nodeId, depth=2)           // traverse the graph
  → collect Chunk children for matched nodes
  → prioritize: direct matches → neighbors → chunks (8K token budget)
  → askWithContext(question, contextPackage) // Claude synthesizes answer
  → extract [citations] from response       // link back to source nodes

Awareness engine

The Pulse system

AI-synthesized snapshots of project state and overall focus. Pulse queries the graph for recent activity, task distributions, and upcoming deadlines — then asks Claude to distill it into actionable awareness.

Project pulse

Generated per-project by querying related nodes from the last 7 days, task status distributions, and recent activity. Stored as a Pulse node linked via PULSE_FOR.

MATCH (p:Project)-[r]-(n)
WHERE n.updatedAt >= $sevenDaysAgo
RETURN n, type(r), labels(n)[0]
ORDER BY n.updatedAt DESC LIMIT 20

Global pulse

Synthesizes all project pulses, recent cross-project activity, and upcoming deadlines into a holistic view. Returns top-of-mind items, priorities, and open threads.

Output shape:
{
  top_of_mind: string[]   // 2-4 items
  priorities: string[]    // 3-5 actionable items
  open_threads: string[]  // 0-5 unresolved items
}

API surface

30+ endpoints. Zero waste.

Graph CRUD

GET /api/graph/nodes
POST /api/graph/nodes
GET /api/graph/nodes/:id
PUT /api/graph/nodes/:id
DELETE /api/graph/nodes/:id
POST /api/graph/relationships
DELETE /api/graph/relationships
GET /api/graph/neighbors/:id
GET /api/graph/search
GET /api/graph/activity

AI + Ingestion

POST /api/ingest
POST /api/ai/ask
POST /api/ai/autolink
POST /api/ai/embed
POST /api/ai/pipeline
POST /api/ai/pipeline/revert

Documents

GET /api/documents
POST /api/documents
GET /api/documents/:id
DELETE /api/documents/:id
POST /api/documents/:id/process
POST /api/documents/:id/resync
GET /api/auth/google
GET /api/drive/picker-token

Storage + System

POST /api/storage/upload
GET /api/storage/:key
GET /api/pulse
POST /api/pulse/refresh
GET /api/settings/note-types
POST /api/settings/note-types
POST /api/onboarding/complete
POST /api/onboarding/complete-nux
POST /api/onboarding/refine-entities

Document pipeline

Upload, parse, and connect

Documents flow through the same ingestion pipeline as notes. Upload a PDF, DOCX, or import from Google Drive — AI extracts entities and connects them to your graph.

01
Upload or import
File upload (PDF, DOCX, TXT, MD) or Google Drive URL
02
Parse content
Extract text with pdf-parse or mammoth, store original + extracted in R2
03
AI summarize
Claude generates a concise summary of the document
04
Chunk & link
Content is chunked, embedded, and run through the entity extraction pipeline
POST /api/documents → create Document node (status: pending)
  → upload original to R2 (documents/{nodeId}.{ext})
  → fire-and-forget: /api/documents/:id/process
    → parse → summarize → chunk → Stage 1 pipeline → embeddings
    → update status: ready

Patterns

Principles that hold the system together

Tenant isolation

Every Cypher query includes WHERE n.userId = $userId. Every API route calls requireUserId() before touching the graph. No data crosses user boundaries.

Module-level singleton

The Neo4j driver is a module-scoped singleton — survives across warm Lambda invocations, reconnects on cold starts. Max pool size of 5, with 10s connection acquisition timeout.

Parameterized Cypher only

All queries use parameterized Cypher via a helper module. No string interpolation. No template literals in queries. Every value goes through $params.

Semantic chunking

Content is split on paragraph boundaries, then sentence boundaries for oversized paragraphs. Short segments merge up to ~200 tokens. 20-token overlap between consecutive chunks for context continuity.

Token budget management

GraphRAG caps context at 8,000 tokens. Direct node matches get priority, then 1-hop neighbors, then chunks. Budget is tracked and stops adding context when exhausted.

Confidence-tiered automation

AI relationship suggestions above 85% confidence are auto-committed. Below that threshold, they surface in the ReviewModal for human approval. All auto-actions are revertible.

Security

Built for trust

Your knowledge graph contains your most personal thoughts. Every layer of the stack is designed to keep them safe.

AI doesn't train on your data

Both Anthropic and OpenAI API policies guarantee that API inputs are not used for model training.

Rate limiting

Upstash Redis-backed sliding window rate limits — 20 req/min on AI routes, 100 req/min on CRUD.

Input validation

Every POST/PUT endpoint validates request bodies with Zod schemas. Malformed payloads are rejected before touching the database.

Security headers

CSP, HSTS, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, and Permissions-Policy on every response.

Clerk authentication

SOC 2 Type II certified auth provider. Middleware enforces auth at the edge before routes execute.

Encrypted storage

All data encrypted at rest (Neo4j AuraDB, Cloudflare R2) and in transit (TLS 1.3).

Algorithm spotlight

Semantic chunking

The chunker balances semantic coherence with retrieval granularity. Paragraphs are the primary boundary, with sentence-level splitting as fallback for oversized blocks.

function chunkText(text, options) {
  // Step 1: Split on paragraph boundaries (double newlines)
  const paragraphs = text.split(/\n\s*\n/)

  // Step 2: Break oversized paragraphs into sentences
  // Sentences split on [.!?] followed by space or end
  for (const para of paragraphs) {
    if (estimateTokens(para) > maxTokens) {
      segments.push(...splitSentences(para))
    } else {
      segments.push(para)
    }
  }

  // Step 3: Merge small segments until ~200 tokens
  // Keeps related content together

  // Step 4: Add 20-token overlap between consecutive chunks
  // Previous chunk's tail prepended to next chunk's head
  // Maintains context continuity for retrieval
}

Ready to think in graphs?

Stop organizing. Start connecting.

Get Started