Under the hood

Architecture & Technical Design

A deep dive into the graph database, AI pipelines, three-tier storage model, and the design decisions that make Noemata work.

Stack

Built with

Next.js 16

App Router + React 19

TypeScript

Strict mode

Neo4j AuraDB

Graph database

Cloudflare R2

Blob storage

Claude API

Sonnet 4 · AI layer

OpenAI

text-embedding-3-small

Clerk

Authentication

Vercel

Edge deployment

Data model

Three-tier storage architecture

Content flows through three tiers based on size and retrieval needs. Short content lives inline on graph nodes. Longer content is stored as blobs. Large content is chunked for semantic retrieval.

Tier 1

Graph · Neo4j

Node metadata, relationships, vector embeddings, and short content under 100 words. Every node carries its type, timestamps, and properties. Relationships are first-class citizens with 16 semantic types.

Tier 2

Blob · Cloudflare R2

Full note bodies, PDFs, images, and attachments for content over 100 words. Stored as markdown files keyed by node type and ID. Referenced via content_uri on the graph node.

Tier 3

RAG · Chunks in Neo4j

Content over 300 words is split into semantic chunks with overlap. Each chunk gets an embedding vector via text-embedding-3-small. Chunks link back to parents via CHUNK_OF and NEXT_CHUNK relationships.

POST /api/ingest → word count check
  > 100 words  → upload to R2, store content_uri on node
  > 300 words  → chunk into ~200-token segments with overlap
                → create Chunk nodes with CHUNK_OF relationships
                → generate embeddings (queued)

Graph schema

11 node types. 18 relationship types.

Every piece of knowledge is a typed node. Every connection is a semantic relationship. The graph is the source of truth.

Node types

project

area

note

task

person

concept

resource

tag

document

log

Relationship types

BELONGS_TOPART_OFDEPENDS_ONRELATES_TOEMERGES_FROMMENTIONSREFERENCESSUPPORTSFOUNDATIONAL_TOCOLLABORATES_ONCENTRAL_TOILLUSTRATESTAGGED_WITHMEMORIALIZED_BYCHUNK_OFNEXT_CHUNKPULSE_FORHAS_DOCUMENT

Intelligence layer

Two AI pipelines

Claude powers both the ingestion pipeline (auto-linking on save) and the retrieval pipeline (GraphRAG for questions).

Ingestion pipeline

Runs on every note save

Extract entities

People, projects, concepts, tags via Claude

Search graph

Match entities against existing nodes

Suggest links

Claude rates confidence for each relationship

Auto-commit or propose

≥85% auto-links, <85% surfaces for review

POST /api/ai/pipeline { nodeId, content }
  → extractEntities(content)        // Claude: people, projects, concepts, tags
  → searchNodesFallback(term, user)  // Neo4j: fuzzy match against graph
  → suggestRelationships(entities)   // Claude: confidence-scored link proposals
  → confidence ≥ 0.85 ? auto-commit : surface for review

GraphRAG retrieval

Runs on every question in the chat sidebar

Pin + extract

Load pinned nodes, extract entities from question

Graph search

Fuzzy-match entities against Neo4j nodes

Traverse

Walk 2-hop neighbors for matched nodes

Collect chunks

Gather semantic chunks from matched parents

Synthesize

Cap at 8K tokens, send to Claude with citations

POST /api/ai/ask { question, pinnedNodeIds, contextNodeIds, history }
  → load pinned + context nodes (highest priority)
  → extractEntities(question)              // what is the user asking about?
  → searchNodesFallback() × 5 terms        // find matching nodes
  → getNeighbors(nodeId, depth=2)           // traverse the graph
  → collect Chunk children for matched nodes
  → prioritize: direct matches → neighbors → chunks (8K token budget)
  → askWithContext(question, contextPackage) // Claude synthesizes answer
  → extract [citations] from response       // link back to source nodes

Awareness engine

The Pulse system

AI-synthesized snapshots of project state and overall focus. Pulse queries the graph for recent activity, task distributions, and upcoming deadlines — then asks Claude to distill it into actionable awareness.

Project pulse

Generated per-project by querying related nodes from the last 7 days, task status distributions, and recent activity. Stored as a Pulse node linked via PULSE_FOR.

MATCH (p:Project)-[r]-(n)
WHERE n.updatedAt >= $sevenDaysAgo
RETURN n, type(r), labels(n)[0]
ORDER BY n.updatedAt DESC LIMIT 20

Global pulse

Synthesizes all project pulses, recent cross-project activity, and upcoming deadlines into a holistic view. Returns top-of-mind items, priorities, and open threads.

Output shape:
{
  top_of_mind: string[]   // 2-4 items
  priorities: string[]    // 3-5 actionable items
  open_threads: string[]  // 0-5 unresolved items
}

API surface

30+ endpoints. Zero waste.

Graph CRUD

GET /api/graph/nodes

POST /api/graph/nodes

GET /api/graph/nodes/:id

PUT /api/graph/nodes/:id

DELETE /api/graph/nodes/:id

POST /api/graph/relationships

DELETE /api/graph/relationships

GET /api/graph/neighbors/:id

GET /api/graph/search

GET /api/graph/activity

AI + Ingestion

POST /api/ingest

POST /api/ai/ask

POST /api/ai/autolink

POST /api/ai/embed

POST /api/ai/pipeline

POST /api/ai/pipeline/revert

Documents

GET /api/documents

POST /api/documents

GET /api/documents/:id

DELETE /api/documents/:id

POST /api/documents/:id/process

POST /api/documents/:id/resync

GET /api/auth/google

GET /api/drive/picker-token

Storage + System

POST /api/storage/upload

GET /api/storage/:key

GET /api/pulse

POST /api/pulse/refresh

GET /api/settings/note-types

POST /api/settings/note-types

POST /api/onboarding/complete

POST /api/onboarding/complete-nux

POST /api/onboarding/refine-entities

Document pipeline

Upload, parse, and connect

Documents flow through the same ingestion pipeline as notes. Upload a PDF, DOCX, or import from Google Drive — AI extracts entities and connects them to your graph.

Upload or import

File upload (PDF, DOCX, TXT, MD) or Google Drive URL

Parse content

Extract text with pdf-parse or mammoth, store original + extracted in R2

AI summarize

Claude generates a concise summary of the document

Chunk & link

Content is chunked, embedded, and run through the entity extraction pipeline

POST /api/documents → create Document node (status: pending)
  → upload original to R2 (documents/{nodeId}.{ext})
  → fire-and-forget: /api/documents/:id/process
    → parse → summarize → chunk → Stage 1 pipeline → embeddings
    → update status: ready

Patterns

Principles that hold the system together

Tenant isolation

Every Cypher query includes WHERE n.userId = $userId. Every API route calls requireUserId() before touching the graph. No data crosses user boundaries.

Module-level singleton

The Neo4j driver is a module-scoped singleton — survives across warm Lambda invocations, reconnects on cold starts. Max pool size of 5, with 10s connection acquisition timeout.

Parameterized Cypher only

All queries use parameterized Cypher via a helper module. No string interpolation. No template literals in queries. Every value goes through $params.

Semantic chunking

Content is split on paragraph boundaries, then sentence boundaries for oversized paragraphs. Short segments merge up to ~200 tokens. 20-token overlap between consecutive chunks for context continuity.

Token budget management

GraphRAG caps context at 8,000 tokens. Direct node matches get priority, then 1-hop neighbors, then chunks. Budget is tracked and stops adding context when exhausted.

Confidence-tiered automation

AI relationship suggestions above 85% confidence are auto-committed. Below that threshold, they surface in the ReviewModal for human approval. All auto-actions are revertible.

Security

Built for trust

Your knowledge graph contains your most personal thoughts. Every layer of the stack is designed to keep them safe.

AI doesn't train on your data

Both Anthropic and OpenAI API policies guarantee that API inputs are not used for model training.

Rate limiting

Upstash Redis-backed sliding window rate limits — 20 req/min on AI routes, 100 req/min on CRUD.

Input validation

Every POST/PUT endpoint validates request bodies with Zod schemas. Malformed payloads are rejected before touching the database.

Security headers

CSP, HSTS, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, and Permissions-Policy on every response.

Clerk authentication

SOC 2 Type II certified auth provider. Middleware enforces auth at the edge before routes execute.

Encrypted storage

All data encrypted at rest (Neo4j AuraDB, Cloudflare R2) and in transit (TLS 1.3).

Read our full security page

Algorithm spotlight

Semantic chunking

The chunker balances semantic coherence with retrieval granularity. Paragraphs are the primary boundary, with sentence-level splitting as fallback for oversized blocks.

function chunkText(text, options) {
  // Step 1: Split on paragraph boundaries (double newlines)
  const paragraphs = text.split(/\n\s*\n/)

  // Step 2: Break oversized paragraphs into sentences
  // Sentences split on [.!?] followed by space or end
  for (const para of paragraphs) {
    if (estimateTokens(para) > maxTokens) {
      segments.push(...splitSentences(para))
    } else {
      segments.push(para)
    }
  }

  // Step 3: Merge small segments until ~200 tokens
  // Keeps related content together

  // Step 4: Add 20-token overlap between consecutive chunks
  // Previous chunk's tail prepended to next chunk's head
  // Maintains context continuity for retrieval
}

Ready to think in graphs?

Stop organizing. Start connecting.

Get Started