R

Rendex

Document Ingestion Pipeline
Technical Reference — v1.0
February 2026 • Confidential

How Rendex Ingests,
Indexes & Retrieves
Your Documents

A complete technical walkthrough of how documents move from upload to cited answer — without ever leaving your server.

📄 Upload PDF, DOCX, TXT

→

⌨ Extract Text + OCR

→

✎ Chunk 1000 chars / 200 overlap

→

◆ Embed 768-dim vectors

→

🗃 Store Qdrant + metadata

→

      💬
      Retrieve
      Cited answer
    

✓

Every step runs locally. No data is transmitted to any external server, API, or cloud service. The entire pipeline executes within a single Docker Compose deployment on your hardware.

1 Document Upload

Documents are uploaded through the Rendex web interface via a standard HTTPS multipart form. Each document is assigned to a matter (case/project) and stored on the local filesystem.

Parameter	Value
Endpoint	POST /api/documents/upload/:matterId
Max file size	50 MB
Storage location	/app/uploads/ (Docker volume)
File naming	UUID + original extension

Supported formats

Format	MIME Type	Extraction Method
PDF	application/pdf	pdf-parse — extracts embedded text layer
DOCX	application/vnd.openxmlformats-...	mammoth — raw text extraction
DOC	application/msword	mammoth — raw text extraction
TXT	text/plain	UTF-8 buffer read

What happens on upload

1. File is saved to the local /app/uploads/ volume.
2. A database record is created with status processing.
3. An audit log entry is written (user, document, timestamp, IP).
4. Asynchronous background processing begins immediately.

2 Text Extraction

The system detects the MIME type and dispatches to the appropriate parser. All extraction happens in-process using Node.js libraries — no external services.

i

Scanned PDFs: If a PDF contains no extractable text layer (i.e. it's a scanned image), the system falls back to OCR processing to extract text from the document images.

The extracted text is a single UTF-8 string containing the full document content, which is then passed to the chunking stage.

R

Rendex

Document Ingestion Pipeline
Technical Reference — v1.0

3 Chunking

The extracted text is split into overlapping chunks. Each chunk becomes a separate vector in the database. The overlap ensures that information spanning a chunk boundary is captured in at least one chunk.

Chunk size1,000 chars

Overlap200 chars

Break strategySentence / paragraph boundary

Min break point50% of target size

Chunking algorithm

1. Start at position 0 in the extracted text 2. Calculate end = min(start + 1000, text length) 3. If not at end of text: • Look backwards for sentence boundary (". ") • Look backwards for paragraph boundary ("\n") • Accept if break point is > 50% of target size 4. Extract and trim the text slice 5. Record: { text, index, charStart, charEnd } 6. Advance start position by (chunkSize - overlap) 7. Repeat until end of text

Each chunk retains its character position in the original document (start/end offsets), enabling the UI to highlight the exact source passage when a citation is clicked.

4 Embedding Generation

Each chunk is converted into a 768-dimensional vector using a local embedding model running on your GPU. This is the mathematical representation that enables semantic search.

Modelnomic-embed-text

Dimensions768

RuntimeOllama (GPU-accelerated)

APIPOST ollama:11434/api/embed

// Each chunk is sent to the local Ollama instance POST http://ollama:11434/api/embed { "model": "nomic-embed-text", "input": "The non-compete clause in Section 4.2 restricts..." } // Returns a 768-dimensional floating-point vector { "embeddings": [[ 0.0234, -0.1891, 0.0567, ... ]] }

✓

No external API calls. The embedding model runs entirely on your GPU via Ollama. The vectors never leave the machine. nomic-embed-text is downloaded once during installation and persists in a Docker volume.

5 Vector Storage (Qdrant)

The embedding vector and its metadata are stored in Qdrant, an open-source vector database running locally. Each chunk becomes a "point" in the collection.

DatabaseQdrant v1.x

Collectionrendex_documents

Distance metricCosine similarity

Batch size100 points per upsert

What's stored per chunk

{ "id": "a7c2e9f1-...", // UUID "vector": [0.0234, -0.189, ...], // 768-dim embedding "payload": { "document_id": 42, "document_name": "lease-agreement-2024.pdf", "matter_id": 7, "matter_number": "2024-0312", "chunk_index": 14, "text": "The non-compete clause in Section 4.2...", "char_start": 11200, "char_end": 12180 } }

Qdrant maintains payload indexes on matter_id and document_id for fast filtered queries. This is how ethical walls are enforced at the retrieval layer — queries only search vectors belonging to matters the user has permission to access.

After indexing

The document's database record is updated to status indexed, the chunk_count is set, and the indexed_at timestamp is recorded. If any step fails, the status becomes failed with a stored error message.

R

Rendex

Document Ingestion Pipeline
Technical Reference — v1.0

6 Query & Retrieval (RAG)

When a user asks a question, Rendex converts it into a vector, searches for the most relevant chunks, builds a context window, and generates a cited answer using a local language model.

Step-by-step query flow

#	Action	Detail
A	Resolve permissions	Determine which matters the user can access based on their role and matter-level permissions.
B	Embed the question	Convert the user's natural-language question into a 768-dim vector using nomic-embed-text.
C	Vector search	Query Qdrant for the top 8 most similar chunks, filtered by the user's accessible matter IDs. Distance metric: cosine similarity.
D	Build context	Assemble the 8 retrieved chunks into a numbered context block with source labels.
E	Generate answer	Send the context + question to the local LLM (Llama 3 via Ollama) with instructions to cite sources using [Source N] notation.
F	Return with citations	The API returns the answer, citation metadata (document name, matter, page, excerpt, similarity score), and confidence metrics.

Ethical wall enforcement

!

Queries are filtered at the vector layer. The Qdrant search includes a matter_id filter that restricts results to only the matters the user has been granted access to. A user on Matter A cannot retrieve chunks from Matter B, even if the content is semantically similar. This is enforced at the database level, not the UI level.

7 Confidence Scoring

Every answer includes a confidence signal derived from the cosine similarity scores of the retrieved chunks.

MetricMax cosine similarity

Threshold0.70 (configurable)

Chunks retrievedTop 8

Low confidence flagmax_score < threshold

When the highest similarity score is below the configured threshold, the UI displays a warning banner indicating that the answer may be less reliable. This signals to the attorney that the available documents may not contain a strong match for their question.

// API response includes confidence data { "answer": "Based on the lease agreement, the auto-renewal...", "max_score": 0.87, "low_confidence": false, "chunks_used": 8, "citations": [ { "document_name": "lease-2024.pdf", "score": 0.87, ... } ] }

8 Document Deletion

When a document is deleted, Rendex removes it from three places simultaneously:

1. File system — the original file is deleted from /app/uploads/.
2. Vector database — all Qdrant points with the matching document_id are purged.
3. PostgreSQL — the document record is removed (cascading to related entries).

✓

No residual knowledge. Because the AI model is pre-trained and never fine-tuned on your documents, deleting a document fully removes it from the system. There are no shadow copies, cached embeddings, or residual model weights.

9 Audit Trail

Every action in the pipeline is logged to an immutable, append-only audit table in PostgreSQL. The table has a database trigger that prevents all UPDATE and DELETE operations.

Field	Description
user_id	Who performed the action
action	What happened (upload, query, delete, login, permission change)
resource_type	What was affected (document, matter, user)
details	Structured JSON with full context (query text, document name, etc.)
ip_address	Client IP address
created_at	Timestamp (UTC, immutable)

∑ Infrastructure Summary

Services (Docker Compose)

Ollama	LLM + embeddings (GPU)
Qdrant	Vector database
PostgreSQL	Auth, RBAC, audit log
Chat UI	Express.js web app
Nginx	TLS + reverse proxy

Network posture

Open ports	80, 443
Outbound egress	None
Internal only	5432, 6333, 11434
PostgreSQL	127.0.0.1 only

Questions? Need a formal security questionnaire completed?

Contact us at info@rendex.ai — we'll turn it around in 48 hours.

How Rendex Ingests,Indexes & RetrievesYour Documents

1 Document Upload

Supported formats

What happens on upload

2 Text Extraction

3 Chunking

Chunking algorithm

4 Embedding Generation

5 Vector Storage (Qdrant)

What's stored per chunk

After indexing

6 Query & Retrieval (RAG)

Step-by-step query flow

Ethical wall enforcement

7 Confidence Scoring

8 Document Deletion

9 Audit Trail

∑ Infrastructure Summary

Services (Docker Compose)

Network posture

How Rendex Ingests,
Indexes & Retrieves
Your Documents