Document Ingestion Pipeline
Technical Reference — v1.0
February 2026 • Confidential

How Rendex Ingests,
Indexes & Retrieves
Your Documents

A complete technical walkthrough of how documents move from upload to cited answer — without ever leaving your server.

📄 Upload PDF, DOCX, TXT
Extract Text + OCR
Chunk 1000 chars / 200 overlap
Embed 768-dim vectors
🗃 Store Qdrant + metadata
💬 Retrieve Cited answer

Every step runs locally. No data is transmitted to any external server, API, or cloud service. The entire pipeline executes within a single Docker Compose deployment on your hardware.

1 Document Upload

Documents are uploaded through the Rendex web interface via a standard HTTPS multipart form. Each document is assigned to a matter (case/project) and stored on the local filesystem.

ParameterValue
EndpointPOST /api/documents/upload/:matterId
Max file size50 MB
Storage location/app/uploads/ (Docker volume)
File namingUUID + original extension

Supported formats

FormatMIME TypeExtraction Method
PDFapplication/pdfpdf-parse — extracts embedded text layer
DOCXapplication/vnd.openxmlformats-...mammoth — raw text extraction
DOCapplication/mswordmammoth — raw text extraction
TXTtext/plainUTF-8 buffer read

What happens on upload

1. File is saved to the local /app/uploads/ volume.
2. A database record is created with status processing.
3. An audit log entry is written (user, document, timestamp, IP).
4. Asynchronous background processing begins immediately.

2 Text Extraction

The system detects the MIME type and dispatches to the appropriate parser. All extraction happens in-process using Node.js libraries — no external services.

i

Scanned PDFs: If a PDF contains no extractable text layer (i.e. it's a scanned image), the system falls back to OCR processing to extract text from the document images.

The extracted text is a single UTF-8 string containing the full document content, which is then passed to the chunking stage.

Document Ingestion Pipeline
Technical Reference — v1.0

3 Chunking

The extracted text is split into overlapping chunks. Each chunk becomes a separate vector in the database. The overlap ensures that information spanning a chunk boundary is captured in at least one chunk.

Chunk size1,000 chars
Overlap200 chars
Break strategySentence / paragraph boundary
Min break point50% of target size

Chunking algorithm

1. Start at position 0 in the extracted text 2. Calculate end = min(start + 1000, text length) 3. If not at end of text: • Look backwards for sentence boundary (". ") • Look backwards for paragraph boundary ("\n") • Accept if break point is > 50% of target size 4. Extract and trim the text slice 5. Record: { text, index, charStart, charEnd } 6. Advance start position by (chunkSize - overlap) 7. Repeat until end of text

Each chunk retains its character position in the original document (start/end offsets), enabling the UI to highlight the exact source passage when a citation is clicked.

4 Embedding Generation

Each chunk is converted into a 768-dimensional vector using a local embedding model running on your GPU. This is the mathematical representation that enables semantic search.

Modelnomic-embed-text
Dimensions768
RuntimeOllama (GPU-accelerated)
APIPOST ollama:11434/api/embed
// Each chunk is sent to the local Ollama instance POST http://ollama:11434/api/embed { "model": "nomic-embed-text", "input": "The non-compete clause in Section 4.2 restricts..." } // Returns a 768-dimensional floating-point vector { "embeddings": [[ 0.0234, -0.1891, 0.0567, ... ]] }

No external API calls. The embedding model runs entirely on your GPU via Ollama. The vectors never leave the machine. nomic-embed-text is downloaded once during installation and persists in a Docker volume.

5 Vector Storage (Qdrant)

The embedding vector and its metadata are stored in Qdrant, an open-source vector database running locally. Each chunk becomes a "point" in the collection.

DatabaseQdrant v1.x
Collectionrendex_documents
Distance metricCosine similarity
Batch size100 points per upsert

What's stored per chunk

{ "id": "a7c2e9f1-...", // UUID "vector": [0.0234, -0.189, ...], // 768-dim embedding "payload": { "document_id": 42, "document_name": "lease-agreement-2024.pdf", "matter_id": 7, "matter_number": "2024-0312", "chunk_index": 14, "text": "The non-compete clause in Section 4.2...", "char_start": 11200, "char_end": 12180 } }

Qdrant maintains payload indexes on matter_id and document_id for fast filtered queries. This is how ethical walls are enforced at the retrieval layer — queries only search vectors belonging to matters the user has permission to access.

After indexing

The document's database record is updated to status indexed, the chunk_count is set, and the indexed_at timestamp is recorded. If any step fails, the status becomes failed with a stored error message.

Document Ingestion Pipeline
Technical Reference — v1.0

6 Query & Retrieval (RAG)

When a user asks a question, Rendex converts it into a vector, searches for the most relevant chunks, builds a context window, and generates a cited answer using a local language model.

Step-by-step query flow

#ActionDetail
AResolve permissionsDetermine which matters the user can access based on their role and matter-level permissions.
BEmbed the questionConvert the user's natural-language question into a 768-dim vector using nomic-embed-text.
CVector searchQuery Qdrant for the top 8 most similar chunks, filtered by the user's accessible matter IDs. Distance metric: cosine similarity.
DBuild contextAssemble the 8 retrieved chunks into a numbered context block with source labels.
EGenerate answerSend the context + question to the local LLM (Llama 3 via Ollama) with instructions to cite sources using [Source N] notation.
FReturn with citationsThe API returns the answer, citation metadata (document name, matter, page, excerpt, similarity score), and confidence metrics.

Ethical wall enforcement

!

Queries are filtered at the vector layer. The Qdrant search includes a matter_id filter that restricts results to only the matters the user has been granted access to. A user on Matter A cannot retrieve chunks from Matter B, even if the content is semantically similar. This is enforced at the database level, not the UI level.

7 Confidence Scoring

Every answer includes a confidence signal derived from the cosine similarity scores of the retrieved chunks.

MetricMax cosine similarity
Threshold0.70 (configurable)
Chunks retrievedTop 8
Low confidence flagmax_score < threshold

When the highest similarity score is below the configured threshold, the UI displays a warning banner indicating that the answer may be less reliable. This signals to the attorney that the available documents may not contain a strong match for their question.

// API response includes confidence data { "answer": "Based on the lease agreement, the auto-renewal...", "max_score": 0.87, "low_confidence": false, "chunks_used": 8, "citations": [ { "document_name": "lease-2024.pdf", "score": 0.87, ... } ] }

8 Document Deletion

When a document is deleted, Rendex removes it from three places simultaneously:

1. File system — the original file is deleted from /app/uploads/.
2. Vector database — all Qdrant points with the matching document_id are purged.
3. PostgreSQL — the document record is removed (cascading to related entries).

No residual knowledge. Because the AI model is pre-trained and never fine-tuned on your documents, deleting a document fully removes it from the system. There are no shadow copies, cached embeddings, or residual model weights.

9 Audit Trail

Every action in the pipeline is logged to an immutable, append-only audit table in PostgreSQL. The table has a database trigger that prevents all UPDATE and DELETE operations.

FieldDescription
user_idWho performed the action
actionWhat happened (upload, query, delete, login, permission change)
resource_typeWhat was affected (document, matter, user)
detailsStructured JSON with full context (query text, document name, etc.)
ip_addressClient IP address
created_atTimestamp (UTC, immutable)

Infrastructure Summary

Services (Docker Compose)

OllamaLLM + embeddings (GPU)
QdrantVector database
PostgreSQLAuth, RBAC, audit log
Chat UIExpress.js web app
NginxTLS + reverse proxy

Network posture

Open ports80, 443
Outbound egressNone
Internal only5432, 6333, 11434
PostgreSQL127.0.0.1 only

Questions? Need a formal security questionnaire completed?

Contact us at info@rendex.ai — we'll turn it around in 48 hours.