Tag

#pgvector

Articles tagged "pgvector" — 6 entries.

Article №54 observability Foundation ~4 hours end-to-end — bring up the cockpit, drive a reindex + two RAG-evals through the control plane, score 44 questions, and ship the artifact
Second Brain

The Machine Manages Its Own Memory — and the Bug the Mocks Slept Through

Driving the Arena recall layer end-to-end on its own corpus: reindex → score → gate, dispatched through the control plane, recall@5 measured against 44 held-out questions. The first real drain caught a bug eight mock-injected unit tests had slept through — the case for operating the thing you built.

uses fieldkit.memoryfieldkit.arenafieldkit.harnessfieldkit.eval

Article №17 agentic NIM ~90 minutes — 30 min to design the tool surface, 30 min to wire FastMCP + pgvector, 15 min to register with Claude Code, 15 min for the demo and trace
Second Brain

Second Brain as a Tool — Wrapping the RAG Stack in MCP for Claude Code

Closing the Second Brain arc. Four MCP tools wrap the RAG chain — embed, retrieve, optionally rerank, generate — and any Claude Code session anywhere on the box becomes a grounded research client. 200 lines of Python, one launcher, one .mcp.json entry.

Article №09 inference Nemotron Reranker + pgvector full-text + Llama 3.1 8B NIM ~45 minutes on top of the naive-RAG chain
Foundations

Hybrid Retrieval on the Spark — BM25, Dense, Fusion, Rerank

Four retrieval modes on one corpus — naive dense, BM25, Reciprocal Rank Fusion, Nemotron rerank. Dense is already 92% recall@5; rerank adds a point at K=10 and reorders the top. The 8B generator still refuses where retrieval is perfect — grounding, not retrieval, is the new bottleneck.

uses fieldkit.rag

Article №08 inference Llama 3.1 8B NIM + Nemotron Retriever + pgvector ~30 minutes if the three endpoints are already warm
Foundations

Three Endpoints, One Answer — Naive RAG on a DGX Spark

Three endpoints in one curl chain — a query embeds through Nemotron, pgvector returns top-5 chunks in under 80 ms, and a Llama 3.1 8B NIM stuffs them into a strict-context prompt. The chain works; the 8B generator still refuses on questions its own context answers.

uses fieldkit.ragfieldkit.eval

Article №07 inference pgvector ~15 minutes first install, re-runs in seconds
Foundations

Where Your Vectors Live — pgvector on a DGX Spark

The substrate between the embed call and the retrieve call — pgvector 0.8.2 running as a Postgres 16 container on GB10, with 1000 Nemotron vectors, HNSW and ivfflat both indexed, and a planner that prefers seq scan until you tell it otherwise.

uses fieldkit.rag

Upcoming inference NIM planned ~14 min read
Machine that Builds Machines

Gates Before the Advisor — Recall Floors, Raw-Base Preflights, and the Bench That Ate Its Own Spec

Before the Advisor trained: a 182-source corpus pack with recall gates on two retrieval lanes (BM25 and live pgvector + NIM embedder), raw-base preflights that failed two NVIDIA bases honestly, and the rebuild that caught the bench's own spec contaminating its retrieval context.