HN
Today

We replaced RAG with a virtual filesystem for our AI documentation assistant

Mintlify shares their novel approach to powering AI documentation assistants, moving beyond traditional RAG limitations. They've developed 'ChromaFs,' a virtual filesystem that translates UNIX commands into database queries, drastically cutting down on latency and infrastructure costs. This innovative solution allows AI agents to explore documentation like a codebase, solving common retrieval challenges with remarkable efficiency and built-in access control.

6
Score
0
Comments
#2
Highest Rank
10h
on Front Page
First Seen
Apr 3, 5:00 PM
Last Seen
Apr 4, 2:00 AM
Rank Over Time
124237711101213

The Lowdown

Mintlify faced a common problem with RAG (Retrieval-Augmented Generation) in their AI documentation assistant: it struggled with multi-page answers or exact syntax not present in top-K results. Their assistant needed to interact with documentation more like a developer explores a codebase, using commands like grep, cat, ls, and find. The conventional approach of sandboxed real filesystems was too slow (46-second boot time) and expensive (over $70,000 annually for 850,000 conversations), prompting a need for a more efficient solution.

Mintlify's answer is ChromaFs, a virtual filesystem built atop their existing Chroma database. It intercepts UNIX commands and translates them into database queries, offering significant improvements:

  • Instant Session Creation: Boot time plummeted from ~46 seconds to ~100 milliseconds.
  • Zero Marginal Compute Cost: By reusing existing Chroma infrastructure, the per-conversation cost became negligible.
  • Leverages just-bash: Built on Vercel Labs' just-bash, a TypeScript re-implementation of bash, it handles parsing and command logic while ChromaFs manages filesystem calls.
  • Efficient Directory Tree Bootstrapping: The entire file tree is stored as a gzipped JSON in Chroma. On initialization, it's fetched and decompressed into in-memory structures, enabling ls, cd, and find without network calls for cached trees.
  • Granular Access Control: Access policies (like isPublic and groups) are applied during tree building, pruning inaccessible paths before the agent even sees them, simplifying security compared to managing Linux user groups.
  • Page Reassembly and Caching: For cat commands, ChromaFs fetches all relevant chunks for a page, sorts them, and joins them. Results are cached to avoid repeat database hits.
  • Lazy File Pointers: Large files like OpenAPI specs can be registered as lazy pointers, only fetching content from S3 when explicitly accessed by cat.
  • Read-Only System: The filesystem is strictly read-only, throwing EROFS errors on write attempts, ensuring statelessness and preventing data corruption.
  • Optimized grep: Recursive grep commands are intercepted and translated into Chroma queries for a coarse filter, then prefetched into Redis. just-bash then performs fine-grained in-memory execution on the matched chunks, making large queries fast.

ChromaFs now powers documentation assistance for hundreds of thousands of users across tens of thousands of conversations daily. By elegantly sidestepping the overhead of real sandboxes with a virtual filesystem, Mintlify achieved instant session creation, zero marginal compute, and robust RBAC, all without new infrastructure.