ILLUME
REPO ANALYZER
A high-performance semantic exploration engine designed to map complex codebase topologies through AST-informed graph neural networks and vector-based relational querying.
System Architecture
AST Decomposition
Illume decomposes raw source code into Abstract Syntax Trees, isolating function signatures, class hierarchies, and variable scopes before tokenization.
Semantic Embedding
Transforming structural nodes into 1536-dimensional vectors. Each fragment is contextualized within its parent module to maintain relational integrity.
Graph Orchestration
Relationships are persisted in a hybrid graph-relational model, allowing for recursive "where-used" lookups across massive multi-repo monoliths.
THE STACK
Built for infinite scalability
FastAPI
Chosen for its asynchronous capabilities and Pydantic-driven data validation, ensuring ultra-low latency between the vector store and the frontend.
Next.js 14
The frontend utilizes Server Components to pre-render codebase statistics, providing an instantaneous "App-like" feel during heavy data navigation.
pgvector
Enables high-performance nearest neighbor searches within PostgreSQL, keeping our vector operations side-by-side with relational code metadata.
CHALLENGES
The Hallucination Gap
Standard RAG models often hallucinated function signatures that didn't exist in the current version of the repo.
Token Overflow
Parsing massive monoliths meant hitting context limits almost immediately during the indexing phase.
SOLUTIONS
-
check_circle
AST-Informed Validation
Implemented a post-processing layer that cross-references LLM outputs with the original AST to verify symbol existence before delivery.
-
check_circle
Recursive Chunking
Developed a proprietary chunking algorithm that breaks files down by logical scopes (class > method > block) rather than raw character counts.
READY FOR DEPLOYMENT.
Illume is open-source and ready for integration. Explore the codebase or spin up a local instance using the CLI tool.