A lightweight semantic memory layer that gives AI agents long-term recall. Vector embeddings, Weaviate storage, and RAG — packaged in a single Docker container.
Converts messages into vector embeddings using the open-source nomic-embed-text model via Ollama — fully local and private.
Retrieves the most relevant memories using Weaviate's high-performance vector similarity search with configurable top-K and distance thresholds.
Automatically injects retrieved memories into the LLM prompt through middleware, making every response context-aware.
New memories are saved asynchronously via a background worker — no latency added to the user-facing response path.
Everything runs inside Docker. Ollama, Weaviate, and the Go server spin up together with a single docker-compose command.
All embeddings are generated locally. Your data never leaves your infrastructure — no third-party API calls for vectorisation.
# Clone the repository
git clone https://github.com/sobowalebukola/memcortex.git
cd memcortex
# Configure environment
cp .env.example .env
# Launch everything
docker-compose up -d --build
# Test it out
curl -X POST http://localhost:8080/chat \
-H "Content-Type: application/json" \
-H "X-User-ID: user-1" \
-d '{"message":"Remember this."}'