Open Source · Memory-RAG

Persistent memory
for your LLMs

A lightweight semantic memory layer that gives AI agents long-term recall. Vector embeddings, Weaviate storage, and RAG — packaged in a single Docker container.

How it works
💬
User
API
🧠
Memory
🔍
Weaviate
🤖
LLM

Memory that scales with your agent

🔗

Semantic Embeddings

Converts messages into vector embeddings using the open-source nomic-embed-text model via Ollama — fully local and private.

📡

Vector Search

Retrieves the most relevant memories using Weaviate's high-performance vector similarity search with configurable top-K and distance thresholds.

💉

Context Injection

Automatically injects retrieved memories into the LLM prompt through middleware, making every response context-aware.

Async Persistence

New memories are saved asynchronously via a background worker — no latency added to the user-facing response path.

🐳

One Command Deploy

Everything runs inside Docker. Ollama, Weaviate, and the Go server spin up together with a single docker-compose command.

🛡️

Privacy First

All embeddings are generated locally. Your data never leaves your infrastructure — no third-party API calls for vectorisation.

Up and running in 60 seconds

terminal
# Clone the repository
git clone https://github.com/sobowalebukola/memcortex.git
cd memcortex

# Configure environment
cp .env.example .env

# Launch everything
docker-compose up -d --build

# Test it out
curl -X POST http://localhost:8080/chat \
  -H "Content-Type: application/json" \
  -H "X-User-ID: user-1" \
  -d '{"message":"Remember this."}'

Proven open-source foundations

🐹 Go
🐳 Docker
🦙 Ollama
🔷 Weaviate
📐 nomic-embed-text