---
name: cerebrun
description: Unified AI Memory Gateway & MCP Server. Persistent cross-model context, 4-layer architecture (preferences, work context, personal identity, encrypted vault), semantic vector search, multi-LLM gateway with BYOK, knowledge base, and cross-conversation memory. Use when you need persistent memory, context sharing between AI models, encrypted secret storage, or inter-model communication.
version: 0.5.0
homepage: https://cereb.run
metadata:
  requires:
    env:
      - CEREBRUN_API_KEY
  primaryEnv: CEREBRUN_API_KEY
---

# Cerebrun — Unified AI Memory Gateway

Cerebrun is a Model Context Protocol (MCP) server that creates a persistent "Digital Self" — a unified memory and context layer shared across all your AI agents. Instead of each agent starting from zero, they all read from and write to the same structured context.

## Quick Start

### 1. Get Your API Key
Log in to the Cerebrun Dashboard and create an API key with the permissions you need:
- **Basic**: Layer 0 access only (language, timezone, preferences)
- **Layer 1**: Work context (projects, goals, memories)
- **Layer 2**: Personal identity (name, location, interests)

### 2. Configure MCP Connection
Add to your MCP client configuration:

```json
{
  "mcpServers": {
    "cerebrun": {
      "url": "https://cereb.run/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_API_KEY"
      }
    }
  }
}
```

### 3. Startup Workflow
Every session should begin with:
1. Call `get_context` with layer 0 to load user preferences
2. Call `get_context` with layer 1 (if permitted) for active projects and goals
3. Call `list_available_providers` to see which LLMs are available for inter-model communication
4. Use `search_context` to find relevant prior knowledge before injecting context

## Layer Architecture

Cerebrun organizes user data into four distinct layers with escalating sensitivity:

### Layer 0 — Public Preferences (Always Accessible)
- Language, timezone, communication style
- Output format preferences, blocked topics
- **When to update**: User says "I prefer concise answers", "use UTC+3", "speak in Turkish"

### Layer 1 — Work Context (Requires `layer1` Permission)
- Active projects, current goals, working directories
- Pinned memories (important things to remember)
- **When to update**: User starts/finishes projects, sets goals, says "remember this"

### Layer 2 — Personal Identity (Requires `layer2` Permission)
- Display name, location, interests, contact preferences
- Relationship notes
- **When to update**: User shares personal info — name, city, hobbies

### Layer 3 — Encrypted Vault (Dashboard Only)
- API keys, tokens, passwords, secrets
- AES-256-GCM encryption at rest
- **NEVER store secrets in Layers 0-2 or Knowledge Base**
- Read access requires explicit user consent via `request_vault_access`
- Write access is dashboard-only — direct users to Dashboard > Vault

## Available Tools (15 Total)

### Context Management
| Tool | Description |
|------|-------------|
| `get_context` | Read any layer (0-3). Layer 3 requires prior consent. |
| `update_context` | Update layers 0-2. Auto-embeds Layer 1 for semantic search. |
| `search_context` | Semantic vector search across context + knowledge. Use before injecting context to prevent token waste. |
| `request_vault_access` | Submit consent request for vault read access. User must approve via dashboard. |

### Knowledge Base
| Tool | Description |
|------|-------------|
| `push_knowledge` | Store categorized knowledge entry. Auto-vectorized for semantic search. |
| `query_knowledge` | Search knowledge by keyword, category, tag, or project. |
| `list_knowledge_categories` | List all categories with entry counts. |

### LLM Gateway & Inter-Model Communication
| Tool | Description |
|------|-------------|
| `list_available_providers` | List configured LLM providers with available models. Call this before `chat_with_llm`. |
| `chat_with_llm` | Send message to any LLM (OpenAI, Gemini, Anthropic, Ollama). Auto-uses stored API keys and injects cross-conversation memory. |
| `fork_conversation` | Fork conversation at any message to a different LLM. |
| `get_llm_usage` | Token usage metrics by provider and model. |

### Conversation History
| Tool | Description |
|------|-------------|
| `list_conversations` | List recent LLM conversations. |
| `get_conversation` | Get full conversation with all messages. |
| `search_conversations` | Keyword search across conversation history. |

## Knowledge Base Best Practices

Use `push_knowledge` to persist important information across sessions:

```
Categories: project_update, code_change, decision, learning, todo,
            insight, architecture, bug_fix, feature, note
```

**Always include:**
- `summary` — one-line description for quick scanning
- `category` — appropriate category from the list above
- `tags` — relevant keywords for filtering (e.g., `["rust", "auth", "api"]`)
- `source_project` — project name when applicable

**Security guard:** The server blocks any attempt to store API keys, passwords, or tokens in the Knowledge Base. If detected, you will receive a BLOCKED error. Direct users to store secrets in the Vault instead.

## Inter-Model Communication

You can communicate with other AI models through the Gateway:

### Example: Get a Second Opinion
```
1. list_available_providers → shows: ollama (qwen3-coder, gpt-oss)
2. chat_with_llm(provider: "ollama", model: "qwen3-coder",
     message: "Review this architecture decision: ...")
3. Use the response to inform your own analysis
```

### Example: Brainstorming Session
```
1. chat_with_llm(provider: "openai", model: "gpt-4.1",
     message: "Generate 5 alternative approaches for...",
     title: "Architecture Brainstorm")
2. Fork the conversation to another model for comparison:
   fork_conversation(conversation_id: "...", new_provider: "anthropic",
     new_model: "claude-sonnet-4.6")
```

### Cross-Conversation Memory
When using `chat_with_llm`, the Gateway automatically:
- Injects recent messages from other conversations as context
- Searches the knowledge base for relevant entries via vector similarity
- Respects the conversation's token budget to prevent context overflow

## Security Rules

1. **NEVER** store API keys, passwords, or tokens in Layers 0-2 or Knowledge Base
2. **NEVER** echo secrets back to users in plain text
3. If a user pastes an API key in chat, tell them to store it in Dashboard > Vault
4. Use `search_context` before injecting large context blocks — prevents token waste
5. Respect layer permissions — don't attempt to access layers you don't have permission for
6. All blocked security attempts are logged to the audit trail

## Context Injection (Over-Injection Prevention)

Cerebrun uses a smart context injection system:
- **Only Layer 0** preferences are auto-injected into LLM conversations
- For deeper context, use `search_context` to find only relevant information
- Each conversation has a configurable `context_token_budget` (default: 2000 tokens)
- Cross-conversation memory is grouped and truncated to stay within budget

## Error Handling

| Error | Meaning | Resolution |
|-------|---------|------------|
| `No permission for Layer X` | API key lacks required scope | Create a new key with appropriate permissions in the dashboard |
| `No API key configured for provider: X` | User hasn't added a provider key | Direct user to Dashboard > LLM Keys |
| `BLOCKED: Content contains sensitive credentials` | Secret detection triggered | Remove actual key values; store metadata only |
| `Vault access requires explicit consent` | Layer 3 read needs approval | Use `request_vault_access` first, then wait for user approval |

## Supported LLM Providers

| Provider | Embedding Support |
|----------|-------------------|
| **OpenAI** | text-embedding-3-small |
| **Anthropic** | — |
| **Google Gemini** | — |
| **Ollama Cloud** | nomic-embed-text |

Any model available from these providers can be used. Use `list_available_providers` to see which providers the user has configured.

## API Endpoints (REST)

For direct HTTP integration (non-MCP):

| Method | Path | Description |
|--------|------|-------------|
| `GET/PUT` | `/api/v0/context` | Layer 0 preferences |
| `GET/PUT` | `/api/v1/context` | Layer 1 work context |
| `GET/PUT` | `/api/v2/context` | Layer 2 personal data |
| `PUT` | `/api/v3/context` | Vault storage (session auth) |
| `POST` | `/mcp` | MCP protocol endpoint |
| `GET/POST` | `/api/knowledge` | Knowledge CRUD |
| `PUT/DELETE` | `/api/knowledge/:id` | Update/delete entry |
| `POST` | `/api/knowledge/:id/move-to-vault` | Move to encrypted vault |
| `POST` | `/api/llm/conversations/:id/chat` | LLM chat (non-streaming) |
| `POST` | `/api/llm/conversations/:id/stream` | LLM chat (SSE streaming) |
| `POST` | `/api/llm/compare` | A/B comparison across models |