MemQ Token Compression API
▎ Purpose: Reduces input token costs for AI agents that load persistent memory into LLM context. Operates as a stateless compression middleware — not a memory system. Compatible with any upstream memory store. ▎ Supported memory formats: claude-md (CLAUDE.md), openclaw-md, chatgpt, google-aom, generic markdown ▎ Endpoints: ▎ POST /v1/compress — Structural compression. Strips filler phrases,…
MemQ Token Compression API endpoints
| Method | Endpoint | Description |
|---|---|---|
| v1 | ||
| POST |
/v1/bulk-optimize /v1/bulk-optimize |
Full MEMORY.md optimization with TF-IDF cosine similarity semantic deduplication, paraphrase detection, section merging, and compression. Achieves 50-95% token reduction. Uses… |
| POST |
/v1/compress/stream /v1/compress/stream |
Compress a single new memory entry for real-time ingest. |
| GET |
/v1/stats /v1/stats |
Calculate projected token cost savings for a specific LLM model and usage pattern. |
| POST |
/v1/compress /v1/compress |
Compress raw AI memory content into compact LLM-parseable format. Reduces token count by 40-55% structurally. Uses real GPT tokenizer for accurate token counts. |
| POST |
/v1/compress/incremental /v1/compress/incremental |
Add new memory entries to an already-compressed block. Uses TF-IDF cosine similarity with synonym normalization to detect duplicates and paraphrases. Only genuinely new… |
| POST |
/v1/compress/chunked /v1/compress/chunked |
Compress very large memory files (200K+ tokens) by splitting into manageable chunks and reassembling. Supports up to 10MB payloads. Uses real GPT tokenizer for accurate token… |
| POST |
/v1/memory/wrap /v1/memory/wrap |
Embed a MemQ YAML frontmatter header into any memory file without compressing it. The header contains agent instructions to automatically call /v1/recall before loading and run… |
| POST |
/v1/recall /v1/recall |
Retrieve only the memory sections relevant to a given query. Scores all sections using TF-IDF cosine similarity and returns only those that match within a token budget. Supports… |
| POST |
/v1/memory/health /v1/memory/health |
Analyze memory content for quality issues. Detects redundancy (duplicate pairs via TF-IDF cosine similarity), bloat (verbose entries that could be more compact), and staleness… |
| POST |
/v1/decompress /v1/decompress |
Reverse compression back to human-readable Markdown. |
MemQ Token Compression API pricing
| Plan | Price | Rate limit | Quotas |
|---|---|---|---|
| BASIC | Free | — |
|
| PRO | $4.99 / month | 5 / minute |
|
| ULTRA | $9.99 / month | 5 / minute |
|
| MEGA | $14.99 / month | 20 / minute |
|