# niuma_code — Full Documentation > A companion tool for Claude Code. Subtract context, multiply attention — every LLM token lands exactly where it matters. ## What is niuma_code niuma_code is a Python-based terminal tool designed for LLM coding tools like Claude Code. Packaged as a portable niuma.exe — download and run, no installation needed. The core idea stems from "Attention is All You Need" — models don't lack attention mechanisms, they lack placing attention in the right places. The longer the context, the more noise dilutes truly critical information. niuma_code does one thing: subtract context, multiply attention. **Version**: 0.2.0 (Alpha) **License**: MIT **Platform**: Windows --- ## Core Features in Detail ### 1. harness Autonomous Chat (Default Mode) No commands needed. LLM explores autonomously in a tool loop, thinking and doing simultaneously, finishing tasks in one go. - **tool_use loop**: LLM autonomously decides to read files / edit code / run commands until complete - **API error auto-retry**: Streaming rendering + ESC cancel + auto-compression for long contexts - For tasks with fuzzy boundaries that are hard to pre-decompose, LLM judges when to stop - No re-planning back-edges — the clearer the initial input, the more accurately it runs ### 2. loop Engineered Orchestration `/loop ` enters goal-oriented self-loop: build a checklist with verification commands, then execute sequentially, verify each step, self-correct on failure. - **Plan → Execute → Verify → Failure retry**: 3-strike mechanism, back-feeds failure reasons to avoid repeating mistakes - **Dual闸 orthogonal**: MAX_ROUNDS=20 task hard limit + MAX_RETRIES=3 per-task self-correction - Each round internally calls harness for a single step, suitable for decomposable, verifiable engineering tasks - 3 failures don't interrupt but record and continue — check the [!!] markers in the summary ### 3. TUI Full-Screen Mode prompt_toolkit-based full-screen interface, input box fixed at bottom, append new messages anytime during LLM streaming output. - 5 modal overlays: model settings, message queue, context switching, conversation management, permission confirmation - Mouse drag selection with highlight, scroll browsing, left-click auto-copy, Ctrl+scroll font scaling - Real-time status bar: thinking preview, input/output token counts, compression progress, ESC cancel ### 4. IDE Orchestration Mode Full-screen code editor, write orchestration scripts in Python-native syntax, compile LLM calls into controllable script workflows. - Injected `llm_call` / `llm_confirm` / `llm_judge` functions for controllable LLM orchestration - F5 preview (AST static analysis step expansion), F6 orchestration run (unattended, auto-approve tool permissions) - Safe sandbox execution + dangerous import blacklist (os/subprocess/socket, etc.) ### 5. Multi-Provider Routing Configure multiple API endpoints in settings.json, auto-route requests to the correct provider by model. - Each provider declares `base_url` / `api_key` / supported model list - `#tag` or `/model` to switch models, requests auto-hit the correct endpoint - Three-layer config overlay (user / project / project-local), later overrides earlier ### 6. Multi-Context Parallelism Extend a single conversation into N parallel contexts, each independently maintaining history, token counts, and compression state. - LRU eviction, default limit 5 (`max_contexts` configurable), evicted contexts archived and recoverable - `/context` visual overlay for switching; background async summary generation when leaving - `/messages` multi-select by conversation unit, delete / move / LLM summary merge ### 7. Code Knowledge Graph Parse code structure via tree-sitter, build symbol definitions, calls, and dependency relationship graphs. - 4 read-only retrieval tools: locate symbols, check file dependencies, find references, trace calls - Returns `file_path:line_number` with signatures, replacing grep-based searching - Three-factor decision for on-demand rebuild, avoiding full parse on every startup ### 8. Perception-Driven Memory Persistent memory based on memory-palace, transcribing runtime events into memory in real-time. - 10 perception events: eye/body/tongue/nose/outcome, etc., written in real-time rather than extracted at conversation end - Fact triples + conversation summaries, 4-layer retrieval + Bayesian decay - Auto-retrieves and injects into system prompt on next conversation, stable recall positions ### 9. Outcome Reward Tracking Track the full lifecycle of task goals, compute reward scores based on completion quality. - OutcomeTracker computes 0.0~1.0 reward scores per task lifecycle - Reward scores serve as memory reuse value assessment signals - One of the 10 perception events, linked with read/write/tool calls ### 10. Sub-Agent Parallel Research Read-only sub-agents execute tools in parallel within isolated contexts, return summaries after research. - Multiple sub-agents execute read-only tools in parallel without blocking - Isolated context research, only returns summaries, doesn't pollute main conversation - Belongs to the default autonomous chat paradigm, called on-demand rather than as a standalone mode --- ## Quick Start ### Download Download [niuma.exe](https://niumacode.cn/download/niuma.exe) from the official site (no installation needed, double-click to run): ### Configuration Create `~/.niuma/settings.json`: ```json { "factories": [ { "base_url": "https://api.anthropic.com", "api_key": "your-key", "options": ["claude-sonnet-4-6"] } ] } ``` ### Launch ```bash niuma.exe ``` --- ## Command Reference | Command | Description | |---------|-------------| | `/ide` | Enter full-screen code editor | | `/context` | Multi-context management (new/rename/switch/delete) | | `/help` | Show help information | | `/copy` | Copy recent reply to clipboard | | `/resume` | Resume incomplete tasks | | `/clear` | Clear conversation context | | `/restart` | Restart niuma | | `/effort` | View/switch effort level | | `/model` | View/switch model | | `/quit` | Exit program | --- ## Links - [Official Site](https://niumacode.cn/) - [GitHub](https://github.com/zhiguoliu/niuma-code) - [GitHub Issues](https://github.com/zhiguoliu/niuma-code/issues)