You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Auto-tuned launcher for GGUF models on llama.cpp / ik_llama.cpp — OpenAI-compatible server with multi-GPU tensor-split, MoE expert placement, measured flag tuning (AI Tune), hardware-matched HuggingFace downloads, and crash recovery. An Ollama alternative for multi-GPU rigs.
Single-kernel Codex runtime that distills skills, agents, plugins, and workflows into atomic capabilities and routes them through one main chain. / 面向 Codex 的单核运行时:将技能、Agent、插件与工作流蒸馏为原子能力,并由唯一主链统一调度。
CacheGuard(缓存卫士)— a drop-in proxy that keeps DeepSeek's server-side prefix-cache stable in front of any coding agent, so cache-hit pricing never silently breaks.