Pure Rust Inference Engine
-
Updated
Jun 30, 2026 - Rust
Pure Rust Inference Engine
Fully uncensored, capability-enhanced abliteration of Qwen3.6-27B. NVFP4 + z-lab DFlash speculative decoding (n=12) on the unified ghcr.io/aeon-7/aeon-vllm-ultimate:latest container, tuned for long-context draft acceptance on DGX Spark. 6 HF variants (BF16/NVFP4/MTP/MTP-XS), docker-compose, and QuickStart.
One-command vLLM installation for NVIDIA DGX Spark with Blackwell GB10 GPUs (sm_121 architecture)
Headless remote desktop setup for NVIDIA DGX SPARK using Sunshine streaming
Local diagnostic CLI for NVIDIA DGX Spark (GB10). Detects power caps, unified memory pressure, thermal risk, Docker/runtime issues, and validates vLLM/Ollama/llama.cpp/SGLang recipes.
Bleeding-edge ComfyUI for NVIDIA DGX Spark (GB10/Blackwell/sm_121a). CUDA 13 + SageAttention v3 (sm_121a) + NVFP4 + 14 custom-node packs + Flux 2 Dev / LTX 2.3 22B / ACE-Step v1.5 XL Turbo pre-bundled with abliterated text-encoder paths.
DGX Spark / GB10 vLLM Docker stack for large-model serving, presets, patches, and validation notes.
Serve the home! Inference stack for your Nvidia DGX Spark aka the Grace Blackwell AI supercomputer on your desk. Mostly vLLM based for now and single-spark. For the not-so-rich buddies. If you want latest/in-testing, look at the branches
Headless 4K remote desktop for the NVIDIA DGX Spark (GB10): one-command installer for Sunshine + Moonlight low-latency game streaming with NVENC hardware encoding, a software virtual display (no HDMI dummy plug), GDM autologin, and optional Tailscale.
vLLM + Qwen3.5-122B-A10B-NVFP4 on NVIDIA DGX Spark (GB10/SM121) — single-GPU NVFP4 W4A4 with MTP speculative decoding, self-contained Docker build
Some benchmark results of small models and quants that fit on DGX Spark
Private LLM/RAG platform in one command for NVIDIA DGX Spark / GB10 (arm64). Validated on real hardware.
GPU-accelerated WhisperX on NVIDIA Blackwell (SM_121) - DGX Spark compatible
Single-file web UI for NVIDIA DGX Spark — pull Ollama models, browse and download from HuggingFace, manage LiteLLM routing, and control SGLang, vLLM, llama.cpp, LocalAI, and ComfyUI. All from one browser tab.
Add a description, image, and links to the dgx-spark topic page so that developers can more easily learn about it.
To associate your repository with the dgx-spark topic, visit your repo's landing page and select "manage topics."