Deploy GLM-4.6V-Flash (9B dense VLM) on Huawei Ascend 910B NPU with vLLM - multimodal, OpenAI API, single/dual-card serving, reproducible benchmarks.
-
Updated
Jun 9, 2026 - HTML
Deploy GLM-4.6V-Flash (9B dense VLM) on Huawei Ascend 910B NPU with vLLM - multimodal, OpenAI API, single/dual-card serving, reproducible benchmarks.
Add a description, image, and links to the vllm-ascend topic page so that developers can more easily learn about it.
To associate your repository with the vllm-ascend topic, visit your repo's landing page and select "manage topics."