User runs local Qwen models on ARC B70 for AI coding work
A Reddit user shared their experience running local AI workloads on an Intel ARC B70 graphics card. They said they used Qwen3.6-35B-A3B-GGUF with Hermes agent and Qwen3.6-27B-MTP-GGUF with Claude Code. They tried LM Studio, the official llama.cpp Docker image, a community Docker image, and kyuz0’s tools before building a Docker image themselves, which they said improved speed and stability.
Key points
- The author says the ARC B70 handled local AI workloads well for them.
- They used Claude Code with Qwen3.6-27B-MTP-GGUF.
- They used Hermes agent with Qwen3.6-35B-A3B-GGUF.
- They dropped LM Studio because it lacked SYCL backend support.
- They reported better speed and stability after building the Docker image themselves.
Quick term guide
- AI workload
- A set of tasks that use an AI model or process AI results.
- workloads
- The tasks a computer is expected to handle.
- graphics card
- A component inside a computer that handles displaying images and running game visuals.
- Hermes Agent
- It appears to be a tool or community for building and managing AI agents.
- Solo developer
- An individual who handles all parts of creating a project or product alone.
- developers
- Developers are people who build software, apps, or websites.
- performance
- How fast and smoothly a site loads and works.
- SYCL backend support
- Support that helps AI software use certain hardware, such as Intel GPUs, for faster calculations.