AIImportance: Medium

User runs local Qwen models on ARC B70 for AI coding work

r/LocalLLMJun 12, 2026 · 4h ago

A Reddit user shared their experience running local AI workloads on an Intel ARC B70 graphics card. They said they used Qwen3.6-35B-A3B-GGUF with Hermes agent and Qwen3.6-27B-MTP-GGUF with Claude Code. They tried LM Studio, the official llama.cpp Docker image, a community Docker image, and kyuz0’s tools before building a Docker image themselves, which they said improved speed and stability.

Key points

The author says the ARC B70 handled local AI workloads well for them.
They used Claude Code with Qwen3.6-27B-MTP-GGUF.
They used Hermes agent with Qwen3.6-35B-A3B-GGUF.
They dropped LM Studio because it lacked SYCL backend support.
They reported better speed and stability after building the Docker image themselves.

Quick term guide

AI workload: A set of tasks that use an AI model or process AI results.
workloads: The tasks a computer is expected to handle.
graphics card: A component inside a computer that handles displaying images and running game visuals.
Hermes Agent: It appears to be a tool or community for building and managing AI agents.
Solo developer: An individual who handles all parts of creating a project or product alone.
developers: Developers are people who build software, apps, or websites.
performance: How fast and smoothly a site loads and works.
SYCL backend support: Support that helps AI software use certain hardware, such as Intel GPUs, for faster calculations.

Read original ↗