Apple announces a new local AI engine for its own chips
The Reddit post says Apple announced CoreAI, a new on-device inference engine for Apple Silicon, at WWDC. The author describes CoreAI as a possible future replacement for CoreML and an alternative to MLX, llama.cpp, and torch for running models on phones and tablets. The post says model weights must be converted with a Python script, and that no performance numbers are available yet. It also points to Apple’s claim about deploying a 20B model on device.
Key points
- Apple announced CoreAI for running AI models on Apple Silicon, according to the post.
- The author says it may replace CoreML for some local AI use cases.
- Models need a conversion step before they can run in this system.
- The post says there are no performance numbers yet.
- Apple’s 20B model claim is highlighted as a sign that larger local models may be possible.
Quick term guide
- on-device inference
- Running an AI model on your own machine instead of sending the work to a cloud service.
- Apple Silicon
- Apple's own line of chips (M1, M2, M3, M4, M5) used in Macs, known for performance and efficiency.
- model weights
- The internal numbers an AI learns during training — saving them lets you reuse or share the trained AI.
- Python script
- A Python script is a small program written in the Python language.
- local execution
- Running an AI model directly on your own computer instead of using a remote server.
- long sessions
- Extended AI work sessions where many messages or steps happen over time.
- local models
- AI models that run on your own computer or device instead of a company server.
- local model
- An AI model you run directly on your own computer, with no internet connection or external service needed.