Mac Studio versus RTX hardware for running a local LLM

A Reddit user says they want to run a local LLM for email tagging, summaries, and related tasks. They currently use GPT-5-mini through the OpenAI API and say it gets about 75% on their own benchmark. Larger models reach about 90%, but their goal is about 70% quality locally with decent speed. They are comparing a Mac Studio M3 Ultra, a possible future M5 Mac Studio, and RTX Pro 5000/6000 hardware.

Key points

Quick term guide

local LLM
An AI language model that runs on your own computer instead of on a remote server.
GPT-5-mini
A smaller OpenAI language model used for reading and generating text.
OpenAI API
A way for an app to send requests to OpenAI and get AI results back.
benchmark
A test used to compare speed, quality, or cost.
Mac Studio
A powerful desktop Mac made by Apple.
hardware
The physical parts of a computer that you can touch.
Mac mini server
A Mac mini used as an always-on computer for files, apps, backups, or automation.
Apple Silicon
Apple's own line of chips (M1, M2, M3, M4, M5) used in Macs, known for performance and efficiency.
Read original