M2 16GB Mac test hit memory trouble during compression
A user says they added Gwen3 4b as an auxiliary model for compression in Hermes Agent. They saw a token speed of 20k/s during testing. While summarizing 16000 MD files, RAM use rose to 12GB and the Mac mini shut down immediately.
Key points
- The user tested Gwen3 4b as an auxiliary model in Hermes Agent.
- They reported a token speed of 20k/s.
- RAM use reached 12GB during a 16000 MD file summarization test.
- The Mac mini shut down right away and could not be turned on remotely.
- On an M2 16GB Mac, large summary jobs should be tested in smaller batches first.
Quick term guide
- Gwen3 4b
- A likely AI model name as written in the Reddit title.
- auxiliary model
- A helper AI model used alongside the main model.
- compression
- A process that shortens older chat details so the AI can keep working in a long session.
- compress
- To take a lot of information and turn it into a shorter, simpler version.
- Hermes Agent
- It appears to be a tool or community for building and managing AI agents.
- testing
- The process of checking that software does what it's supposed to do, usually by running it and looking for errors.
- MD file
- A text document written in Markdown format.
- Mac mini
- A small desktop computer made by Apple.