**Santiago** @svpino 2026-04-08If you ask me, Gemma 4 is one of the best models out there for a single reason:You can run it locally and it’s really, really good (probably Sonnet level?)And, of course, ya can now use it to power OpenClaw and show a middle finger to the company who doesn’t like it.
**Piotr Pomorski** @PtrPomorski [2026-04-09](https://x.com/PtrPomorski/status/2042108784550588592)“You can run it locally and it’s really, really good” -> the X slogan for basically EVERY SINGLE open source LLM that comes out
**Konstantin Gladych** @gladkos [2026-04-08](https://x.com/gladkos/status/2042015698030096832)Google Turboquant makes possible large context window for agentic tasks even on a mid-end devices.
**Jatin Garg** @jatingargiitk [2026-04-09](https://x.com/jatingargiitk/status/2042098843484033238)running it locally is the real moat here, but “sonnet level” depends a lot on the task for coding and long-horizon reasoning, that’s still a stretch. the interesting part is getting near-frontier quality without the API leash.
**Sam Selvanathan** @samselvanathan\_ [2026-04-09](https://x.com/samselvanathan_/status/2042068569769570727)'Sonnet level' is the wrong frame. 26B MoE with ~4B active parameters per forward pass isn't competing with Sonnet quality-wise. It's frontier-adjacent reasoning at local-inference economics. Different deployment category. The teams that get this aren't asking 'is it as good as
**Pia Red Dragon** @PiaRedDragon [2026-04-09](https://x.com/PiaRedDragon/status/2042084912401174912)Agreed, but if you are not going to use the full sized model use these guys version, the 30GB version gave me BETTER results on MMLU.I am not sure what their proprietary quantization method is, but it is insane!
**GooGZ AI** @PaulGugAI [2026-04-08](https://x.com/PaulGugAI/status/2042012661916078116)Which flavor of Gemma 4 can run on a 16GB mini, and would it be 'clever enough' for most things however ? I've been hearing mixed reviews/reports so far about the efficacy of the smaller E2B/E4B variants. I think the just is still out.
**Balvinder Kalon** @BalvinderKalon [2026-04-09](https://x.com/BalvinderKalon/status/2042110482329289096)25 tok/s on a MacBook Air is genuinely usable. the game changer with local models isn't matching frontier quality, it's having a capable model running with zero latency and zero cost for the 80% of tasks that don't need opus-level reasoning. gemma 4 hits that sweet spot.
**NColonJr** @NelsonColonJr [2026-04-09](https://x.com/NelsonColonJr/status/2042074194683347400)Its insanely efficient, I thought it was just as good as claude at first, its not even close. Still buggy, gives me cracked code, crashes ~80k-100k token spent...still sufficient just not as good.
**Alexandru Indrei** @aindrei [2026-04-09](https://x.com/aindrei/status/2042035222636851423)lol. It's definitely not even close to sonnet level when running locally. Give it a try actually.
**Erich Cervantez ** @erichcervantez [2026-04-09](https://x.com/erichcervantez/status/2042047551617298687)Wonder how the A.I. tech bros feel about this after blowing tens of thousands on mac studio ultra machines that burn tokens like it's the fourth of july 🎇
**Nishan** @nishancodes [2026-04-09](https://x.com/nishancodes/status/2042168888084017459)Not Sonnet level, not even near.But it's an insanely good model compared to its oss peers
**Matthias Kampmann** @M\_Kampmann [2026-04-09](https://x.com/M_Kampmann/status/2042105368898080849)I wish this were true.Tool calling is not reliable though.Also it plays back the instructions of repeating jobs instead of executing them quite often.GLM 5.1 though…
**Milinaire** @myonemillions [2026-04-09](https://x.com/myonemillions/status/2042043532693926096)I played with the Gemma 4 lightweight base model and it could not answer how may P in pineapple correctly. While it works offline / airplane mode is fantastic.
**Theo FunnyStrip** @thecodingtheo [2026-04-09](https://x.com/thecodingtheo/status/2042118190524715082)Every time someone runs a capable model locally, a subscription-based AI company loses its leverage. Beautiful.
2026-04-08 Run OpenClaw with Gemma 4 and Atomic Chat MacBook Air M4 · 16 GB RAM · 25 tok/s No cloud! No subscription fees! Open-source local model. Runs on your regular device
**Piotr Pomorski** @PtrPomorski [2026-04-09](https://x.com/PtrPomorski/status/2042108784550588592)“You can run it locally and it’s really, really good” -> the X slogan for basically EVERY SINGLE open source LLM that comes out
**Konstantin Gladych** @gladkos [2026-04-08](https://x.com/gladkos/status/2042015698030096832)Google Turboquant makes possible large context window for agentic tasks even on a mid-end devices.
**Jatin Garg** @jatingargiitk [2026-04-09](https://x.com/jatingargiitk/status/2042098843484033238)running it locally is the real moat here, but “sonnet level” depends a lot on the task for coding and long-horizon reasoning, that’s still a stretch. the interesting part is getting near-frontier quality without the API leash.
**Sam Selvanathan** @samselvanathan\_ [2026-04-09](https://x.com/samselvanathan_/status/2042068569769570727)'Sonnet level' is the wrong frame. 26B MoE with ~4B active parameters per forward pass isn't competing with Sonnet quality-wise. It's frontier-adjacent reasoning at local-inference economics. Different deployment category. The teams that get this aren't asking 'is it as good as
**Pia Red Dragon** @PiaRedDragon [2026-04-09](https://x.com/PiaRedDragon/status/2042084912401174912)Agreed, but if you are not going to use the full sized model use these guys version, the 30GB version gave me BETTER results on MMLU.I am not sure what their proprietary quantization method is, but it is insane!
**GooGZ AI** @PaulGugAI [2026-04-08](https://x.com/PaulGugAI/status/2042012661916078116)Which flavor of Gemma 4 can run on a 16GB mini, and would it be 'clever enough' for most things however ? I've been hearing mixed reviews/reports so far about the efficacy of the smaller E2B/E4B variants. I think the just is still out.
**Balvinder Kalon** @BalvinderKalon [2026-04-09](https://x.com/BalvinderKalon/status/2042110482329289096)25 tok/s on a MacBook Air is genuinely usable. the game changer with local models isn't matching frontier quality, it's having a capable model running with zero latency and zero cost for the 80% of tasks that don't need opus-level reasoning. gemma 4 hits that sweet spot.
**NColonJr** @NelsonColonJr [2026-04-09](https://x.com/NelsonColonJr/status/2042074194683347400)Its insanely efficient, I thought it was just as good as claude at first, its not even close. Still buggy, gives me cracked code, crashes ~80k-100k token spent...still sufficient just not as good.
**Alexandru Indrei** @aindrei [2026-04-09](https://x.com/aindrei/status/2042035222636851423)lol. It's definitely not even close to sonnet level when running locally. Give it a try actually.
**Erich Cervantez ** @erichcervantez [2026-04-09](https://x.com/erichcervantez/status/2042047551617298687)Wonder how the A.I. tech bros feel about this after blowing tens of thousands on mac studio ultra machines that burn tokens like it's the fourth of july 🎇
**Nishan** @nishancodes [2026-04-09](https://x.com/nishancodes/status/2042168888084017459)Not Sonnet level, not even near.But it's an insanely good model compared to its oss peers
**Matthias Kampmann** @M\_Kampmann [2026-04-09](https://x.com/M_Kampmann/status/2042105368898080849)I wish this were true.Tool calling is not reliable though.Also it plays back the instructions of repeating jobs instead of executing them quite often.GLM 5.1 though…
**Milinaire** @myonemillions [2026-04-09](https://x.com/myonemillions/status/2042043532693926096)I played with the Gemma 4 lightweight base model and it could not answer how may P in pineapple correctly. While it works offline / airplane mode is fantastic.
**Theo FunnyStrip** @thecodingtheo [2026-04-09](https://x.com/thecodingtheo/status/2042118190524715082)Every time someone runs a capable model locally, a subscription-based AI company loses its leverage. Beautiful.