Gemma 3 270m 4-bit DWQ is up. Same speed, same memory, much better quality:
Awni Hannun
Awni Hannun15 ago 2025
Gemma 3 270m 4-bit generates text at over 650 (!) tok/sec on an M4 Max with mlx-lm and uses < 200MB: Not sped up:
28,75K