Laguna XS.2 on Mac: Coding Model, Benchmarks and RAM Limits
Laguna XS.2 from Poolside scores 69.9% on SWE-bench Verified. What runs locally on Mac, which Ollama tags matter and where Qwen3.6 leads.
Laguna XS.2 — Facts First
Released: April 28, 2026 Parameters: 33B (3B activated per token) Architecture: Mixture-of-Experts (MoE), 256+1 experts, 40 layers Context window: Hugging Face lists 262,144 tokens; Ollama lists the available tags with 128K context License: Apache 2.0 Download: Hugging Face | Ollama
Laguna XS.2 from Poolside scores 69.9% on SWE-bench Verified in the current model card and is officially positioned as local-ready for Macs with around 36 GB of unified memory. That makes it interesting for Apple Silicon users with large unified-memory configurations, but not automatically the strongest model in every benchmark category. Honestly, at 36GB as the entry point, my 32GB Mac Mini M4 sits just below the threshold — so I am watching this one from the sidelines for now.
Diagram based on the Laguna XS.2 model card, Poolside’s blog post and the Ollama model page. Sources: Hugging Face Model Card, Poolside Blog, Ollama Laguna XS.2. Checked May 27, 2026.
Benchmark Comparison: The Official Numbers
Official comparison values from the Laguna XS.2 model card.
| Benchmark | Laguna XS.2 (33B-A3B) | Qwen3.6-35B-A3B | Claude Haiku 4.5 |
|---|---|---|---|
| SWE-bench Verified | 69.9% | 73.4% | 73.3% |
| SWE-bench Multilingual | 57.7% | 67.2% | — |
| SWE-bench Pro | 46.3% | 49.5% | 39.5% |
| Terminal-Bench 2.0 | 35.7% | 51.5% | 29.8% |
Sources: Hugging Face Model Card, Poolside Blog — checked May 27, 2026.
The table shows that Laguna XS.2 is competitive for a locally usable open-weight coding model. However, Qwen3.6-35B-A3B leads Laguna XS.2 in the official comparison values for SWE-bench Verified, SWE-bench Multilingual, SWE-bench Pro and Terminal-Bench.
Methodology matters: Poolside benchmarks Laguna XS.2 with the Harbor Framework agent harness and multiple runs per benchmark. Comparison values come from official releases or leaderboards where available.
The Technical Recipe
Mixture-of-Experts with 256+1 experts: Every forward pass activates only 3B of the 33B parameters. This keeps memory requirements manageable while preserving the full 33B model capacity for complex reasoning tasks.
Sliding Window Attention with per-head gating: 30 of 40 layers use Sliding Window Attention (512-token window), 10 layers use global attention. This massively reduces the KV cache. Combined with MoE, the result is an efficient architecture that runs on local GPUs and even Macs with sufficient unified memory.
Native Reasoning + Tool-Calling: Laguna XS.2 supports thought generation directly in the output. The model can interleave reasoning steps with tool calls — critical for agentic use cases like SWE-bench, where code is written, tested, and iterated.
FP8 KV Cache: The KV cache is stored in FP8, further reducing memory per token. Hugging Face lists 262,144 tokens of context for the model, while the Ollama tags are currently listed with 128K context. For Mac users, the concrete runtime and tag matter more than the theoretical model-card maximum.
Memory Requirements: 36 GB as an Entry Point
36 GB should be understood as an entry point. Ollama lists variants such as latest at 23 GB, nvfp4 at 22 GB, mxfp4 at 36 GB and BF16 variants at 67 GB. Real unified-memory use is still higher than package size because context, KV cache, runtime and other apps add overhead.
How Laguna XS.2 Fits In
Local models (Mac-compatible): With 33B MoE (3B activated), Laguna XS.2 is unusual for a locally usable open-weight coding model. Its main appeal is not that it wins every benchmark, but that it brings coding performance to suitable Apple Silicon hardware.
Open license: The Apache 2.0 license allows commercial use and lowers the barrier compared with API-only models. Safety review, use-case fit and Poolside’s use guidance still matter.
Coding focus: The agentic-coding focus shows in the current model-card values: SWE-bench Verified is 69.9%, SWE-bench Pro is 46.3% and Terminal-Bench 2.0 is 35.7%.
For Mac Users Specifically
What runs:
- Ollama:
ollama run laguna-xs.2— model is available in the Ollama Library - 36 GB unified memory as a local entry point according to Poolside — this makes it more relevant for MacBook Pro or Mac Studio configurations with large unified memory; a Mac mini with 16 or 24 GB is not a stable default recommendation
- On Apple Silicon, local runtimes such as Ollama, MLX or other Metal-oriented setups are the more realistic path. eGPU setups are not a normal recommendation for modern Apple Silicon Macs.
What doesn’t run:
- Mac mini with 16 GB — clearly below the official local positioning
- MacBook Air or MacBook Pro with 24 GB — below the official recommendation; at most experimental with small quantizations and limited context
Recommendation: Laguna XS.2 is most interesting for MacBook Pro and Mac Studio configurations with at least 36 GB of unified memory. Machines with 16 GB or 24 GB sit below the official recommendation. If you have enough unified memory and want to test local coding models, Laguna XS.2 is an Apache 2.0 model with 33B MoE for coding.
If you want the highest coding benchmark numbers, Qwen3.6-35B-A3B is stronger in the official comparison table. If you want a coding model explicitly positioned for local use, Apache 2.0 licensing and Ollama tags, Laguna XS.2 is the more relevant candidate to test. What they don’t tell you is that “local-ready” at 36GB means you need a pretty beefy Mac — this is not a MacBook Air kind of model.
Sources and Further Reading
Frequently Asked Questions
Can Laguna XS.2 run on a Mac with 36 GB RAM?
Yes, Poolside positions the model as locally viable on machines with around 36 GB of unified memory. Treat that as an entry point, not a guarantee for long context or high precision.
What makes Laguna XS.2 special for coding?
Laguna XS.2 combines 33B total parameters, 3B active parameters per token, Apache 2.0 licensing and official local-ready positioning. Current model-card values are strong, but Qwen3.6-35B-A3B leads on several coding benchmarks.
Is Laguna XS.2 free to use?
The model weights are available under Apache 2.0, including commercial use. You still need to evaluate safety, suitability and licensing obligations for your own use case.
Does Laguna XS.2 run on a Mac with 16 GB of RAM?
Not as a sensible default recommendation. The official positioning points to around 36 GB of unified memory as the local entry point, so a 16 GB Mac sits clearly below that.
Is 24 GB of unified memory enough for Laguna XS.2?
24 GB may be interesting for experiments depending on quantization and runtime, but it is below the official 36 GB recommendation. For reliable local use, plan with 36 GB or more.
Is Laguna XS.2 better than Qwen3.6-35B-A3B?
Not across the board. In Poolside/model-card comparison values, Qwen3.6-35B-A3B leads Laguna XS.2 on several coding benchmarks. Laguna XS.2 remains interesting because of local usability, open licensing and its coding focus.
How do I run Laguna XS.2 with Ollama?
With Ollama, use: ollama run laguna-xs.2. The library lists several tags from about 22 GB to 67 GB; actual memory use depends on quantization, context length, KV cache and runtime.