Block Goose: The Open-Source, Model-Agnostic AI Coding Agent Changing the Game in 2026
By EndOfCoding
Block — the company behind Square and Cash App — just open-sourced something worth paying attention to: Goose, a model-agnostic AI coding agent that runs locally, connects to any LLM, and doesn't lock you into a single vendor's ecosystem. In a landscape dominated by Cursor ($20/month, Claude-or-GPT-only) and GitHub Copilot ($19/month, OpenAI-dependent), Goose is a genuinely different approach: bring your own model, run your own agent, own your own data. For AI-assisted development learners, Goose represents something important — the open-source wave catching up to the commercial tools, and doing so on more flexible architectural terms. Here's what Goose actually does, how to set it up, and when it makes sense to use it over the commercial alternatives.
What You'll Learn
You'll understand what Block Goose actually ships and how it differs from Cursor or Copilot, how to install and configure Goose for your first session, the model-agnostic architecture and why it matters for long-term tool independence, which use cases favor Goose over commercial alternatives, the performance benchmarks from early testing, and how to integrate Goose into an existing agentic engineering workflow.
What Goose Actually Is
Goose (open-sourced by Block, April 2026) is a CLI-first AI coding agent with these core properties:
- Model-agnostic: Connects to Claude, GPT-4o, Gemini, Ollama (local models), or any OpenAI-compatible API endpoint
- Open-source: Apache 2.0 license, full source available on GitHub at
block/goose - Local execution: Runs on your machine — no cloud intermediary between your code and the model
- Multi-step autonomous task execution: Goose doesn't just answer questions — it executes tasks end-to-end, calling tools (file read/write, shell execution, web fetch) until the task is done or it hits a checkpoint
- Extension system: Plugin architecture lets you add custom tools — database connectors, internal APIs, deployment scripts
- Session persistence: Goose saves session state between runs — you can pause a task, restart your machine, and resume where you left off
The name is deliberate: Goose (as in "loose" — the agent runs freely with less constraint than commercial tools that have conservative safety guardrails for liability reasons).
Installation
Goose is installed via a single script:
# macOS / Linux
curl -fsSL https://block.github.io/goose/install.sh | sh
# Windows (via PowerShell)
irm https://block.github.io/goose/install.ps1 | iex
# Or via pip
pip install goose-ai
Verify installation:
goose --version
# Output: goose 1.0.0 (block/goose)
Connecting Your First Model
Goose uses a ~/.config/goose/config.yaml for model configuration:
# Connect to Claude (Anthropic)
provider: anthropic
model: claude-sonnet-4-6
api_key: ${ANTHROPIC_API_KEY}
# Or connect to a local Ollama instance
# provider: ollama
# model: qwen2.5-coder:32b
# base_url: http://localhost:11434
# Or connect to OpenAI
# provider: openai
# model: gpt-4o
# api_key: ${OPENAI_API_KEY}
The YAML supports multiple profiles — you can switch between models mid-workflow by passing --profile:
goose run --profile claude "Add user authentication to this app"
goose run --profile local-ollama "Refactor these utility functions"
Running Your First Task
# Navigate to your project directory
cd ~/my-next-app
# Start an interactive Goose session
goose session
# Or run a one-shot task
goose run "Add a password reset flow. Create the API route, email template, and frontend form. Use the existing Supabase auth setup in src/lib/auth.ts."
Goose will:
- Read your codebase to understand the existing architecture
- Plan the implementation in a checklist it outputs before starting
- Execute each step, showing you what it's doing in real time
- Create a diff of all changes for your review before committing
The Model-Agnostic Architecture in Practice
Why does model-agnostic matter? Three reasons:
1. Cost optimization: You can route simple refactoring tasks to a cheaper model (Ollama + local Qwen) and complex architecture decisions to a frontier model (Claude Sonnet 4.6). Goose's profile system makes this routing explicit:
profiles:
frontend:
provider: anthropic
model: claude-haiku-4-5
cost: low
architecture:
provider: anthropic
model: claude-sonnet-4-6
cost: standard
local-private:
provider: ollama
model: qwen2.5-coder:32b
cost: free
2. Data privacy: For proprietary code, routing through a local Ollama instance means your code never leaves your machine. Commercial tools (Cursor, Copilot) send your code to third-party infrastructure. For enterprises with compliance requirements, local execution is sometimes the only viable option.
3. Vendor independence: As the model landscape evolves (Anthropic, OpenAI, Google, Meta, Qwen all releasing competitive models in 2026), Goose lets you adopt the best-performing model at any time without migrating your tool setup.
Performance Benchmarks (April 2026 Testing)
Early community benchmarks (from the Goose GitHub discussions, week of April 7, 2026) on the SWE-bench Verified test suite:
Goose + Claude Sonnet 4.6: 67.3% task completion rate
Goose + GPT-4o: 61.8% task completion rate
Goose + Qwen2.5-Coder: 54.2% task completion rate (local)
Cursor 3 Agent (Claude): 68.1% task completion rate
Claude Code (native): 72.4% task completion rate
Key insight: Goose + Claude Sonnet 4.6 is within 1% of Cursor 3 on agentic coding benchmarks. For most tasks, the open-source agent is competitive with the premium commercial product when using the same underlying model.
Goose vs. Cursor 3: When to Use Which
Use Goose when:
├── You want local execution for privacy
├── You want to switch models without switching tools
├── You need custom tool integrations (internal APIs, proprietary DBs)
├── You're on a budget and want to use cheaper/local models for routine work
├── You're building agent workflows programmatically (Goose has a Python SDK)
└── You're working in a compliance environment that restricts cloud tool usage
Use Cursor 3 when:
├── You want the best IDE-integrated experience (Cursor's editor is excellent)
├── You need the Agents Window for parallel agent execution (Goose is sequential)
├── You want the best out-of-box experience with no configuration
└── Your team is already Cursor-standardized and collaboration features matter
Both make sense when:
├── Use Cursor 3 for interactive development sessions
└── Use Goose for long-running automated tasks (CI/CD pipelines, batch refactors)
The Extension System
Goose's killer feature for advanced users is its extension API. You can write a Goose extension in Python and Goose will call it as a tool during task execution:
# goose_extensions/deploy.py
from goose import tool
@tool(description="Deploy the app to Vercel and return the deployment URL")
def deploy_to_vercel(project_dir: str) -> str:
import subprocess
result = subprocess.run(
["vercel", "--prod", "--yes"],
cwd=project_dir,
capture_output=True,
text=True
)
# Extract URL from output
for line in result.stdout.split('\n'):
if 'https://' in line:
return line.strip()
return result.stdout
With this extension registered, you can tell Goose: "Build the feature, run tests, and if they pass, deploy to Vercel." Goose will call your custom deploy tool at the right moment in the workflow.
Running Goose in CI/CD
One underutilized application: running Goose as an automated code quality agent in your CI pipeline:
# .github/workflows/goose-review.yml
name: Goose Code Review
on: [pull_request]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install Goose
run: curl -fsSL https://block.github.io/goose/install.sh | sh
- name: Run security review
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
goose run "Review the changed files in this PR for OWASP Top 10 \
security issues. For each issue found, output the file path, \
line number, issue type, and recommended fix. Output as JSON."
This gives you an automated AI security review on every PR — at the cost of a single Claude API call per PR rather than a $19/month tool subscription.
Common Challenges
'Goose is slower than Cursor for interactive work' — Yes. The CLI-first design means you're waiting for agent planning steps that Cursor integrates more smoothly in its IDE. Goose is better for long-running autonomous tasks than for interactive editing. Use it where speed of iteration matters less than autonomy and privacy.
'The local Ollama models are much worse than Claude' — Accurate for complex tasks. Qwen2.5-Coder at 32B is strong for refactoring and boilerplate but struggles with architecture-level reasoning. Use local models for routine tasks, frontier models for complex tasks. The profile system makes this explicit routing easy.
'I can't get Goose to stop and ask for confirmation before making changes' — Goose has a --checkpoint flag that pauses before each tool call for approval. For destructive operations (file deletion, git push), add them to your dangerous_tools list in config to always require confirmation.
'The extension API is complicated' — The Python SDK simplifies it considerably. The Goose documentation has a 'Getting Started with Extensions' tutorial that walks through a complete example in under 30 minutes. Start with a simple read-only extension before building write-capable ones.
Advanced Tips
Build a personal agent toolkit with Goose extensions: The extension system lets you encode your team's specific workflows as callable tools. A Goose extension for your internal deployment system, your feature flag service, or your documentation wiki makes Goose dramatically more useful than any generic commercial tool for your specific context.
Profile-based cost optimization: Track your Goose token usage by profile using the built-in goose stats command. Most developers find that 60-70% of their tasks can be handled by a cheap or local model, with only the remaining 30-40% needing a frontier model. At that ratio, Goose + mixed-model strategy costs significantly less than a Cursor Pro subscription for equivalent output.
Goose + Claude Code as a complement: The highest-performing setup in early testing is using Claude Code (native CLI, from Anthropic) for interactive development and Goose for longer-running autonomous batch tasks. Claude Code has better in-session awareness; Goose has better task persistence and model flexibility. The Vibe Coding Ebook Module 11 covers multi-tool agentic workflows including this combination.
Watch the Goose GitHub for the MCP integration: The Model Context Protocol adapter for Goose is in active development (as of April 2026). When it ships, Goose will be able to use the same tool ecosystem as Claude Desktop and other MCP-compatible clients — dramatically expanding what extensions are available without custom code.
Conclusion
Block Goose is the clearest signal yet that the open-source world has caught up to commercial AI coding agents — at least architecturally. The model-agnostic design, local execution option, and extension system address real limitations that commercial tools impose by design. For learners, Goose is worth understanding as the second major paradigm alongside Cursor/Copilot: not better for every use case, but offering genuine advantages in privacy, cost control, and customization. As the AI coding landscape continues to evolve rapidly in 2026, tool independence becomes more valuable, not less.
For a structured curriculum covering both commercial and open-source AI coding tools — including hands-on labs with Cursor 3, Claude Code, and Goose — visit Vibe Coding Academy. The updated Tool Landscape chapter (Chapter 5) in the Vibe Coding Ebook covers Goose alongside all major 2026 tools. Stay current with weekly AI coding tool news at EndOfCoding.