Difficulty: Beginner | Duration: 15 minutes | Takeaway: Master gstack core usage and understand AI programming workflow design concepts
Target Audience
You're already using Claude Code to write code, but you feel a lack of a "team feel" when building products soloβno one to review your architecture, no one to run QA, and no one to vet your designs. Want to upgrade AI programming from "advanced autocomplete" to a "real engineering team"?
This article is for you. We'll start from installation and walk through a complete /office-hours β /ship workflow, letting you experience firsthand how gstack turns one person into a whole army.
TIP
The author of gstack is Garry Tan, President of Y Combinator. According to him, he and his team used these tools to deliver 600,000 lines of production code in the first two months of 2026, peaking at 10k-20k lines per day. The same tools are open-source and free.
Prerequisites: Basic knowledge of Git and command line, understanding of basic Claude Code usage.
Core Dependencies and Environment
| Dependency | Description | Minimum Version |
|---|---|---|
| Claude Code | AI programming tool, Download from official site | Latest |
| Git | Version control | Any |
| Bun | JavaScript runtime, used to compile gstack binaries | v1.0+ |
| Node.js | Required for Windows users only (resolves known Bun bugs on Windows) | Any |
Repository Address: https://github.com/garrytan/gstack
Operating Systems: macOS, Linux (Git Bash), Windows 11 (WSL or Git Bash)
Complete Project Structure
After cloning locally, gstack's core directory structure is as follows:
gstack/
βββ browse/ # Persistent headless browser engine (Playwright + custom CDP layer)
β βββ src/ # CLI + HTTP server + command implementation
β βββ dist/ # Compiled single-file binary (~58MB)
βββ office-hours/ # YC-style product dialogue, starts all design docs
βββ plan-ceo-review/ # CEO-level product strategy review (scope, positioning, priority)
βββ plan-eng-review/ # Engineering architecture review (data flow, state machines, test plans)
βββ plan-design-review/ # Design review (UI/UX scoring, Slop detection)
βββ design-consultation/ # Building design systems from scratch
βββ review/ # PR code review (auto-fix + issue grading)
βββ investigate/ # Systematic root cause debugging (Iron Law: No investigation, no fix)
βββ qa/ # QA testing + automated atomic commit fixes
βββ qa-only/ # QA report mode (report only, no code changes)
βββ cso/ # Chief Security Officer mode (OWASP Top 10 + STRIDE)
βββ ship/ # Release workflow (sync main β test β push β open PR)
βββ land-and-deploy/ # Merge PR β deploy β production health verification
βββ canary/ # Post-launch monitoring loop (error rates, performance regression)
βββ benchmark/ # Performance benchmarking (Core Web Vitals, page size)
βββ document-release/ # Post-release documentation sync
βββ retro/ # Retrospective meetings (supports global cross-project mode)
βββ codex/ # OpenAI Codex second opinion (cross-model analysis)
βββ browse/ # Browser automation (persistent Chromium, ~100ms/command)
βββ setup-browser-cookies/ # Import cookies from real browsers (Chrome/Arc/Brave/Edge)
βββ setup-deploy/ # Deployment configuration wizard (detects platform, URL, commands)
βββ careful/ # Dangerous command warnings (rm -rf / DROP TABLE, etc.)
βββ freeze/ # Lock editing scope to prevent out-of-bounds modifications
βββ guard/ # Combines careful + freeze, equivalent to maximum security mode
βββ unfreeze/ # Remove freeze
βββ autoplan/ # Automated chaining: Full CEO β Design β Engineering review process
βββ bin/ # CLI toolset (gstack-config / gstack-analytics, etc.)
28 skills covering the full lifecycle from product design β architecture review β development β security β QA β deployment β retrospective.
Step-by-Step Guide
Step 1 β Installing gstack
Global Installation (Recommended)
Open Claude Code and paste the following command into the chat box; Claude will automatically complete the remaining steps:
git clone https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup
The ./setup script will:
- Detect the system environment (Bun / Node / Git)
- Compile the
browse/dist/browsebinary (bun build --compile, approx. 10 seconds) - Register all 28 skills with Claude Code
After installation, you need to tell Claude Code about the gstack configuration. Add the following content to the CLAUDE.md file in your project root:
## gstack
Use /browse from gstack for all web browsing. Never use mcp__claude-in-chrome__* tools.
Available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review,
/design-consultation, /review, /ship, /land-and-deploy, /canary, /benchmark, /browse,
/qa, /qa-only, /design-review, /setup-browser-cookies, /setup-deploy, /retro,
/investigate, /document-release, /codex, /cso, /autoplan, /careful, /freeze, /guard,
/unfreeze, /gstack-upgrade.
Project-level Installation (For teammates)
If you want team members to automatically get gstack after a git clone:
cp -Rf ~/.claude/skills/gstack .claude/skills/gstack && rm -rf .claude/skills/gstack/.git && cd .claude/skills/gstack && ./setup
WARNING
Project-level installation commits actual files to the repository (not as a git submodule), and the .git/ directory will be deleted. Ensure you understand this before proceeding.
Verify Installation
# Check if binary exists
ls ~/.claude/skills/gstack/browse/dist/browse
# Check if skills are successfully registered, type in Claude Code:
/help
# You should see gstack-related skills in the list of available commands
Step 2 β Running Your First Complete Workflow
This is the core usage of gstack: 5 commands covering the full process from product design to release. Let's run it in a real local project.
2.1 Define Product with /office-hours
/office-hours
gstack starts YC Office Hours modeβit asks you 6 mandatory questions to reframe your product thinking. The key is to voice real pain points, not just describe features.
Example Dialogue:
You: I want a feature that generates a daily briefing based on my calendar every morning.
gstack: I'm reframing your thinking. You're talking about a "briefing generator," but the pain points you've described are actually:
1. Multiple Google Calendars with stale info
2. Preparation time is too long
3. Results aren't good enough
You're actually building a "Personal Chief of Staff AI."
Should we re-define the scope? Or tell me why the original scope was correct?
After the design dialogue, gstack generates a design document that automatically flows into subsequent review skills.
2.2 Review Product Strategy with /plan-ceo-review
/plan-ceo-review
gstack reads the design document from the previous step and performs a four-quadrant review from a CEO/Founder perspective:
- Expand: Where to expand the scope?
- Selectively Expand: Keep the core, what to cut?
- Maintain Scope: Current size is just right
- Contract: Focus on the narrowest wedge
It also provides 3 implementation paths, each marking the Man-Days vs. AI-Time comparison:
| Implementation Path | Manual Estimate | AI+gstack Estimate |
|---|---|---|
| MVP (Narrowest Wedge) | 2 Weeks | 1 Day |
| Full v1.0 | 3 Months | 1 Week |
2.3 Review Code with /review
Once the code is written, run this on your branch:
/review
gstack will:
- Auto-fix obvious issues (e.g., unhandled edge cases)
- Pop up an
AskUserQuestionfor complex issues to get your input - Output a graded report:
[AUTO-FIXED]/[ASK]/[WARN]
Example Output:
[/review] Analysis complete:
[AUTO-FIXED] 2 items
- src/api/calendar.ts:67 β Missing Promise reject handling
- src/utils/date.ts:12 β Hardcoded timezone offset
[ASK] Race condition in refresh loop β Recommended fixes:
A) Add mutex lock (safer, +15 lines)
B) Add debounce (simpler, +5 lines)
Recommendation A to comprehensively handle concurrency scenarios.
2.4 Test Your App with /qa
# Replace with your staging URL
/qa https://staging.myapp.com
gstackβs /qa launches a persistent Chromium browser (reusing cookies and login state) to automatically:
- Open the page and take screenshots
- Click through critical paths (Login β Core Feature β Edge Scenarios)
- Find bugs β Atomic commit fixes β Re-verify
- Generate regression tests for each fix
Difference from standard automated testing: Standard tests run assertions you've already written. /qa proactively seeks scenarios where you haven't written assertionsβempty states, error states, and concurrency boundaries.
2.5 Release with /ship
/ship
/ship executes the full release pipeline:
# Equivalent to manual operations:
git checkout main && git pull
git merge your-branch
pnpm test # Or your test command
pnpm build
git push
gh pr create # Automatically opens PR with change summary
Each step has a checklist and only proceeds if passed. If an issue is encountered, it stops and waits for your decision.
Step 3 β Deep Dive into Core Skills by Stage
3.1 Design Phase: office-hours and plan series
The core of this phase is turning "ideas" into "design documents" that all subsequent steps can read.
/office-hours β Not a brainstorm, but a product reframing. gstack will:
- Identify unspoken real needs (back-solving from described pain points)
- Challenge your default assumptions
- Generate an implementation path with specific milestones
/plan-ceo-review β Reads design docs to judge based on market size, competitive landscape, and priority.
/plan-eng-review β Locks down the technical architecture:
- ASCII data flow diagrams
- State machine design
- Failure path analysis
- Test matrix (at least one test per branch)
TIP
Each plan skill outputs Markdown files to the project root. These files are inputs for downstream skillsβ/review reads architecture decisions, and /qa reads test plans.
3.2 Development Phase: review and investigate
/review core logic is to "find bugs that pass CI but explode in production." It doesn't replace a linter; it performs semantic analysis for:
- Logical errors (missing conditions, unbalanced branches)
- Concurrency issues (race conditions, deadlock patterns)
- Security risks (injection, auth bypass)
/investigate is dedicated to debugging. When a bug keeps recurring, run:
/investigate
gstack will:
- Propose hypotheses
- Verify them one by one (no fixing, just investigating)
- Stop once the root cause is found (Iron Law: No fixing without investigation)
3.3 Release Phase: ship, land-and-deploy, canary
/ship # After code review passes β Open PR
/land-and-deploy # Merge PR β Deploy β Verify production health
/canary # Monitor for 30 minutes after deployment
/canary monitors three metrics:
- Console Error Rate: Are JavaScript exceptions rising?
- Performance Regression: Has Core Web Vitals worsened?
- Page Failure Rate: Are critical pages returning 500s?
If any metric is abnormal, it auto-alerts and can trigger a rollback.
Step 4 β Browser Automation in Practice
/browse is gstack's most unique capability: it runs a long-lived Chromium process that receives commands via a local HTTP API, with an average response time of ~100ms, supporting true login state reuse.
4.1 Basic Usage
# Open a page
$B goto https://example.com
# View interactive elements (auto-numbered @e1, @e2 ...)
$B snapshot -i
# Click the 3rd element
$B click @e3
# Fill out a form
$B fill @e1 "[email protected]"
$B fill @e2 "password123"
# Screenshot
$B screenshot /tmp/result.png
IMPORTANT
The first call auto-starts Chromium (~3 seconds), and subsequent commands take ~100-200ms. The browser auto-closes after 30 minutes of inactivity.
4.2 Verifying UI Changes
# Baseline snapshot
$B snapshot -i
# Perform action
$B click @e5
# Diff comparison (shows only what changed)
$B snapshot -D
The -D flag outputs in unified diff format, telling you which elements appeared, disappeared, or changed content after an operation.
4.3 Testing Login States
Import cookies from a real browser:
# Interactive selector (macOS supports Chrome / Arc / Brave / Edge)
$B cookie-import-browser
# Specify domain directly
$B cookie-import-browser chrome --domain .github.com
WARNING
Cookie import relies on the macOS Keychain; a system authorization dialog will pop up the first time. gstack does not access silentlyβthe user must manually click "Allow." Cookie values are decrypted in memory and injected into the browser without touching the disk.
4.4 Responsive Testing
# Generate screenshots for mobile, tablet, and desktop simultaneously
$B goto https://yourapp.com
$B responsive /tmp/layout
# Output: layout-mobile.png, layout-tablet.png, layout-desktop.png
Step 5 β Team Collaboration and Advanced Usage
5.1 Vendored Installation Explained
Global installation is for personal use; Vendored installation allows team sharing:
# Execute in project root
git clone https://github.com/garrytan/gstack.git .claude/skills/gstack
cd .claude/skills/gstack
./setup
Key points:
- You are installing a snapshot, not a submodule
- The
.gitdirectory is deleted, so it won't pollute your repo history - Upgrading: Each member reruns
./setupafter pulling the latest
5.2 Parallel Sprints and Conductor
gstack's structure naturally supports parallelism. A single Claude Code is one "person," but multiple instances can run different sprint stages in parallel.
Conductor (officially recommended by gstack) can launch multiple Claude Code instances simultaneously, each running in an independent workspace:
- Instance A:
/office-hoursre-defining the product - Instance B: Implementing a specific feature
- Instance C:
/reviewchecking another branch - Instance D:
/qatesting the staging environment
All instances share the same Git repository. No conflicts ariseβGitβs branching model provides natural isolation.
5.3 Safety Mode
Enable safety guardrails when handling production code:
/guard # = /careful + /freeze
/careful: Dangerous commands (rm -rf,DROP TABLE,git push --force) will trigger confirmation pop-ups/freeze: Limits file editing scope to specified directories to prevent accidental production code changes during debugging/unfreeze: Lifts the freeze
Troubleshooting FAQs
1. Skills Not Showing Up
# Go to the gstack directory and run setup manually
cd ~/.claude/skills/gstack && ./setup
If binary compilation fails, check your Bun version:
bun --version # Needs v1.0+
For macOS/Linux installation issues:
# Ensure directory exists
mkdir -p ~/.claude/skills/
# Ensure execution permissions
chmod +x ~/.claude/skills/gstack/setup
2. /browse Fails to Start
# Manually build the binary
cd ~/.claude/skills/gstack/browse
bun install
./setup
If the Bun version is too low:
# Upgrade Bun
curl -fsSL https://bun.sh/install | bash
3. Codex Error "Skilled loading invalid"
Codex's skill description cache has expired. Fix:
# For globally installed Codex
cd ~/.codex/skills/gstack && git pull && ./setup --host codex
# For Vendored installation of Codex
cd "$(readlink -f .agents/skills/gstack)" && git pull && ./setup --host codex
4. Windows Compatibility
gstack on Windows relies on WSL or Git Bash (not PowerShell). Ensure:
# Run in Git Bash or WSL
# Both Bun and Node must be in PATH
which bun # Should have output
which node # Should have output
On Windows, Playwright has a known bug (bun#4253); gstack will automatically fall back to Node.js execution.
5. Cookie Import Failed
Cookie import is supported only on macOS (using system Keychain). Linux and Windows are currently not supported.
If you need to test pages requiring login on Linux, you can manually export cookies as JSON:
[
{"name": "session_token", "value": "...", "domain": ".example.com", "path": "/"}
]
Then use:
$B cookie-import /path/to/cookies.json
6. Version Upgrades
# Option 1: Run the upgrade skill
/gstack-upgrade
# Option 2: Manually pull latest code
cd ~/.claude/skills/gstack && git pull && ./setup
gstack-upgrade will detect if it's a Global or Vendored installation and handle it accordingly.
Extended Reading / Advanced Directions
ETHOS.md β Core philosophy document of gstack. The "Boil the Lake" principle: AI makes the marginal cost of full implementation near zero; don't settle for "good enough." Also, "Search Before Building": A three-layer knowledge systemβverified solutions (Layer 1), trending new solutions (Layer 2), and First Principles (Layer 3).
Autoplan β A single command to chain CEO Review β Design Review β Engineering Review. If you don't want to call skills one by one, use /autoplan; it runs all plans in order, popping up for confirmation at key decision points.
Conductor β gstack's official multi-session parallel management tool. Organize multiple Claude Code instances into a real AI engineering team, suitable for advancing multiple features simultaneously.
Source Code Reading Path:
browse/src/commands.tsβ Registry for all browser commands, the single source of truthbrowse/src/snapshot.tsβ Core implementation of the Ref system (ARIA tree β Playwright Locator)scripts/gen-skill-docs.tsβ SKILL.md automatic generation pipelineARCHITECTURE.mdβ Complete architectural design document
TIP
gstack follows the SKILL.md standard and can be used with any compatible AI Agent (Codex, Cursor, etc.). If you use multiple AI programming tools, ~/.codex/skills/gstack and ~/.claude/skills/gstack can coexist without interference.