Resources / Guide

Get started with Agent Etna

Agent Etna is the harness around an AI agent you have already built. It profiles the agent, runs simulations that probe where it breaks, proposes ranked growths (changes) to fix what it finds, and ships them back to your repository under the same review process the rest of your code already uses. This guide walks through the lifecycle, then through the actions you actually take in the dashboard.

1. How Agent Etna works

The product is a closed loop, run as a "cycle." Each step's output becomes the next step's input, and everything happens inside one chat-centric home view — there are no separate tabs to hunt through.

Connect

Point Agent Etna at the agent — a GitHub repository, or a direct URL if it's already running somewhere. No setup wizard, no JSON config to fill in.

Profile

Agent Etna reads the agent's own instructions, tools, and documented behaviour, and builds a capability map — what the agent is supposed to be able to do. This map is what the rest of the cycle reasons about.

Run a simulation

Agent Etna builds situations at the agent's capability frontier and runs them live, so you can watch the agent respond as it happens.

Score

Each situation gets a verdict, and the capability map updates — confirmed, still developing, or failing — with the scenario count behind each judgment, not just a single number.

Propose growths

For each gap the simulation surfaced, Agent Etna proposes a specific change — a new tool, an updated instruction, whatever the failure calls for — and ranks every proposal by impact, confidence, and effort, so you see what's worth building next, not just a pile of suggestions.

Sandbox-verify

Every accepted change proves itself first — built, run, and tested against the situation that motivated it, plus the agent's existing scenarios, before it ever reaches your repo.

Ship

Approve, and it opens a real GitHub pull request — branch protection, CODEOWNERS, required reviewers, and your existing CI all run on it like any other PR. One click merges it; one click rolls it back.

Learn

Every accepted change, every rejected one, every rollback sharpens what the next cycle proposes and how it ranks the backlog. None of the signal is wasted.

2. Finding your way around

The sidebar has: Home (your agent's chat and capability map), Sandbox (every change awaiting review, badged with a pending count), your list of connected Agents, any Groups you've set up for multi-agent coordination, and Settings / API keys / Team under your account.

From Home, the Tools menu opens focused views into a specific agent:

Architecture — triggers, files, externals, data stores.
Skills — recipes Etna can teach your agent.
Personality — how your agent talks and decides.
Security — scan for risks.
Hygiene — repo health and maintainability.
Consumption — token use, cost, and budgets.
Quality & plan — improvement plan and quality model.

3. Connect your first agent

Click Connect new agent in the sidebar (or Connect your first agent if you have none yet). You'll pick who you are, then how the agent is hosted.

Option A

GitHub recommended

Best path if your agent's code lives in a GitHub repo. Agent Etna will:

Search your repos, or let you type owner/repo directly.
Ask if the agent lives in a subfolder of that repo.
Ask for the agent's URL (if it's already running) and any keys or secrets it needs at runtime.

Option B

Desktop / Direct URL

Your agent is already running somewhere — on your laptop (http://localhost:3000), on a VPS, on Render. No repo needed. Paste the URL and Agent Etna will talk to it over HTTP.

4. Run your first simulation

Click Run a simulation — the chip sits right above the chat input on your agent's Home. A live progress card appears in the chat as the cycle moves through its steps; when it finishes you'll see the capability map (in the side rail once there's been a cycle) and any proposed growths, with the highest-priority one called out in a Build this next banner.

5. Review and ship a change

Each growth proposal shows its rationale plus three chips — Score, Confidence, Effort — so you can judge it at a glance instead of reading prose. From there:

Click Send to Sandbox (or Ship this from a chat card) to accept a proposal.
It lands in the Sandbox tab's Live list, shaped like the pull request it will become.
Click Ship it (merge) when you're ready — this is a separate, confirm-gated step from acceptance, so nothing reaches your main branch by accident.

6. Slash commands

Typing / in the chat input opens a menu of shortcuts: /simulation to run one, /agent for the agent's profile, /risks for the security view, /spec for its architecture, /check for a system health check, and /help for the full list.

Use responsibly

Only simulate and security-scan agents you own or have explicit authorization to test. Agent Etna produces real adversarial payloads when probing for risks.

7. Troubleshooting

After GitHub sign-in I land on a blank dashboard

Usually a stale cookie or cached HTML. Hard-refresh with Cmd+Shift+R (or Ctrl+Shift+R) and try again. If the problem persists, open the service logs on your deploy host — they'll show GitHub OAuth callback URL and any Session save failed entries.

GitHub OAuth says `redirect_uri_mismatch`

The callback URL in your GitHub OAuth App must match <BASE_URL>/auth/github/callback exactly. Check your GitHub OAuth App settings and your BASE_URL env var.

The profile step comes back thin

Agent Etna profiles your agent from whatever it can find — system prompt, README, tool definitions. If the profile looks sparse, add a short description of what the agent is supposed to do somewhere in the repo (an instructions, prompt, or README.md file all work) and run another simulation.