← Back to Agent Etna Agent Etna
Resources / Guide

Get started with Agent Etna

Agent Etna is the harness around an AI agent you have already built. It tests the agent, finds where it breaks, proposes fixes, and ships them back to your repository under the same review process the rest of your code already uses. This guide walks through the lifecycle, then through the actions a new user takes in the dashboard.

1. How Agent Etna works

The product is a closed loop. Each step happens in the place it makes sense to happen — the agent's repository, the simulator, the sandbox, the audit log — and each step's output becomes the next step's input.

1

Connect

Point Agent Etna at the agent. The supported path is a GitHub repository; a direct URL works if the agent is already running somewhere. No setup wizard, no JSON config to fill in.

2

Read the agent

Agent Etna parses the agent's source — system prompt, tools it can call, model it uses, any documented behaviour — and builds an internal map. This map is what the rest of the system reasons about.

3

Generate tests

The system prompt is treated as a specification. Tests are derived from it: refusals it should make, capabilities it should have, edge cases it should handle. The tests are not handwritten and not generic; they are derived from the agent's own description of its job.

4

Run the suite

The agent answers each generated conversation. Replies are scored on five dimensions (helpfulness, accuracy, safety, brand voice, conciseness). You see what passed, what failed, and the actual conversation in each case.

5

Propose a fix

For each failure Agent Etna identifies the most likely cause (system prompt, a tool, a parameter, missing context) and proposes a patch against the file where the issue lives. The diff is shown, not summarised.

6

Sandbox-verify

The patch runs against the same tests that surfaced the failure. If it does not pass, no pull request is opened. You see pass/fail on the patched version before being asked to approve anything.

7

Open a real pull request

Approved patches land as a real GitHub PR labelled agent-etna. Branch protection, CODEOWNERS, required reviewers and your existing CI run on it the same way they run on every other PR. The commit is cryptographically signed, tied to your account.

8

Learn

Every accepted fix, every rejected fix, every rollback sharpens the standard the next test run is scored against. The system you open on Monday is, measurably, sharper than the one you closed on Friday. None of the signal is wasted.

2. What Agent Etna actually does

The lifecycle above is the loop. In the dashboard it shows up as three modes you switch between:

3. Connect your first agent

Click + Connect Agent on the landing page. You'll see three options — pick the one that matches how your agent is hosted.

Option A

GitHub recommended

Best path if your agent's code lives in a GitHub repo. Agent Etna will:

  1. Open GitHub for you to authorize the app.
  2. Return to the modal with your repos auto-populated — pick one.
  3. Scan it for server.js, instructions.txt, or prompt.txt.
  4. Ask for an agent name, role, and the URL where the agent is running.
Option B

Desktop / Direct URL

Your agent is already running somewhere — on your laptop (http://localhost:3000), on a VPS, on Render. No repo needed. Paste the URL and Agent Etna will talk to it over HTTP.

What does Agent Etna expect from the agent?

An HTTP endpoint at /api/status for health, and /api/simulator/send that accepts {text, channel} and returns the agent's reply plus an optional pipeline trace. If you're building your own, mirror the shape — the simulator will light up instantly.

4. Open your agent

Back on the landing page, click the agent row. You'll land in the simulator. From the top-right you can switch between three modes:

5. Run a baseline

Inside the Test Suite tab you have three ways to get tests:

1

Baseline

Runs tests that already live in your agent's repo (exposed via the agent's /api/simulator/tests endpoint). Best if you've hand-written a suite.

2

From Instructions

Agent Etna reads your agent's instructions.txt, prompt.txt, or README.md, then produces 10 scenarios derived directly from what the agent is supposed to do. Each test shows the capability it probes and a one-line rationale tying it back to your instructions. This is the right place to start if you have nothing else.

3

New Tests

Freeform generation — 10 diverse, varied prompts. Useful for regression testing once you already have a baseline you trust.

6. Fix what breaks

When a test fails, expand it to see the agent's reply and the specific assertions that didn't pass. Hit Fix This and Agent Etna will:

  1. Pull the relevant source file from the agent's repo.
  2. Ask Claude for a patch that specifically addresses the failure without breaking anything else.
  3. Show you the diff.
  4. Commit to GitHub when you approve.

For a batch fix, click Fix All Failures after running a suite.

7. Add new powers to your agent

The Add Powers tab is a catalog of agentic capabilities you can teach your agent — voice, memory, web search, vision, tool use, RAG, and more. Pick one and Agent Etna reads your agent's instructions + code, then writes a tailored implementation guide:

The catalog covers eight high-leverage capabilities today: Voice Output, Voice Input, Long-Term Memory, Web Search, Vision, Function Calling / Tools, Scheduled Tasks, and RAG.

8. Adversarial testing

Click Security Scan inside Test Suite. Agent Etna generates 20 adversarial prompts covering prompt injection, data exfiltration, role confusion, social engineering, and authorization bypass. Each finding is categorised by severity so you can triage fast.

Use responsibly

Only run the security scan against agents you own or have explicit authorization to test. Agent Etna produces real adversarial payloads.

9. Troubleshooting

After GitHub sign-in I land on a blank dashboard

Usually a stale cookie or cached HTML. Hard-refresh with Cmd+Shift+R (or Ctrl+Shift+R) and try again. If the problem persists, open the service logs on your deploy host — they'll show GitHub OAuth callback URL and any Session save failed entries.

GitHub OAuth says redirect_uri_mismatch

The callback URL in your GitHub OAuth App must match <BASE_URL>/auth/github/callback exactly. Check your GitHub OAuth App settings and your BASE_URL env var.

"From Instructions" says no instructions found

Agent Etna looks for a file matching instructions, prompt, or README.md in the agent's repo. Add one — even a short paragraph describing what the agent does — and retry.


Terms of Service · Privacy Policy

© 2026 Agent Etna, Inc.