Skip to content

Environment Factory Guide

The Big Picture

Before Autonoma runs an E2E test, it needs two things:

  1. Data — a user account, some test records, whatever the test scenario requires
  2. Authentication — a way to log in as that user (cookies, headers, or credentials)

After the test finishes, everything gets cleaned up so the next test starts fresh.

Your job is to implement one endpoint that handles three actions:

ActionWhen it’s calledWhat you do
discoverWhen Autonoma connectsReturn a list of available scenarios (e.g., “standard”, “empty”)
upBefore each test runCreate data, generate auth credentials, return everything
downAfter each test runVerify the request is legitimate, then delete the data you created

That’s it. One endpoint, three actions, and Autonoma handles the rest.

Why “scenarios”?

Different tests need different data. A test for “empty state messaging” needs an org with zero data. A test for “pagination in the runs table” needs hundreds of runs. Instead of one giant seed, you define named scenarios — each one creates exactly the data its tests need.

How the Protocol Works

All communication is a single POST request with a JSON body. The action field tells your endpoint what to do.

Discover

Autonoma asks: “What scenarios do you support?”

Request fields:

FieldTypeDescription
action"discover"Always the string "discover"

Response fields:

FieldTypeDescription
environmentsarrayList of available scenarios
environments[].namestringScenario identifier (e.g., "standard", "empty")
environments[].descriptionstringHuman-readable description. Autonoma’s AI reads this to choose the right scenario
environments[].fingerprintstring16-character hex hash of the scenario’s data structure

Example:

→ POST /your-endpoint
{ "action": "discover" }
← 200 OK
{
"environments": [
{
"name": "standard",
"description": "Full dataset: users, products, orders...",
"fingerprint": "a1b2c3d4e5f67890"
},
{
"name": "empty",
"description": "Empty org, no data",
"fingerprint": "f0e1d2c3b4a59687"
}
]
}

Up

Autonoma says: “Create the standard scenario for test run run-abc123.”

Request fields:

FieldTypeDescription
action"up"Always the string "up"
environmentstringThe scenario name (must match one returned by discover)
testRunIdstringUnique identifier for this test run. Use it to make emails, org names unique

Response fields:

FieldTypeDescription
authobjectCredentials Autonoma uses to act as the test user
auth.cookiesarraySession cookies to inject. Each has name, value, httpOnly, sameSite, path
refsobjectIDs of everything you created. These come back verbatim in down
refsTokenstringA signed (JWT or equivalent) copy of refs
metadataobjectExtra info for Autonoma’s AI agent (email, role, org name, etc.)

Example:

// → POST /your-endpoint
{
"action": "up",
"environment": "standard",
"testRunId": "run-abc123"
}
// ← 200 OK
{
"auth": {
"cookies": [
{
"name": "session",
"value": "eyJ...",
"httpOnly": true,
"sameSite": "lax",
"path": "/"
}
]
},
"refs": {
"organizationId": "org_xyz",
"userId": "usr_abc",
"productIds": ["prod_1", "prod_2"]
},
"refsToken": "eyJhbGciOiJIUzI1NiIs...",
"metadata": {
"email": "test-user@example.com",
"scenario": "standard"
}
}

Down

Autonoma says: “I’m done with test run run-abc123. Here are the refs you gave me — delete everything.”

Request fields:

FieldTypeDescription
action"down"Always the string "down"
testRunIdstringSame test run ID from the up call
refsobjectThe exact refs object returned by up
refsTokenstringThe exact refsToken returned by up

Response fields:

FieldTypeDescription
successbooleantrue if teardown completed

Example:

// → POST /your-endpoint
{
"action": "down",
"testRunId": "run-abc123",
"refs": {
"organizationId": "org_xyz",
"userId": "usr_abc",
"productIds": ["prod_1", "prod_2"]
},
"refsToken": "eyJhbGciOiJIUzI1NiIs..."
}
// ← 200 OK
{ "success": true }

Before deleting anything, you must verify the refsToken and confirm it matches the refs in the request body. This prevents anyone from crafting a fake down request to delete arbitrary data.

Security Model

Three layers of security protect your endpoint:

Layer 1: Environment Gating

Your endpoint should not exist in production unless explicitly enabled. The simplest approach: return 404 when NODE_ENV=production (or your framework’s equivalent) unless you’ve set a specific override flag.

This is the first line of defense. Even if someone discovers the URL, it doesn’t respond in production.

Layer 2: Request Signing (HMAC-SHA256)

Every request from Autonoma includes a signature header:

x-signature: <hex-digest>

The signature is an HMAC-SHA256 of the raw request body, using a shared secret that only you and Autonoma know. Your endpoint must:

  1. Read the raw request body (before JSON parsing)
  2. Compute HMAC-SHA256 of that body using your shared secret
  3. Compare your result with the x-signature header
  4. Reject if they don’t match (return 401)

This guarantees every request actually came from Autonoma.

Layer 3: Signed Refs (for down only)

When up creates data, it signs the refs map into a JWT token (refsToken). When down receives the token back:

  1. Verify the JWT signature and expiry (24h)
  2. Decode the refs from inside the token
  3. Compare them with the refs in the request body
  4. Only proceed if they match exactly

This guarantees that down can only delete data that up actually created.

Error Responses

Use consistent error codes so Autonoma can handle failures gracefully:

SituationHTTP StatusError Code
Unknown action400UNKNOWN_ACTION
Unknown scenario name400UNKNOWN_ENVIRONMENT
up fails during creation500UP_FAILED
down fails during deletion500DOWN_FAILED
Invalid, expired, or mismatched refs403INVALID_REFS_TOKEN
Missing or invalid HMAC signature401(no code needed)

Response shape:

{ "error": "Human-readable description", "code": "ERROR_CODE" }

Implementing the Actions

Implementing Discover

This is the simplest action. It returns your list of scenarios with their metadata.

What to return for each scenario:

FieldTypeDescription
namestringIdentifier (e.g., "standard", "empty")
descriptionstringHuman-readable description. Autonoma’s AI reads this to choose the right scenario
fingerprintstringA 16-character hex hash of the scenario’s data structure
function handleDiscover():
scenarios = getAllRegisteredScenarios()
return {
environments: scenarios.map(s => ({
name: s.name,
description: s.description,
fingerprint: s.computeFingerprint()
}))
}
Implementing Up

This is where the real work happens. up receives a scenario name and a test run ID, and creates all the data.

Step by step:

  1. Find the scenario by name. Return 400 UNKNOWN_ENVIRONMENT if not found.
  2. Call the scenario’s up function, which creates all database records and collects their IDs into a refs map.
  3. Sign the refs into a JWT token (the refsToken).
  4. Create auth credentials — whatever your app needs to log in as the test user.
  5. Return everything: auth, refs, refsToken, metadata.

Important design decisions:

  • Every up creates a NEW isolated dataset. Use the testRunId to make names/emails unique (e.g., test-user-run-abc123@example.com). This allows parallel test runs without collisions.
  • Collect ALL created IDs into refs. You’ll need them for teardown.
  • Handle creation order carefully. Parent records must be created before children.
Implementing Down

down receives the refs map and the signed token, verifies them, and deletes everything.

Step by step:

  1. Verify the refsToken — decode the JWT, check it hasn’t expired (24h max), extract the refs.
  2. Compare decoded refs with request refs — they must match exactly. If someone sends a valid token but swaps the refs in the request body, reject with 403.
  3. Determine which scenario was used (from the refs structure, or store the scenario name in refs).
  4. Call the scenario’s down function, which deletes all records.
  5. Return { success: true }.

Scenario Fingerprinting

Each scenario has a fingerprint — a hash of its structural definition. It serves two purposes: drift detection and validation.

The problem it solves

You add a new field to your users table, but forget to update the scenario’s up function to populate it. Now your tests are running against incomplete data. The fingerprint catches this.

How Autonoma uses it

Autonoma stores the fingerprint from your last successful run. Before each new test run, it calls discover and compares fingerprints. If they differ, Autonoma knows the scenario data has changed and can re-analyze accordingly.

How to build it

  1. Define a descriptor object that mirrors the structure of what your up creates
  2. JSON-serialize it and hash with SHA-256
  3. Take the first 16 hex characters
descriptor = {
users: 4,
products: { count: 10, statuses: { active: 8, draft: 2 } },
orders: 5
}
fingerprint = sha256(JSON.stringify(descriptor)).substring(0, 16)

The key property: The fingerprint is computed from the same constants your up function reads. When you add a product, the descriptor’s count changes, and the fingerprint changes automatically.

Signed Refs — How Teardown Stays Safe

This is the most important security concept. Here’s the full flow:

┌── up ───────────────────────────────────────────┐
│ │
│ 1. Create org, users, products... │
│ 2. Collect IDs: refs = { orgId, userIds, ... } │
│ 3. Sign: refsToken = JWT.sign({ refs }, secret) │
│ 4. Return both refs AND refsToken │
│ │
└──────────────────────────────────────────────────┘
│ (Autonoma runs tests)
┌── down ─────────────────────────────────────────┐
│ │
│ 1. Receive refs AND refsToken │
│ 2. Verify: decoded = JWT.verify(refsToken) │
│ 3. Compare: decoded.refs === request.refs? │
│ NO → 403 INVALID_REFS_TOKEN │
│ YES → proceed to delete │
│ 4. Delete everything in refs │
│ │
└──────────────────────────────────────────────────┘

What this prevents:

AttackWhy it fails
Attacker sends fake refs with made-up IDsNo valid token → rejected
Attacker sends a valid token but changes the refsRefs don’t match token → rejected
Attacker replays a token from a week agoToken expired (24h) → rejected

No server-side state needed. The token itself is the proof.

Authentication Strategies

The auth object in your up response tells Autonoma how to log in as the test user.

Option A: Session Cookies (most common)

If your app uses cookie-based sessions, generate a session during up and return the cookies:

{
"auth": {
"cookies": [
{
"name": "session-token",
"value": "abc123",
"httpOnly": true,
"sameSite": "lax",
"path": "/"
}
]
}
}

Works with: NextAuth, custom JWT cookies, session stores, etc.

Option B: Bearer Token / Headers

If your app uses API tokens or bearer auth:

{
"auth": {
"headers": {
"Authorization": "Bearer eyJ..."
}
}
}

Works with: Auth0, custom API keys, OAuth tokens, etc.

Option C: Username + Password

If your app has a login page and you want Autonoma to log in through it:

{
"auth": {
"credentials": {
"email": "test-user@example.com",
"password": "TestP@ssw0rd123!"
}
}
}

Options A and B can be used together. Cookies or headers are preferred because Autonoma can use them directly without navigating a login page.

Writing Your Teardown Function

Teardown is where most bugs hide. Key rules:

Rule 1: Delete in reverse creation order

If up creates: org → users → products → orders, then down must delete: orders → products → users → org. Foreign key constraints enforce this.

Rule 2: Don't rely on ORM cascade behavior

ORMs have inconsistent cascade defaults. Explicit deletion in reverse order is always safer.

Rule 3: Handle circular foreign keys

If your schema has tables that reference each other, you can’t delete either table first.

Solution: Use raw SQL in a transaction to temporarily drop the FK constraint:

BEGIN;
ALTER TABLE components DROP CONSTRAINT components_default_version_id_fkey;
DELETE FROM component_versions WHERE org_id = $1;
DELETE FROM components WHERE org_id = $1;
ALTER TABLE components ADD CONSTRAINT components_default_version_id_fkey
FOREIGN KEY (default_version_id) REFERENCES component_versions(id);
COMMIT;
Rule 4: Handle nested/self-referential records

If a table references itself (e.g., folders with parent folders), delete children before parents:

DELETE FROM folders WHERE org_id = $1 AND parent_id IS NOT NULL;
DELETE FROM folders WHERE org_id = $1;

Testing Your Implementation

Write integration tests that cover the full lifecycle.

Happy Path Tests
TestWhat it verifies
discover returns scenariosCorrect names, descriptions, 16-char fingerprints
Fingerprints are stableCalling discover twice returns identical fingerprints
up creates dataQuery your database after up — verify entity counts
down deletes dataQuery your database after down — verify everything is gone
Full round-tripup → verify data exists → down → verify data is gone
Security Tests
TestWhat it verifies
Tampered tokenSend a random string as refsToken → expect 403
Mismatched refsSend a valid token but change the refs body → expect 403
Expired tokenCreate a token with past expiry → expect 403
Missing signatureSend a request without x-signature → expect 401
Invalid signatureSend a request with a wrong signature → expect 401
Error Handling Tests
TestWhat it verifies
Unknown action{ action: "explode" } → expect 400
Unknown environment{ action: "up", environment: "nonexistent" } → expect 400
Malformed bodySend non-JSON → expect 400

Manual Testing with curl

curl commands for discover, up, and down

Set your signing secret first:

Terminal window
export SECRET="your-signing-secret"
export BASE_URL="https://your-app.example.com"

Discover:

Terminal window
BODY='{"action":"discover"}'
SIG=$(echo -n "$BODY" | openssl dgst -sha256 -hmac "$SECRET" | sed 's/.*= //')
curl -s -X POST "$BASE_URL/api/autonoma" \
-H "Content-Type: application/json" \
-H "x-signature: $SIG" \
-d "$BODY" | jq .

Up:

Terminal window
BODY='{"action":"up","environment":"standard","testRunId":"manual-test-001"}'
SIG=$(echo -n "$BODY" | openssl dgst -sha256 -hmac "$SECRET" | sed 's/.*= //')
UP=$(curl -s -X POST "$BASE_URL/api/autonoma" \
-H "Content-Type: application/json" \
-H "x-signature: $SIG" \
-d "$BODY")
echo "$UP" | jq .
# Save for down
REFS=$(echo "$UP" | jq -c '.refs')
TOKEN=$(echo "$UP" | jq -r '.refsToken')

Down:

Terminal window
BODY=$(jq -n -c --argjson refs "$REFS" --arg token "$TOKEN" \
'{action:"down", testRunId:"manual-test-001", refs:$refs, refsToken:$token}')
SIG=$(echo -n "$BODY" | openssl dgst -sha256 -hmac "$SECRET" | sed 's/.*= //')
curl -s -X POST "$BASE_URL/api/autonoma" \
-H "Content-Type: application/json" \
-H "x-signature: $SIG" \
-d "$BODY" | jq .

Deployment Checklist

Before sharing your endpoint URL with Autonoma:

  • Production guard works — endpoint returns 404 in production (unless explicitly overridden)
  • Signing secret configured — the shared HMAC secret is set in your environment
  • JWT secret configured — used for signing refs tokens
  • discover returns correct data — scenario names, descriptions, and fingerprints
  • up creates all entities — spot-check counts in your database
  • Auth works — use the returned cookies/headers to navigate your app
  • down deletes all entities — no orphaned records left behind
  • down rejects bad tokens — tampered, expired, and mismatched refs return 403
  • Response times acceptableup < 30s, down < 10s
  • Integration tests pass

Troubleshooting

ProblemCauseFix
up fails with FK violationCreating child before parentCheck your creation order — parents first
down fails with FK violationDeleting parent before childCheck your deletion order — children first
down fails on circular FKTwo tables reference each otherDrop the constraint temporarily in a transaction
Signature verification fails locallySecret not set or wrong valueCheck your env vars match between client and server
Fingerprint changes between callsNon-deterministic data in descriptorRemove timestamps, random values from descriptor
openssl dgst output looks wrongDifferent OpenSSL versionsUse sed 's/.*= //' instead of awk '{print $2}'
Token expired immediatelyClock skew or wrong expiryCheck server time, ensure JWT expiry is 24h not -24h
Parallel tests collideSame email/name used across runsUse testRunId in all unique fields