Argus.
The Core is the eyes. The Agent is the judgment. The boundary between them is the architecture.
A smart-transaction stack that watches Solana in real time, lands Jito bundles intelligently, tracks every submission across commitment levels, and delegates one decision — failure diagnosis — to an AI agent that reasons over the raw failure surface instead of running a script.
System architecture
Two runtimes, one contract. A Rust Core holds every deterministic, network-facing concern. A TypeScript Agent holds a single judgment call. They speak only over HTTP/JSON — and that process boundary is the challenge's required clean separation between AI layer and core stack, made literal.
Rust earns its place on the Yellowstone gRPC firehose: tokio gives idiomatic bounded-channel backpressure and reconnection — the one genuinely hard feature the brief names. TypeScript hosts the agent for LLM ergonomics and fast prompt iteration. Neither leaks into the other. ADR 0001
flowchart LR
subgraph EXT["External infrastructure"]
direction TB
YS["SolInfra Yellowstone gRPC
slot + tx streams"]
RPC["SolInfra RPC
blockhash · simulate"]
SE["Jito searcher gRPC
next leader"]
TF["Jito tip floor"]
BE["Jito block engine
sendBundle · 8 regions"]
OR["OpenRouter
model access"]
SOL["Solana mainnet"]
end
subgraph CORE["Core — Rust · the eyes"]
direction TB
STR["streaming"]
LDR["leader"]
TIP["tip"]
BDL["bundle"]
LFC["lifecycle"]
FAIL["failure / classify"]
ACL["agent_client"]
DB[("SQLite")]
end
subgraph AG["Agent — TS · the judgment"]
direction TB
HTTP["/decide · /health/"]
DEC["decide()"]
end
YS --> STR --> LFC --> DB
RPC --> FAIL
SE --> LDR
TF --> TIP
TIP --> BDL
LDR -.timing.-> BDL
BDL --> BE --> SOL
FAIL --> ACL
ACL -->|"raw failure surface"| HTTP --> DEC --> OR
ACL --> DB
Deployment
- Layout:
core/(Rust),agent/(TypeScript),docs/(ADRs + plan),logs/(SQLite + JSONL + Markdown Lifecycle Log). - Mainnet for the real path with SolInfra credits and a dedicated low-balance keypair — Jito only lands bundles on mainnet, and judges verify slots on explorers. Devnet is a sandbox only. ADR 0002
- One contract: a scored run hard-gates on the agent's
/healthand refuses to start if it is down.
Key components
Core (Rust) — deterministic, network-facing
| Module | Responsibility | Key surface |
|---|---|---|
streaming | Yellowstone slot + tx subscriptions; resilient driver with reconnect + backpressure. | track_lifecycle, resilient_subscribe |
leader | Next Jito leader window over gRPC — a soft timing signal, never a gate. | next_scheduled_leader |
tip | Base tip from the live Jito tip floor percentile; clamped to sane bounds. | fetch_tip_lamports |
bundle | All-or-nothing Jito bundle (payload + tip); 8-region concurrent submit. | build_bundle, submit_all_regions |
rpc | Blockhash, simulateTransaction, balance, aged blockhash for injection. | simulate_transaction, SimResult |
failure | Fault injection, baseline classification, remedy execution, the Policy seam. | classify_failure, apply_remedy |
agent_client | The one HTTP boundary: send the raw surface, receive the Decision. | AgentClient, Decision |
storage | SQLite source of truth; first-observation-wins stage stamps. | Store, record_decision |
export | Render the Lifecycle Log (JSONL + Markdown) purely from SQLite. | write_lifecycle_log |
Agent (TypeScript) — the single judgment
index.ts— Express service:GET /health,POST /decide(zod-validated), port 8787.decide.ts— the OpenRouter call: prompt, reasoning request,submit_decisiontool parse.types.ts— zod schemas mirroring the Rust types (snake_case).
Storage — the Lifecycle Log is the deliverable
submissions
run_id · attempt · nonce · signature · tip_lamports · landed_slot · processed_at · confirmed_at · finalized_at · failure_classdecisions
remedy · baseline_remedy · diagnosis · triage · rationale · confidence · reasoning_trace · modelA Run is a prefix, not a column: the session is run-{ts} and payload k runs under child run_id = run-{ts}-p{k} — unique keys, zero schema change. ADR 0011
Data flow between services
Happy path — submit, track, persist
The subscription opens before the bundle is sent, so inclusion is never missed; tracking is reconciled afterward against getSignatureStatuses in case a Landed frame is dropped.
sequenceDiagram
autonumber
participant O as Orchestrator
participant B as bundle
participant J as Jito
participant Y as Yellowstone
participant DB as SQLite
O->>B: build_bundle(payload, tip)
O->>DB: record_submission
O->>Y: subscribe slot + tx — before submit
Y-->>O: on_subscribed
O->>J: submit_all_regions
Y-->>O: Landed (slot)
O->>DB: set_landed_slot
Y-->>O: Processed → Confirmed → Finalized
O->>DB: mark_stage
The lifecycle, measured
One submission's progression on mainnet, with the real deltas from the graded run. Two adjacent deltas, two orders of magnitude apart, measuring different physics.
bundle sent to the Jito block engine
included in a Jito leader's slot — binary, detected on the tx stream
faulted bundles never reach inclusion → recorded with no slot
block replayed by a node
≥ ⅔ of stake voted on the slot
slot rooted and irreversible
123 ms vs 12.2 s. The first measures how fast votes propagate; the second waits for the chain to root. Same instrument, two different questions.
processed → confirmed ~123 ms (the sliver on the left) · confirmed → finalized ~12.2 s — about 100× longer.
Failure path — diagnose, remedy, resubmit
A Jito bundle is all-or-nothing, so a faulted transaction never lands and leaves no on-chain error. The one deterministic pre-submit source of truth is a preflight simulateTransaction — and that output is the raw surface handed to the agent.
sequenceDiagram
autonumber
participant O as injection_run
participant R as rpc.simulate
participant A as Agent → OpenRouter
participant DB as SQLite
O->>R: simulateTransaction
R-->>O: err · instruction_error · logs
O->>A: POST /decide (raw surface, no failure_class)
A-->>O: diagnosis · triage · remedy · trace
O->>DB: record_decision (agent + baseline)
alt remedy = abort
O-->>O: stop — no retry
else recoverable
O->>O: attempt-2 (fresh blockhash / raised CU)
end
Infrastructure decisions
Every decision is recorded as an ADR in the repo. The load-bearing ones:
| Decision | What & why | ADR |
|---|---|---|
| Mainnet, not devnet | Jito lands bundles only on mainnet; slots must be explorer-verifiable. SolInfra credits remove the cost argument; a low-balance keypair caps exposure. | 0002 |
| Streams, not polling | Inclusion from the tx stream, commitment from the slot stream. getBundleStatuses is a cross-check only. | 0004 |
| Dynamic tips | Base tip = a live tip-floor percentile (default p75), rotated across accounts — never hardcoded. The agent may raise it as a remedy; base tipping stays in Core. | 0005 |
| OpenRouter | OpenAI-compatible API normalizes reasoning traces and a submit_decision tool across providers — the model is env-configurable and rotatable. | 0006 |
| Jito bundles are scored | Real sendBundle, multi-region fan-out. A Jito auth UUID makes the engine forward bundles. Helius Sender is a keyless backstop, never the scored path. | 0007 |
| Leader via searcher gRPC | getNextScheduledLeader is gRPC-only; a minimal vendored proto avoids a conflicting SDK. Timing is a soft signal, never a gate. | 0008 |
| Stream resilience | A receive task feeds a bounded channel; exponential-backoff reconnect, a cumulative ceiling, and shed-and-count give genuine backpressure. | 0009 |
Failure handling strategy
Failure is the heart of the system — happy-path-only submissions are disqualified. Argus handles it on two axes: a bounded four-class baseline for remedy variation, and an unbounded program-error tail for diagnosis variation.
The bounded baseline — four classes
| Failure class | Induced by | Default remedy |
|---|---|---|
| Expired blockhash | Sign against a real blockhash aged ~200 slots (past the ~150 window) | refresh blockhash |
| Compute exceeded | CU limit set to 1, below need | raise CU limit (from re-simulation) |
| Bundle failure | Include a failing instruction | abort / rebuild |
| Fee too low | Tip below the live floor under contention | bump tip |
The unbounded tail — where a classifier goes blind
One identical malformed instruction — [0xff; 8], zero accounts — sent to three different real programs produces three distinct errors. The four-class baseline collapses all three to one verdict. The agent does not.
Click each failure below ↓ The baseline on the left never moves — it's blind. The agent on the right names a different cause every time.
The same verdict for all four — it can't tell them apart.
Retry, recovery, degradation
- Remedy execution stays in Core. The agent names the remedy; Core owns the magnitudes — e.g. the raised CU limit comes from a max-CU re-simulation, not a tuned constant.
- Attempt-2 is seeded clean, so a remedy is tested honestly rather than inheriting the injected fault.
- Loud degradation. If the agent is unreachable within ~45s, Core falls back to the baseline and records
model="local-fallback"— visible in the log, never silent.
AI agent responsibilities
The agent owns exactly one operational decision — failure diagnosis. It observes a failed transaction, reasons about why it failed, and decides what must change before retrying. Retry decisions come from the agent, not from hardcoded logic.
The contract
| Direction | Payload |
|---|---|
Core → AgentPOST /decide | error_text, instruction_error, failing_program_id, program_logs[], tip_floor_p50/p75, blockhash_age_slots, cu_limit, cu_used. No failure_class is sent. |
Agent → Coresubmit_decision | diagnosis (free text), triage, remedy, rationale, confidence — plus reasoning_trace and the serving model. |
Triage — the axis the agent reasons on
Any decision specifiable cleanly enough to grade is encodable as a classifier — legible ⟹ enumerable ⟹ lookup-replicable. Handing the agent a four-class verdict and a five-element remedy set is a 4→5 mapping a match replicates: the "simple wrapper" the brief disqualifies. The escape is a different input — the unbounded, unstructured raw failure surface. An AMM alone defines its own custom-error enum (Custom(6022) differs per program and version); a static classifier would need a combinatorial, perpetually-stale table. Reasoning over the raw surface does not.
The honesty boundary: on a permanent failure the agent and the baseline both abort — the agent's value there is the reason, not a different action. It is graded on the diagnoses a lookup can't produce, not on theatrical disagreement. ADR 0012
Operational evidence
From the graded mainnet run run-1781958744615, committed to the repo as logs/lifecycle-1781958744615.{md,jsonl}.
The graded run — every submission, explorable
All 15 real submissions from run-1781958744615. Faulted rows expand to the agent's diagnosis, triage, and full reasoning trace; landed rows show the commitment deltas drawn to scale.
Loading the run…
Four payloads the baseline collapses to one verdict drew four distinct diagnoses. The two recoverable injections — expired blockhash (aged 200 slots, conf. 0.99) and compute exceeded (cu_limit=1, conf. 0.99) — were triaged and landed on attempt 2 (slots 427724252, 427724375). Every decision carried a non-empty reasoning trace.
The three required questions, from this run
Q1 · What does the processed→confirmed delta tell you?
Vote-aggregation latency (≥⅔ stake voting) — consensus health, not inclusion speed. This run: 87–272 ms, median 123 ms. The next hop, confirmed→finalized, took ~12.2 s (rooting), two orders of magnitude larger.
Q2 · Why never use a finalized blockhash for a time-sensitive tx?
A blockhash is valid only ~150 slots (~60–90 s); a finalized one is already ~31 slots old on receipt — ~20% of the window burned before you submit. Shown directly: a blockhash aged 200 slots was rejected with BlockhashNotFound; recovery needed a fresh one.
Q3 · What if the Jito leader skips their slot?
The bundle is slot-specific and atomic — not included, not auto-forwarded, and no tip charged (tips pay only on inclusion). Resubmit to the next leader window with a fresh blockhash. All 6 faulted bundles here were sent free; the recoverable two landed on resubmission.
Decision record
Full context and consequences live in the repository under docs/adr/.
| # | Decision |
|---|---|
| 0001 | Two-runtime split: Rust Core + TypeScript Agent over HTTP |
| 0002 | Run the real path on mainnet, not devnet |
| 0003 | Agent owns Failure Reasoning (superseded by 0012) |
| 0004 | Confirmation via Yellowstone streams; bundle-status RPC is cross-check only |
| 0005 | Dynamic tips from the tip floor; Core sets base, Agent adjusts on failure |
| 0006 | Model access via OpenRouter, not a single-vendor SDK |
| 0007 | Jito bundles are the scored path; Helius Sender is a backstop |
| 0008 | Leader-window timing via a minimal gRPC searcher client |
| 0009 | Resilient subscriptions: bounded-channel backpressure + reconnect |
| 0010 | Deterministic classification via preflight simulation (amended by 0011, 0012) |
| 0011 | The Run: single-session orchestrator, Run-ID-prefix keying |
| 0012 | Agent owns Failure Diagnosis over the unbounded program-error tail |