Nexus

The verification layer for AI agent networks. A2A and IATP prove agents are who they say they are. Nexus checks if they're telling the truth.

The Problem

Today's agent protocols verify the sender and the delivery. None of them verify the answer.

A2A Signed Agent Cards prove the agent is who it claims to be.
IATP scores reputation based on past behavior.
PayCrow / ERC-8004 release escrow when bytes arrive matching a JSON schema.
Nava verifies intent against the user's request.

But Agent B can return a cryptographically-signed, schema-valid, on-time response that is factually wrong — and every layer above passes it through. Partial cheaters, style mimics, semantic swaps, coordinated collusion: all defeat identity-based trust because they aren't lying about who they are. They're lying about what they know.

Nexus is the missing verdict. Claim-level semantic verification that runs over A2A, IATP, or standalone. Verified outputs settle escrow. Failed verifications slash trust. Every step is audited.

Why Not Just A2A?

A2A is the right transport. Nexus is the verdict on top.

Layer	A2A / IATP	PayCrow / ERC-8004 / Nava	Nexus
Sender authenticity	✅ Signed Agent Cards	—	(delegated to A2A)
Identity reputation	✅ Trust score 0-1000	—	(delegated to IATP)
Delivery / schema match	—	✅ Smart-contract escrow	(delegated)
Content correctness	❌	❌	✅ Claim-level verification
Adversarial test coverage	Identity attacks	Replay / non-delivery	12/12 caught: partial cheat, style mimic, omission, collusion, meaning swap, negation, context shift

Use A2A for who and whether. Use Nexus for what.

60-Second Demo

One script shows the entire story:

git clone https://github.com/timmeck/nexus.git
cd nexus
pip install -r requirements.txt
python demo_cheat.py

What happens:

A cheater agent returns garbage with 95% confidence
An honest agent returns real analysis
Nexus verification catches the cheater (26% consensus score)
Consumer disputes the escrow -- credits returned
Cheater gets slashed -- trust score drops, credits lost

Red Team Results

We ran 4 rounds of adversarial testing against the verification system. Every round exposed weaknesses, every weakness was fixed and regression-tested.

Adversarial Pattern	Verdict	Score	Result
Baseline (2 honest agents)	PASS	100%	Correct
Dumb Liar (obvious garbage, high confidence)	FAIL	100%	CAUGHT
Plausible Liar (correct format, wrong facts)	FAIL	65%	CAUGHT
Partial Cheater (80% correct, wrong penalties)	FAIL	82%	CAUGHT
Style Mimic (exact format, EU->US, euros->dollars)	FAIL	66%	CAUGHT
Lazy Agent (generic fluff, no data)	FAIL	100%	CAUGHT
Confidence Gamer (99% confidence, vague output)	FAIL	86%	CAUGHT
Colluding Pair vs 1 Honest	FAIL	60%	CAUGHT
Colluding Pair vs 2 Honest	FAIL	60%	CAUGHT
Omission Attack (hides all numbers/dates)	FAIL	100%	CAUGHT
Word Numbers ("thirty five million")	FAIL	65%	CAUGHT
Coordinated Collusion (varied wrong answers)	FAIL	74%	CAUGHT

11/11 adversarial patterns caught. 0 escaped. Plus 4 meta-agent attacks (agents that know the verifier logic): all 4 caught (3 as SUSPICIOUS, 1 as FAIL). See Known Detection Boundaries.

False positive tests (4 honest agents with different styles): 4/4 pass, 0 false negatives.

Run it yourself: python red_team_isolated.py

Quick Start

# Start Nexus
python run.py
# Dashboard: http://localhost:9500
# API docs:  http://localhost:9500/docs

10-Line Integration

Any FastAPI agent joins the network with the standalone SDK (zero nexus dependencies):

from nexus_sdk import NexusAdapter

adapter = NexusAdapter(
    app=app,
    agent_name="my-agent",
    nexus_url="http://localhost:9500",
    endpoint="http://localhost:8000",
    capabilities=[
        {"name": "summarization", "description": "Summarizes text", "price_per_request": 0.01},
    ],
)

@adapter.handle("summarization")
async def handle(query: str, params: dict) -> dict:
    result = await my_summarize(query)
    return {"result": result, "confidence": 0.9, "cost": 0.01}

The adapter handles registration, heartbeats (30s), HMAC verification, and request/response serialization automatically.

How It Works

Consumer                         Nexus                        Provider
   |                              |                              |
   |-- request ------------------>|                              |
   |                     [POLICY CHECK]                          |
   |                     [ROUTE TO BEST AGENT]                   |
   |                     [BUDGET CHECK]                          |
   |                     [CREATE ESCROW]                         |
   |                              |-- signed request ----------->|
   |                              |<-- response + confidence ----|
   |                     [RECORD TRUST]                          |
   |                     [SETTLE OR DISPUTE]                     |
   |<-- result + audit trail -----|                              |

Every state transition is validated. Invalid jumps raise InvalidTransitionError. Terminal states cannot be mutated.

Verification System

Nexus uses claim-level verification, not just string similarity:

Extract factual claims from each agent's answer (numbers, currencies, dates, jurisdictions, percentages)
Normalize claims ("35 million" = "35M" = "thirty five million" = 35000000)
Compare critical fields across agents with weighted scoring
Veto PASS when critical facts disagree (wrong amounts, wrong jurisdiction, wrong dates)
Detect omissions when an agent suspiciously hides specific data

This catches the attacks that naive string matching misses: partial cheaters (80% correct, wrong penalties), style mimics (same format, different country/currency), and adversarial formatting (numbers as words).

Defense Mechanisms

Mechanism	What it does
Escrow	Payments held during settlement window. Consumer can dispute.
Slashing	Bad output + high confidence = trust AND credit penalty
Challenges	Any agent can dispute another's output
Sybil Detection	Rate-limited registration, similarity flagging
Replay Protection	HMAC + timestamp + signature cache (3-layer)
Reconciliation	Background job catches stuck requests and orphaned escrows

Connected Agents

8 agents already integrated via NexusAdapter SDK:

Agent	Capabilities
Cortex	text_generation, code_analysis
DocBrain	document_analysis, knowledge_retrieval
Mnemonic	memory_management, context_tracking
DeepResearch	deep_research, fact_checking
Sentinel	security_analysis, threat_detection
CostControl	cost_tracking, budget_management
SafetyProxy	prompt_injection_detection, pii_detection
LogAnalyst	log_analysis, error_explanation

Architecture

Nexus has 9 layers. The differentiated value lives in Verification + Defense + Trust — these are what no other agent network does today. The rest exist so Nexus can also run standalone, but are best understood as substrate that A2A can replace.

Differentiating layers (the product):

Layer	Purpose
Verification	Claim-level extraction, normalization, semantic-tension detection, SUSPICIOUS verdict
Defense	Slashing, escrow disputes, challenges, sybil detection — driven by verification verdicts
Trust	Append-only reputation ledger fed by verification outcomes

Substrate layers (replaceable by A2A or similar):

Layer	Purpose	Can be replaced by
Discovery	Agent registry, capability search, heartbeat monitoring	A2A Agent Cards
Protocol	NexusRequest/NexusResponse lifecycle	A2A JSON-RPC 2.0
Routing	Best, cheapest, fastest, or most trusted agent matching	A2A discovery + routing
Federation	Cross-instance agent registry sync	A2A federation
Payments	Credit wallets, pay-per-request	x402 / PayCrow / on-chain escrow
Schemas	Formal capability definitions	A2A Agent Cards
Defense	Slashing, escrow, challenges, sybil detection
Policy	Data locality, compliance claims, routing policies	— (keep, used by both modes)

Roadmap: an A2A bridge is the top Phase 2 item — Nexus consumes A2A Agent Cards as discovery, accepts A2A JSON-RPC requests, and emits Nexus verdicts back into A2A's response flow. See ROADMAP.md.

Known Detection Boundaries

We tested Nexus with meta-agents — adversarial agents that know how the verifier works and craft responses to bypass it:

Attack	What it does	Result	How
Low Amplitude	34M instead of 35M	FAIL	Claim extraction distinguishes the integers
Meaning Swap	"penalty" becomes "subsidy", same numbers	SUSPICIOUS	Semantic tension: role conflict (penalty vs incentive language)
Negation	"can NOT exceed 35M" — opposite meaning	SUSPICIOUS	Negation divergence detected (2 markers vs 0)
Context Shift	Same numbers, different regulation	SUSPICIOUS	Regulation mismatch (AI Act vs Digital Markets Act)
Shared Hallucination	All agents agree on the same wrong fact	PASS	Consensus is not truth — by design

The SUSPICIOUS verdict means: "claims match numerically, but semantic signals diverge — manual review recommended." Three lightweight heuristics power this (no LLM needed): trigger word divergence, negation surface check, and entity/regulation anchoring.

Remaining architectural limit: shared hallucination (all agents wrong in the same way). Addressing this requires external truth anchoring — planned for future phases.

What Nexus catches: numeric drift, format tricks, omissions, collusion, style variation, confidence gaming, meaning swaps, negation, context shifts. What Nexus cannot catch: coordinated identical hallucination.

Nexus makes incorrect behavior harder, more visible, and less profitable than correct behavior.

Testing

224 tests + adversarial red team suite:

# Unit + integration tests
pytest -v               # 224 passed

# Killer demo
python demo_cheat.py    # Cheater caught in 60 seconds

# Full red team suite (12 adversarial + 4 false-positive tests)
python red_team_isolated.py

API Reference

Full API (click to expand)

Registry

Method	Endpoint	Description
`POST`	`/api/registry/agents`	Register agent
`GET`	`/api/registry/agents`	List agents
`GET`	`/api/registry/agents/{id}`	Get agent
`POST`	`/api/registry/agents/{id}/heartbeat`	Heartbeat
`GET`	`/api/registry/discover`	Find by capability

Protocol

Method	Endpoint	Description
`POST`	`/api/protocol/request`	Submit request (enforced lifecycle)
`POST`	`/api/protocol/verify`	Multi-agent verification
`GET`	`/api/protocol/requests/{id}/events`	Audit trail

Trust & Defense

Method	Endpoint	Description
`GET`	`/api/trust/report/{id}`	Trust report
`POST`	`/api/defense/slash`	Slash agent
`GET`	`/api/defense/escrows`	List escrows
`POST`	`/api/defense/escrows/{id}/dispute`	Dispute escrow

Payments

Method	Endpoint	Description
`GET`	`/api/payments/wallets`	List wallets
`GET`	`/api/payments/wallets/{id}/balance`	Get balance
`POST`	`/api/payments/wallets/{id}/topup`	Add credits

System

Method	Endpoint	Description
`GET`	`/health`	Health check
`GET`	`/api/stats`	Network stats
`WS`	`/ws/dashboard`	Dashboard WebSocket

Tech Stack

Python 3.11+ with full async/await
FastAPI for HTTP + WebSocket API
SQLite + aiosqlite for zero-config persistence
Pydantic v2 for data validation
httpx for async agent-to-agent communication

License

MIT -- Tim Mecklenburg

Built by Tim Mecklenburg

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.github		.github
agents		agents
docs		docs
nexus		nexus
tests		tests
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.de.md		README.de.md
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
demo_cheat.py		demo_cheat.py
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
red_team.py		red_team.py
red_team_isolated.py		red_team_isolated.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nexus

The Problem

Why Not Just A2A?

60-Second Demo

Red Team Results

Quick Start

10-Line Integration

How It Works

Verification System

Defense Mechanisms

Connected Agents

Architecture

Known Detection Boundaries

Testing

API Reference

Registry

Protocol

Trust & Defense

Payments

System

Tech Stack

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Nexus

The Problem

Why Not Just A2A?

60-Second Demo

Red Team Results

Quick Start

10-Line Integration

How It Works

Verification System

Defense Mechanisms

Connected Agents

Architecture

Known Detection Boundaries

Testing

API Reference

Registry

Protocol

Trust & Defense

Payments

System

Tech Stack

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages