An exchange-rate oracle
for AI agent labor.

Empirical reference rates for AI agent classes. Patent-pending methodology. Currently in private testing with eight frontier models across five task domains.

What it looks like

Trading-card view of one Agent Class on a single Task Domain. Live exchange rates and quality scores updated daily. Full preview behind authenticated access.

How it works

Cross-platform blind grading
Each platform scores its peers without knowing whose output it's grading. Self-grading bias is removed by construction.
Multiplicative capability coupling
Efficiency acts as a coupling constant on the base value, not an additive factor. Doubling efficiency more than doubles value.
Adaptive sampling
Per-platform, per-domain sampling rates self-adjust based on confidence interval width. Stable performers get sampled less; volatile ones more.
Real-time exchange rates
Pair rates derive from rolling ALU values, FOREX-style. Each platform always has a current rate from its rolling EWMA.
Agent Class as the unit of comparison
Scoring is computed per (Agent Class × Task Domain). An Agent Class is a peer group of agents that compete for similar work — currently the Budget and Premium tiers across Anthropic, OpenAI, Google, and xAI. The framework generalizes to any meaningful grouping as the field expands.
Glossary of terms

Foundational

ALU (Agent Liquidity Unit)
The composite score that prices an agent class on a specific task domain. Combines five base factors with a multiplicative capability efficiency term. Higher ALU means more value per unit of agent labor.
Agent Class
A peer group of AI agents that compete for similar work and are measured the same way. Currently spans tier (Budget vs Premium) but generalizes to any meaningful grouping — capability bracket, model family, or specialty. The framework scales from 8 frontier models today to N models across M domains tomorrow.
Domain (Task Domain)
The category of work being scored. Current domains: code generation, legal, financial, research, general. Scoring is computed per (Agent Class × Domain) pair, not at the model level.
Iteration
A single scoring event — one prompt sent to all active agents in a class, graded across the methodology, recorded once. Daily auto-runs accumulate iterations over time.

The five base factors

Quality
How good is the answer? Composite of accuracy, completeness, reasoning, format quality, and efficiency. Cross-platform blind-graded to remove self-grading bias.
Trust
Earned reputation. Slow-moving signal that takes 50–100 observations to meaningfully shift. Distinct from Quality, which is the fast-moving recent-performance signal.
Demand
Marketplace request volume. Currently held neutral pending real marketplace integration; the framework supports it as a first-class factor.
Scarcity
How much an agent's domain performance exceeds the field's rolling average. Earned through measured performance, not assumed.
History
Trend signal. Are token values rising, declining, or steady over time?

The CE innovation

Capability Efficiency (CE)
Quality-per-token, normalized against the field. Acts as a coupling constant on the base value, not an additive factor. The patent's central claim.
Coupling constant (vs additive factor)
Multiplicative coupling means doubling efficiency more than doubles value. Treating efficiency additively dilutes that signal. The distinction is patent-relevant.

Reliability composition

Reliability
The composed reliability number on each card. Equals trust × reliability rate. The number that actually feeds ALU.
Reliability rate
Fraction of recent API calls that returned successfully. Real uptime measurement, not assumed.
Effective trust
Internal name for the composed Reliability number — kept distinct in the API for audit.

Methodology pillars

Cross-platform blind grading
Each platform scores its peers without knowing whose output it's grading. Self-grading bias is removed by construction.
Hybrid scoring
Composite blend of deterministic rubric checks with cross-grade evaluations. Avoids both mock-only data and pure-rubric heuristic noise.
Adaptive sampling
Per-platform, per-domain sampling rates self-adjust to confidence interval width. Three phases: probation (full grade for first 10 obs), established (10–30%), trusted (5–15%).
Anomaly detection
3-sigma threshold with a 2-flag confirmation window. Confirmed anomalies escalate the platform back to full grading and apply a trust penalty.
Bias correction
Per-grader rolling deviation tracking with statistical significance gating. Credibility penalties self-calibrate with sample volume.
EWMA
Exponentially weighted moving average. The core statistical tool. Quality uses a fast EWMA; Trust uses a slow EWMA. Same input signal, different frequencies.

Output artifacts

Exchange rate
Pair ratio between two agent classes' ALU values. Like FOREX — expresses how much of A's labor you get per unit of B's.
Latency p50 / p95
Round-trip API response time at the 50th and 95th percentiles. p95 is the realistic worst case for agent routing decisions.
Methodology version
Stamp identifying which methodology version produced data. Bumped when scoring weights, cross-grade protocol, or ALU formula changes.

IP & process

Provisional patent
USPTO filing that establishes priority date. Two filed April 16, 2026.
Trade secret coefficients
Specific weights, exponents, and parameter values intentionally not disclosed publicly. Available under NDA.
Field-relative
Scoring that is normalized against the active field of agents at any moment. Adding a new agent shifts everyone's relative position. Distinct from absolute scoring.

Intellectual property

Two provisional patents filed at the United States Patent and Trademark Office — April 16, 2026.

Co-inventors: James Gill, David Schwartz.
Joint IP Ownership Agreement executed.
Non-provisional conversion pending Q1 2027.

Request access

Authenticated visitors can view a fuller methodology preview with redacted sample data. Investors and IP attorneys: please include your firm in the message body.

Send Access Request

Or write directly to james.gill@agentvaluelab.com.