This document defines the public dual-layer grading model used in the HaleES Architecture Specification.
The rubric combines:
- Gradient scoring (0–100) for evaluative detail
- Binary decision (0|1) for enforceable pass/fail gating
Each output is graded across five required categories:
- accuracy
- Correctness of content versus contract requirements and inputs.
- efficiency
- Resource-consciousness and avoidance of unnecessary complexity.
- constraint_adherence
- Compliance with explicit contract constraints and boundaries.
- quality
- Clarity, structure, usability, and professional completeness.
- timeliness
- Alignment to required response window or operational tempo.
All five categories must be present for a valid grading result.
- Each category score: 0 to 100
global_score: 0 to 100confidence: 0.0 to 1.0 (tracked independently)
Interpretation guideline:
- 90–100: excellent
- 85–89: acceptable pass band
- 70–84: near-pass but requires revision
- below 70: insufficient
Default public calculation uses equal weighting across the five categories:
[ \text{global_score} = \text{round}\left(\frac{accuracy + efficiency + constraint_adherence + quality + timeliness}{5}\right) ]
Rounding method should be consistent per implementation (recommended: nearest integer).
Decision threshold:
binary_decision = 1ifglobal_score >= 85binary_decision = 0ifglobal_score < 85
This threshold is the normative public baseline.
confidence is recorded separately from pass/fail.
- Confidence does not override the threshold rule.
- A high confidence score cannot convert a failing global score into pass.
- A low confidence score does not automatically fail a passing output, but may trigger human review depending on policy.
The grading model is intentionally dual-layer:
- 0–100 evaluates the degree of quality and compliance.
- 0|1 decides acceptance status.
Operational rule:
- No decision exists without scoring.
- No scoring matters without a decision.
This creates both explainability (why) and enforceability (what outcome).
When binary_decision = 0, system behavior should be:
- Append specific feedback by category.
- Re-run execution against the same contract (or a versioned amendment if approved).
- Re-grade and re-evaluate decision.
- Stop once pass is achieved or max iterations reached.
Default max iterations: 5.
If max iterations are exhausted without passing, mark result as fail and escalate per local policy.
The following sample is included as a normative example:
- accuracy: 92
- efficiency: 88
- constraint_adherence: 96
- quality: 90
- timeliness: 94
- global_score: 92
- confidence: 0.91
- binary_decision: 1
- result: PASS
{
"rubric_version": "halees-dual-layer-v1",
"category_scores": {
"accuracy": 92,
"efficiency": 88,
"constraint_adherence": 96,
"quality": 90,
"timeliness": 94
},
"global_score": 92,
"threshold": 85,
"binary_decision": 1,
"confidence": 0.91,
"result": "PASS",
"iteration": 2,
"feedback": []
}### Grading Result
- rubric_version: halees-dual-layer-v1
- accuracy: 92
- efficiency: 88
- constraint_adherence: 96
- quality: 90
- timeliness: 94
- global_score: 92
- threshold: 85
- binary_decision: 1
- confidence: 0.91
- result: PASS
- iteration: 2This rubric specifies public grading semantics and reporting structure only.
It does not disclose closed-source implementation details such as internal model routing logic, proprietary grading engines, hosted enforcement internals, or production runtime infrastructure.