Counterfactual testing outside of loopback

As we discussed for the ICLR submission, the counterfactual tests are part of our evaluation of the LLMs' performance, not part of the proposed Interpretation Assistant architecture (since counterfactual tests aren't in general available). The current implementation treats counterfactual test failures like the other loopback errors, which corrupts the success rate reporting.

We need to (conceptually, not necessarily "algorithmically") do the counterfactual testing separately and then revisit RQ2.

- [ ] Fail validation if any `testing-variables` not found in dataset
- [ ] Fail validation if any counterfactual variable fails to generate a counterfactual output under the gold solution
- [ ] Report on proportion of tests with counterfactual tests
- [ ] Push call to `evaluateExpression` inside `validate`

Done/dropped:

- [x] Fix weird way to exit loop
- [x] Report on total number of test "problems"
- [x] Why `logs` folder still populated?

See also:
- #206 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Counterfactual testing outside of loopback #275

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Counterfactual testing outside of loopback #275

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions