Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the LLM document parsing logic by introducing flags to track parsing success and the presence of errors, which are then used to determine the final status. Feedback suggests improving robustness by verifying that the 'errors' field is a list to avoid potential TypeErrors and refining the output labels and reasons for cases where parsing fails or no errors are detected.
| errors = result_data.get("errors", []) | ||
| parse_ok = True | ||
| errors_nonempty = len(errors) > 0 |
There was a problem hiding this comment.
The current implementation is not robust against unexpected JSON structures from the LLM. If the errors field is not a list (e.g., a string or null), len(errors) will raise a TypeError, and the subsequent loop will also fail. It's safer to verify that errors is indeed a list before proceeding.
| errors = result_data.get("errors", []) | |
| parse_ok = True | |
| errors_nonempty = len(errors) > 0 | |
| errors = result_data.get("errors", []) | |
| if isinstance(errors, list): | |
| parse_ok = True | |
| errors_nonempty = bool(errors) | |
| else: | |
| errors = [] |
|
|
||
| result = EvalDetail(metric=cls.__name__) | ||
| result.status = False | ||
| result.status = (not parse_ok) or errors_nonempty |
There was a problem hiding this comment.
The updated logic for result.status correctly identifies issues, but the subsequent code for constructing result.label has two problems:
- If no errors are found (
statusisFalse), the label becomes["."]becausetypesandnamesare empty. It should ideally be["QUALITY_GOOD"]. - If parsing fails (
parse_okisFalse), the label also becomes["."], which is not descriptive.
Returning early for the success case and setting a specific label for parsing errors would improve the output quality.
| result.status = (not parse_ok) or errors_nonempty | |
| result.status = (not parse_ok) or errors_nonempty | |
| if not result.status: | |
| result.label = ["QUALITY_GOOD"] | |
| result.reason = [json_str] if 'json_str' in locals() else [response] | |
| return result | |
| if not parse_ok: | |
| types, names = ["QUALITY_BAD"], ["ParseError"] |
No description provided.