Is it GLM-OCR or PP-DocLayoutV3

Hello,

I’m encountering issues with document information extraction and would like to clarify whether the problem could be related to the OCR stage or to the PP-DocLayoutV3 component.

I am working with complex industrial documents, and I have tested the pipeline across more than 30 systems/models without success. In all cases, the extraction results are either incomplete or inaccurate.

Could you please confirm:

* Whether extraction failures are more likely caused by OCR inaccuracies or by limitations in PP-DocLayoutV3?
* If there are any recommended debugging steps or diagnostics to help isolate whether the issue originates from OCR vs layout detection?
* Whether PP-DocLayoutV3 has known limitations when handling highly complex or dense industrial documents?

Any guidance on how to better troubleshoot or improve performance in such scenarios would be greatly appreciated.

Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it GLM-OCR or PP-DocLayoutV3 #178

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Is it GLM-OCR or PP-DocLayoutV3 #178

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions