-
Notifications
You must be signed in to change notification settings - Fork 23
Open
Description
We have observed an error with DIA-NN 1.8.1 reading certain mzML files created with timsconvert:
E:\EXP24054_2024ms0528X1_A_GA1_1_27896.mzML(1) : parseOffset() 2: Syntax error parsing XML.
Based on the results of running DIA-NN with strace it appears this crash occurs attempting to read the <indexList> element, which is not found at the reported indexListOffset.
The issue with the mzML can be verified using tail:
$ tail -c 200 EXP24054_2024ms0528X1_A_GA1_1_27896.mzML
idRef="scan=51335">846541370</offset>
</index>
</indexList>
<indexListOffset>846547400</indexListOffset>
<fileChecksum>2a757d5a71d7aca0788a302e8b881e42d4045446</fileChecksum>
$ tail -c +846547400 EXP21063_2022bruker038bX25_A_BA4_1_757.mzML | head
ML>
<indexList count="1">
<index name="spectrum">
<offset idRef="scan=1">3832</offset>
<offset idRef="scan=2">143201</offset>
<offset idRef="scan=3">149715</offset>
<offset idRef="scan=4">156293</offset>
<offset idRef="scan=5">161496</offset>
<offset idRef="scan=6">167723</offset>
<offset idRef="scan=7">174251</offset>
The root cause of this error appears to be the update_spectra_count() function, which is not guaranteed to preserve byte offsets in the file.
We will provide a PR patching this issue.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels