-
Notifications
You must be signed in to change notification settings - Fork 290
Fix unwanted text‑node merging on libxml2 2.12+ by manually updating doc pointers #573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I will be adding tests for this ticket in the next few hours sorry. |
7643fdd to
ec641d9
Compare
|
tests have been pushed |
|
I'm not familiar with libxml2, but the explanation of the change here sounds reasonable, so this looks good in principle. |
Ah ok I will work on this thank you. |
ec641d9 to
4a3fd62
Compare
|
Ok @rfm I have tried this on different linux platforms and I believe this fixes the issue. I can't seem to be able to run the CI so can you pleas run it for me? |
|
Looks like it's fine on linux but failing on windows :-( |
|
Ah ok. Thank you so much for trying I need to setup a Windows environment to mirror your CI environment. I will start on that now. I really appreciate you being patient there is actually several bugs in newer libxml2 versions so I guess I'm not suprised. |
|
Thanks for working on this. It's good to see someone knows what they are doing with libxml2 :-) |
4a3fd62 to
f9273f3
Compare
Of course I don't mind! I believe I have fixed this now I tested all Windows platforms except MSVC because I don't have any of that setup. I will verify all the linux builds tomorrow morning before we run the checks again. Turns out Windows is on an even later version of Libxml2 and it has problems with namespace prefixes. Kind of amazing how many bugs different versions of libxml have. |
|
Looking at the logs, I think what's still wrong with the testcases still is the use of the non-portable exception handler code '@Try {...} @catch {...}' |
f9273f3 to
e3dde8c
Compare
|
Ah ok I have made that change. I did not realize that was the difference between these handlers. Thank you for that. |
|
FYI NeXTstep exception handling used those macros to wrap setjmp() calls for exception handling. Newer compilers added support for exception handling as an optional extension to the language itself. But if you are using an older compiler, or a compiler where the appropriate flag to turn on the language extensions has not been used, then you need the older mechanism. |
Ah ok thank you for clarifying that it makes sense. I believe all the checks have passed now thank you so much @rfm. |
rfm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good now, thanks.
|
One note ... the testcases show that the code contains some memory leaks; bugs in libxml2 I expect. If you define the environment variable GNUSTEP_WITH_ASAN=1 before building the base library and running the tests, you will see that description.m and xpath.m fail (due to leaked memory), and the log file reports that the memory leaked was allocated by xmlStrndup and xmlStrdup I'm not sure how important it is, and tracking down the leaks would likely be very time consuming (perhaps compiling libxml2 with address sanitizer would help), but I thought you would want to know. |
Ok @rfm I am probably going to look into this still but it may be tomorrow. Thank you for merging for now. |
This change updates our libxml2 integration to correctly handle node adoption and manual linking when building against libxml2 2.12.0 or newer. In 2.12.x, libxml2 rewrote significant portions of its tree-manipulation code, including adding internal normalization during both xmlSetTreeDoc and manual node linking. Previously, manually linking child nodes avoided automatic text‑node merging, but in 2.12.x the library now performs merging internally even when bypassing xmlSetTreeDoc. To prevent unwanted normalization, this update adds a manual doc‑update routine for subtree nodes that sets the doc pointer without triggering merges, ensures every text node receives a unique name pointer so libxml2 will not merge them based on pointer equality, updates attributes and namespace nodes safely, and applies these updates before assigning the parent to avoid libxml2 normalization during insertion. For older libxml2 versions we continue to use xmlDOMWrapAdoptNode or xmlSetTreeDoc as before. This fixes text‑node coalescing introduced by libxml2’s internal changes and restores the expected behavior for manually linked XML nodes.