-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML elements are reordered. #2267
Comments
Hi there, I can't seem to replicate it. Can you show your code? With your example HTML in try.jsoup: https://try.jsoup.org/~KDhcX48QN1pnbYhCj-WlusIyAWo The result of
Which is the same as Chrome's rendered output:
|
I looked into the example you shared here. I copied both the "input HTML" and the "renderer output" into an HTML renderer, and it seems like the HTML structure is being reordered, as shown in the attached image. I should also mention I'm using version 1.18.3 (which I believe is the latest version) as per https://github.com/jhy/jsoup/blob/master/CHANGES.md. Interestingly, in the example you sent, the text appears in the correct order, but the HTML itself is still being rearranged. Do you know why this might be happening? |
Sorry, but I can't make out what you're trying to show me in that image. Given how badly formed the HTML is (incorrect closers for formatters, etc) there is a chance that the adoption agency algorithm is being executed differently by jsoup and (what browser are you testing? You haven't disclosed that). Or something akin. To get to the bottom of it, my suggestion would be to write some debug code that traces the specific order of the DOM for both jsoup and browser. Add IDs to each element so we can see which one is which. Then traverse the doc and serialize the tag + ID. For the browser you can do this in JS. Then the difference is likely to be apparent. But I am confused as to why my original output is different to yours. Can you show your code and confirm which version you're on. |
I've uploaded an
example.txt
file that contains the html example. Unfortunately I couldn't get the example to be any smaller, so I apologise for the largeish html example.When rendering the original html file in a HTML renderer you get:
Once you parse the HTML with
Jsoup.parse(html)
and render the new html with a renderer once again you get:Notice how the text has been re-ordered. One thing that may help in investigation is that when you remove all of the inline style attributes, this issue no longer occurs and the text is in the correct order after parsing with
Jsoup.parse()
.Is this expected behaviour?
example.txt
The text was updated successfully, but these errors were encountered: