Is there any method to Extract Numbered Headings with Docx4j? #574

Alex-Victor · 2024-02-27T07:39:33Z

Alex-Victor
Feb 27, 2024

I'm trying to extract a text from a docx file using docx4j. So far I can extract all written text and tables, but I'm having problems while extracting numbered headers and lists. such as :
1. Heading 1
text....
1.1 Heading 2
text....
text....
2. Heading 1
text....
2.1 Heading 2
....
Opening a docx file by zip suffix, inside documents.xml, I found that all these headers and lists are inside numPr tags. like this:
<w:numPr>
<w:ilvl w:val="0"/>
<w:numId w:val="2"/>
</w:numPr>
<w:numPr>
<w:ilvl w:val="1"/>
<w:numId w:val="2"/>
</w:numPr>

My expectation is: is there an easy way to get these headers text (1. ; 1.1; a)...)? how can I convert the tags into text?
I'm most appreciative of your help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there any method to Extract Numbered Headings with Docx4j? #574

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Is there any method to Extract Numbered Headings with Docx4j? #574

Alex-Victor Feb 27, 2024

Replies: 0 comments

Alex-Victor
Feb 27, 2024