-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request:Docx native capabilities #34
Comments
Below is a copy of the docx
|
this seems to be true, figure captions are just paragraphs with a different style afaik |
There was talk of using a similar mechanism as the MMD ODT writer (where figure and caption are contained in a text box). But I think that never really happened as using keep-with-next works well enough... |
Hi @iandol. I'm not really sure what you mean by "keep-with-next". Can you please explain? My understanding is that docx natively supports figure captions and automatic figure numbering. Pandoc-fignos and friends should be using that. What I need is a model The model docx file should inculde a single-captioned figure with automatic figure numbering turned on (i.e., nothing hard-coded). A short sentence with a reference to that figure would be helpful as well. To obtain the |
I'm not using fignos yet, so I can't comment on whether it is outputting correctly formed DOCX files yet. But just in case, here is a document.xml. Generated in Word 2016. I dragged an image into a new blank document, then added a default caption with some caption text "This is a simple test". Then added a paragraph in front of the picture "This is a simple test to see if [Figure 1] is hyperlinked?" — The bracketed text is a cross reference I inserted. Line 101 of the XML is where the caption paragraph starts: https://gist.github.com/iandol/a3d7a456776002719e2ea139e681790e What I meant above is that MultiMarkDown wraps the figure and caption in a text box, but Pandoc doesn't. Pandoc uses a paragraph style mechanism to keep the caption always underneath the figure. In Word if you select "keep-with-next" it makes two adjacent styles "stick together", so it behaves as if they are grouped in a text box. MMD uses a text frame/box in DOCX and ODT, I got confused as I mentioned in my comment above Pandoc ODT writer, but I meant MMD (MMD also uses auto-numbering for figures). Here is some tangential discussion about using frames for captioned figures in Pandoc (for ODT output but some discussion on DOCX): And possibly: jgm/pandoc#3177 may have an influence on fignos as it is resolved... |
Thank you, @iandol. This is enormously helpful. I will look at the docx and see what pandoc-fignos can do to better support docx. We can have a look into odt after that. Thanks for the heads up on jgm/pandoc#3177. I have subscribed to it and will have pandoc-fignos and friends adjust to the new behaviours as they emerge. |
I can also add the docx/odt output from MMD which does generate text frame and auto-numbered figure legend if it helps. As pointed out on jgm/pandoc#2401 one issue is hard coding the English "Figure" text in the legend, and I'm not sure if the easiest solution isn't allowing a YAML variable for the user to change. |
A possible solution is to replace entirely pandoc's Image element with custom ooxml (including the caption), and then insert custom ooxml for the figure reference. For this I need a model document.xml file. One difficulty with what you posted, @iandol (although much appreciated), is that it does not appear that the native cross-referencing mechanism for docx was used. The figure number appears to be hard-coded. I could be mistaken. OOXml is pretty hard for a human to read. I tried to generate it anew using LibreOffice. Unfortunately, when I imported the resulting docx file back into LibreOffice, the cross-references were broken. It turns out that this is a long-standing issue (since 2011!) with no end in sight: see here. So, I am still needing the following to make progress: A docx with a captioned figure (using Word's native caption feature), automatic numbering of that figure, and a native reference in some text to the figure (i.e., not hard-coded). Cheers, |
I've generated a sample document, hope this helps. |
I just saw this pandoc filter, which would be a huge improvement to my workflow if it used the right way of producing the references in docx. However, there are many different ways to represent a figure caption number, both in the caption itself and in the text referring to it. The document supplied by @tstenner is a good one, but I typically do not include the whole caption text in the reference but just the number. Here’s @tstenner’s example the way I typically use figures and references: Figure_example2.docx The style for the figure caption is called „Beschriftung“ in document.xml (a German term for caption). I guess the formatting of the figure number is encoded in Word’s format for this caption type and can be changed in Word itself. Thus, this is nothing pandoc-fignos must do if the user wants a different numbering scheme. Right? Pandoc-fignos could provide these different schemes but it would be OK to just use one default setting. In the text reference to the figure, however, it the user chooses to include the caption text, this text is part of the content of the I hope, this gives you sufficient material to actually implement the feature. That would be great! Torsten |
Hello. I am trying to export from Org to DOCX, but I don't get a numeric reference. I would like to know if someone can help me. I create the LaTeX file with Org, and then do org-latex-pandoc+fignos-docx.tar.gz pandoc 2.0.5 Compiled with pandoc-types 1.17.3, texmath 0.10, skylighting 0.5 |
Hi, @broncodev. Pandoc-fignos only works for markdown input files. |
I have created a Wiki page to help formulate docx support in pandoc-fignos: https://github.com/tomduck/pandoc-fignos/wiki/Development#docx-output This page describes the progress and what needs to be done next to solve the docx problem. |
in https://github.com/tomduck/pandoc-eqnos/wiki/Development @tomduck mentioned the issue below: yes it is correctly. there is document about this issue |
in https://github.com/tomduck/pandoc-eqnos/wiki/Development @tomduck mentioned the issue below:
I have created a blank docx file and write a OMML eqaution manually.
What I suggestted is do not overwrite pandoc's implementation about equation, I think there is much more work than you have considered. |
in https://github.com/tomduck/pandoc-eqnos/wiki/Development @tomduck mentioned the issue below:
I have tested the original equation number in docx and I have to say, it is very stupid design. If I want to implementation the correct form, I have to do a lot of dirty work, which is below:
then I can cite the equation correctly. after doing this, the xml document is like below:
for most docx user, no one use this stupid method, because there is a third party software as MathType. As my opinion, pandoc-xnos should use hard-coded number still but use the table style to layout the equation. |
@tomduck Thanks. I will try by first exporting latex to markdown then (or something similar). |
@shixuguo @tomduck
So the changes are:
The equation number (between the brackets) remains hard-coded, but I don't have a problem with that. |
"Pandoc-fignos only works for markdown input files." . It would be great if you clarify this at the beginning of the home page. That will save a lot of frustration. Also, would it work for native (native-pandoc) format? (I couldn't make it work). |
I am very interested in what pandoc-fignos does, but work with RST. It would be great if fignos handled rst input as well. |
@tomduck FWIW, I tried to solve the To Do task on the Docx output wiki page. I've created a word document that comes close to what desired docx output should be. It uses native docx capabilities for numbering the figures and for referencing. As asked on the wiki page, the doc has one captioned figure and one reference to it. I replaced the hardcoded 1 in "Figure 1: Caption" with a numbered caption (In Word: Reference > Insert Caption). I created the reference using Reference > Insert reference. Here's the xml from word/document.xml: <?xml version="1.0" encoding="UTF-8"?>
<w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:cx="http://schemas.microsoft.com/office/drawing/2014/chartex" xmlns:cx1="http://schemas.microsoft.com/office/drawing/2015/9/8/chartex" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" mc:Ignorable="w14 w15 w16se wp14">
<w:body>
<w:p w:rsidR="00BD015B" w:rsidRDefault="00EF2D29">
<w:pPr>
<w:pStyle w:val="CaptionedFigure" />
</w:pPr>
<w:bookmarkStart w:id="0" w:name="fig:1" />
<w:r>
<w:rPr>
<w:noProof />
<w:lang w:val="fr-FR" w:eastAsia="fr-FR" />
</w:rPr>
<w:drawing>
<wp:inline distT="0" distB="0" distL="0" distR="0">
<wp:extent cx="914400" cy="457200" />
<wp:effectExtent l="0" t="0" r="0" b="0" />
<wp:docPr id="1" name="Picture" descr="Figure 1: Caption" />
<wp:cNvGraphicFramePr />
<a:graphic xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:pic xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:nvPicPr>
<pic:cNvPr id="0" name="Picture" descr="img1.jpg" />
<pic:cNvPicPr>
<a:picLocks noChangeAspect="1" noChangeArrowheads="1" />
</pic:cNvPicPr>
</pic:nvPicPr>
<pic:blipFill>
<a:blip r:embed="rId8" />
<a:stretch>
<a:fillRect />
</a:stretch>
</pic:blipFill>
<pic:spPr bwMode="auto">
<a:xfrm>
<a:off x="0" y="0" />
<a:ext cx="914400" cy="457200" />
</a:xfrm>
<a:prstGeom prst="rect">
<a:avLst />
</a:prstGeom>
<a:noFill />
<a:ln w="9525">
<a:noFill />
<a:headEnd />
<a:tailEnd />
</a:ln>
</pic:spPr>
</pic:pic>
</a:graphicData>
</a:graphic>
</wp:inline>
</w:drawing>
</w:r>
</w:p>
<w:p w:rsidR="00BD015B" w:rsidRDefault="00EF2D29" w:rsidP="00B227A1">
<w:pPr>
<w:pStyle w:val="Image Caption" />
</w:pPr>
<w:bookmarkStart w:id="1" w:name="_Ref12885797" />
<w:r>
<w:t xml:space="preserve">Figure </w:t>
</w:r>
<w:fldSimple w:instr=" SEQ Figure \* ARABIC ">
<w:r w:rsidR="00B227A1">
<w:rPr>
<w:noProof />
</w:rPr>
<w:t>1</w:t>
</w:r>
</w:fldSimple>
<w:bookmarkEnd w:id="1" />
<w:r>
<w:t>: Cap</w:t>
</w:r>
<w:bookmarkStart w:id="2" w:name="_GoBack" />
<w:bookmarkEnd w:id="2" />
<w:r>
<w:t>tion</w:t>
</w:r>
</w:p>
<w:bookmarkEnd w:id="0" />
<w:p w:rsidR="00BD015B" w:rsidRDefault="00B227A1">
<w:pPr>
<w:pStyle w:val="Corpsdetexte" />
</w:pPr>
<w:r>
<w:t xml:space="preserve">Reference to </w:t>
</w:r>
<w:r>
<w:fldChar w:fldCharType="begin" />
</w:r>
<w:r>
<w:instrText xml:space="preserve"> REF _Ref12885797 \h </w:instrText>
</w:r>
<w:r>
<w:fldChar w:fldCharType="separate" />
</w:r>
<w:r>
<w:t xml:space="preserve">Figure </w:t>
</w:r>
<w:r>
<w:rPr>
<w:noProof />
</w:rPr>
<w:t>1</w:t>
</w:r>
<w:r>
<w:fldChar w:fldCharType="end" />
</w:r>
<w:r w:rsidR="00EF2D29">
<w:t>.</w:t>
</w:r>
</w:p>
<w:sectPr w:rsidR="00BD015B">
<w:pgSz w:w="12240" w:h="15840" />
<w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440" w:header="720" w:footer="720" w:gutter="0" />
<w:cols w:space="720" />
</w:sectPr>
</w:body>
</w:document> The numbering and the reference use fields. See the Anwering the questions from your wiki page:
Hope this helps. If you want, I can paste the XML into the wiki. Here's the docx file. |
Thanks for your work on this @ociule. I'm planning to release pandoc-fignos 2.0.0 soon. I expect to revisit the issue of docx support this fall, and the leg-work you have done will be helpful. Cheers, |
Thank you @tomduck, this is indeed a much needed feature for all of us that don't want to use Word but have to generate Word documents. @ociule made the point that indeed, standard insertion of tables, figures, and documents are some "special" bookmarks. I feel like the important part is the following:
I tried a simple docx document with one equation, and this showed up in the xml:
I guess this is relevant for |
Is there anything more needed for this feature? Is there anything I could do to help this along? This is a feature I would use heavily so I would be willing to help out. Thanks for all the work on this and many other features. |
So I did some playing around and the label in the Figure caption is currently working correctly. It has the form:
In order to make the reference to the figure work the following xml code is needed:
I think I have what I need to make the changes, but I'm having trouble actually figuring out where the changes need to be made in the code. I don't see any of the existing markup in the code, so I am a bit confused about where it comes from. If someone can help me out with where to look, I can try to implement the changes. |
I was able to get the native figure numbering working on my computer. I think I have a good start on this, but I am guessing that these changes would break things for other users. Let me know if you think I should make a pull request to work out some of these issues or if it is easier to do it here (I've never written a pull request before). Next, you have to use the native_numbering extension which then creates the necessary field in the figure caption that can be "cross referenced" (linked) to. I have been running:
This will create the necessary field in the caption:
Note that this looks like the Figure number is being hardcoded in, but when you open the word doc, all you have to do is select the text and hit "update field" and it will update the text inside the field. Also of note, is that the above field is contained with the following xml block which creates a bookmark:
This contains the key Note that all of the above is created by the pandoc writer without pandoc-xnos. That means that we actually don't need pandoc-fignos to do anything at all to our captions (Otherwise you end up with Figure X: Figure X:). On my machine I added the following to
This also means that we don't need anything word specific under The next step is to adjust the reference which is under the pandoc-xnos package. I added the following to
This results in the following xml:
Which again looks like it is hard coded, but it will be able to update if say new figures are added in word after the document's creation. Here are the outstanding issues/TODO items:
|
Internal links now use docx's native capabilities (closing Issue #25, submitted by @krnlyng). Figure and references numbers should do the same. Currently, the numbers are hard-coded.
The text was updated successfully, but these errors were encountered: