-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Progress on the active parser ("citationjs") #3
Comments
I have a different mental model of this now, with two stages of parsing: one of entries, and one of values. name (and date) field parsing is still at mapping for the moment, but |
What does the first stage do? It looks like it does more than tokenization, but assuming the first stage also does command -> unicode mapping, I'm not clear on how you brought verbatim parsing to the 2nd stage. What does |
Yeah first part is full parsing of the file syntax, resolving All in all, not optimal so I might have to rethink it. But that would probably also involve rethinking the tokenizer (just category codes?) and as a result the whole grammar. I also still have to make special cases for commands like |
But literal lists have the same issue as name lists. With "list fields" do you mean "literal lists"? Why would Sentence casing has to know about bracketing, and commands, because the brackets and the placement of commands within them influence the result in fairly intricate ways. If just the end-result is sentence-cased after all else is done, the results won't be right. |
I mean any list field. Brackets are kept to the second stage, and while I did implement top-level " and " splitting, so to speak, I didn't realize that needed to be done for name fields as well until after posting the comment.
Because I currently don't have any argument commands yet, apart from formatting which I do special-case (and would like to keep doing so). So I'll probably use formatting commands, symbol commands and "other" commands implemented as functions in tandem.
The brackets would be translated to |
I guess it could be corrected if enough information travels along? It gets complicated fairly quickly though
If you don't parse with arguments, how do you parse |
Some of the command-and-brace interaction is described at https://retorque.re/zotero-better-bibtex/support/faq/#why-the-double-braces |
Right, diacritics are a special case too. I see the BibLaTeX manual also has documentation about their algorithm for sentence casing (page 253-255). However, not when it is applied exactly: it seems to be part of specific citation styles. |
Styles decide for themselves whether to apply sentence casing but given the complexities of sentence casing in bib(la)tex I can't imagine that the styles actually each implement the sentence casing themselves. |
No, BibLaTeX has a helper for that ( |
Ah yes, that is true, but it remains that bib(la)tex expect the input to be title cased, and that CSL expects it sentence case. |
So an update for the interested:
|
Going through the issues. Checked
TODO list:
|
One big problem is the question of what should be parsed when parsing syntax, and what should parsed when mapping to CSL. Consider also that Bib.TXT should be able to use the same mapping.
verbatim
orurl
in the specification should not have commands parsed, and then the syntax parser has to know about all the different fields.crossref
: should be done when mappingLess crucial things, maybe:
The text was updated successfully, but these errors were encountered: