-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to create a bilingual dictionary entry #44
Comments
Here's an example I created specially for you. I hope it's not too late. <?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE xdxf SYSTEM "xdxf_strict.dtd">
<xdxf revision="034">
<meta_info>
<languages>
<from xml:lang="en"/>
<to xml:lang="eo"/>
<to xml:lang="lv-LV"/>
<to xml:lang="es-ES"/>
<to xml:lang="zh-cmn-Hant-TW"/>
</languages>
<title>Multilingual dictionary</title>
<full_title>Example of a multilingual dictionary</full_title>
<description>
This dictionary shows how to compile a dictionary with one-to-many language translation.
"k" tag is in English and it is translated into Esperanto, Spanish (Spain) and Spanish (Argentina).
</description>
<file_ver>v1.1b</file_ver>
<creation_date>15-01-2023</creation_date>
<last_edited_date>15-01-2023</last_edited_date>
</meta_info>
<lexicon>
<ar>
<k xml:lang="en">cell phone</k>
<def>
<def xml:lang="es-ES">
<deftext>móvil</deftext>
</def>
<def xml:lang="es-AR">
<deftext>celular</deftext>
</def>
<def xml:lang="eo">
<deftext>poŝtelefono</deftext>
</def>
<def xml:lang="zh-cmn-Hant-TW">
<deftext>手機</deftext>
</def>
</def>
</ar>
</lexicon>
</xdxf> |
I noticed that you have figured out, how to create such entries in your dictionary. Tell if there are cases when my scheme doesn't work/describe well. |
Yes thank you very much! It is generally working very well. You noticed I added a couple of extra tags to the dtd. This was really for my use case. I needed to uniquely identify definitions, and examples for use with Anki flash cards but has also been useful for importing into neo4j graph database. So I have a uuid field for each definition. So for example (from French to English) verres: glasses; lunettes: glasses. Ok this is maybe a dumb example its what my brain came up with right now. But there are a lot of situations in the language that I am working with where there is this many to one relationship that happens. Maybe I could have just used the Same for examples where the same example phrase may be used with several word definitions contained therein. This enabled me to create a question for each time the example phrase occurs in the dictionary but giving a different 'hint' each time for the phrase meaning. You also notice I am working with a language in West Africa. This is Jula, which is not super well documented and there are many spelling variations using either phonetics or french phonemes. In addition to the spelling variations, this is a tonal language, so it has been challenging to keep the 'headword' in the local language unique. (I know I can use a comment for that too). But I was tempted to also add a uuid to the ar or k tag. I have made extensive use of the kref/spv, but now I am dealing with many situations where multiple words share the same kref/spv. This is totally my use case issue but I have awk scripts that search exisiting docs and add http links to the definitions of words. They also search the spv, but may then link back to the wrong definition. This is not at all an xdxf issue but just to let you know my challenges with this. The one item that I may have found useful is to have source and author tags for the ar (as you have for examples). Again that would not be so useful for well documented languages. In my case I have noted separately where I 'heard' a certain word. And finally since xml is difficult to work with especially for non technical people I have been working with yaml and converting to xml, (and sometimes back). But the convesions (esp back to yaml) are not perfect. I'd love to get a conversion going directly from xdxf to neo4j. Its possible. But the easy route for me right now is to do an xslt to csv, and then import that. I will post back what I may come up with. Thanks! |
According to the DTD, it is possible to assign IDs to both
In the DTD it's not possible to add IDs currently. Would you be so kind to point to specific use cases in Gitlab (like this |
Its been a while... I remember now I had tried to use that id attribute. But I believe it didn't/wouldn't accept a uuid as a valid tag value. (Which I was already using for my Anki cards) But I'd actually like to use that ID. I'll get back on the exact error there. |
Yes there are lots of examples. But I don't think this is a limitation of the spec. Its really just how I am using the data for question and answer flash cards. I have an example phrase, "It's not yours!", used in two different definitions: the definition of the word 'not', and also the word 'yours'. I need to keep them unique so the phrase is presented to the user twice: on one card with a hint providing the meaning of the word 'yours' and on another card with a hint for the word 'not'. The uuid produced the required results.
|
On a unrelated note, some comments on your XDXF file:
|
Thanks! You are really keeping me on the ball. Actually since I maintain the dictionary in yaml and convert on every change to xml, I had temporarily commented out those lines. I was doing some re-arranging. I have put them back, but I was validating the DTD via script anyways.
For the dates, yes I was not actually paying to much attention. I will have to automate inserting the current date every time I save or convert my file. You were asking also about the change I made to categ in the DTD. I am using the categ element for 'tags'. I noted that categ related to wikipedia or something that I thought could be repurposed for my use... The change was so that I could use it as a list element to tag or 'categorize' definitions. (People; Calendar; Work etc) Again this was not so much for the dictionary, but for the Anki cards I produce from the file. Just as an FYI, I maintain all these scripts in a CI on gitlab, so I make a change to the yaml file it converts to xdxf and updates the dictionary, quiz, anki cards on one shot! (My neo4j project is temporarily offline. I have to get back to making some adjustments there. ) |
Hi, Thanks for amazing project. I am interested in this but I can't see how to make a bi-lingual entry. In the Rev34.xml file there are 'to' and 'from' language elements in the meda_info. But they indicate to translations of languages en and lv. However in the ar entries I don't see anything identified by lv. And there does seem to be a translation of 'Home", but this appears to be in Russian but with no language specified. Can you point to any other example? Thanks!!!!
The text was updated successfully, but these errors were encountered: