Skip to content

Latest commit

 

History

History
854 lines (728 loc) · 32 KB

README.md

File metadata and controls

854 lines (728 loc) · 32 KB

Mnemocards CI pipeline status Mnemocards contributors Mnemocards total downloads Mnemocards downloads per month Mnemocards MIT license
Generate Anki cards from text files.


🤔 What is this?

Text files are easily maintainable, apkg files are not. You can easily store text files in a version control system like git, so you can easily keep track of changes and collaborate with others.

Mnemocards comes with some pre-design formats:

  • Language cards (first row of the last picture): Specially designed for learning a language. There are two types of language cards:
    • Vocabulary cards (right): Cards displayed in 2 languages, your native language and the language you are learning. This type of card gives you the possibility to auto-generate pronunciation audios directly from Google Translator. Also, if you are learning Japanese you can use ふりがな (furigana, the small hiragana characters on top of the Kanji)!
    • Expression cards (left): When you already know a language and want to master it, sometimes it is no longer enough to translate into your language, but you want to write sentences in the language you are learning with their respective explanation also in the language you are learning.
  • Markdown cards (second row): Cards generated from *.cards files. This file format has been created specifically for the creation of cards with Mnemocards. Apart from a pair of start and end of card markers, the syntax of these files is pure Markdown. You can use images, Latex and math in this kind of cards.
  • Autogenerated cards(picture below): These cards are generated fully-automatically from a simple *.txt file with a list of words (or phrases) in a language you learn. Each word is translated to your own language using Google Translate and then formatted using Vocabulary cards template. Each card will have everything you'll see on the Google Translate page - the word itself, main translation, additional synonym translations, pronunciation, definitions (in a language of original word, with example sentences). For some words (and most phrases) Google Translate will lack some items, but will have only one main translation. Additionally you can generate pronunciation audios - it will work regardless of the word having a text pronounciation on Google Translate page, and can be generated for phrases too. And if you create autogenerated cards from japanese words and phrases - you can auto-generate furigana too!

Table of Contents

Requirements

  • PyAudio, one of the Python dependencies requires the installation of PortAudio-dev. Install the package in Ubuntu-like systems (bionic) with apt install portaudio19-dev=19.6.0-1build1. Maybe any other version of the package is working but that one is the one I'm using without problems. Remove the version if you have any problem and try with the last one.
  • Python 3 and the dependencies specified in pyproject.toml. You need at least pip>=19.
  • If you want to import automatically the generated apkgs, you should have Anki installed.
  • If you want to generate cards from your repositories or gists you should have Git installed. Install it in Ubuntu-like systems with apt install git. Also, in order to use the GitHub API you should have a file with and API key with gists/repository permissions. The repository permission is only needed for private repositories.

Installation

Using PyPi package with pip:

pip install -U pip  # pip >= 19 is needed
pip install mnemocards

Mnemocards is using Poetry for packaging and dependency management, so if you want to generate a .whl file from the source code all you have to do is:

poetry build

Then you should be able to install the wheel file in any Python env with pip>=19:

pip install dist/*.whl

If you want to contribute or develop use Poetry as in any other Python project.

⚠️ Remember to have at least version 19 of pip.

Consider the option of using Docker if you do not want to install the package and to set up all the needed environment. Read the Docker section of this README to learn more about it.

Generate cards

Move into the examples/ directory and execute the next line to generate all the *.apkg files in this directory and all the subdirectories.

$ mnemocards generate -r .
Building ./computer_science/cards_config.json
        Building deck:  Computer Science  ID: 3778079933
Building ./japanese/cards_config.json
Creating audio file ./japanese/media/hiragana/10091493812914340822.mp3
Creating audio file ./japanese/media/hiragana/7304217433350980427.mp3
Creating audio file ./japanese/media/hiragana/3595385396154511079.mp3
Creating audio file ./japanese/media/hiragana/5000408949304965326.mp3
Creating audio file ./japanese/media/hiragana/2088116759824648408.mp3
Creating audio file ./japanese/media/katakana/13050069045466478331.mp3
Creating audio file ./japanese/media/katakana/4834734646036555229.mp3
Creating audio file ./japanese/media/katakana/4275246117432970461.mp3
Creating audio file ./japanese/media/katakana/8563378359496393897.mp3
Creating audio file ./japanese/media/katakana/10683512746176998599.mp3
        Building deck:  Japanese Scripts  ID: 46741143
Building ./english/cards_config.json
        Building deck:  English  ID: 376026414
Building ./gtrans_generated/from_words/cards_config.json
Creating audio file ./gtrans_generated/from_words/.media/16602724546385148562.mp3
Creating audio file ./gtrans_generated/from_words/.media/5309234024643684554.mp3
Creating audio file ./gtrans_generated/from_words/.media/16025915470194576015.mp3
Creating audio file ./gtrans_generated/from_words/.media/3362678031558458662.mp3
        Building deck:  English-Spanish googletranslated  ID: 2639255077
Building ./gtrans_generated/from_tsv/cards_config.json
Creating audio file ./gtrans_generated/from_tsv/.media/9906647659348914100.mp3
Creating audio file ./gtrans_generated/from_tsv/.media/93519578040143984.mp3
Creating audio file ./gtrans_generated/from_tsv/.media/17049437411425550802.mp3
Creating audio file ./gtrans_generated/from_tsv/.media/8896291257694557922.mp3
Creating audio file ./gtrans_generated/from_tsv/.media/933277714999002525.mp3
        Building deck:  Japanese-English googletranslated  ID: 2732900318
Writing packages to a file...
$ ls
computer_science  cs.apkg  english  english.apkg  gtrans_generated  gtrans_generated_from_tsv.apkg  gtrans_generated_from_words.apkg  japanese  japanese.apkg

Now you have 5 *.apkg files in this directory that you can import to Anki manually or using Mnemocards (see the import section). During the generation process 10 audio files has been created for the Japanese decks, 4 for english-spanish deck generated from word file and another 5 audio files for japanese-english deck generated from tsv file. These audio files come from Google Translator. If you repeat the command again, no audio is downloaded again, so the process of adding new words to a vocabulary is going to be faster.

Mnemocards commands come with documentation that you can read adding --help to any command. For example, if you want to see all the options you can use with the generate command just execute:

$ mnemocards generate --help
usage: mnemocards generate [-h] [--config-file CONFIG_FILE] [--recursive]
                           [--output-dir OUTPUT_DIR]
                           DATA_DIR

positional arguments:
  DATA_DIR              Directory with the configuration and text data to use
                        for generating the Anki cards.

optional arguments:
  -h, --help            show this help message and exit
  --config-file CONFIG_FILE, -f CONFIG_FILE
                        Configuration file to search in the DATA_DIR.
  --recursive, -r       Search recursively for configuration files in the
                        given DATA_DIR.
  --output-dir OUTPUT_DIR, -o OUTPUT_DIR
                        Output directory where the packages are going to be
                        saved. Current directory by default.

The process of generating Anki *.apkg files is based on the use of configuration files. By default, the configuration file is called cards_config.json. There are five different cards_config.json in the examples, one in each directory (computer_science/cards_config.json, english/cards_config.json, japanese/cards_config.json,gtrans_generated/from_tsv/cards_config.json and gtrans_generated/from_words/cards_config.json)

The -r option used in the generate command indicates Mnemocards to search for those configuration files recursively. If you want to generate only the japanese.apkg use mnemocards generage japanese or move into examples/japanese and execute there mnemocards generate ..

Configuration files cards_config.json

Configuration files contain how many packages to build, the number of decks, deck configurations and the input source of the data (TSV files and Markdown files).

The most basic configuration file is:

{
    "packages": [
        {
            "name": "APKG_filename",
            "decks": [
                {
                    "id": "ad054cdc-160b-4b77-a8a5-4da79fe5d8a5",
                    "name": "Deck name",
                    "src": [
                        {
                            "type": "markdown",
                            "file": "my_file.cards"
                        }
                    ]
                }
            ]
        }
    ]
}

Each configuration file can generate one or more *.apkg packages. Each package can contain one or more decks. Each deck can consist of one or more source text files.

It is recommended to specify a deck ID, otherwise a hash of the deck name will be used, which implies that if the name is changed the deck will be considered as a new deck by Anki, loosing any learning progress.

Apart from the deck ID, name and source files, you can specify a deck config. Look to this example:

{
    "id": "e9a0b7ba-641a-4af6-8631-be9854a4e9d8",
    "name": "My deck name",
    "config": {
        "id": "65bcc65b-b4de-4ce4-b5c1-a73a2f64b82d",
        "name": "My deck name (Configuration)",
        "timer": 1,  # Active timer
        "maxTaken": 30,  # Max seconds taken by the timer
        "new": {
            "bury": true,  # Bury related new cards
            "initialFactor": 1500,  # Initial ease factor
            "perDay": 5,  # Number of new cards per day
            "delays": [1, 10, 1440, 4320, 10080],  # Learning steps in minutes
        },
        "lapse": {
            "leechAction": 1  # Mark leech cards. Set to 0 to suspend.
        }
    }
}

Note that the comments added to the right of some properties are not a valid JSON syntax, they are added here only for this tutorial. You can read about all the deck config options you can use in the Ankidroid documentation.

The src property should have at least one file in order to generate some cards for that deck. type and file are the two required properties. Depend on the type you can add more properties.

markdown type

{
    "type": "markdown",
    "file": "math.cards",
    "show_tags": true,
    "card_properties": {
        "tags": ["math"]
    }
}

Apart from type and file you can add:

  • show_tags. This flag set to true will display at the top of the cards tags for it both from cards_config.json and from tags property in the markdown card header (as described in *.cards format section).
    By default false.
  • card_properties. Properties that are applied to all the cards in this file. For exampe: using this property you avoid setting tags in all the cards inside of that file.
    • tags. The tag property is the only one available at the moment. It is an array of tags. Even if you only what to specify one tag you should use an array with one element.

vocabulary type

{
    "type": "vocabulary",
    "file": "hiragana.tsv",
    "header": true,
    "pronunciation_in_reverse": false,
    "card_color": "#33AA33",
    "furigana": false,
    "audio": {
        "lang": "ja",
        "media_dir": "media/hiragana"
    },
    "card_properties": {
        "tags": ["japanese", "hiragana"]
    }
},
  • header. The first line of the TSV file is a header line, so it will be skipped.
  • pronunciation_in_reverse. By default, when the vocabulary card is shown in reverse the pronunciation is not showed. Set this option to true if you want want the pronunciation. It will be shown once you press the Show answer button.
  • card_color. Card background color in hexadecimal.
  • furigana. If you are learning Japanese, maybe you want to use furigana (small hiragana characters over Kanji) in your cards. Set this flag to true if you want to use them, by default false. In your TSV files your furigana must be written between brackets and with a space before the Kanji. For example, 日[に] 本[ほん] 語[ご].
  • audio. If you want to generate and audio file of the language you are learning, you should specify here the language.
    • lang. The language used to generate those file using ISO 639-2. You can find a table with the ISO 639-2 for all the languages in Wikipedia. If the pronunciation is not available in Google Translator this is not going to work.
    • media_dir. Directory where the audio files are stored. After generating the package for the first time, this folder will be created and filled with all the audio files. If you don't delete this folder, the next time Mnemocards will be much faster because it already has all the audio files generated.
  • card_properties has the same meaning as in Markdown cards.

expression type

{
    "type": "expression",
    "file": "expressions.tsv",
    "header": true,
    "card_color": "#AA3333",
    "card_properties": {
        "tags": ["english", "expressions"]
    }
}

header, card_color and card_properties have the same meaning as in vocabulary cards.

autogenerate type

{
    "type": "autogenerate",
    "file": "words.txt",
    "pronunciation_in_reverse": true,
    "card_color": "#f5f5f5",
    "lang": {
        "original": "en",
        "translation": "es"
    },
    "one_translation": false,
    "audio": true,
    "furigana": false,
    "furigana_type": "hepburn",
    "card_properties": {
        "tags": ["english","spanish", "autogenerated" ]
    }
}
  • file.
    This type is for generating cards automatically from a list of words in a *.txt file. The file should contain one word or phrase per line without separator such as . or , at the end of the line.
    This type can also be used to generate card from automatically generated *.tsv files - for instructions on how to generate TSV-files read the section Autogenerate TSV files with maketsv command
  • lang.
    You should specify here the languages for automatic translation.
    List of language codes is available here
    • original. Language code for language of words in the file.
    • translation. Language code for your language.
  • one_translation.
    Use this flag if you want to generate cards with only one translation for each word. Otherwise cards will feature alternative translations (if available in Google Translate.)
    By default false.
  • furigana.
    This flag works the same as in vocabulary type, but generates furigana automatically.
    This flag only works if the lang:original is set to japanese (ja).
    If furigana is activated, pronunciation from Google Translate will be removed from the card.
    By default false.
  • furigana_type.
    You can choose what type of transliteration to use for furigana. Avaiable values hira for hiragana (this is set by default if you skip this flag), kana for katakana and hepburn for romaji.
  • audio.
    This flag works a little different then in vocabulary type and has to be set true to work. By default false.
  • pronunciation_in_reverse, card_color, card_properties has the same meaning as in vocabulary cards.

⚠️ Due to limitations of Google Translate API, the program can download translations only for 25 words per 3 minute. If you use autogenerate type on list of more then 25 words, the program will make 3 minute delays for every 25 words and generation can take a large amount of time dependent of total number of words.

*.cards file format

A card has the following syntax:

<<<
header
===
title
---
body
>>>

The header section contains some metadata about the card (ID and tags), the title is the front part of the card and the body in the hidden part that is shown when you press the Show Answer button on Anki.

The header section has a YAML syntax and the title and body section use Markdown syntax. Notice than the separators === and --- are legal Markdown syntax for generating headers, so it's recommended to use # and ## instead in your title and body.

The header and the body are not required, so the next example is also a card:

<<<
title
>>>

However, it's highly recommended to give an ID to your cards. If no ID is used, a hash of the title is going to be used as ID. That means that the card ID will change if the title is changed (titles are prone to change because of typos or future improvements you want to make to your cards). Cards with different IDs are considered as different cards by Anki, so you will have duplicates and the new card will loose any progress. Use IDs please. The ID is given in the header and it's recommended to use a GUID:

<<<
id: 07924f36-ccfa-4b72-ac21-11f8b151d42f
===
# Title
---
Body
>>>

Another legal property that you can define in the header are tags. Use a comma separated list with the names of all the tags you want to assign to that card:

<<<
id: 07924f36-ccfa-4b72-ac21-11f8b151d42f
tags: tag1,tag2,tag3
===
# Title
>>>

Use inline math formulas using a dollar (ex: $x^2$) and a multi-line formula using two dollars (ex: $$\sum_i x_i$$) in any part of the title or body.

You can also add images to the cards using an <img> tag. At the moment the Markdown syntax for images ![alt](url) is not supported. Notice that the image names should be unique over all the images in your Anki decks, so avoid names like 1.png or example.png.

TSV Vocabulary files

TSV vocabulary files should contain the next columns. At the moment, the columns should be in the given order.

  • ID. Characters that uniquely identify a note. This number must be unique not only in the file but in the whole collection, that is why we recommend using a UUID (a sequence of alphanumeric characters such as: 64012c71-9aea-4622-aac7-2595d6798737). Having a UUID is necessary to be able to update the cards (make spelling corrections or improve them with extra information) and not lose the progress. If you need to generate UUID for your card when you first compose it, use command mnemocards id

  • YourLanguageWord. The word you want to learn but in your mother tongue or in a known language.

  • YourLanguageExplanation. Any extra detail to help you get the word you're looking for. A clear example of use is when you have to explain a word that does not have a direct translation in your language or when the translation in your language is a word that has more than one meaning. For example: in Japanese flat and thin objects use different numbers, so the translation of 一枚 is obviously "one" but to make the reverse translation we need a clarification like "one, when counting flat and thin objects".

  • LanguageYouLearnWord. The word written in the language you are trying to learn.

  • LanguageYouLearnPronunciation. Write here how you can pronounce the word of the language you are learning. If you choose to generate an audio with the pronunciation, the audio is going to be placed here.

  • LanguageYouLearnExplanation. This explanation will always accompany the word in the language you want to learn. It explains in what alternative forms the word can appear as synonyms or variations in writing. Do not give any extra information that reveals the meaning of the word, as it will appear on the front of some cards where your goal will be to make the translation into your language. For example: English "hit, to punch someone" to Spanish "pegar, you can also use 'golpear'".

  • Tags. The tags you write here are added to the tags specified in the cards_config.json.

This is how the fields are shown in the cards. Front card format:

YourLanguageWord
YourLanguageExplanation
---                            # After showing answer
LanguageYouLearnWord           # After showing answer
LanguageYouLearnPronunciation  # After showing answer
LanguageYouLearnExplanation    # After showing answer

Reverse card format:

LanguageYouLearnWord
LanguageYouLearnPronunciation  # After showing answer
LanguageYouLearnExplanation
---                            # After showing answer
YourLanguageWord               # After showing answer
YourLanguageExplanation        # After showing answer

For Japanese language there is an special flag in cards_config.json named furigana. If you mark this flag to true the Kanjis in the front side are going to be shown alone and in the back side are going to be shown with furigana. This makes the field LanguageYouLearnPronunciation not really required when creating Japanese cards (you can always use romaji here, of course).

Autogenerate TSV files with maketsv command

You can use automatically generated *.tsv files for autogenerated type of cards instead of *.txt files. If you want to review automatically generated cards before collecting them in apkg file use this option.

To create such a file, move to the folder with the *.txt file with words you want to turn into cards and execute command mnemocards maketsv .

By default command will search for file with the name words.txt, but you can point it toward any file with additional argument -w name_of_file.txt.

By default the command will assume that words in the file in english language and translate them to spanish. To change language pair use argument -l lang_lang where first lang value is the language of words and second lang value is your language. Use language codes from this page

For example, command to translate words from japanese to english, collected in the file japanese.txt will look like this:

$ mnemocards maketsv -l ja_en -w japanese.txt

The resulting file will have a name based on language pair, in the example case it will be ja_en.tsv.

The TSV-file will have the same structure as TSV vocabulary file described above. In fact, you can use this TSV-file with vocabulary type configs for building decks. But I reccomend to use them with autogenerate type configs, since the file will have some basic html which is used to make cards look like Google Translate site.

After TSV-file generated you can manually adjust values of the columns and then use it for generating the deck.

You can access full help for maketsv command by using mnemocards maketsv -h

Expressions TSV files

Similarly to vocabulary TSV files, the expression TSV files contain: At the moment, the columns should be in the given order.

  • ID Characters that uniquely identify a note. This number must be unique not only in the file but in the whole collection, that is why we recommend using a UUID (a sequence of alphanumeric characters such as: 64012c71-9aea-4622-aac7-2595d6798737). Having a UUID is necessary to be able to update the cards (make spelling corrections or improve them with extra information) and not lose the progress. If you need to generate UUID for your card when you first compose it, use command mnemocards id

  • Expression. Expression that you want to learn.

  • Explanation. Extra explanation of the expression if needed.

  • Meaning. Meaning of the expression.

  • Example. Example sentence of use of the expression.

  • Tags. The tags you write here are added to the tags specified in the cards_config.json.

This type of notes only have one front card. Front card:

Expression
Explanation
---          # After showing answer
Meaning      # After showing answer
Example      # After showing answer

Of course, these types of cards are created for the purpose of learning a new language, but they can be used for any other purpose as long as the fields described here fit your purpose.

Import cards to Anki

Use the command mnemocards import --help to get the instructions about importing *.apkg files.

$ mnemocards import --help
usage: mnemocards import [-h] [--profile-name PROFILE_NAME]
                         [--collection-path COLLECTION_PATH]
                         apkgs [apkgs ...]

positional arguments:
  apkgs                 List of packages to import.

optional arguments:
  -h, --help            show this help message and exit
  --profile-name PROFILE_NAME, -p PROFILE_NAME
                        If your collection is in the default location
                        (`~/.local/share/Anki2/`) you can specify only the
                        profile name. You cannot use this option as the same
                        time as `-c`.
  --collection-path COLLECTION_PATH, -c COLLECTION_PATH
                        Specify the full path of the collection file. If you
                        use this option with `-p` (profile name), the profile
                        name has preference over the full collection path.

To import an *.apkg file you need to close Anki, otherwise the collection file cannot be written. Remember that you need to open Anki and synchronize the collection with Web Anki to see the updated collection in all your devices.

Git utilities

In order to keep my cards safe and centralize my knowledge database, I added a few utilities to mnemocards to clone and push many Git repositories at the same time.

The first step is to know which repositories you want to clone. I like to create a new private repository for every subject I'm learning. For example:

  • Japanese: under my profile I have a repository called learning_japanese.
  • Programming: I have a repository called learning_programming.
  • And so on...

My aim is to clone all these repositories in an easy way and make them very accessible so that any time I think of something I want to remember I don't postpone it because of laziness. As I'm always learning I have a lot of repositories. To automate this task I've created the mnemocards github command.

The result of executing the next command is a ~/.mnemocards file with a list of all the repositories with Anki cards and the local path in my PC where I want them to be cloned.

$ mnemocards github -i "guiferviz/learning_([^ _]*)" -d ~/learning
... some output ...
$ cat ~/.mnemocards
{
    "repos": [
        [
            "[email protected]:guiferviz/learning_japanese.git",
            "~/learning/japanese"
        ],
        [
            "[email protected]:guiferviz/learning_programming.git",
            "~/learning/programming"
        ]
    ]
}

To execute that command you need a file with your GitHub API key with enough permissions to read your repositories. Go to GitHub Tokens and generate a new one with permissions for reading your repositories. If you want to read private repositories select the next permissions:

If you want to use the mnemocards github --gists option, that is, cloning gists instead of repositories, your GitHub API key should have different permissions. I do not use gists because they do not allow to commit directories and I want to have my images good organized.

You can also create the ~/.mnemocards file by hand taking the given example and substituting the URLs and the local paths.

Once you have your file manually create or automatically created, you can clone all your repos with the next command. If your repo is already cloned, this command also pulls the last changes from the server.

mnemocards pull

To commit and push all the changes in a repository use the next command. Everything in your repositories is going to be added and committed, so if you do not want to include all the files add exclude patters in your .gitignore or push all your repositories manually. Commits are made using a default commit message similar to "Updating repository with mnemocards.".

mnemocards push

Docker

A Docker image is available so that you can generate your packages without having to install Mnemocards in your environment. At the moment you need to have Anki installed locally. The Docker image I've built for you is named guiferviz/mnemocards and it is available in the Mnemocards Docker Hub repository. Read the documentation under the docker/ directory to learn how to execute the image.

If you want to generate the Docker image by your own, you will also find all the information in the docker/ directory (Dockerfile and build commands).

As Docker images are auto-generated when a new version tag is pushed to the GitHub repository, using Docker is a very convenient way to switch between different versions of Mnemocards.

VIM users

I'm a die-hard VIM user, for that reason I've created a vim_syntax/cards.vim syntax file. It's not too fancy but it looks better than using the Markdown syntax.

Using Markdown syntax:

Using my own Cards syntax:

Among my UltiSnips snippets I have one that generates a new card with an unique ID, a title and a body.

snippet card "Create a new card" b
<<<
id: `!p if not snip.c: snip.rv = get_uuid()`
===
# ${1}
---
${2}
>>>
endsnippet

The get_uuid function is defined as:

def get_uuid():
    """Get an UUID string. """

    import uuid
    return str(uuid.uuid4())

I also use the Markdown Preview plugin so I can see how my cards look like without generating the package. It's not perfect for the *.cards format, but it's better than nothing :)