Merge pull request neocl#38 from letuananh/main

jamdict version 0.1a11 ready
alt-romes · May 25, 2021 · 21242da · 21242da
2 parents 886b2c2 + 1b1b90c
commit 21242da
Show file tree

Hide file tree

Showing 22 changed files with 963 additions and 324 deletions.
diff --git a/README.md b/README.md
@@ -16,8 +16,6 @@
 * Fast look up (dictionaries are stored in SQLite databases)
 * Command-line lookup tool [(Example)](#command-line-tools)
 
-Homepage: [https://github.com/neocl/jamdict](https://github.com/neocl/jamdict)
-
 [Contributors](#contributors) are welcome! 🙇. If you want to help, please see [Contributing](https://jamdict.readthedocs.io/en/latest/contributing.html) page.
 
 # Try Jamdict out

diff --git a/docs/api.rst b/docs/api.rst
@@ -1,18 +1,27 @@
+.. _api_index:
+
 jamdict APIs
 ============
 
 An overview of jamdict modules.
 
+.. warning::
+    👉 ⚠️ THIS SECTION IS STILL UNDER CONSTRUCTION ⚠️ Help is much needed.
+
 .. module:: jamdict
 
+.. autoclass:: jamdict.util.Jamdict
+   :members:
+   :member-order: groupwise
+   :exclude-members: get_ne, has_jmne, import_data, jmnedict
+
 .. autoclass:: jamdict.util.LookupResult                     
    :members:
    :member-order: groupwise
 
-.. autoclass:: jamdict.util.Jamdict
+.. autoclass:: jamdict.util.IterLookupResult                     
    :members:
    :member-order: groupwise
-   :exclude-members: get_ne, has_jmne, import_data, jmnedict
 
 .. module:: jamdict.jmdict
 

diff --git a/docs/index.rst b/docs/index.rst
@@ -4,6 +4,16 @@ Jamdict's documentation!
 
 `Jamdict <https://github.com/neocl/jamdict>`_ is a Python 3 library for manipulating Jim Breen's JMdict, KanjiDic2, JMnedict and kanji-radical mappings.
 
+Welcome
+-------
+
+Are you new to this documentation? Here are some useful pages:
+
+- Want to try out Jamdict package? Try `Jamdict online demo <https://replit.com/@tuananhle/jamdict-demo>`_
+- Want some useful code samples? See :ref:`recipes`.
+- Want to look deeper into the package? See :ref:`api_index`.
+- If you want to help developing Jamdict, please visit :ref:`contributing` page.
+
 Main features
 -------------
 
@@ -27,14 +37,18 @@ If you want to help developing Jamdict, please visit :ref:`contributing` page.
 Installation
 ------------
 
-Jamdict is `available on PyPI <https://pypi.org/project/jamdict/>`_ and
+Jamdict and `jamdict-data <https://pypi.org/project/jamdict/>`_ are both `available on PyPI <https://pypi.org/project/jamdict/>`_ and
 can be installed using pip.
 For more information please see :ref:`installpage` page.
 
 .. code:: bash
 
    pip install jamdict jamdict-data
 
+Also, there is an online demo Jamdict virtual machine to try out on Repl.it
+
+https://replit.com/@tuananhle/jamdict-demo
+
 Sample jamdict Python code
 --------------------------
 
@@ -125,6 +139,7 @@ Documentation
    recipes
    api
    contributing
+   updates
 
 Other info
 ==========

diff --git a/docs/recipes.rst b/docs/recipes.rst
@@ -3,10 +3,9 @@
 Common Recipes
 ==============
 
-- Search words using wildcards.
-- Searching for kanji characters.
-- Decomposing kanji characters into components, or search kanji characters by components.
-- Search for named entities.
+.. contents::
+    :local: 
+    :depth: 2
 
 .. warning::
     👉 ⚠️ THIS SECTION IS STILL UNDER CONSTRUCTION ⚠️
@@ -20,14 +19,55 @@ High-performance tuning
 -----------------------
 
 When you need to do a lot of queries on the database, it is possible to load the whole database
-into memory to boost up querying performance (This will takes about 400 MB of RAM) by using the ``memory_mode``
+into memory to boost up querying performance (This will takes about 400 MB of RAM) by using the :class:`memory_mode <jamdict.util.Jamdict>`
 keyword argument, like this:
 
 >>> from jamdict import Jamdict
 >>> jam = Jamdict(memory_mode=True)
 
 The first query will be extremely slow (it may take about a minute for the whole database to be loaded into memory)
 but subsequent queries will be much faster.
+
+Iteration search
+----------------
+
+Sometimes people want to look through a set of search results only once and determine which items to keep
+and then discard the rest. In these cases :func:`lookup_iter <jamdict.util.Jamdict.lookup_iter>` should be used.
+This function returns an :class:`IterLookupResult <jamdict.util.IterLookupResult>` object immediately after called.
+Users may loop through ``result.entries``, ``result.chars``, and ``result.names`` exact one loop for each
+set to find the items that they want. Users will have to store the desired word entries, characters, and names 
+by themselves since they are discarded after yield.
+
+>>> res = jam.lookup_iter("花見")
+>>> for word in res.entries:
+...     print(word)  # do somethign with the word
+>>> for c in res.chars:
+...     print(c)
+>>> for name in res.names:
+...     print(name)
+
+Part-of-speeches and named-entity types
+---------------------------------------
+
+Use :func:`Jamdict.all_pos <jamdict.util.Jamdict.all_pos>` to list all available part-of-speeches
+and :func:`Jamdict.all_ne_type <jamdict.util.Jamdict.all_pos>` named-entity types:
+
+>>> for pos in jam.all_pos():
+...     print(pos)  # pos is a string
+>>> for ne_type in jam.all_ne_type():
+...     print(ne_type)  # ne_type is a string
+
+To filter words by part-of-speech use the keyword argument ``pos``
+in :func:`loookup() <jamdict.util.Jamdict.lookup>` or :func:`lookup_iter() <jamdict.util.Jamdict.lookup_iter>`
+functions.
+
+For example to look for all "かえる" that are nouns use:
+
+>>> result = jam.lookup("かえる", pos=["noun (common) (futsuumeishi)"])
+
+To search for all named-entities that are "surname" use:
+
+>>> result = jam.lookup("surname")
 
 Kanjis and radical/components (KRAD/RADK mappings)
 --------------------------------------------------

diff --git a/docs/updates.rst b/docs/updates.rst
@@ -1,49 +1,65 @@
 .. _updates:
 
-Updates
-=======
+Jamdict Changelog
+=================
 
-2021-04-19
-----------
+jamdict 0.1a11
+--------------
 
--  [Version 0.1a9]
--  Fix data audit query
--  Enhanced Jamdict() constructor. ``Jamdict('/path/to/jamdict.db')``
-   works properly.
--  Code quality review
--  Automated documentation build via
-   `readthedocs.org <https://jamdict.readthedocs.io/en/latest/>`__
+-  2021-05-25
 
-.. _section-1:
+  - Added ``lookup_iter()`` for iteration search
+  - Added ``pos`` filter for filtering words by part-of-speeches
+  - Added ``all_pos()`` and ``all_ne_type()`` to Jamdict to list part-of-speeches and named-entity types
+  - Better version checking in ``__version__.py``
+  - Improved documentation
 
-2021-04-15
-----------
+jamdict 0.1a10
+--------------
 
--  Make ``lxml`` optional
--  Data package can be installed via PyPI with ``jamdict_data`` package
--  Make configuration file optional as data files can be installed via
-   PyPI.
+-  2021-05-19
 
-.. _section-2:
+  - Added ``memory_mode`` keyword to load database into memory before querying to boost up performance
+  - Improved import performance by using puchikarui's ``buckmode``
+  - Tested with both puchikarui 0.1.* and 0.2.*
 
-2020-05-31
-----------
+jamdict 0.1a9
+-------------
 
--  [Version 0.1a7]
--  Added Japanese Proper Names Dictionary (JMnedict) support
--  Included built-in KRADFILE/RADKFile support
--  Improved command line tools (json, compact mode, etc.)
+-  2021-04-19
 
-.. _section-3:
+  -  Fix data audit query
+  -  Enhanced ``Jamdict()`` constructor. ``Jamdict('/path/to/jamdict.db')``
+     works properly.
+  -  Code quality review
+  -  Automated documentation build via
+     `readthedocs.org <https://jamdict.readthedocs.io/en/latest/>`__
 
-2017-08-18
-----------
+jamdict 0.1a8
+-------------
 
--  Support KanjiDic2 (XML/SQLite formats)
+-  2021-04-15
 
-.. _section-4:
+  -  Make ``lxml`` optional
+  -  Data package can be installed via PyPI with ``jamdict_data`` package
+  -  Make configuration file optional as data files can be installed via PyPI.
 
-2016-11-09
-----------
+jamdict 0.1a7
+-------------
 
--  Release first version to Github
+-  2020-05-31
+
+  -  Added Japanese Proper Names Dictionary (JMnedict) support
+  -  Included built-in KRADFILE/RADKFile support
+  -  Improved command line tools (json, compact mode, etc.)
+
+Older versions
+--------------
+
+- 2017-08-18
+
+  -  Support KanjiDic2 (XML/SQLite formats)
+
+- 2016-11-09
+
+  -  Release first version to Github
diff --git a/jamdict/__init__.py b/jamdict/__init__.py
@@ -45,7 +45,7 @@
 from . import __version__ as version_info
 from .__version__ import __author__, __email__, __copyright__, __maintainer__
 from .__version__ import __credits__, __license__, __description__, __url__
-from .__version__ import __version_major__, __version_long__, __version__, __status__
+from .__version__ import __version__, __version_long__, __status__
 
 from .jmdict_sqlite import JMDictSQLite
 from .kanjidic2_sqlite import KanjiDic2SQLite

diff --git a/jamdict/__version__.py b/jamdict/__version__.py
@@ -6,10 +6,30 @@
 __copyright__ = "Copyright (c) 2016, Le Tuan Anh"
 __credits__ = []
 __license__ = "MIT License"
-__description__ = "Python library for manipulating Jim Breen's JMdict, KanjiDic2, KRADFILE and JMnedict"
+__description__ = "Python library for using Japanese dictionaries and resources (Jim Breen's JMdict, KanjiDic2, KRADFILE, JMnedict)"
 __url__ = "https://github.com/neocl/jamdict"
 __maintainer__ = "Le Tuan Anh"
-__version_major__ = "0.1"
-__version__ = "{}a10".format(__version_major__)
-__version_long__ = "{} - Alpha 10".format(__version_major__)
+# ------------------------------------------------------------------------------
+# Version configuration (enforcing PEP 440)
+# ------------------------------------------------------------------------------
 __status__ = "3 - Alpha"
+__version_tuple__ = (0, 1, 0, 11)
+__version_status__ = ''  # a specific value ('rc', 'dev', etc.) or leave blank to be auto-filled
+# ------------------------------------------------------------------------------
+__status_map__ = {'3 - Alpha': 'a', '4 - Beta': 'b', '5 - Production/Stable': '', '6 - Mature': ''}
+if not __version_status__:
+    __version_status__ = __status_map__[__status__]
+if len(__version_tuple__) == 3:
+    __version_build__ = ''
+elif len(__version_tuple__) == 4:
+    __version_build__ = f"{__version_tuple__[3]}"
+elif len(__version_tuple__) == 5:
+    __version_build__ = f"{__version_tuple__[3]}.post{__version_tuple__[4]}"
+else:
+    raise ValueError("Invalid version information")
+if __version_tuple__[2] == 0:
+    __version_main__ = f"{'.'.join(str(n) for n in __version_tuple__[:2])}"
+else:
+    __version_main__ = f"{'.'.join(str(n) for n in __version_tuple__[:3])}"
+__version__ = f"{__version_main__}{__version_status__}{__version_build__}"
+__version_long__ = f"{__version_main__} - {__status__.split('-')[1].strip()} {__version_build__}"