Skip to content

An easy-to-use library for the ISO 639 language representation standards

License

Notifications You must be signed in to change notification settings

tijmenbaarda/iso639

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

iso639-lang

PyPI Supported Python versions PyPI - Downloads

iso639-lang is a simple library to handle the ISO 639 series of international standards for language codes.

>>> from iso639 import Lang
>>> Lang("fr")
Lang(name='French', pt1='fr', pt2b='fre', pt2t='fra', pt3='fra', pt5='')

iso639-lang allows you to switch from one language code to another easily. There’s no need to manually download or parse data files, just use the Lang class!

ISO 639-1, ISO 639-2, ISO 639-3 and ISO 639-5 parts are all supported.

Installing iso639-lang and Supported Versions

iso639-lang is available on PyPI:

$ pip install iso639-lang

iso639-lang supports Python 3.6+.

Usage

Handling language codes with iso639-lang is very simple.

Begin by importing the Lang class:

>>> from iso639 import Lang

Lang is instantiable with any ISO 639 language code or name. For example, let’s try to get the ISO 639 codes for French:

>>> lg = Lang("French")
>>> lg.name
'French'
>>> lg.pt1
'fr'
>>> lg.pt2b
'fre'
>>> lg.pt2t
'fra'
>>> lg.pt3
'fra'
>>> lg.pt5
''

You can use the asdict method to return ISO 639 language values as a Python dictionary.

>>> lg.asdict()
{'name': 'French', 'pt1': 'fr', 'pt2b': 'fre', 'pt2t': 'fra', 'pt3': 'fra', 'pt5': ''}

In data structures

Lists of Lang instances are sortable by name.

>>> langs = [Lang("deu"), Lang("eng"), Lang("rus"), Lang("eng")]
>>> [lg.name for lg in sorted(langs)]
['English', 'English', 'German', 'Russian']

As Lang is hashable, Lang instances can be added to a set or used as dictionary keys.

>>> [lg.pt3 for lg in set(langs)]
['eng', 'rus', 'deu']

Iterator

iter_langs() iterates through all possible Lang instances, ordered alphabetically by name.

>>> from iso639 import iter_langs
>>> [lg.name for lg in iter_langs()]
["'Are'are", "'Auhelawa", "A'ou", ... , 'ǂHua', 'ǂUngkue', 'ǃXóõ']

Language Types

The type of a language is accessible thanks to the type method.

>>> lg = Lang("Latin")
>>> lg.type()
'Ancient'

Macrolanguages

You can easily determine whether a language is a macrolanguage or an individual language.

>>> lg = Lang("Arabic")
>>> lg.scope()
'Macrolanguage'

Use the macro method to get the macrolanguage of an individual language.

>>> lg = Lang("Wu Chinese")
>>> lg.macro()
Lang(name='Chinese', pt1='zh', pt2b='chi', pt2t='zho', pt3='zho', pt5='')

Conversely, you can also list all the individual languages that share a common macrolanguage.

>>> lg = Lang("Persian")
>>> lg.individuals()
[Lang(name='Iranian Persian', pt1='', pt2b='', pt2t='', pt3='pes', pt5=''), 
Lang(name='Dari', pt1='', pt2b='', pt2t='', pt3='prs', pt5='')]

Exceptions

When an invalid language value is passed to Lang, an InvalidLanguageValue exception is raised.

>>> from iso639.exceptions import InvalidLanguageValue
>>> try:
...     Lang("foobar")
... except InvalidLanguageValue as e:
...     e.msg
... 
"'foobar' not supported by ISO 639"

When an deprecated language value is passed to Lang, a DeprecatedLanguageValue exception is raised.

>>> from iso639.exceptions import DeprecatedLanguageValue
>>> try:
...     Lang("gsc")
... except DeprecatedLanguageValue as e:
...     lg = Lang(e.change_to)
...     f"{e.name} replaced by {lg.name}"
...
'Gascon replaced by Occitan (post 1500)'

Sources of data used by iso639-lang

As of November 12, 2023, iso639-lang is based on the latest official code tables provided by the ISO 639 registration authorities.

Standard Name Registration Authority
ISO 639-1 Part 1: Alpha-2 code Infoterm
ISO 639-2 Part 2: Alpha-3 code Library of Congress
ISO 639-3 Part 3: Alpha-3 code for comprehensive coverage of languages SIL International
ISO 639-4 Part 4: Implementation guidelines and general principles for language coding (not a list) ISO/TC 37/SC 2
ISO 639-5 Part 5: Alpha-3 code for language families and groups Library of Congress

About

An easy-to-use library for the ISO 639 language representation standards

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%