Skip to content
forked from goodmami/wn

A modern, interlingual wordnet interface for Python

License

Notifications You must be signed in to change notification settings

francis-dion/wn

 
 

Repository files navigation

Wn logo
a Python library for wordnets
PyPI link Python Support tests Documentation Status
Available Wordnets | Documentation | FAQ | Migrating from NLTK | Roadmap


Wn is a Python library for exploring information in wordnets. Install it from PyPI and download some data:

$ pip install wn
$ python -m wn download oewn:2021  # the Open English WordNet 2021

Then start exploring:

>>> import wn
>>> en = wn.Wordnet('oewn:2021')  # Create Wordnet object to query
>>> ss = en.synsets('win')[0]     # Get the first synset for 'win'
>>> ss.definition()               # Get the synset's definition
'be the winner in a contest or competition; be victorious'

Features

Available Wordnets

Any WN-LMF-formatted wordnet can be added to Wn's database from a local file or remote URL, but Wn also maintains an index (see wn/index.toml) of available projects, similar to a package manager for software, to aid in the discovery and downloading of new wordnets. The projects in this index are listed below.

English Wordnets

There are several English wordnets available. In general it is recommended to use the latest Open English Wordnet, but if you have stricter compatibility needs for, e.g., experiment replicability, you may try the OMW English Wordnet based on WordNet 3.0 (compatible with the Princeton WordNet 3.0 and with the NLTK), or OpenWordnet-EN (for use with the Portuguese wordnet OpenWordnet-PT).

Name Specifier # Synsets Notes
Open English WordNet oewn:2021
ewn:2020
ewn:2019
120039
120053
117791
Recommended
 
 
OMW English Wordnet based on WordNet 3.0 omw-en:1.4 117659 Included with omw:1.4
OMW English Wordnet based on WordNet 3.1 omw-en31:1.4 117791
OpenWordnet-EN own-en:1.0.0 117659 Included with own:1.0.0

Other Wordnets and Collections

These are standalone non-English wordnets and collections. The wordnets of each collection are listed further down.

Name Specifier # Synsets Language
Open Multilingual Wordnet omw:1.4 n/a multiple [mul]
Open German WordNet odenet:1.4
odenet:1.3
36268
36159
German [de]
Open Wordnets for Portuguese and English own:1.0.0 n/a multiple [mul]
KurdNet kurdnet:1.0 2144 Kurdish [ckb]

Open Multilingual Wordnet (OMW) Collection

The Open Multilingual Wordnet collection (omw:1.4) installs the following lexicons (from here) which can also be downloaded and installed independently:

Name Specifier # Synsets Language
Albanet omw-sq:1.4 4675 Albanian [sq]
Arabic WordNet (AWN v2) omw-arb:1.4 9916 Arabic [arb]
BulTreeBank Wordnet (BTB-WN) omw-bg:1.4 4959 Bulgarian [bg]
Chinese Open Wordnet omw-cmn:1.4 42312 Mandarin (Simplified) [cmn-Hans]
Croatian Wordnet omw-hr:1.4 23120 Croatian [hr]
DanNet omw-da:1.4 4476 Danish [da]
FinnWordNet omw-fi:1.4 116763 Finnish [fi]
Greek Wordnet omw-el:1.4 18049 Greek [el]
Hebrew Wordnet omw-he:1.4 5448 Hebrew [he]
IceWordNet omw-is:1.4 4951 Icelandic [is]
Italian Wordnet omw-iwn:1.4 15563 Italian [it]
Japanese Wordnet omw-ja:1.4 57184 Japanese [ja]
Lithuanian WordNet omw-lt:1.4 9462 Lithuanian [lt]
Multilingual Central Repository omw-ca:1.4 45826 Catalan [ca]
Multilingual Central Repository omw-eu:1.4 29413 Basque [eu]
Multilingual Central Repository omw-gl:1.4 19312 Galician [gl]
Multilingual Central Repository omw-es:1.4 38512 Spanish [es]
MultiWordNet omw-it:1.4 35001 Italian [it]
Norwegian Wordnet omw-nb:1.4 4455 Norwegian (Bokmål) [nb]
Norwegian Wordnet omw-nn:1.4 3671 Norwegian (Nynorsk) [nn]
OMW English Wordnet based on WordNet 3.0 omw-en:1.4 117659 English [en]
Open Dutch WordNet omw-nl:1.4 30177 Dutch [nl]
OpenWN-PT omw-pt:1.4 43895 Portuguese [pt]
plWordNet omw-pl:1.4 33826 Polish [pl]
Romanian Wordnet omw-ro:1.4 56026 Romanian [ro]
Slovak WordNet omw-sk:1.4 18507 Slovak [sk]
sloWNet omw-sl:1.4 42583 Slovenian [sl]
Swedish (SALDO) omw-sv:1.4 6796 Swedish [sv]
Thai Wordnet omw-th:1.4 73350 Thai [th]
WOLF (Wordnet Libre du Français) omw-fr:1.4 59091 French [fr]
Wordnet Bahasa omw-id:1.4 38085 Indonesian [id]
Wordnet Bahasa omw-zsm:1.4 36911 Malaysian [zsm]

Open Wordnet (OWN) Collection

The Open Wordnets for Portuguese and English collection (own:1.0.0) installs the following lexicons (from here) which can also be downloaded and installed independently:

Name Specifier # Synsets Language
OpenWordnet-PT own-pt:1.0.0 52670 Portuguese [pt]
OpenWordnet-EN own-en:1.0.0 117659 English [en]

Collaborative Interlingual Index

While not a wordnet, the Collaborative Interlingual Index (CILI) represents the interlingual backbone of many wordnets. Wn, including interlingual queries, will function without CILI loaded, but adding it to the database makes available the full list of concepts, their status (active, deprecated, etc.), and their definitions.

Name Specifier # Concepts
Collaborative Interlingual Index cili:1.0 117659

Changes to the Index

ewnoewn

The 2021 version of the Open English WordNet (oewn:2021) has changed its lexicon ID from ewn to oewn, so the index is updated accordingly. The previous versions are still available as ewn:2019 and ewn:2020.

pwnomw-en, omw-en31

The wordnet formerly called the Princeton WordNet (pwn:3.0, pwn:3.1) is now called the OMW English Wordnet based on WordNet 3.0 (omw-en) and the OMW English Wordnet based on WordNet 3.1 (omw-en31). This is more accurate, as it is a OMW-produced derivative of the original WordNet data, and it also avoids license or trademark issues.

*wnomw-* for OMW wordnets

All OMW wordnets have changed their ID scheme from ...wn to omw-.. and the version no longer includes +omw (e.g., bulwn:1.3+omw is now omw-bg:1.4).

About

A modern, interlingual wordnet interface for Python

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%