CLLD Meta

How to cite

If you use these data please cite this dataset using the DOI of the particular released version you were using

Description

This dataset is licensed under a CC-BY-4.0 license

Available online at meta.clld.org

Basic Workflow

Creating the meta database is a three-step process:

Download metadata for existing datasets from Zenodo. This will update the metadata in raw/zenodo-metadata.json.

$ cldfbench clld-meta.updatemd cldfbench_clld_meta.py
Download the datasets themselves. They will be downloaded into the raw/datasets/ folder.

$ cldfbench download cldfbench_clld_meta.py
Look through the datasets and create the meta database. This will update the CLDF dataset in cldf/ and also add files that don't contain any CLDF data to etc/not-cldf.csv, so they can be avoided in the future.

$ cldfbench makecldf cldfbench_clld_meta.py

Important files

raw/zenodo-metadata.json: contains the metadata downloaded from Zenodo. This file is updated automatically by the updatemd command.
etc/blacklist.csv: contains DOIs for datasets that should be excluded from the meta database (e.g. the CLDF version of Glottolog). This file is meant to be edited manually.
etc/whitelist.csv: contains DOIs for datasets that should explicitly be added to the meta database. This file meant to be edited manually.
etc/not-cldf.csv: contains a list of dataset files that are known to not contain CLDF. These files will not be downloaded or scanned for CLDF data. This file is updated automatically by the makecldf command.

Using a Personal Access Token to access Zenodo

Since this project involves downloading a lot of data, there is a non-zero chance that the updatemd or download commands might hit the rate limits for Zenodo's API.

If you need to extend the rate limit, you can set up a Personal Access Token and add it to the $CLLD_META_ACCESS_TOKEN environment variable before running cldfbench:

$ export CLLD_META_ACCESS_TOKEN=AbCdEfG[…]
$ cldfbench download cldfbench_clld_meta.py

CLDF Datasets

The following CLDF datasets are available in cldf:

CLDF Generic at cldf/cldf-metadata.json

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
.github/workflows		.github/workflows
cldf		cldf
clld_meta		clld_meta
clld_meta_commands		clld_meta_commands
etc		etc
raw		raw
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cldfbench_clld_meta.py		cldfbench_clld_meta.py
metadata.json		metadata.json
setup.cfg		setup.cfg
setup.py		setup.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CLLD Meta

How to cite

Description

Basic Workflow

Important files

Using a Personal Access Token to access Zenodo

CLDF Datasets

About

Releases

Packages

Languages

License

cldf-datasets/clld_meta

Folders and files

Latest commit

History

Repository files navigation

CLLD Meta

How to cite

Description

Basic Workflow

Important files

Using a Personal Access Token to access Zenodo

CLDF Datasets

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages