Skip to content

Commit

Permalink
adds original corpus by Gertim Alberda, converted to MuseScore 3.6.2
Browse files Browse the repository at this point in the history
  • Loading branch information
johentsch committed May 27, 2024
1 parent 2d4da8a commit c7d9c9d
Show file tree
Hide file tree
Showing 393 changed files with 1,967,391 additions and 61 deletions.
161 changes: 161 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,164 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

*.DS_Store
*.mscx,
*.mscz,
Expand Down
60 changes: 10 additions & 50 deletions .zenodo.json
Original file line number Diff line number Diff line change
@@ -1,29 +1,7 @@
{
"license": "CC-BY-NC-SA-4.0",
"description": "<p>This corpus of annotated <a href=\"https://musescore.org\">MuseScore</a> files has been created within the <a href=\"https://github.com/DCMLab/dcml_corpora\">DCML corpus initiative</a> and employs the <a href=\"https://github.com/DCMLab/standards\">DCML harmony annotation standard</a>. It is one out of nine similar corpora that have been grouped together to <a href=\"https://doi.org/10.5281/zenodo.7473560\">An Annotated Corpus of Tonal Piano Music from the Long 19th Century</a> which comes with a data report that is currently under review.</p>\n\n<p>The dataset lives on GitHub (link under &quot;Related identifiers&quot;) and is stored on Zenodo purely for conservation and automatic DOI generation for new GitHub releases. For technical reasons, we include only brief, generic instructions on how to use the data. For more detailed documentation, please refer to the dataset&#39;s GitHub page.</p>\n\n<p><strong>What is included</strong></p>\n\n<p>The dataset includes annotated MusicScores <strong>.mscx</strong> files that have been created with <a href=\"https://github.com/musescore/MuseScore/releases/tag/v3.6.2\">MuseScore 3.6.2</a> and can be opened with any MuseScore 3, or later version. Apart from that, the score information (measures, notes, harmony labels) have been extracted in the form of TSV files which can be found respectively in the folders <code>measures</code>, <code>notes</code>, and <code>harmonies</code>. They have been extracted with the Python library <a href=\"https://pypi.org/project/ms3/\">ms3</a> and its documentation has a <a href=\"https://ms3.readthedocs.io/columns\">column glossary for looking up the meaning of a column</a>.</p>\n\n<p><strong>Getting the data</strong></p>\n\n<p>You can download the dataset as a ZIP file from Zenodo or GitHub. Please note that these automatically generated ZIP files do not include submodules, which would appear as empty folders. If you need ZIP files, you will need to find the submodule repositories (e.g. via GitHub) and download them individually.</p>\n\n<p>Apart from that, there is the possibility to git-clone the GitHub repository to your disk. This has the advantage that it allows to version-control any changes you want to make to the dataset and to ask for your changes to be included (&quot;merged&quot;) in a future version.</p>",
"contributors": [
{
"orcid": "0000-0002-6329-7492",
"type": "DataCollector",
"name": "Amelia Brey"
},
{
"orcid": "0000-0002-6588-2257",
"type": "DataCollector",
"name": "Davor Krkljus"
},
{
"orcid": "0009-0005-5468-5871",
"type": "DataCollector",
"name": "Ehsan Mohagheghi Fard"
},
{
"orcid": "0000-0002-2105-525X",
"type": "DataCollector",
"name": "Hanné Becker"
}
],
"title": "{{ pretty_repo_name }}",
"description": "<p>This repository has been made possible by Gertim Alberda, \nwho ‘manually’ digitalized all the 389 Bach 4-voice chorales, \nas published by the <a href=\"https://imslp.org/wiki/Special:ReverseLookup/348824\">Breitkopf edition &#39;nr. 3765&#39;</a>. \nFor this he meticulously transcribed all the notes and lyrics of that edition, by using the music notation editor MuseScore (MS). \nHe checked it against both BGA and NBA (Bach-Gesellschaft Ausgabe and Neue Bach-Ausgabe), \nif there was any reasonable doubt for it, and even found and improved a couple of mistakes that way (&lt;10).\nSolely for performance reasons, he also applied many hidden (grey) extras in MS to make the scores and its phrasing \nsound as realistic as possible; hidden fermatas, tempo changes, breath pauses, phrasing, note-cutbacks, etc. \nThis was all done to &#39;humanize&#39; the playback and make it sound like a real choir performance (without words), \nand given the technical limitations at that time (2016-2018). \nAlso see the <a href=\"https://gertim-alberda.com/chorales/info.html\">Info</a> tab on his website.</p>\n<p>The full corpus resides on Gertim Alberda’s <a href=\"https://gertim-alberda.com/chorales\">dedicated website</a>. \nThe scores there have synchronized play back (synthesized and human performances) available \nand can be downloaded in various formats (mscz/xml/midi/mp3/pdf). \n<a href=\"https://gertim-alberda.com/chorales/BachChorales/B288.html\">Here is an example</a> of his playback page. \nHis intention is to have only human performances available (YT videos and/or mp3’s) for all the scores, \nbut that is still an ongoing process \n(see the <a href=\"https://gertim-alberda.com/chorales/changelog_bach_chorales.html\">changelog</a> on his website). \nHe has agreed to share his MS files under a <a href=\"https://creativecommons.org/licenses/by-nc-sa/4.0/\">Creative Commons BY-NC-SA 4.0</a> license. \nThe license prohibits the use of the scores for any commercial purpose, including the training of machine learning models for use in commercial products.</p>\n",
"title": "J.S. Bach - 389 Chorale Settings (Choralgesänge)",
"keywords": [
"music research",
"music theory",
Expand All @@ -32,21 +10,10 @@
"corpus studies",
"corpora",
"symbolic dataset",
"scores",
"annotated dataset",
"harmony",
"key annotations",
"chord annotations",
"phrase annotations",
"cadence annotations"
],
"grants": [
{
"id": "10.13039/501100001711::105216_182811"
}
"scores"
],
"upload_type": "dataset",
"version": "{{ corpus_release }}",
"version": "v1.0",
"communities": [
{
"identifier": "dcml"
Expand All @@ -55,34 +22,27 @@
"identifier": "epfl"
}
],
"publication_date": "2023-09-20",
"publication_date": "2024-05-27",
"creators": [
{
"orcid": "0000-0002-1986-9545",
"affiliation": "\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne",
"name": "Johannes Hentschel"
},
{
"orcid": "0000-0003-1455-5990",
"affiliation": "\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne",
"name": "Yannis Rammos"
"name": "Gertim Alberda"
},
{
"orcid": "0000-0002-4323-7257",
"orcid": "0000-0002-1986-9545",
"affiliation": "\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne",
"name": "Martin Rohrmeier"
"name": "Johannes Hentschel"
}
],
"access_right": "open",
"related_identifiers": [
{
"scheme": "url",
"identifier": "https://github.com/DCMLab/{{ repo_name }}/tree/{{ corpus_release }}",
"identifier": "https://github.com/DCMLab/389_chorale_settings/tree/v1.0",
"relation": "references"
},
{
"scheme": "url",
"identifier": "https://dcmlab.github.io/{{ repo_name }}/",
"identifier": "https://dcmlab.github.io/389_chorale_settings/",
"relation": "isDocumentedBy"
}
]
Expand Down
Loading

0 comments on commit c7d9c9d

Please sign in to comment.