Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select the recording-work pairs from the corpus #1

Open
6 tasks
sertansenturk opened this issue Oct 2, 2015 · 2 comments
Open
6 tasks

Select the recording-work pairs from the corpus #1

sertansenturk opened this issue Oct 2, 2015 · 2 comments

Comments

@sertansenturk
Copy link
Contributor

  • From MusicBrainz fetch the metadata of all recordings, including the work mbid.
  • From the work mbid, get the makam.
  • From the SymbTr-mbid mapping, filter the work-recording pairs, which have a score.
  • Eliminate the work-recording pairs with erroneous SymbTr files given in SymbTr files without any section info SymbTr#5 and SymbTr files with inconsistent lyrics/section information SymbTr#6. There are also some scores, which don't have any section information (e.g. taksim transcriptions, seyirs etc.). It is better to disregard them too. The complete list is in (here)[SymbTr-metadata-extractor/blob/master/logs/no_section_log.txt].
  • From the remaining pairs, fetch and divide the pairs according to the 20 makams used in the (turkish_makam_recognition_dataset)[https://github.com/MTG/turkish_makam_recognition_dataset/tree/master/data].
  • At this point, we will probably observe that there are 25-30 recordings for the makam with fewest pairs. Starting from the makam with the fewest pairs, we will select the recordings by listening. The target is 20 pairs per makam.
@sertansenturk
Copy link
Contributor Author

We would like to minimize the number of composition shared among different recordings and maximize the variety of instrumentation/voicing and try to add "expressive" performances (e.g. performances of grandmasters) . There should be challenging pairs so we can also discuss the limitations of the algorithms. While adding a variety in form could be interested our core audio-score alignment system is mainly tested on pesrev, sazsemaisi and sarkis, which are the most common forms in classical makam music. Hence it might be better to stick to these forms.

@sertansenturk
Copy link
Contributor Author

@altugkarakurt has coded most of the scripts needed to fetch and process the metadata for turkish_makam_recognition_dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant