Replies: 1 comment 1 reply
-
Hi! This version of the
Instead, use the from datasets import load_dataset
common_voice_train = load_dataset("mozilla-foundation/common_voice_8_0", "sw", split="train") |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi guys,
I'm facing a problem loading the Swahili dataset from common voice. I'm using google colab Swahili datasets are already uploaded in the Hugging face dataset package but I can't load them.
My approach:
!pip install datasets=2.0.0
from datasets import load_dataset
common_voice_train = load_dataset("common_voice", "sw", split="train")
The resulting error:
ValueError: BuilderConfig sw not found. Available: ['ab', 'ar', 'as', 'br', 'ca', 'cnh', 'cs', 'cv', 'cy', 'de', 'dv', 'el', 'en', 'eo', 'es', 'et', 'eu', 'fa', 'fi', 'fr', 'fy-NL', 'ga-IE', 'hi', 'hsb', 'hu', 'ia', 'id', 'it', 'ja', 'ka', 'kab', 'ky', 'lg', 'lt', 'lv', 'mn', 'mt', 'nl', 'or', 'pa-IN', 'pl', 'pt', 'rm-sursilv', 'rm-vallader', 'ro', 'ru', 'rw', 'sah', 'sl', 'sv-SE', 'ta', 'th', 'tr', 'tt', 'uk', 'vi', 'vot', 'zh-CN', 'zh-HK', 'zh-TW']
Any assistance on solving this, please, or even the idea of loading Swahili common voice data apart from manual downloading
Thank you
Beta Was this translation helpful? Give feedback.
All reactions