Did you know that English tokens represent more than 90ย % of generalist LLMs training dataย ?
Weโre OpenLLM Europe ๐ช๐บ, an Open Source community committed to empower LLM projects in all European languages, specifically medium and low-resource languages. We aim to build the first multimodal multilingual european model with partners all over the continent.
- OpenLLM-Europe ๐ช๐บ
- Discord: https://discord.com/invite/b5UQTWQn
- Contact:[email protected] - https://github.com/OpenLLM-Europe
Our work is 100% open and fits in with ALT-EDIC's mission, which you can discover here: https://language-data-space.ec.europa.eu/related-initiatives/alt-edic_en
- ALT = for Alliance for Language Technologies EDIC
- EDIC = European Digital Infrastructure Consortium
The mission of the ALT-EDIC is to develop a common European infrastructure in Language Technologies, focussing particularly on Large Language Models. It seeks to improve European competitiveness, increase the availability of European language data and uphold Europe's linguistic diversity and cultural richness. The ALT-EDIC is a multi-country project, run and funded by the Member States who have agreed to join it. By pooling resources, the members should achieve the critical mass of data and other resources needed to create and finetune Large Language Models, which any single member would find difficult to do alone.
OpenLLM Europe ๐ช๐บ is thus making its contribution to identifying and attempting to federate national initiatives to create LLMs or learning datasets. Our goal is to federate, create together & promote open source and sovereign Generative AI digital commons.
Here is a list of Open Source projects in AI (mostly LLMs) that we have gathered during our research.
Feel free to use it to build great things together. Feel free to amen it and add projects that we missed. PR are welcome ! Feel free to join our Discord server
- Insat - Contact:[email protected]
- CroAI - https://www.linkedin.com/posts/croai_large-language-models-have-demonstrated-impressive-activity-7167796231417520128-AlDs/
- Czech BERT - Contact:[email protected]
- Danish foundation models - https://www.linkedin.com/in/saattrupdan/
- Danskgpt - Contact:[email protected]
- Going Dutch - Contact:[email protected]
- Stability AI - Multilingual ๐ - https://stability.ai/contact
- NOUS Research - Contact:[email protected]
- TartuNLP - Discord: https://discord.gg/tartunlp - Contact:[email protected]
- PORO silogen - Contact:[email protected]
- Le Bon LLM - https://www.linkedin.com/company/le-bon-llm/
- OpenLLM France - Contact:[email protected] - https://www.openllm-france.fr
- LAION - Discord: https://discord.com/invite/laion - Contact: [email protected]
- OpenGPTX - Discord: https://discord.gg/ZmF2dJgJ - Contact: [email protected]
- Fraunhofer IAIS - Contact:[email protected]
- GFOSS - Contact:[email protected]
- Hilanco - Contact:[email protected]
- HUN-REN - Contact:[email protected]
- gaBERT - Discord: https://discord.com/invite/b5UQTWQn - Contact:[email protected]
- Fauno Italian LLM - Contact:[email protected]
- NLP Odyssey - Discord: https://discord.gg/nlpodyssey - Contact:[email protected]
- LVBERT - Contact:[email protected]
- EMBEDDIA - Contact:[email protected]
- Tilde AI powered langage technologies - Contact:https://www.linkedin.com/in/andrejs-vasiljevs/
- Tollef Jรธrgensen - Contact:[email protected]
- Polbert - Contact:[email protected]
- Sabia - Contact:[email protected]
- LLM for Romanian - Contact:[email protected]
- Beia, consult international - Contact:[email protected]
- Serbian LLM - Serbian ๐ท๐ธ - https://www.linkedin.com/in/aleksagordic/
- KInit - https://www.linkedin.com/in/juraj-bezdek-6b521346/
- Blip.solution - Contact:[email protected]
- SloBERTa - Contact:[email protected]
- Projecte Aina : Aguila Alpaca - Discord: https://discord.gg/projecte-aina - Contact:[email protected]
- BSC โ Barcelona supercomputing Center - Contact:[email protected]
- Expert AI -Contact:[email protected]
- AI Sweden - Contact:[email protected]
- Satisfied - Discord: https://discord.gg/statisfied - Contact:[email protected]
- HPLT - Contact:[email protected]
- Unbabel - Contact:https://communityonboarding.unbabel.com/signup/step/0
- Occiglot - Contact:brack.cs.tu-darmstadt.de
- TrustLLM - Contact:[email protected]
- Luxembourg Institute of Science and technology - Luxembourg ๐ฑ๐บ - Contact:[email protected]
- Sosnitskij - https://www.linkedin.com/in/said-azizov-6b5a82256/
- Evidently AI - Multilingual ๐ - Discord: https://discord.gg/evidentlyai - Contact:[email protected]
- YugoGPT - ๐ท๐ธ๐ญ๐ท๐ง๐ฆ๐ฒ๐ฐ๐ฝ๐ฐ - Discord: https://discord.gg/yugogpt - https://www.linkedin.com/in/aleksagordic/
- LangFuse US project using european languages ๐บ๐ธ - Contact:[email protected]
- Sayhan - Turkish ๐น๐ท - https://www.linkedin.com/in/sayhan-yalva%C3%A7er-0617641b1/
- Sestek - Turkish ๐น๐ท - Contact:[email protected]
- AI Forever - Armenian ๐ฆ๐ฒ - https://www.linkedin.com/in/said-azizov-6b5a82256/
- Yandex YaLM 100B - Russian and English ๐ท๐บ๐ฌ๐ง - Contact:[email protected]
- EleutherAI - International collaboration using english ๐ - Discord:https://discord.com/invite/zBGx3azzUn - Contact:[email protected]