Gondi-Dialect-Analysis

This paper is about analyzing 46 Gondi dialects dialects spoken in Central India.
The paper is available as [pdf] (paper/gondi_dialects.pdf) here and is accepted for publishing at VarDial 2017

Bibtex:

@inproceedings{rama-coltekin-sofroniev2017gondi,
  title = {A computational analysis fo Gondi dialects},
  author = {Rama, Taraka and \c{C}\"{o}ltekin, \c{C}a\u{g}r\i{} and Sofroniev, Pavel},
  date = {2017},
  booktitle = {Proceedings of the Fourth Workshop on {NLP} for Similar Languages, Varieties and Dialects},
  pages = {(to appear)},
  location = {Valencia, Spain},
}

Cognate clustering and MrBayes

The program online_pmi.py produces the nexus file which can be fed into MrBayes for the purpose of tree building.

The program takes three inputs: a file with a seed list of probable cognates, a data file containing the data and the coding option for processing the file. The program then processes the file by computing the PMI scores for sound segments and then uses the scoring matrix to cluster the words.

For example: python3 online_pmi.py gondi_ldn_ipa1.txt data/gondi_combined.tsv IPA outputfilename The program outputs the cognate judgments for each word belonging to a concept and the number of clusters found for each concept.

The data folder contains gondi_combined.tsv file that contains the word lists in IPA, ASJP, and SCA format. Another file gondi_combined_cognates.csv contains the cognate information given by Taraka Rama (the lead author of the paper).

Finally the mrbayes/run.mb file consists of MrBayes commands that produce the consensus tree that can be visualized using FigTree. We provide a consensus tree for visualization. The tree is a rooted tree and uses Independent Gamma Branch Rates for the purpose of inferring trees. The .nexus file is also provided in the mrbayes folder. The .tre file provides the consensus tree from our analysis.

Maps

The gondi.kml file is in the maps folder that is useful for the purpose of visualization.

[] Add a pdf to the map

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
data		data
maps		maps
mrbayes		mrbayes
paper		paper
IPA.characters.list		IPA.characters.list
README.md		README.md
ae.py		ae.py
distances.py		distances.py
gondi_combined.tsv_IPA_32.LSTM_concat_bidir.nex		gondi_combined.tsv_IPA_32.LSTM_concat_bidir.nex
gondi_ldn_ipa1.txt		gondi_ldn_ipa1.txt
gondi_train_pmi_ipa_clusters2x.nex		gondi_train_pmi_ipa_clusters2x.nex
online_pmi.py		online_pmi.py
run.mb		run.mb
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gondi-Dialect-Analysis

Cognate clustering and MrBayes

Maps

Autoencoders

Requirements

About

Releases 1

Packages

Languages

PhyloStar/Gondi-Dialect-Analysis

Folders and files

Latest commit

History

Repository files navigation

Gondi-Dialect-Analysis

Cognate clustering and MrBayes

Maps

Autoencoders

Requirements

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages