Update README.md

Updated descriptions for content and style
cldf-datasets · Oct 10, 2024 · d38752f · d38752f
1 parent 322f974
commit d38752f
Showing 1 changed file with 11 additions and 12 deletions.
diff --git a/README.md b/README.md
@@ -17,30 +17,29 @@ This dataset is licensed under a CC-BY-4.0 license
 
 ### Forms
 
-Words for the concepts 'five' and 'hand' in Austronesian languages have been collected from four datasets
-described in the [ContributionTable](cldf/contributions.csv). Since forms were aggregated on language
+Words for the concepts ‘five’ and ‘hand’ in Austronesian languages have been collected from four datasets,
+described in the [ContributionTable](cldf/contributions.csv). Since some entries in these datasets contain multiple forms for a single concept, and since forms were aggregated at the language
 level (with forms for dialects taken as forms for the parent language) and across datasets, often more
-than one form per language and concept was attested.
-If multiple forms were attested, one was chosen so as to maximize potential for colexification.
+than one form per concept in a given language was attested.
+If multiple forms were attested, one was chosen so as to maximize the potential for finding colexification.
 In other words, the pair of forms selected for a language is the one closest to exhibiting full 
 colexification (or, failing that, partial colexification). The decision was made so as to minimize 
-"false negatives" (i.e., cases where there could appear to be *no* colexification of the two concepts, 
-but only because there are, e.g., two synonyms for 'hand' in a given language and the particular dataset chose the "wrong" one).
+“false negatives” (i.e., cases where there could appear to be *no* colexification of the two concepts, 
+but only because there are, e.g., two synonyms for ‘hand’ in a given language and the particular dataset chose the “wrong” one).
 
 
 ### Features
 
-Based on the words for 'five' and 'hand' collected in the [FormTable](cldf/forms.csv) and the inferred
-replacement events (described below), six features have been coded, with values reported in the 
-[ValueTable](cldf/values.csv). The distribution of values for these features can be investigated 
+Based on the words for ‘five’ and ‘hand’ collected in the [FormTable](cldf/forms.csv), six features have been coded, with values reported in the 
+[ValueTable](cldf/values.csv). A seventh feature details the types of numeral systems found in the languages in this dataset; it derives from Barlow (2023) “Papuan-Austronesian contact and the spread of numeral systems in Melanesia”, updated to reflect the changes in classifications between Glottolog versions 4.7 and 5.0. The distribution of values for these features can be investigated 
 using [geographical maps](maps/README.md).
 
 
 ### Replacement events
 
 Replacement events (i.e., rows in the [replacements table](cldf/replacements.csv)) represent a probable loss of the 
 inherited form for ‘hand’ or ‘five’, whether in the individual history of a single language or in a protolanguage ancestral
-to multiple languages, with Glottolog languoids (i.e. language subgroups or individual languages in the Glottolog 5.0 
+to multiple languages, with Glottolog languoids (i.e., language subgroups or individual languages in the Glottolog 5.0 
 classification of the Austronesian family) serving as proxies. While the replacements table lists the name and Glottocode of this
 languoid, the individual languages in our sample that fall within this designation are linked via the Glottocodes in the
 `Language_IDs` column.
@@ -55,7 +54,7 @@ sqlite> select distinct r.subgroup from languagetable as l, "replacements.csv_la
 ts.csv_cldf_id" = r.cldf_id and l.cldf_name = 'Lenkau';
 South-East Admiralty
 ```
-and if we wanted to see which other languages are subsumed under "South-East Admiralty", we could run
+and if we wanted to see which other languages are subsumed under “South-East Admiralty”, we could run
 ```
 sqlite> select distinct l.cldf_name from languagetable as l, "replacements.csv_languagetable" as rl, "replacements.csv" as r where l.cldf_id == rl.languagetable_cldf_id and rl."replacements.csv_cldf_id" = r.cldf_id and r.subgroup = 'South-East Admiralty';
 Lenkau
@@ -88,4 +87,4 @@ Kela (Papua New Guinea),possibly shared change between Bugawac/Kela
 
 The following CLDF datasets are available in [cldf](cldf):
 
-- CLDF [StructureDataset](https://github.com/cldf/cldf/tree/master/modules/StructureDataset) at [cldf/StructureDataset-metadata.json](cldf/StructureDataset-metadata.json)
+- CLDF [StructureDataset](https://github.com/cldf/cldf/tree/master/modules/StructureDataset) at [cldf/StructureDataset-metadata.json](cldf/StructureDataset-metadata.json)