Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add descriptions for dictionaries.json #350

Open
adelavega opened this issue May 11, 2019 · 2 comments
Open

Add descriptions for dictionaries.json #350

adelavega opened this issue May 11, 2019 · 2 comments

Comments

@adelavega
Copy link
Member

https://github.com/tyarkoni/pliers/blob/a8d675b15e232b0d9c4dbe39307ecc32aeab8a34/pliers/datasets/dictionaries.json#L11

Some of the Predefined Dictionaries don't have descriptions, and they are buried in the source. Adding descriptions would make these much more useful.

For valence, I found the following, which explains their nomenclature:

Every value is reported three times, one for each dimension, prefixed with V for valence, A for arousal, and D for dominance. For each word, we report the overall mean (Mean.Sum), standard deviation (SD.Sum), and number of contributing ratings (Rat.Sum). We also report these values for group differences, replacing the suffix .Sum with the following (.M = male; .F =
31 female; .O = older; .Y = younger; .H = high education; .L = low education). Words are presented in alphabetical order

@adelavega
Copy link
Member Author

adelavega commented May 11, 2019

For aoa:

For each word, we report the number of times it occurs in the trimmed data (OccurTotal). For most words,
34 the count is about 19. However, for the 10 calibration words and the 52 control words, this amounts to more than 1,900 presentations. Next, we provide the mean AoA rating (in years of age) and the standard deviation (Rating.Mean and Rating.SD). We also present the number of responders that gave numeric ratings to the word, rather than rated it as unknown (OccurNum). This information is useful, because it helps to avoid using unknown words in psychological experiments and indicates the degree of reliability of the mean AoA ratings. Finally, we add word frequency counts from the 50 million SUBTLEX-US corpus (Brysbaert & New, 2009). Words are presented in the decreasing order of frequency of occurrence. The 574 words that were not present in the SUBTLEX-US frequency list were assigned the frequenc

Actually the above description does not really match the dataset. I'm guessing the variable of interest is AoA_Kup (Age of acquisition Kuperman)

@adelavega
Copy link
Member Author

adelavega commented May 11, 2019

For concreteness:

  1. The word 2.Whether it is a single word or a two-word expression
    3.The mean concreteness rating 4.The standard deviation of the concreteness ratings 5.The number of persons indicating they did not know the word 6.The total number of persons who rated the word 7.Percentage participants who knew the word 8.The SUBTLEX-US frequency count

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant