Skip to content

Tools for analysis of classification and subject data from github.com/zooniverse/Panoptes

License

Notifications You must be signed in to change notification settings

vrooje/panoptes_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

panoptes_analysis

Tools for analysis of classification and subject data from github.com/zooniverse/Panoptes

  • basic_project_stats.py - computes basic statistics from a raw classification export file. Run without any inputs to see usage details. Outputs to screen:

    • Total number of classifications
    • Total number of classifiers (registered and unregistered)
    • Stats on classifications per subject
    • Stats on classifications per user
    • Top 10 most prolific classifiers (note: for your edification only; I strongly recommend against publishing this)
    • Gini coefficient for classifications (more details on this in the code, as comments)

    This pays no attention to separate workflows or versions, so if you want those separated you will need to save a subset of the raw classification exports to a new csv file with the same format.

  • sessions_inproj_byuser.py - computes classification and session statistics for classifiers. Run at the the command line without additional inputs to see the usage. Output columns:

    • n_class: total number of classifications by the classifier
    • n_sessions: total number of sessions by the classifier
    • n_days: number of unique days on which the classifier has classified
    • first_day: date of first classification (YYYY-MM-DD)
    • last_day: date of last classification (YYYY-MM-DD)
    • tdiff_firstlast_hours: time elapsed between first and last classification (hours)
    • time_spent_classifying_total_minutes: total time spent actually classifying, i.e. work effort (minutes)
    • class_per_session_min: minimum number of classifications per session
    • class_per_session_max: maximum number of classifications per session
    • class_per_session_med: median number of classifications per session
    • class_per_session_mean: mean number of classifications per session
    • class_length_mean_overall: mean length of a single classification (minutes), over all sessions
    • class_length_median_overall: median length of a single classification (minutes), over all sessions
    • session_length_mean: mean length of a session (minutes)
    • session_length_median: median length of a session (minutes)
    • session_length_min: length of shortest session (minutes)
    • session_length_max: length of longest session (minutes)
    • which_session_longest: session number of classifier's longest session (by time spent)
    • mean_session_length_first2: mean session length in the classifier's first 2 sessions (minutes)
    • mean_session_length_last2: mean session length in the classifier's last 2 sessions (minutes)
    • mean_class_length_first2: mean classification length in the classifier's first 2 sessions (minutes)
    • mean_class_length_last2: mean classification length in the classifier's last 2 sessions (minutes)
    • class_count_session_list: classification counts in each session, formatted as: [n_class_1; n_class_2; ...]

The mean session and classification lengths in the first 2 and last 2 sessions are only calculated if the user has classified in at least 4 sessions; otherwise the values are 0.

About

Tools for analysis of classification and subject data from github.com/zooniverse/Panoptes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages