Skip to content

Latest commit

 

History

History

data-parser

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Parsing the collected submissions on courses.mooc.fi

The output of the data-parser is a .csv file containing only answers to the DOGS FACTORIAL ANALYSIS SURVEY exercise types. The file will contain answers submitted after 22.05.2023 due to the latest format. The separator used in the .csv file is the semicolon ;.

Dataset layout

The file contains columns user_id, name, email, followed by a column per questionLabel existing in the course. Empty submissions (not answered questions) have empty entry-points.

Additionally, column course_module_name specify which module the answers for the given user belong to. Columns exercise_name per exercise (that may contain several exercise tasks) with a timestamp for when the user completed the given exercise.

Multiple-choice questions

An exception to the above format are the multiple-choice questions. These questions are represented in the dataset as "questionLabel option" column per option that may be selected. The user answer is then represented as 1 for chosen option, 0 for not chosen option. If the user has not answered the given question at all, the fields are empty (null).

For submissions being collected across different language versions it is adviced to label the multiple-choice options in the same manner as the questions. This allows easier combining of datasets from the different language courses, having the same column headers. The format is label ; option text where the text on the left-hand side of the semicolon ; is used as the column header in the resulting dataset, while the text on the right is what is shown to the survey user. Only the first semicolon will be used as a separator, meaning the option text may contain arbitrary amount of semicolons if needed. In case no semicolon is found the full option text is used as the column header.

Using the parser

In order to parse the collected submissions you need to download the files from the main course management page on courses.mooc.fi. The links to download the files are shown at the bottom of the picture

The csv file for course instances is not used in the process and may be skipped. The needed files are:

  • submissions
  • user details
  • exercise-tasks
  • research consent form answers

To download the data-parser go to github release page https://github.com/rage/factor-analysis-exercise-service/releases/tag/release and choose the execution file for your operative system.

The parser expects folder named data to contain the downloaded .csv files and being located in the same folder as it self. This is the directory structure:

where the green main is the executable program in question (will probably be called main-[name of you os]-latest). The parser will use the latest versions of the .csv files if there are several versions available in the data folder as in the above example.

Open up a terminal and navigate to the directory with the execution file and the data folder. From that folder run the parser with

./name-of-executable

The parser will create a parsed-outputs folder with the resulting .csv file:

You may have to give executuion right to the executable file with:

chmod +x name-of-executable

Executing on Cubbli machine using VMware Horizen Client from your browser

Go to https://vdi.helsinki.fi/. Choose WMware Horizon HTML Access:

vdi.helsinki

Sign in with you University of Helsinki credentials.

Choose the Cubbli Linux desktop:

Cubbli Linux desktop

Download the files and and the executable as explained above.

Open a browser in the VMware Client in you browser, remember you are accessing your helsinki Cubbli desktop through your bowser. Your keyboard may also be different layout than you are used to. Search for Keyboard in the menu and change the Layout to the wanted one. (For Finnish Layout you may also just run the command setxkbmap fi in the Konsole)

Choose the main-ubuntu-latest executable from the github release page:

ubuntu executable

Open up a Konsole (search for Konsole in the menu). Create a new folder where you are going to work with your files. Move the executable file to the folder. Additionally, create a subfolder named data and move all the downloaded .csv files there. In the Konsole, navigate to the folder with the executable file and the data folder using the cd (change directory) command:

navigate to the given directory

The folder in question is named moocdata here, you can see the name of the direcotry you are in as the last name before the $-sign.

Executing the binary fila is done by running command

./main-ubuntu-latest

command flow

You may need to add exucution rights to the executable program:

chmod +x main-ubuntu-latest