Skip to content

Copora for evaluating NLU Services/Platforms such as Dialogflow, LUIS, Watson, Rasa etc.

License

Notifications You must be signed in to change notification settings

Ai-Marshal/NLU-Evaluation-Data

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

README

This project contains natural languageg data for human-robot interaction in home domain which we collected and annotated for evaluating NLU Services/platforms.

If you use the data and publish the results please let us know and cite our IWSDS 2019 paper:

(It is also available at arXiv.)

@InProceedings{XLiu.etal:IWSDS2019,
  author    = {Xingkun Liu, Arash Eshghi, Pawel Swietojanski and Verena Rieser},
  title     = {Benchmarking Natural Language Understanding Services for building Conversational Agents},
  booktitle = {Proceedings of the Tenth International Workshop on Spoken Dialogue Systems Technology (IWSDS)},
  month     = {April},
  year      = {2019},
  address   = {Ortigia, Siracusa (SR), Italy},
  publisher = {Springer},
  pages     = {xxx--xxx},
  url       = {http://www.xx.xx/xx/}
}

License

All data are released under the CC BY-SA 3.0 license.

Content

It contains

  1. Collected-Original-Data (25K): collected original data with normalization for numbers/date etc which contain the pre-designed human-robot interaction questions and the user answers. They are organized in CSV format.

  2. AnnotatedData (25716 Lines): annotated for Intents and Entities, organized in csv format.

    The annotated csv file has following columns: userid, answerid, scenario, intent, status, answer_annotation, notes, suggested_entities, answer_normalised, answer, question

    Most of them come from the original data collection, we keep them here for monitoring of the afterwards processing.

    "answer" contains the original user answers.
    "answer_normalised" were normalised from "answer".
    "notes" was used for the annotators. They put some notes there if they have changed anything.
    "status" was used for annotation and post processing. The utterance will be ignored by the post processing scripts if the column content starts with 'IRR_'.
    "answer_annotation" contains the annotated results, it will be used for generating the train/test datasets, along with "scenario", "intent" and "status".

  3. The 10-fold cross-validation we used (here for reference only)

NB: The CSV file uses Semicolon(;) as the field/column delimiter! It may mess up with the data if using Colon(,).

CrossValidation contains the generated data for different NLU services we used for our evaluations which are uploaded here for reference only as they can be generated from the annotated csv data using our scripts. NB: the script will shuffle the data each time when runing the script, so the generated data may not be exact the same each time.

autoGeneFromRealAnno/: generated trainset and testset from the annotated csv file. The other four subdirectories (out4ApiaiReal, out4LuisReal, out4RasaReal and out4WatsonReal) in CrossValidation/ are the converted NLU service input data for Dialogflow, LUIS, Rasa and Watson respectively. The inside 'merged' directory contains the trainning input data.

Scripts for the Data Preparation and Evaluation

Java for preparing the data, evaluating the performances and Python scripts for querying the Services/Platforms are Here

Contact

Please contact [email protected], if you have any questions

About

Copora for evaluating NLU Services/Platforms such as Dialogflow, LUIS, Watson, Rasa etc.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published