Skip to content

Some data and code regarding direct speech in French novels (ELTeC-fra)

Notifications You must be signed in to change notification settings

dh-trier/directspeech2022

Repository files navigation

Direct speech annotation for ELTeC-fra

Sampling

For each of the 100 novels in ELTeC-fra, 10 sentences have been randomly sampled, for a total of 1000 sentences or around 18600 words.

Annotation

Each sentence has been annotated with one of four speech categories:

  • n = narrator speech, including indirect character speech
  • c = direct character speech
  • x = mixed speech, including character and narrator speech
  • u = undecidable / other, e.g. free indirect speech or thought as well as letters.

No annotation guidelines, one annotator, no inter-annotator agreement checks.

Visualization

  • Overall percentages of the four categories in the sample
  • Percentages for the four categories by decade.
  • Sentence length by speech category.

About

Some data and code regarding direct speech in French novels (ELTeC-fra)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages