The ComiGS corpus v.0.8

The ComiGS corpus is licensed under a Creative Commons Attribution 4.0 International License with the following exceptions: The script compute_kappa.py is dual licensed under MIT / Apache v2 and for the images in ./img local copyright laws apply. The images are “gemeinfrei” (roughly equivalent to public domain) in Germany since 2015/1/1.

Contents

The syntactic, PoS and lemma annotations (for the target hypotheses) were performed using the annotation frontend AnnoViewer of jwcdg. In contrast to the syntactic and PoS annotation, the lemma annotation is incomplete (approximately 60 lemmas) due to missing vocabulary entries in the annotation interface and was not checked for correctness after the initial annotation. The cda and conll files contain morphological information which was derived automatically as a byproduct of using the annotation interface and which was not manually checked.

About the picture stories

The picture stories are by Erich Ohser, a German cartoonist. In 1944, Ohser and his friend Erich Knauf were arrested for making anti Nazi jokes. Ohser committed suicide in prison a day before his trial, Erich Knauf was decapitated after a trial at the Volksgerichtshof.

About calculating Cohen’s κ

We wanted to compute Cohen’s κ not only for the agreement on which tokens to change but also include the corrections in the computation. As it turns out, this is not a valid computation of Cohen’s κ because vocabulary is an open set and the annotation task is open-ended and therefore the categories are never exhaustive. However, this is one of the requirements:

The categories of the nominal scale are independent, mutually exclusive, and exhaustive. – Cohen (1960)

Therefore, kappa as calculated by compute_kappa.py is not Cohen’s κ. The problem with calculating Cohen’s κ is that the chance agreement pe has to be estimated but with an open-ended task there is no straightforward way to approximate pe. For kappa, we estimated pe as the chance agreement that the annotators performed a change on the same token, i.e. we used the same pe for kappa and kappa_positions. For the computation of kappa, we chose pe conservatively, i.e. it tends to underestimate agreement. We assume that if annotators perform a change it is the same (i.e. 100% agreement to perform the same correction if a token is corrected), and therefore pe cannot underestimate chance agreement and kappa does not overestimate agreement.

All in all, kappa_positions is a Cohen’s κ and kappa is not. Maybe agreement computations simply should not be boiled down to one number. compute_kappa.py computes the relevant raw numbers that can be used to illustrate the level of agreement between annotators.

Citing

If you make use of the corpus, please cite Christine Köhn and Arne Köhn: An Annotated Corpus of Comic Strips Stories Written by Language Learners, which also describes the corpus in detail.

@InProceedings{Koehn2018-comigs,
  author =  "K{\"o}hn, Christine and K{\"o}hn, Arne",
  title =   "An Annotated Corpus of Picture Stories Retold by Language Learners",
  booktitle =   "Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)",
  year =    "2018",
  publisher =   "Association for Computational Linguistics",
  pages =   "121--132",
  location =    "Santa Fe, New Mexico, USA",
  url =     "http://aclweb.org/anthology/W18-4914"
}