INitiative for the Evaluation of XML Retrieval

XML Entity Ranking (XER) 2007


See also INEX 2007

Testing Data

Download the testing data, consisting of 46 topics and their assessments in trec_eval format. The testing data consists of two parts:

  • Topics 60-100 generated as genuine XER topics;
  • Topics 30-59 derived from adhoc 2007 assessments.

Genuine XER Topics

Topics 60-100 are genuine XER topics, in that the participants created these topics specifically for the track, and (almost all) topics have been assessed by the original topic authors.

From the originally proposed topics, we have dropped topics 93 (because it was too similar to topic 94) and topic 68 (because the underlying information need was identical to that of topic 61). Topics 71, 77, 80, 82, 84, 86, 89, 101 and 102 were dropped because their answer sets contained more relevant entities than (or just about as many as) the pool depth (of 50). Topics 69 and 92 have been dropped because their assessments were never finished.

The final set consists of 25 genuine XER topics with assessments, contained in directory xer07.
The set of topics could be expanded to 35 should we decide to perform more assessments...

Topics derived from adhoc 2007

Topics 30-59 have been derived from INEX adhoc 2007 topics, similar to the way that the training data have been produced. For these topics, description and narrative may not be perfect, but they should be highly similar to the training topics.

These topics have been assessed by track organizers (i.e., not by the original topic authors), with pools constructed by the articles that contained relevant information in the INEX adhoc 2007 assessments.

Of the originally proposed set of adhoc derived topics, the topics with numbers 34, 37, 38, 39, 41, 51, and 55 have been dropped because the adhoc pools on which to base the XER assessments did not exist. Topics 42 and 57 have been dropped because their answer sets contained more than 50 relevant entities (and therefore we do not trust the original pools to be sufficiently complete).

The final set consists of 21 adhoc derived entity ranking test topics with assessments, contained in directory adhoc07.

Training Data

A small training set of topics (based on a selection of 2006 ad hoc adapted to the entity task) has been kindly made available by INRIA (who developed this data) for participants to develop and train their systems. The relevance assessments have been derived from the articles judged relevant in 2006, limiting the set to the corresponding "entities".

Notice that the original title, description and narrative fields have not been updated to reflect the new entity ranking interpretation of the training topics. Consequently, the testing topics may have shorter title fields than those in the training data.

Download the 28 training topics, derived from the INEX 2006 topics and assessments.

Assessments Data Format (Qrels)

Assessments for the entity ranking topics are provided in qrels format, that can be used with trec_eval to evaluate trial runs (the trec_eval program can be downloaded from NIST).

The identifier for article ###.xml in the collection is given as WP###.

Use qrels-entity-ranking to evaluate your results on the entity ranking tasks, and qrels-list-completion to evaluate results for the list completion task. The difference between the two files is whether the entity examples are included as relevant answers or left out. Notice that your system should not include the given example entities in the answer set when evaluating the list completion task!

Prefixes inex07-xer-training- and inex07-xer-testing- distinguish the training from the testing qrels; for the latter, suffixes -adhoc or -topic identify qrels for adhoc derived and for genuine XER topics. File inex07-xer-testing-qrels contains the qrels for all topics.

Guidelines and results

The results are summarized in the Overview of the INEX 2007 Entity Ranking Track in the INEX 2007 Proceedings.

Guidelines are archived in the original INEX 2007 Entity Ranking guidelines document.

Judging INEX Entity Ranking

See the judging pages.