In this experiment we test several disambiguation strategies for
particular types of resources. First, we test the strategies restricted
to only one type of resource. Secondly, we experiment with the
combination of different strategies when used in an unrestricted
setting.


STRATEGIES 
There are two global variables: the information to present in
order to improve selection and disambiguation of the controlled terms
and how to present this information.

We will experiment with the following information about terms:
- rdf:type Class (and super classes)
- alternative label
- description (incl. wn:gloss)
- broader term (used transitive)
- narrower term (used transitive)
- related term
- linguistic relations

Visualization
- clustering
    - by property (group values by similar property)
    - by label (group additional information by same label)
- sublabel (more than one sublabel can be used for a term)
- mouse over (indirect method of visualization)

In this experiment we focus on the "important" dimensions Person,
Location, Time and Object. For these dimension we find several
strategies in existing web applications, such as rkd artists (Person),
geonames (Location), wordnet search (Object/Concept). As I understand
currently experts might use these (or similar) tools to find the right
term. We can apply the same strategies to autocompletion results. This
should be done in a restricted setting, in which values from only a
single dimension are allowed.

Persons:
RKD artists chooses to show the birth/death date and the nationality.

Location:
Location search is very common on the web, for example, google maps and
geonames. The place hierarchy is very well suited for disambiguation. In
geonames we find 1702 records for paris. It shows the country and
district. However, this is not always sufficient because within a
country the same name can be used for a city and a region. So the place
type is required in this case as well.

Time:
I think we should not consider time here.

Object:
In wordnet pipe occurs 5x as a noun and 4x as a verb. The online
search facility of princeton wordnet first groups these results by
wordtype (noun and verb), displays the other sense labels and the
wordnet gloss.

I am not sure how we should evaluate these strategies. Should we compare
them against other strategies? How do we choose these?

We also have to deal with a problem that arises by using prefix
autocompletion, which drastically increases the number of hits. Instead
of only the five exact matches on car we also find a lot of terms with
longer labels. What do we want to do with this? Should we test different
strategies here as well or do we fix this as a variable? I think a good
solution would be to find the relations among the search results, and
use these relations for grouping. For example, a search on "computer" in
wordnet finds many compound terms, such as computer science, computer
industry and computer operation. All these terms are derived from the
same concept computer.

What I want to achieve is that the disambiguation is only focused on the
different senses of a word, and that all derived terms are captured
within these senses. I am not sure how to do this, nor how to test it.
If we do not manage to solve this we should restrict ourselves to exact
matches.


Visualization 
I want to test when it is useful to group by property. And
when to group by the word. The former is useful to distinguish between
place and location. However, in the first setup we are in a restricted
setting in which the type is fixed and the location-person distinction
is not required. Any other property can be used instead for grouping,
for example locations can be grouped by place type or country/continent
and persons by role or nationality.

Grouping by word could be used for a wordnet match. For example, car has
five matches. We can show car once and the various senses as items of
this group.


In an unrestricted setting values from all dimensions are suggestion
candidates, therefore different strategies have to be combined. Grouping
by type seems the most obvious solution. This seems suited for Person
and location, but for AAT concepts and wordnet terms this is less
straightforward. We can use the facets as class hierarchies, however the
root nodes in these hierarchies are not well suited for novice users. We
could setup an experiment in which we use different levels in the
hierarchy for grouping. I think this is to tricky to do.