Open Humanities Awards – Maphub final update

*This is the final blog in a series of posts from Dr Bernhard Haslhofer, one the recipients of the [DM2E Open Humanities Award](http://openhumanitiesawards.org). The final report is available here. *

Semantic Tagging in Maphub – Final Results and Lessons Learned

Maphub (http://maphub.github.io) is an open source Web application which allows people to annotate digitized historical maps. It pulls maps out of closed environments, adds zooming functionality, and assigns Web URIs so that people can talk about them on the Web. It has been built as a demonstrator for the W3C Open Annotation specification (http://www.w3.org/community/openannotation/), which currently works towards a common, RDF-based, specification for annotating digital resources. Here is a screenshot of the prototype application:

A first prototype (http://maphub.herokuapp.com) has been bootstrapped with a set of around 6,000 digitized high-resolution historical maps from the Library of Congress’ Map Division. It allows users to retrieve maps either by browsing or searching over available metadata and user-contributed annotations and tags.

Technical Details

Semantic tagging is part of Maphub’s annotation feature: to create an annotation, users markup regions on the map with geometric shapes such as polygons or rectangles. Once the area to be annotated is defined, they are asked to tell their stories and contribute their knowledge in the form of textual comments. While users are composing their comments, Maphub periodically suggests tags based on either the text contents or the geographic location of the annotated map region. Suggested tags appear below the annotation text. The user may accept tags and deem them as relevant to their annotation or reject non-relevant tags. Unselected tags remain neutral.

The screenshot in the next figure shows an example user annotation created for a region covering the Strait of Gibraltar. While the user entered a free-text comment related to the naming of the area, Maphub queried an instance of Wikipedia Miner (http://wikipedia-miner.cms.waikato.ac.nz/) to perform named entity recognition on the entered text and received a ranked list of Wikipedia resource URIs (e.g., http://en.wikipedia.org/wiki/Mediterranean_sea) in return. URIs should not be exposed to the user, so Maphub displays the corresponding Wikipedia page titles instead (e.g., Mediterranean Sea). Since page titles alone might not carry enough information for the user to disambiguate concepts, Maphub offers additional context information: the short abstract of the corresponding Wikipedia article is shown when the user hovers over a tag.

Once tags are displayed, users may mark them as relevant for their annotation by clicking on them once, which turns the labels green. Clicking once more rejects the tags, and clicking again sets them back to their (initial) neutral state. In the previous screenshot, the user accepts five tags and actively prunes two tags that are not relevant in the context of this annotation.

Sharing Annotations and Semantic Tags

Sharing collected annotation data in an interoperable way was another major development goal. Maphub is an early adopter of the Open Annotation specification and demonstrates how to apply that model in the context of digitized historic maps and how to expose comments as well as semantic tags. As described in the Maphub API documentation (http://maphub.github.io/api), each annotation becomes a first class Web resource that is dereferencable by its URI and therefore easily accessible by any Web client. In that way, while users are annotating maps, Maphub not only consumes data from global data networks – it also contributes data back. The following screenshot shows how the previous annotation could be represented following the Open Annotation specification.

Tagging Experiments

While working on Maphub, its semantic tagging functionality has become our core research interest. We conducted an in-lab user study with 26 participants to find out how semantic tagging differs from label-based tagging and learned that there was no significant difference in its tag production capacity, in the types and categories of tags added, and in overall user task load. Hence, semantic tagging as implemented in Maphub could produce the same result as a label-based tagging, with the main difference that semantic tagging gives references to unambiguous Web resources instead of semantically ambiguous labels. More details on the methodology and results of that experiment are described in our report available at (http://arxiv.org/abs/1304.1636).

Enabling Annotations and Semantic Tagging in other Applications

We found that semantic tagging might be useful for other application scenarios as well. Therefore, with the support we received from the Open Humanities Award, we added a semantic tagging feature to Annotorious (http://annotorious.github.io/), which is a JavaScript image annotation library that can be used in any Website. Annotorious is also compatible with the Open Knowledge Foundation’s Annotator (http://annotatorjs.org/) tool. Our next research and development steps will go into two main directions: (i) providing a more efficient and lightweight (semantic) tag suggestion service, and (ii) improving tag recommendation strategies.