Dr. Dominik Benz – KDE – FB16 – Universität Kassel

Hinweis: Dominik Benz ist seit dem 31.10.2012 nicht mehr im Fachgebiet beschäftigt.

About me

I studied Computer Science with minor Psychology at the University of Freiburg. In May 2007, I joined the Knowledge and Data Engineering team in Kassel. I’m a senior developer of BibSonomy, and my scientific interests lie in the field of bridging the gap between the Social and the Semantic Web. In my dissertation, I analyzed methods to capture emergent semantics from social annotation systems.

Among others, my research in collaboration with several great colleagues has provided insight in the following areas:

A systematic semantic characterization of tag relatedness / abstractness measures in Social Annotation Systems:
Evidences for a causal link between tagging pragmatics (i.e., how tags are being used for annotation) and tag semantics (i.e., what tags mean):
Methods to derive hierarchical relationships among collaboratively created tags:

Teaching

Most courses I have been teaching or assisisting were held in German.

Lectures, Seminars

Supervised Theses

Automatische Webdokumentenklassifikation zur Unterstützung einer facettierten Suchmaschine an der Universität Kassel (Marek Bachmann)
Effiziente Analyse von Faktoren entstehender Semantik in Folksonomien (Tobias Gunkel)
Lernen von Ontologien aus kollaborativen Tagging-Systemen (Stefan Stützer)

Activities

Here you find a summary of my scientific activities related to conferences, workshops and journals.

Co-chaired workshops & conferences
PC memberships
Reviewing

Research

My research is focussed around supporting users during knowledge interaction tasks like retrieval, structuring and collaboration. I am especially interested in Data Mining methods to discover latent Semantic information, e.g. in Social annotation data (e.g. from Tagging Systems). I see Ontology Learning approaches as a viable tools to bridge the gap between the Social and the Semantic Web.

Grants, Awards

Award Summa Cum Laude for my dissertation “Capturing Emergent Semantics from Social Annotation Systems”2012
Student Travel Award, Extended Semantic Web Conference (ESWC) 2011, Crete; Paper: One Tag to Bind Them All : Measuring Term Abstractness in Social Metadata
Student Travel Award, World Wide Web Conference (WWW) 2009, Barcelona, Spain; Paper Evaluating Similarity Measures for Emergent Semantics of Social Tagging
Best Paper Honorable Mention, International Semantic Web Conference (ISWC) 2008, Karlsruhe; Paper Semantic Analysis of Tag Similarity Measures in Collaborative Tagging Systems

Invited Talks

Literature Databases: Techniques and Tutorial (02/2005) Lab-Seminar, Department of Biology III, Freiburg (Prof. Ad Aertsen)
Self-emerging Semantics from Social Metadata: Factors, Methods and Evaluations of “Taxonomies made by you and me” (02/2011), Competence Center Information Retrieval & Machine Learning, DAI-Labors, TU Berlin (Dr. Ernesto William de Luca)
Web 2.0 ? Hype or Revolution for Researchers? (11/2011), International Congress on Communicating Life Science and Technology, Stra?urg (COMM4BIOTECH)

PhD summary

Title: Capturing Emergent Semantics from Social Annotation Systems

The ongoing growth of the World Wide Web, catalyzed by the increasing possibility of ubiquitous access via a variety of devices, continues to strengthen its role as our prevalent information and commmunication medium. The vision of supporting both humans and machines in knowledge-based activities led to the development of different systems which allow to structure Web resources by metadata annotations. Interestingly, two major approaches which gained a considerable amount of attention are addressing the problem from nearly opposite directions: On the one hand, the idea of the Semantic Web suggests to formalize the knowledge within a particular domain by means of a “top-down” approach of defining ontologies. On the other hand, Social Annotation systems as part of the so-called Web 2.0 movement implement a “bottom-up” style of categorization using arbitrary keywords.

Experience as well as research in the characteristics of both systems has shown that their strengths and weaknesses seem to be inverse: While Social Annotation suffers from problems like, e.g. ambiguity or lack or precision, ontologies were especially designed to eliminate those. On the contrary, the latter suffer from a knowledge acquisition bottleneck, which is successfully overcome by the large user populations of Social Annotation systems. Instead of being regarded as competing paradigms, the obvious potential synergies from a combination of both motivated approaches to “bridge the gap” between them. Those were fostered by the evidence of emergent semantics, i.e. the self-organized evolution of implicit conceptual structures, within Social Annotation data. While several techniques to exploit the emergent patterns were proposed, a systematic analysis — especially regarding paradigms from the field of ontology learning — is still largely missing. This also includes a deeper understanding of the circumstances which affect the evolution processes.

This work aims to address this gap by providing an in-depth study of methods and influencing factors to capture emergent semantics from Social Annotation Systems. We focus hereby on the acquisition of lexical semantics from the underlying networks of keywords, users and resources. Structured along different ontology learning tasks, we use a methodology of semantic grounding to characterize and evaluate the semantic relations captured by different methods. In all cases, our studies are based on datasets from several Social Annotation Systems.

Finally, we discuss a set of applications which operationalize our results for enhancing both Social Annotation and semantic systems. In summary, the contributions of this work highlight viable methods and crucial aspects for designing enhanced knowledge-based services of a Social Semantic Web.

Projects

During my PhD, I have been involved in the following research, industry and software projects:

Research Projects

EveryAware: Enhancing Environmental Awareness through Social Information Technologies (EU project, since 2011)
Webzubi – Ein Web2.0-Netzwerk zur Gestaltung innovativer Berufsausbildung f?erblich-technische Auszubildende (BMBF project, 2009-2012)
PUMA – Akademisches Publikationsmanagement (DFG project, 2009-2011)
TAGora – Semiotic Dynamics in Online Social Communities (EU project, 2006 – 2009)

Industry projects

Semantic Analysis of Keywords used in an internal Knowledge Base (K+S IT-Services GmbH, since 2011)

Software Projects

BibSonomy – a social bookmark and publication sharing system.

Streams

I participate in several Social Platforms – here you can find my latest activities.

BibSonomy bookmarks

BibSonomy publications

SlideShare presentations

Tweets

Commitments

Around my professional activities, I’m engaged at a number of projects and events I find more than worth to be supported:

Java User Group Nordhessen
There’s no better source for Java expertise and networking around Kassel.
Eclipse Demo Camps
Always stay up-to-date around my favourite IDE.
Freiwillig in Kassel
Voluntary work in Kassel – great projects, especially during a “volunteer day”.
Wintersause
Best student get-together in Kassel, in January – don’t miss the next event with live bands, DJs and more!
Tag der Technik
“Day of Technical Jobs” – pupils can visit different parts of the University at this day and get an impression of what’s going on there.
Sommercampus Freiburg
Sometimes people are complaining that not enough techical skills are being teached at Universities. The “Sommercampus” is a great initiative in Freiburg, where technical courses are offered by students and University staff to counterbalance this issue.
ESG Kassel
I’ve offered a guitar course at the ESG – great team!

Publications

2012

URLBibTeXEndNote

@article{strohmaier2011evaluation,
author = {Strohmaier, Markus and Helic, Denis and Benz, Dominik and Körner, Christian and Kern, Roman},
journal = {Transactions on Intelligent Systems and Technology},
keywords = 2012,
title = {Evaluation of Folksonomy Induction Algorithms},
year = 2012
}
%0 Journal Article
%1 strohmaier2011evaluation
%A Strohmaier, Markus
%A Helic, Denis
%A Benz, Dominik
%A Körner, Christian
%A Kern, Roman
%D 2012
%J Transactions on Intelligent Systems and Technology
%T Evaluation of Folksonomy Induction Algorithms
%U http://tist.acm.org/index.html

2011

URLBibTeXEndNote

@article{martin2011enhancing,
author = {Atzmueller, Martin and Benz, Dominik and Doerfel, Stephan and Hotho, Andreas and Jäschke, Robert and Macek, Bjoern Elmar and Mitzlaff, Folke and Scholz, Christoph and Stumme, Gerd},
booktitle = {it - Information Technology},
journal = {it - Information Technology},
keywords = {conferator},
month = {05},
number = 3,
pages = {101--107},
publisher = {Oldenbourg Wissenschaftsverlag GmbH},
title = {Enhancing Social Interactions at Conferences},
volume = 53,
year = 2011
}
%0 Journal Article
%1 martin2011enhancing
%A Atzmueller, Martin
%A Benz, Dominik
%A Doerfel, Stephan
%A Hotho, Andreas
%A Jäschke, Robert
%A Macek, Bjoern Elmar
%A Mitzlaff, Folke
%A Scholz, Christoph
%A Stumme, Gerd
%B it - Information Technology
%D 2011
%I Oldenbourg Wissenschaftsverlag GmbH
%J it - Information Technology
%N 3
%P 101--107
%R 10.1524/itit.2011.0631
%T Enhancing Social Interactions at Conferences
%U http://dx.doi.org/10.1524/itit.2011.0631
%V 53
BibTeXEndNote

@inproceedings{mitzlaff2011community,
author = {Mitzlaff, Folke and Atzmueller, Martin and Benz, Dominik and Hotho, Andreas and Stumme, Gerd},
booktitle = {Analysis of Social Media and Ubiquitous Data},
keywords = {itegpub},
series = {LNAI},
title = {{Community Assessment using Evidence Networks}},
volume = 6904,
year = 2011
}
%0 Conference Paper
%1 mitzlaff2011community
%A Mitzlaff, Folke
%A Atzmueller, Martin
%A Benz, Dominik
%A Hotho, Andreas
%A Stumme, Gerd
%B Analysis of Social Media and Ubiquitous Data
%D 2011
%T {Community Assessment using Evidence Networks}
%V 6904
URLBibTeXEndNote

@inproceedings{atzmueller2011towards,
author = {Atzmüller, Martin and Benz, Dominik and Hotho, Andreas and Stumme, Gerd},
booktitle = {Proceedings of the 4th international workshop on Social Data on the Web (SDoW2011)},
editor = {Passant, Alexandre and Fernández, Sergio and Breslin, John and Bojārs, Uldis},
keywords = 2011,
title = {Towards Mining Semantic Maturity in Social Bookmarking Systems},
year = 2011
}
%0 Conference Paper
%1 atzmueller2011towards
%A Atzmüller, Martin
%A Benz, Dominik
%A Hotho, Andreas
%A Stumme, Gerd
%B Proceedings of the 4th international workshop on Social Data on the Web (SDoW2011)
%D 2011
%E Passant, Alexandre
%E Fernández, Sergio
%E Breslin, John
%E Bojārs, Uldis
%T Towards Mining Semantic Maturity in Social Bookmarking Systems
%U https://www.kde.cs.uni-kassel.de/pub/pdf/atzmueller2011towards.pdf
URLBibTeXEndNote

Recent research has demonstrated how the widespread adoption of collaborative tagging systems yields emergent semantics. In recent years, much has been learned about how to harvest the data produced by taggers for engineering light-weight ontologies. For example, existing measures of tag similarity and tag relatedness have proven crucial step stones for making latent semantic relations in tagging systems explicit. However, little progress has been made on other issues, such as understanding the different levels of tag generality (or tag abstractness), which is essential for, among others, identifying hierarchical relationships between concepts. In this paper we aim to address this gap. Starting from a review of linguistic definitions of word abstractness, we first use several large-scale ontologies and taxonomies as grounded measures of word generality, including Yago, Wordnet, DMOZ and Wikitaxonomy. Then, we introduce and apply several folksonomy-based methods to measure the level of generality of given tags. We evaluate these methods by comparing them with the grounded measures. Our results suggest that the generality of tags in social tagging systems can be approximated with simple measures. Our work has implications for a number of problems related to social tagging systems, including search, tag recommendation, and the acquisition of light-weight ontologies from tagging data.
@inproceedings{benz2011measuring,
abstract = {Recent research has demonstrated how the widespread adoption of collaborative tagging systems yields emergent semantics. In recent years, much has been learned about how to harvest the data produced by taggers for engineering light-weight ontologies. For example, existing measures of tag similarity and tag relatedness have proven crucial step stones for making latent semantic relations in tagging systems explicit. However, little progress has been made on other issues, such as understanding the different levels of tag generality (or tag abstractness), which is essential for, among others, identifying hierarchical relationships between concepts. In this paper we aim to address this gap. Starting from a review of linguistic definitions of word abstractness, we first use several large-scale ontologies and taxonomies as grounded measures of word generality, including Yago, Wordnet, DMOZ and Wikitaxonomy. Then, we introduce and apply several folksonomy-based methods to measure the level of generality of given tags. We evaluate these methods by comparing them with the grounded measures. Our results suggest that the generality of tags in social tagging systems can be approximated with simple measures. Our work has implications for a number of problems related to social tagging systems, including search, tag recommendation, and the acquisition of light-weight ontologies from tagging data.},
address = {Heraklion, Crete},
author = {Benz, Dominik and Körner, Christian and Hotho, Andreas and Stumme, Gerd and Strohmaier, Markus},
booktitle = {Proceedings of the 8th Extended Semantic Web Conference (ESWC 2011)},
editor = {Antoniou, Grigoris and Grobelnik, Marko and Simperl, Elena and Parsia, Bijan and Plexousakis, Dimitris and Pan, Jeff and Leenheer, Pieter De},
keywords = 2011,
month = {05},
title = {One Tag to Bind Them All : Measuring Term Abstractness in Social Metadata},
year = 2011
}
%0 Conference Paper
%1 benz2011measuring
%A Benz, Dominik
%A Körner, Christian
%A Hotho, Andreas
%A Stumme, Gerd
%A Strohmaier, Markus
%B Proceedings of the 8th Extended Semantic Web Conference (ESWC 2011)
%C Heraklion, Crete
%D 2011
%E Antoniou, Grigoris
%E Grobelnik, Marko
%E Simperl, Elena
%E Parsia, Bijan
%E Plexousakis, Dimitris
%E Pan, Jeff
%E Leenheer, Pieter De
%T One Tag to Bind Them All : Measuring Term Abstractness in Social Metadata
%U https://www.kde.cs.uni-kassel.de/pub/pdf/benz2011measuring.pdf
%X Recent research has demonstrated how the widespread adoption of collaborative tagging systems yields emergent semantics. In recent years, much has been learned about how to harvest the data produced by taggers for engineering light-weight ontologies. For example, existing measures of tag similarity and tag relatedness have proven crucial step stones for making latent semantic relations in tagging systems explicit. However, little progress has been made on other issues, such as understanding the different levels of tag generality (or tag abstractness), which is essential for, among others, identifying hierarchical relationships between concepts. In this paper we aim to address this gap. Starting from a review of linguistic definitions of word abstractness, we first use several large-scale ontologies and taxonomies as grounded measures of word generality, including Yago, Wordnet, DMOZ and Wikitaxonomy. Then, we introduce and apply several folksonomy-based methods to measure the level of generality of given tags. We evaluate these methods by comparing them with the grounded measures. Our results suggest that the generality of tags in social tagging systems can be approximated with simple measures. Our work has implications for a number of problems related to social tagging systems, including search, tag recommendation, and the acquisition of light-weight ontologies from tagging data.

2010

URLBibTeXEndNote

@inproceedings{mitzlaff2010visit,
address = {Toronto, Canada},
author = {Mitzlaff, Folke and Benz, Dominik and Stumme, Gerd and Hotho, Andreas},
booktitle = {Proceedings of the 21st ACM conference on Hypertext and hypermedia},
keywords = {itegpub},
note = {(to appear)},
title = {Visit me, click me, be my friend: An analysis of evidence networks of user relationships in Bibsonomy},
year = 2010
}
%0 Conference Paper
%1 mitzlaff2010visit
%A Mitzlaff, Folke
%A Benz, Dominik
%A Stumme, Gerd
%A Hotho, Andreas
%B Proceedings of the 21st ACM conference on Hypertext and hypermedia
%C Toronto, Canada
%D 2010
%T Visit me, click me, be my friend: An analysis of evidence networks of user relationships in Bibsonomy
%U https://www.kde.cs.uni-kassel.de/pub/pdf/mitzlaff2010visit.pdf
BibTeXEndNote

@inproceedings{koerner2010social,
address = {Riva del Garda Fierecongressi, Trento, Italy},
author = {Körner, Christian and Benz, Dominik and Hotho, Andreas and Strohmaier, Markus and Stumme, Gerd},
booktitle = {Proceedings of INSNA Sunbelt XXX},
keywords = 2010,
title = {Social Bookmarking Systems: Verbosity Improves Semantics},
year = 2010
}
%0 Conference Paper
%1 koerner2010social
%A Körner, Christian
%A Benz, Dominik
%A Hotho, Andreas
%A Strohmaier, Markus
%A Stumme, Gerd
%B Proceedings of INSNA Sunbelt XXX
%C Riva del Garda Fierecongressi, Trento, Italy
%D 2010
%T Social Bookmarking Systems: Verbosity Improves Semantics
URLBibTeXEndNote

Social resource sharing systems are central elements of the Web 2.0 and use the same kind of lightweight knowledge representation, called folksonomy. Their large user communities and ever-growing networks of user-generated content have made them an attractive object of investigation for researchers from different disciplines like Social Network Analysis, Data Mining, Information Retrieval or Knowledge Discovery. In this paper, we summarize and extend our work on different aspects of this branch of Web 2.0 research, demonstrated and evaluated within our own social bookmark and publication sharing system BibSonomy, which is currently among the three most popular systems of its kind. We structure this presentation along the different interaction phases of a user with our system, coupling the relevant research questions of each phase with the corresponding implementation issues. This approach reveals in a systematic fashion important aspects and results of the broad bandwidth of folksonomy research like capturing of emergent semantics, spam detection, ranking algorithms, analogies to search engine log data, personalized tag recommendations and information extraction techniques. We conclude that when integrating a real-life application like BibSonomy into research, certain constraints have to be considered; but in general, the tight interplay between our scientific work and the running system has made BibSonomy a valuable platform for demonstrating and evaluating Web 2.0 research.
@article{benz2010social,
abstract = {Social resource sharing systems are central elements of the Web 2.0 and use the same kind of lightweight knowledge representation, called folksonomy. Their large user communities and ever-growing networks of user-generated content have made them an attractive object of investigation for researchers from different disciplines like Social Network Analysis, Data Mining, Information Retrieval or Knowledge Discovery. In this paper, we summarize and extend our work on different aspects of this branch of Web 2.0 research, demonstrated and evaluated within our own social bookmark and publication sharing system BibSonomy, which is currently among the three most popular systems of its kind. We structure this presentation along the different interaction phases of a user with our system, coupling the relevant research questions of each phase with the corresponding implementation issues. This approach reveals in a systematic fashion important aspects and results of the broad bandwidth of folksonomy research like capturing of emergent semantics, spam detection, ranking algorithms, analogies to search engine log data, personalized tag recommendations and information extraction techniques. We conclude that when integrating a real-life application like BibSonomy into research, certain constraints have to be considered; but in general, the tight interplay between our scientific work and the running system has made BibSonomy a valuable platform for demonstrating and evaluating Web 2.0 research.},
address = {Berlin / Heidelberg},
author = {Benz, Dominik and Hotho, Andreas and Jäschke, Robert and Krause, Beate and Mitzlaff, Folke and Schmitz, Christoph and Stumme, Gerd},
journal = {The VLDB Journal},
keywords = {bibsonomy},
pages = {849-875},
publisher = {Springer},
title = {The social bookmark and publication management system bibsonomy},
volume = 19,
year = 2010
}
%0 Journal Article
%1 benz2010social
%A Benz, Dominik
%A Hotho, Andreas
%A Jäschke, Robert
%A Krause, Beate
%A Mitzlaff, Folke
%A Schmitz, Christoph
%A Stumme, Gerd
%C Berlin / Heidelberg
%D 2010
%I Springer
%J The VLDB Journal
%P 849-875
%R 10.1007/s00778-010-0208-4
%T The social bookmark and publication management system bibsonomy
%U http://dx.doi.org/10.1007/s00778-010-0208-4
%V 19
%X Social resource sharing systems are central elements of the Web 2.0 and use the same kind of lightweight knowledge representation, called folksonomy. Their large user communities and ever-growing networks of user-generated content have made them an attractive object of investigation for researchers from different disciplines like Social Network Analysis, Data Mining, Information Retrieval or Knowledge Discovery. In this paper, we summarize and extend our work on different aspects of this branch of Web 2.0 research, demonstrated and evaluated within our own social bookmark and publication sharing system BibSonomy, which is currently among the three most popular systems of its kind. We structure this presentation along the different interaction phases of a user with our system, coupling the relevant research questions of each phase with the corresponding implementation issues. This approach reveals in a systematic fashion important aspects and results of the broad bandwidth of folksonomy research like capturing of emergent semantics, spam detection, ranking algorithms, analogies to search engine log data, personalized tag recommendations and information extraction techniques. We conclude that when integrating a real-life application like BibSonomy into research, certain constraints have to be considered; but in general, the tight interplay between our scientific work and the running system has made BibSonomy a valuable platform for demonstrating and evaluating Web 2.0 research.
URLBibTeXEndNote

@inproceedings{benz2010semantics,
address = {Raleigh, NC, USA},
author = {Benz, Dominik and Hotho, Andreas and Stützer, Stefan and Stumme, Gerd},
booktitle = {Proceedings of the 2nd Web Science Conference (WebSci10)},
keywords = {itegpub},
title = {Semantics made by you and me: Self-emerging ontologies can capture the diversity of shared knowledge},
year = 2010
}
%0 Conference Paper
%1 benz2010semantics
%A Benz, Dominik
%A Hotho, Andreas
%A Stützer, Stefan
%A Stumme, Gerd
%B Proceedings of the 2nd Web Science Conference (WebSci10)
%C Raleigh, NC, USA
%D 2010
%T Semantics made by you and me: Self-emerging ontologies can capture the diversity of shared knowledge
%U https://www.kde.cs.uni-kassel.de/pub/pdf/benz2010semantics.pdf
URLBibTeXEndNote

@proceedings{atzmueller2010proceedings,
editor = {Atzmueller, Martin and Benz, Dominik and Hotho, Andreas and Stumme, Gerd},
keywords = 2010,
publisher = {Department of Electrical Engineering/Computer Science, Kassel University},
series = {Technical report (KIS), 2010-10},
title = {{Proceedings of the LWA 2010 - Lernen, Wissen, Adaptivität}},
year = 2010
}
%0 Conference Proceedings
%1 atzmueller2010proceedings
%B Technical report (KIS), 2010-10
%D 2010
%E Atzmueller, Martin
%E Benz, Dominik
%E Hotho, Andreas
%E Stumme, Gerd
%I Department of Electrical Engineering/Computer Science, Kassel University
%T {Proceedings of the LWA 2010 - Lernen, Wissen, Adaptivität}
%U https://www.kde.cs.uni-kassel.de/pub/pdf/atzmueller2010proceedings.pdf
URLBibTeXEndNote

Community mining is a prominent approach for identifying (user) communities in social and ubiquitous contexts. While there are a variety of methods for community mining and detection, the effective evaluation and validation of the mined communities is usually non-trivial. Often there is no evaluation data at hand in order to validate the discovered groups. This paper proposes evidence networks using implicit information for the evaluation of communities. The presented evaluation approach is based on the idea of reconstructing existing social structures for the assessment and evaluation of a given clustering. We analyze and compare the presented evidence networks using user data from the real-world socialbookmarking application BibSonomy. The results indicate that the evidencenetworks reflect the relative rating of the explicit ones very well.
@inproceedings{mitzlaff2010community,
abstract = {Community mining is a prominent approach for identifying (user) communities in social and ubiquitous contexts. While there are a variety of methods for community mining and detection, the effective evaluation and validation of the mined communities is usually non-trivial. Often there is no evaluation data at hand in order to validate the discovered groups. This paper proposes evidence networks using implicit information for the evaluation of communities. The presented evaluation approach is based on the idea of reconstructing existing social structures for the assessment and evaluation of a given clustering. We analyze and compare the presented evidence networks using user data from the real-world socialbookmarking application BibSonomy. The results indicate that the evidencenetworks reflect the relative rating of the explicit ones very well.},
address = {Barcelona, Spain},
author = {Mitzlaff, Folke and Atzmüller, Martin and Benz, Dominik and Hotho, Andreas and Stumme, Gerd},
booktitle = {Proceedings of the Workshop on Mining Ubiquitous and Social Environments (MUSE2010)},
keywords = {evidence_networks},
title = {Community Assessment using Evidence Networks},
year = 2010
}
%0 Conference Paper
%1 mitzlaff2010community
%A Mitzlaff, Folke
%A Atzmüller, Martin
%A Benz, Dominik
%A Hotho, Andreas
%A Stumme, Gerd
%B Proceedings of the Workshop on Mining Ubiquitous and Social Environments (MUSE2010)
%C Barcelona, Spain
%D 2010
%T Community Assessment using Evidence Networks
%U https://www.kde.cs.uni-kassel.de/pub/pdf/mitzlaff2010community.pdf
%X Community mining is a prominent approach for identifying (user) communities in social and ubiquitous contexts. While there are a variety of methods for community mining and detection, the effective evaluation and validation of the mined communities is usually non-trivial. Often there is no evaluation data at hand in order to validate the discovered groups. This paper proposes evidence networks using implicit information for the evaluation of communities. The presented evaluation approach is based on the idea of reconstructing existing social structures for the assessment and evaluation of a given clustering. We analyze and compare the presented evidence networks using user data from the real-world socialbookmarking application BibSonomy. The results indicate that the evidencenetworks reflect the relative rating of the explicit ones very well.
URLBibTeXEndNote

Recent research provides evidence for the presence of emergent semantics in collaborative tagging systems. While several methods have been proposed, little is known about the factors that influence the evolution of semantic structures in these systems. A natural hypothesis is that the quality of the emergent semantics depends on the pragmatics of tagging: Users with certain usage patterns might contribute more to the resulting semantics than others. In this work, we propose several measures which enable a pragmatic differentiation of taggers by their degree of contribution to emerging semantic structures. We distinguish between categorizers, who typically use a small set of tags as a replacement for hierarchical classification schemes, and describers, who are annotating resources with a wealth of freely associated, descriptive keywords. To study our hypothesis, we apply semantic similarity measures to 64 different partitions of a real-world and large-scale folksonomy containing different ratios of categorizers and describers. Our results not only show that ‘verbose’ taggers are most useful for the emergence of tag semantics, but also that a subset containing only 40% of the most ‘verbose’ taggers can produce results that match and even outperform the semantic precision obtained from the whole dataset. Moreover, the results suggest that there exists a causal link between the pragmatics of tagging and resulting emergent semantics. This work is relevant for designers and analysts of tagging systems interested (i) in fostering the semantic development of their platforms, (ii) in identifying users introducing “semantic noise�?, and (iii) in learning ontologies.
@inproceedings{koerner2010stop,
abstract = {Recent research provides evidence for the presence of emergent semantics in collaborative tagging systems. While several methods have been proposed, little is known about the factors that influence the evolution of semantic structures in these systems. A natural hypothesis is that the quality of the emergent semantics depends on the pragmatics of tagging: Users with certain usage patterns might contribute more to the resulting semantics than others. In this work, we propose several measures which enable a pragmatic differentiation of taggers by their degree of contribution to emerging semantic structures. We distinguish between categorizers, who typically use a small set of tags as a replacement for hierarchical classification schemes, and describers, who are annotating resources with a wealth of freely associated, descriptive keywords. To study our hypothesis, we apply semantic similarity measures to 64 different partitions of a real-world and large-scale folksonomy containing different ratios of categorizers and describers. Our results not only show that ‘verbose’ taggers are most useful for the emergence of tag semantics, but also that a subset containing only 40% of the most ‘verbose’ taggers can produce results that match and even outperform the semantic precision obtained from the whole dataset. Moreover, the results suggest that there exists a causal link between the pragmatics of tagging and resulting emergent semantics. This work is relevant for designers and analysts of tagging systems interested (i) in fostering the semantic development of their platforms, (ii) in identifying users introducing “semantic noise�?, and (iii) in learning ontologies.},
address = {Raleigh, NC, USA},
author = {Körner, Christian and Benz, Dominik and Strohmaier, Markus and Hotho, Andreas and Stumme, Gerd},
booktitle = {Proceedings of the 19th International World Wide Web Conference (WWW 2010)},
keywords = {collaborative_verbosity},
month = {04},
publisher = {ACM},
title = {Stop Thinking, start Tagging - Tag Semantics emerge from Collaborative Verbosity},
year = 2010
}
%0 Conference Paper
%1 koerner2010stop
%A Körner, Christian
%A Benz, Dominik
%A Strohmaier, Markus
%A Hotho, Andreas
%A Stumme, Gerd
%B Proceedings of the 19th International World Wide Web Conference (WWW 2010)
%C Raleigh, NC, USA
%D 2010
%I ACM
%T Stop Thinking, start Tagging - Tag Semantics emerge from Collaborative Verbosity
%U https://www.kde.cs.uni-kassel.de/pub/pdf/koerner2010stop.pdf
%X Recent research provides evidence for the presence of emergent semantics in collaborative tagging systems. While several methods have been proposed, little is known about the factors that influence the evolution of semantic structures in these systems. A natural hypothesis is that the quality of the emergent semantics depends on the pragmatics of tagging: Users with certain usage patterns might contribute more to the resulting semantics than others. In this work, we propose several measures which enable a pragmatic differentiation of taggers by their degree of contribution to emerging semantic structures. We distinguish between categorizers, who typically use a small set of tags as a replacement for hierarchical classification schemes, and describers, who are annotating resources with a wealth of freely associated, descriptive keywords. To study our hypothesis, we apply semantic similarity measures to 64 different partitions of a real-world and large-scale folksonomy containing different ratios of categorizers and describers. Our results not only show that ‘verbose’ taggers are most useful for the emergence of tag semantics, but also that a subset containing only 40% of the most ‘verbose’ taggers can produce results that match and even outperform the semantic precision obtained from the whole dataset. Moreover, the results suggest that there exists a causal link between the pragmatics of tagging and resulting emergent semantics. This work is relevant for designers and analysts of tagging systems interested (i) in fostering the semantic development of their platforms, (ii) in identifying users introducing “semantic noise�?, and (iii) in learning ontologies.
URLBibTeXEndNote

Query logs provide a valuable resource for preference information in search. A user clicking on a specific resource after submitting a query indicates that the resource has some relevance with respect to the query. To leverage the information ofquery logs, one can relate submitted queries from specific users to their clicked resources and build a tripartite graph ofusers, resources and queries. This graph resembles the folksonomy structure of social bookmarking systems, where users addtags to resources. In this article, we summarize our work on building folksonomies from query log files. The focus is on threecomparative studies of the system’s content, structure and semantics. Our results show that query logs incorporate typicalfolksonomy properties and that approaches to leverage the inherent semantics of folksonomies can be applied to query logsas well.
@article{benz2010query,
abstract = {Query logs provide a valuable resource for preference information in search. A user clicking on a specific resource after submitting a query indicates that the resource has some relevance with respect to the query. To leverage the information ofquery logs, one can relate submitted queries from specific users to their clicked resources and build a tripartite graph ofusers, resources and queries. This graph resembles the folksonomy structure of social bookmarking systems, where users addtags to resources. In this article, we summarize our work on building folksonomies from query log files. The focus is on threecomparative studies of the system’s content, structure and semantics. Our results show that query logs incorporate typicalfolksonomy properties and that approaches to leverage the inherent semantics of folksonomies can be applied to query logsas well.},
author = {Benz, Dominik and Hotho, Andreas and Jäschke, Robert and Krause, Beate and Stumme, Gerd},
journal = {Datenbank-Spektrum},
keywords = {itegpub},
month = {06},
number = 1,
pages = {15--24},
title = {Query Logs as Folksonomies},
volume = 10,
year = 2010
}
%0 Journal Article
%1 benz2010query
%A Benz, Dominik
%A Hotho, Andreas
%A Jäschke, Robert
%A Krause, Beate
%A Stumme, Gerd
%D 2010
%J Datenbank-Spektrum
%N 1
%P 15--24
%T Query Logs as Folksonomies
%U http://dx.doi.org/10.1007/s13222-010-0004-8
%V 10
%X Query logs provide a valuable resource for preference information in search. A user clicking on a specific resource after submitting a query indicates that the resource has some relevance with respect to the query. To leverage the information ofquery logs, one can relate submitted queries from specific users to their clicked resources and build a tripartite graph ofusers, resources and queries. This graph resembles the folksonomy structure of social bookmarking systems, where users addtags to resources. In this article, we summarize our work on building folksonomies from query log files. The focus is on threecomparative studies of the system’s content, structure and semantics. Our results show that query logs incorporate typicalfolksonomy properties and that approaches to leverage the inherent semantics of folksonomies can be applied to query logsas well.
URLBibTeXEndNote

The PUMA project fosters the Open Access movement und aims at a better support of the researcher’s publication work. PUMA stands for an integrated solution, where the upload of a publication results automatically in an update of both the personal and institutional homepage, the creation of an entry in a social bookmarking systems like BibSonomy, an entry in the academic reporting system of the university, and its publication in the institutional repository. In this poster, we present the main features of our solution.
@inproceedings{benz2010academic,
abstract = {The PUMA project fosters the Open Access movement und aims at a better support of the researcher’s publication work. PUMA stands for an integrated solution, where the upload of a publication results automatically in an update of both the personal and institutional homepage, the creation of an entry in a social bookmarking systems like BibSonomy, an entry in the academic reporting system of the university, and its publication in the institutional repository. In this poster, we present the main features of our solution.},
address = {Berlin/Heidelberg},
author = {Benz, Dominik and Hotho, Andreas and Jäschke, Robert and Stumme, Gerd and Halle, Axel and Lima, Angela Gerlach Sanches and Steenweg, Helge and Stefani, Sven},
booktitle = {Proceedings of the European Conference on Research and Advanced Technology for Digital Libraries (ECDL) 2010},
editor = {Lalmas, M. and Jose, J. and Rauber, A. and Sebastiani, F. and Frommholz, I.},
keywords = {puma},
pages = {417--420},
publisher = {Springer},
series = {Lecture Notes in Computer Science},
title = {Academic Publication Management with PUMA - collect, organize and share publications},
volume = 6273,
year = 2010
}
%0 Conference Paper
%1 benz2010academic
%A Benz, Dominik
%A Hotho, Andreas
%A Jäschke, Robert
%A Stumme, Gerd
%A Halle, Axel
%A Lima, Angela Gerlach Sanches
%A Steenweg, Helge
%A Stefani, Sven
%B Proceedings of the European Conference on Research and Advanced Technology for Digital Libraries (ECDL) 2010
%C Berlin/Heidelberg
%D 2010
%E Lalmas, M.
%E Jose, J.
%E Rauber, A.
%E Sebastiani, F.
%E Frommholz, I.
%I Springer
%P 417--420
%T Academic Publication Management with PUMA - collect, organize and share publications
%U https://www.kde.cs.uni-kassel.de/pub/pdf/benz2010academic.pdf
%V 6273
%X The PUMA project fosters the Open Access movement und aims at a better support of the researcher’s publication work. PUMA stands for an integrated solution, where the upload of a publication results automatically in an update of both the personal and institutional homepage, the creation of an entry in a social bookmarking systems like BibSonomy, an entry in the academic reporting system of the university, and its publication in the institutional repository. In this poster, we present the main features of our solution.

2009

URLBibTeXEndNote

@inproceedings{benz2009characterizing,
address = {Bled, Slovenia},
author = {Benz, Dominik and Krause, Beate and Kumar, G. Praveen and Hotho, Andreas and Stumme, Gerd},
booktitle = {Proceedings of the 1st Workshop on Explorative Analytics of Information Networks (EIN2009)},
keywords = {itegpub},
month = {09},
title = {Characterizing Semantic Relatedness of Search Query Terms},
year = 2009
}
%0 Conference Paper
%1 benz2009characterizing
%A Benz, Dominik
%A Krause, Beate
%A Kumar, G. Praveen
%A Hotho, Andreas
%A Stumme, Gerd
%B Proceedings of the 1st Workshop on Explorative Analytics of Information Networks (EIN2009)
%C Bled, Slovenia
%D 2009
%T Characterizing Semantic Relatedness of Search Query Terms
%U https://www.kde.cs.uni-kassel.de/pub/pdf/benz2009characterizing.pdf
URLBibTeXEndNote

In this demo we present BibSonomy, a social bookmark and publication sharing system.
@inproceedings{benz2009managing,
abstract = {In this demo we present BibSonomy, a social bookmark and publication sharing system.},
address = {New York, NY, USA},
author = {Benz, Dominik and Eisterlehner, Folke and Hotho, Andreas and Jäschke, Robert and Krause, Beate and Stumme, Gerd},
booktitle = {HT '09: Proceedings of the 20th ACM Conference on Hypertext and Hypermedia},
editor = {Cattuto, Ciro and Ruffo, Giancarlo and Menczer, Filippo},
keywords = {itegpub},
month = {06},
pages = {323--324},
publisher = {ACM},
title = {Managing publications and bookmarks with BibSonomy},
year = 2009
}
%0 Conference Paper
%1 benz2009managing
%A Benz, Dominik
%A Eisterlehner, Folke
%A Hotho, Andreas
%A Jäschke, Robert
%A Krause, Beate
%A Stumme, Gerd
%B HT '09: Proceedings of the 20th ACM Conference on Hypertext and Hypermedia
%C New York, NY, USA
%D 2009
%E Cattuto, Ciro
%E Ruffo, Giancarlo
%E Menczer, Filippo
%I ACM
%P 323--324
%R 10.1145/1557914.1557969
%T Managing publications and bookmarks with BibSonomy
%U https://www.kde.cs.uni-kassel.de/pub/pdf/benz2009managing.pdf
%X In this demo we present BibSonomy, a social bookmark and publication sharing system.
%@ 978-1-60558-486-7
URLBibTeXEndNote

Social bookmarking systems and their emergent information structures, known as folksonomies, are increasingly important data sources for Semantic Web applications. A key question for harvesting semantics from these systems is how to extend and adapt traditional notions of similarity to folksonomies, and which measures are best suited for applications such as navigation support, semantic search, and ontology learning. Here we build an evaluation framework to compare various general folksonomy-based similarity measures derived from established information-theoretic, statistical, and practical measures. Our framework deals generally and symmetrically with users, tags, and resources. For evaluation purposes we focus on similarity among tags and resources, considering different ways to aggregate annotations across users. After comparing how tag similarity measures predict user-created tag relations, we provide an external grounding by user-validated semantic proxies based on WordNet and the Open Directory. We also investigate the issue of scalability. We ?nd that mutual information with distributional micro-aggregation across users yields the highest accuracy, but is not scalable; per-user projection with collaborative aggregation provides the best scalable approach via incremental computations. The results are consistent across resource and tag similarity.
@inproceedings{markines2009evaluating,
abstract = {Social bookmarking systems and their emergent information structures, known as folksonomies, are increasingly important data sources for Semantic Web applications. A key question for harvesting semantics from these systems is how to extend and adapt traditional notions of similarity to folksonomies, and which measures are best suited for applications such as navigation support, semantic search, and ontology learning. Here we build an evaluation framework to compare various general folksonomy-based similarity measures derived from established information-theoretic, statistical, and practical measures. Our framework deals generally and symmetrically with users, tags, and resources. For evaluation purposes we focus on similarity among tags and resources, considering different ways to aggregate annotations across users. After comparing how tag similarity measures predict user-created tag relations, we provide an external grounding by user-validated semantic proxies based on WordNet and the Open Directory. We also investigate the issue of scalability. We ?nd that mutual information with distributional micro-aggregation across users yields the highest accuracy, but is not scalable; per-user projection with collaborative aggregation provides the best scalable approach via incremental computations. The results are consistent across resource and tag similarity.},
author = {Markines, Benjamin and Cattuto, Ciro and Menczer, Filippo and Benz, Dominik and Hotho, Andreas and Stumme, Gerd},
booktitle = {18th International World Wide Web Conference},
keywords = {itegpub},
month = {04},
pages = {641--641},
title = {Evaluating Similarity Measures for Emergent Semantics of Social Tagging},
year = 2009
}
%0 Conference Paper
%1 markines2009evaluating
%A Markines, Benjamin
%A Cattuto, Ciro
%A Menczer, Filippo
%A Benz, Dominik
%A Hotho, Andreas
%A Stumme, Gerd
%B 18th International World Wide Web Conference
%D 2009
%P 641--641
%T Evaluating Similarity Measures for Emergent Semantics of Social Tagging
%U https://www.kde.cs.uni-kassel.de/pub/pdf/markines2009evaluating.pdf
%X Social bookmarking systems and their emergent information structures, known as folksonomies, are increasingly important data sources for Semantic Web applications. A key question for harvesting semantics from these systems is how to extend and adapt traditional notions of similarity to folksonomies, and which measures are best suited for applications such as navigation support, semantic search, and ontology learning. Here we build an evaluation framework to compare various general folksonomy-based similarity measures derived from established information-theoretic, statistical, and practical measures. Our framework deals generally and symmetrically with users, tags, and resources. For evaluation purposes we focus on similarity among tags and resources, considering different ways to aggregate annotations across users. After comparing how tag similarity measures predict user-created tag relations, we provide an external grounding by user-validated semantic proxies based on WordNet and the Open Directory. We also investigate the issue of scalability. We ?nd that mutual information with distributional micro-aggregation across users yields the highest accuracy, but is not scalable; per-user projection with collaborative aggregation provides the best scalable approach via incremental computations. The results are consistent across resource and tag similarity.
URLBibTeXEndNote

BibSonomy ist ein kooperatives Verschlagwortungssystem (Social Bookmarking System), betrieben vom Fachgebiet Wissensverarbeitungder Universit{ä}t Kassel. Es erlaubt das Speichern und Organisieren von Web-Lesezeichen und Metadaten für wissenschaftlichePublikationen. In diesem Beitrag beschreiben wir die von BibSonomy bereitgestellte Funktionalit{ä}t, die dahinter stehende Architektursowie das zugrunde liegende Datenmodell. Ferner erläutern wir Anwendungsbeispiele und gehen auf Methoden zur Analyse der in BibSonomy und ähnlichen Systemen enthaltenen Daten ein.
@incollection{hotho2009social,
abstract = {BibSonomy ist ein kooperatives Verschlagwortungssystem (Social Bookmarking System), betrieben vom Fachgebiet Wissensverarbeitungder Universit{ä}t Kassel. Es erlaubt das Speichern und Organisieren von Web-Lesezeichen und Metadaten für wissenschaftlichePublikationen. In diesem Beitrag beschreiben wir die von BibSonomy bereitgestellte Funktionalit{ä}t, die dahinter stehende Architektursowie das zugrunde liegende Datenmodell. Ferner erläutern wir Anwendungsbeispiele und gehen auf Methoden zur Analyse der in BibSonomy und ähnlichen Systemen enthaltenen Daten ein.},
address = {Berlin, Heidelberg},
author = {Hotho, Andreas and Jäschke, Robert and Benz, Dominik and Grahl, Miranda and Krause, Beate and Schmitz, Christoph and Stumme, Gerd},
booktitle = {Social Semantic Web},
chapter = 18,
editor = {Blumauer, Andreas and Pellegrini, Tassilo},
keywords = {itegpub},
pages = {363--391},
publisher = {Springer},
series = {X.media.press},
title = {Social Bookmarking am Beispiel BibSonomy},
year = 2009
}
%0 Book Section
%1 hotho2009social
%A Hotho, Andreas
%A Jäschke, Robert
%A Benz, Dominik
%A Grahl, Miranda
%A Krause, Beate
%A Schmitz, Christoph
%A Stumme, Gerd
%B Social Semantic Web
%C Berlin, Heidelberg
%D 2009
%E Blumauer, Andreas
%E Pellegrini, Tassilo
%I Springer
%P 363--391
%R 10.1007/978-3-540-72216-8
%T Social Bookmarking am Beispiel BibSonomy
%U https://www.kde.cs.uni-kassel.de/pub/pdf/hotho2009social.pdf
%X BibSonomy ist ein kooperatives Verschlagwortungssystem (Social Bookmarking System), betrieben vom Fachgebiet Wissensverarbeitungder Universit{ä}t Kassel. Es erlaubt das Speichern und Organisieren von Web-Lesezeichen und Metadaten für wissenschaftlichePublikationen. In diesem Beitrag beschreiben wir die von BibSonomy bereitgestellte Funktionalit{ä}t, die dahinter stehende Architektursowie das zugrunde liegende Datenmodell. Ferner erläutern wir Anwendungsbeispiele und gehen auf Methoden zur Analyse der in BibSonomy und ähnlichen Systemen enthaltenen Daten ein.
%& 18
%@ 978-3-540-72215-1

2008

URLBibTeXEndNote

@book{hotho2008challenge,
editor = {Hotho, Andreas and Benz, Dominik and Jäschke, Robert and Krause, Beate},
keywords = 2008,
publisher = {Workshop at 18th Europ. Conf. on Machine Learning (ECML'08) / 11th Europ. Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD'08)},
title = {ECML PKDD Discovery Challenge 2008 (RSDC'08)},
year = 2008
}
%0 Book
%1 hotho2008challenge
%D 2008
%E Hotho, Andreas
%E Benz, Dominik
%E Jäschke, Robert
%E Krause, Beate
%I Workshop at 18th Europ. Conf. on Machine Learning (ECML'08) / 11th Europ. Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD'08)
%T ECML PKDD Discovery Challenge 2008 (RSDC'08)
%U https://www.kde.cs.uni-kassel.de/ws/rsdc08/pdf/all_rsdc_v2.pdf
URLBibTeXEndNote

The objective of our group was to exploit state-of-the-art Information Retrieval methods for finding associations and dependencies between tags, capturing and representing differences in tagging behavior and vocabulary of various folksonomies, with the overall aim to better understand the semantics of tags and the tagging process. Therefore we analyze the semantic content of tags in the Flickr and Delicious folksonomies. We find that: tag context similarity leads to meaningful results in Flickr, despite its narrow folksonomy character; the comparison of tags across Flickr and Delicious shows little semantic overlap, being tags in Flickr associated more to visual aspects rather than technological as it seems to be in Delicious; there are regions in the tag-tag space, provided with the cosine similarity metric, that are characterized by high density; the order of tags inside a post has a semantic relevance.
@inproceedings{benz2008analyzing,
abstract = {The objective of our group was to exploit state-of-the-art Information Retrieval methods for finding associations and dependencies between tags, capturing and representing differences in tagging behavior and vocabulary of various folksonomies, with the overall aim to better understand the semantics of tags and the tagging process. Therefore we analyze the semantic content of tags in the Flickr and Delicious folksonomies. We find that: tag context similarity leads to meaningful results in Flickr, despite its narrow folksonomy character; the comparison of tags across Flickr and Delicious shows little semantic overlap, being tags in Flickr associated more to visual aspects rather than technological as it seems to be in Delicious; there are regions in the tag-tag space, provided with the cosine similarity metric, that are characterized by high density; the order of tags inside a post has a semantic relevance.},
author = {Benz, Dominik and Grobelnik, Marko and Hotho, Andreas and Jäschke, Robert and Mladenic, Dunja and Servedio, Vito D. P. and Sizov, Sergej and Szomszor, Martin},
booktitle = {Proceedings of the Dagstuhl Seminar on Social Web Communities},
editor = {Alani, Harith and Staab, Steffen and Stumme, Gerd},
keywords = {itegpub},
number = {08391},
title = {Analyzing Tag Semantics Across Collaborative Tagging Systems},
year = 2008
}
%0 Conference Paper
%1 benz2008analyzing
%A Benz, Dominik
%A Grobelnik, Marko
%A Hotho, Andreas
%A Jäschke, Robert
%A Mladenic, Dunja
%A Servedio, Vito D. P.
%A Sizov, Sergej
%A Szomszor, Martin
%B Proceedings of the Dagstuhl Seminar on Social Web Communities
%D 2008
%E Alani, Harith
%E Staab, Steffen
%E Stumme, Gerd
%N 08391
%T Analyzing Tag Semantics Across Collaborative Tagging Systems
%U https://www.kde.cs.uni-kassel.de/pub/pdf/benz2008analyzing.pdf
%X The objective of our group was to exploit state-of-the-art Information Retrieval methods for finding associations and dependencies between tags, capturing and representing differences in tagging behavior and vocabulary of various folksonomies, with the overall aim to better understand the semantics of tags and the tagging process. Therefore we analyze the semantic content of tags in the Flickr and Delicious folksonomies. We find that: tag context similarity leads to meaningful results in Flickr, despite its narrow folksonomy character; the comparison of tags across Flickr and Delicious shows little semantic overlap, being tags in Flickr associated more to visual aspects rather than technological as it seems to be in Delicious; there are regions in the tag-tag space, provided with the cosine similarity metric, that are characterized by high density; the order of tags inside a post has a semantic relevance.
URLBibTeXEndNote

Social bookmarking systems allow users to organise collections of resources on the Web in a collaborative fashion. The increasing popularity of these systems as well as first insights into their emergent semantics have made them relevant to disciplines like knowledge extraction and ontology learning. The problem of devising methods to measure the semantic relatedness between tags and characterizing it semantically is still largely open. Here we analyze three measures of tag relatedness: tag co-occurrence, cosine similarity of co-occurrence distributions, and FolkRank, an adaptation of the PageRank algorithm to folksonomies. Each measure is computed on tags from a large-scale dataset crawled from the social bookmarking system del.icio.us. To provide a semantic grounding of our findings, a connection to WordNet (a semantic lexicon for the English language) is established by mapping tags into synonym sets of WordNet, and applying there well-known metrics of semantic similarity. Our results clearly expose different characteristics of the selected measures of relatedness, making them applicable to different subtasks of knowledge extraction such as synonym detection or discovery of concept hierarchies.
@inproceedings{cattuto2008semantic,
abstract = {Social bookmarking systems allow users to organise collections of resources on the Web in a collaborative fashion. The increasing popularity of these systems as well as first insights into their emergent semantics have made them relevant to disciplines like knowledge extraction and ontology learning. The problem of devising methods to measure the semantic relatedness between tags and characterizing it semantically is still largely open. Here we analyze three measures of tag relatedness: tag co-occurrence, cosine similarity of co-occurrence distributions, and FolkRank, an adaptation of the PageRank algorithm to folksonomies. Each measure is computed on tags from a large-scale dataset crawled from the social bookmarking system del.icio.us. To provide a semantic grounding of our findings, a connection to WordNet (a semantic lexicon for the English language) is established by mapping tags into synonym sets of WordNet, and applying there well-known metrics of semantic similarity. Our results clearly expose different characteristics of the selected measures of relatedness, making them applicable to different subtasks of knowledge extraction such as synonym detection or discovery of concept hierarchies.},
address = {Patras, Greece},
author = {Cattuto, Ciro and Benz, Dominik and Hotho, Andreas and Stumme, Gerd},
booktitle = {Proceedings of the 3rd Workshop on Ontology Learning and Population (OLP3)},
keywords = {itegpub},
month = {07},
note = {ISBN 978-960-89282-6-8},
pages = {39--43},
title = {Semantic Analysis of Tag Similarity Measures in Collaborative Tagging Systems},
year = 2008
}
%0 Conference Paper
%1 cattuto2008semantic
%A Cattuto, Ciro
%A Benz, Dominik
%A Hotho, Andreas
%A Stumme, Gerd
%B Proceedings of the 3rd Workshop on Ontology Learning and Population (OLP3)
%C Patras, Greece
%D 2008
%P 39--43
%T Semantic Analysis of Tag Similarity Measures in Collaborative Tagging Systems
%U https://www.kde.cs.uni-kassel.de/pub/pdf/cattuto2008semantic.pdf
%X Social bookmarking systems allow users to organise collections of resources on the Web in a collaborative fashion. The increasing popularity of these systems as well as first insights into their emergent semantics have made them relevant to disciplines like knowledge extraction and ontology learning. The problem of devising methods to measure the semantic relatedness between tags and characterizing it semantically is still largely open. Here we analyze three measures of tag relatedness: tag co-occurrence, cosine similarity of co-occurrence distributions, and FolkRank, an adaptation of the PageRank algorithm to folksonomies. Each measure is computed on tags from a large-scale dataset crawled from the social bookmarking system del.icio.us. To provide a semantic grounding of our findings, a connection to WordNet (a semantic lexicon for the English language) is established by mapping tags into synonym sets of WordNet, and applying there well-known metrics of semantic similarity. Our results clearly expose different characteristics of the selected measures of relatedness, making them applicable to different subtasks of knowledge extraction such as synonym detection or discovery of concept hierarchies.
%@ 978-960-89282-6-8
URLBibTeXEndNote

Several learning tasks comprise hierarchies. Comparison with a "goldstandard" is often performed to evaluate the quality of a learned hierarchy. We assembled various similarity metrics that have been proposed in different disciplines and compared them in a unified interdisciplinary framework for hierarchical evaluation which is based on the distinction of three fundamental dimensions. Identifying deficiencies for measuring structural similarity, we suggest three new measures for this purpose, either extending existing ones or based on new ideas. Experiments with an artificial dataset were performed to compare the different measures. As shown by our results, the measures vary greatly in their properties.
@inproceedings{bade2008evaluation,
abstract = {Several learning tasks comprise hierarchies. Comparison with a "goldstandard" is often performed to evaluate the quality of a learned hierarchy. We assembled various similarity metrics that have been proposed in different disciplines and compared them in a unified interdisciplinary framework for hierarchical evaluation which is based on the distinction of three fundamental dimensions. Identifying deficiencies for measuring structural similarity, we suggest three new measures for this purpose, either extending existing ones or based on new ideas. Experiments with an artificial dataset were performed to compare the different measures. As shown by our results, the measures vary greatly in their properties.},
address = {Berlin-Heidelberg},
author = {Bade, Korinna and Benz, Dominik},
booktitle = {Proceedings of the 32nd Annual Conference of the German Classification Society - Advances in Data Analysis, Data Handling and Business Intelligence (GfKl 2008)},
keywords = {itegpub},
note = {in press},
publisher = {Springer},
series = {Studies in Classification, Data Analysis, and Knowledge Organization},
title = {Evaluation Strategies for Learning Algorithms of Hierarchical Structures},
year = 2008
}
%0 Conference Paper
%1 bade2008evaluation
%A Bade, Korinna
%A Benz, Dominik
%B Proceedings of the 32nd Annual Conference of the German Classification Society - Advances in Data Analysis, Data Handling and Business Intelligence (GfKl 2008)
%C Berlin-Heidelberg
%D 2008
%I Springer
%T Evaluation Strategies for Learning Algorithms of Hierarchical Structures
%U https://www.kde.cs.uni-kassel.de/pub/pdf/bade2008evaluation.pdf
%X Several learning tasks comprise hierarchies. Comparison with a "goldstandard" is often performed to evaluate the quality of a learned hierarchy. We assembled various similarity metrics that have been proposed in different disciplines and compared them in a unified interdisciplinary framework for hierarchical evaluation which is based on the distinction of three fundamental dimensions. Identifying deficiencies for measuring structural similarity, we suggest three new measures for this purpose, either extending existing ones or based on new ideas. Experiments with an artificial dataset were performed to compare the different measures. As shown by our results, the measures vary greatly in their properties.
URLBibTeXEndNote

Collaborative tagging systems have nowadays become important data sources for populating semantic web applications. For taskslike synonym detection and discovery of concept hierarchies, many researchers introduced measures of tag similarity. Eventhough most of these measures appear very natural, their design often seems to be rather ad hoc, and the underlying assumptionson the notion of similarity are not made explicit. A more systematic characterization and validation of tag similarity interms of formal representations of knowledge is still lacking. Here we address this issue and analyze several measures oftag similarity: Each measure is computed on data from the social bookmarking system del.icio.us and a semantic grounding isprovided by mapping pairs of similar tags in the folksonomy to pairs of synsets in Wordnet, where we use validated measuresof semantic distance to characterize the semantic relation between the mapped tags. This exposes important features of theinvestigated similarity measures and indicates which ones are better suited in the context of a given semantic application.
@inproceedings{cattuto2008semantica,
abstract = {Collaborative tagging systems have nowadays become important data sources for populating semantic web applications. For taskslike synonym detection and discovery of concept hierarchies, many researchers introduced measures of tag similarity. Eventhough most of these measures appear very natural, their design often seems to be rather ad hoc, and the underlying assumptionson the notion of similarity are not made explicit. A more systematic characterization and validation of tag similarity interms of formal representations of knowledge is still lacking. Here we address this issue and analyze several measures oftag similarity: Each measure is computed on data from the social bookmarking system del.icio.us and a semantic grounding isprovided by mapping pairs of similar tags in the folksonomy to pairs of synsets in Wordnet, where we use validated measuresof semantic distance to characterize the semantic relation between the mapped tags. This exposes important features of theinvestigated similarity measures and indicates which ones are better suited in the context of a given semantic application.},
address = {Heidelberg},
author = {Cattuto, Ciro and Benz, Dominik and Hotho, Andreas and Stumme, Gerd},
booktitle = {The Semantic Web -- ISWC 2008, Proc.Intl. Semantic Web Conference 2008},
editor = {Sheth, Amit P. and Staab, Steffen and Dean, Mike and Paolucci, Massimo and Maynard, Diana and Finin, Timothy W. and Thirunarayan, Krishnaprasad},
keywords = {methods_concepthierarchy},
pages = {615--631},
publisher = {Springer},
series = {LNAI},
title = {Semantic Grounding of Tag Relatedness in Social Bookmarking Systems},
volume = 5318,
year = 2008
}
%0 Conference Paper
%1 cattuto2008semantica
%A Cattuto, Ciro
%A Benz, Dominik
%A Hotho, Andreas
%A Stumme, Gerd
%B The Semantic Web -- ISWC 2008, Proc.Intl. Semantic Web Conference 2008
%C Heidelberg
%D 2008
%E Sheth, Amit P.
%E Staab, Steffen
%E Dean, Mike
%E Paolucci, Massimo
%E Maynard, Diana
%E Finin, Timothy W.
%E Thirunarayan, Krishnaprasad
%I Springer
%P 615--631
%R http://dx.doi.org/10.1007/978-3-540-88564-1_39
%T Semantic Grounding of Tag Relatedness in Social Bookmarking Systems
%U https://www.kde.cs.uni-kassel.de/pub/pdf/cattuto2008semantica.pdf
%V 5318
%X Collaborative tagging systems have nowadays become important data sources for populating semantic web applications. For taskslike synonym detection and discovery of concept hierarchies, many researchers introduced measures of tag similarity. Eventhough most of these measures appear very natural, their design often seems to be rather ad hoc, and the underlying assumptionson the notion of similarity are not made explicit. A more systematic characterization and validation of tag similarity interms of formal representations of knowledge is still lacking. Here we address this issue and analyze several measures oftag similarity: Each measure is computed on data from the social bookmarking system del.icio.us and a semantic grounding isprovided by mapping pairs of similar tags in the folksonomy to pairs of synsets in Wordnet, where we use validated measuresof semantic distance to characterize the semantic relation between the mapped tags. This exposes important features of theinvestigated similarity measures and indicates which ones are better suited in the context of a given semantic application.

2007

URLBibTeXEndNote

The emergence of collaborative tagging systems with their underlying flat and uncontrolled resource organization paradigm has led to a large number of research activities focussing on a formal description and analysis of the resulting “folksonomies�?. An interesting outcome is that the characteristic qualities of these systems seem to be inverse to more traditional knowledge structuring approaches like taxonomies or ontologies: The latter provide rich and precise semantics, but suffer - amongst others - from a knowledge acquisition bottleneck. An important step towards exploiting the possible synergies by bridging the gap between both paradigms is the automatic extraction of relations between tags in a folksonomy. This position paper presents preliminary results of ongoing work to induce hierarchical relationships among tags by analyzing the aggregated data of collaborative tagging systems as a basis for an ontology learning procedure.
@inproceedings{benz2007position,
abstract = {The emergence of collaborative tagging systems with their underlying flat and uncontrolled resource organization paradigm has led to a large number of research activities focussing on a formal description and analysis of the resulting “folksonomies�?. An interesting outcome is that the characteristic qualities of these systems seem to be inverse to more traditional knowledge structuring approaches like taxonomies or ontologies: The latter provide rich and precise semantics, but suffer - amongst others - from a knowledge acquisition bottleneck. An important step towards exploiting the possible synergies by bridging the gap between both paradigms is the automatic extraction of relations between tags in a folksonomy. This position paper presents preliminary results of ongoing work to induce hierarchical relationships among tags by analyzing the aggregated data of collaborative tagging systems as a basis for an ontology learning procedure.},
author = {Benz, Dominik and Hotho, Andreas},
booktitle = {Workshop Proceedings of Lernen - Wissensentdeckung - Adaptivität (LWA 2007)},
editor = {Hinneburg, Alexander},
keywords = {itegpub},
month = {09},
note = {http://lwa07.informatik.uni-halle.de/kdml07/kdml07.htm},
pages = {109--112},
publisher = {Martin-Luther-Universität Halle-Wittenberg},
title = {Position Paper: Ontology Learning from Folksonomies},
year = 2007
}
%0 Conference Paper
%1 benz2007position
%A Benz, Dominik
%A Hotho, Andreas
%B Workshop Proceedings of Lernen - Wissensentdeckung - Adaptivität (LWA 2007)
%D 2007
%E Hinneburg, Alexander
%I Martin-Luther-Universität Halle-Wittenberg
%P 109--112
%T Position Paper: Ontology Learning from Folksonomies
%U https://www.kde.cs.uni-kassel.de/pub/pdf/benz2007position.pdf
%X The emergence of collaborative tagging systems with their underlying flat and uncontrolled resource organization paradigm has led to a large number of research activities focussing on a formal description and analysis of the resulting “folksonomies�?. An interesting outcome is that the characteristic qualities of these systems seem to be inverse to more traditional knowledge structuring approaches like taxonomies or ontologies: The latter provide rich and precise semantics, but suffer - amongst others - from a knowledge acquisition bottleneck. An important step towards exploiting the possible synergies by bridging the gap between both paradigms is the automatic extraction of relations between tags in a folksonomy. This position paper presents preliminary results of ongoing work to induce hierarchical relationships among tags by analyzing the aggregated data of collaborative tagging systems as a basis for an ontology learning procedure.
%@ 978-3-86010-907-6
BibTeXEndNote

@inproceedings{eckert2007interactive,
author = {Eckert, Kai and Stuckenschmidt, Heiner and Pfeffer, Magnus},
booktitle = {{Proceedings of The Fourth International Conference on Knowledge Capture (K-CAP 2007), Whistler, Canada}},
keywords = {myown},
title = {{Interactive Thesaurus Assessment for Automatic Document Annotation}},
year = 2007
}
%0 Conference Paper
%1 eckert2007interactive
%A Eckert, Kai
%A Stuckenschmidt, Heiner
%A Pfeffer, Magnus
%B {Proceedings of The Fourth International Conference on Knowledge Capture (K-CAP 2007), Whistler, Canada}
%D 2007
%T {Interactive Thesaurus Assessment for Automatic Document Annotation}
URLBibTeXEndNote

Bookmarks (or favorites, hotlists) are popular strategies to relocate interesting websites on the WWW by creating a personalized URL repository. Most current browsers offer a facility to locally store and manage bookmarks in a hierarchy of folders; though, with growing size, users reportedly have trouble to create and maintain a stable organization structure. This paper presents a novel collaborative approach to ease bookmark management, especially the “classification�? of new bookmarks into a folder. We propose a methodology to realize the collaborative classification idea of considering how similar users have classified a bookmark. A combination of nearest-neighbor-classifiers is used to derive a recommendation from similar users on where to store a new bookmark. A prototype system called CariBo has been implemented as a plugin for the central bookmark server software SiteBar. All findings have been evaluated on a reasonably large scale, real user dataset with promising results, and possible implications for shared and social bookmarking systems are discussed.
@article{benz2007supporting,
abstract = {Bookmarks (or favorites, hotlists) are popular strategies to relocate interesting websites on the WWW by creating a personalized URL repository. Most current browsers offer a facility to locally store and manage bookmarks in a hierarchy of folders; though, with growing size, users reportedly have trouble to create and maintain a stable organization structure. This paper presents a novel collaborative approach to ease bookmark management, especially the “classification�? of new bookmarks into a folder. We propose a methodology to realize the collaborative classification idea of considering how similar users have classified a bookmark. A combination of nearest-neighbor-classifiers is used to derive a recommendation from similar users on where to store a new bookmark. A prototype system called CariBo has been implemented as a plugin for the central bookmark server software SiteBar. All findings have been evaluated on a reasonably large scale, real user dataset with promising results, and possible implications for shared and social bookmarking systems are discussed.},
author = {Benz, Dominik and Tso, Karen H. L. and Schmidt-Thieme, Lars},
journal = {Special Issue of the Computer Networks journal on Innovations in Web Communications Infrastructure},
keywords = {itegpub},
number = 16,
pages = {4574--4585},
title = {Supporting Collaborative Hierarchical Classification: Bookmarks as an Example},
volume = 51,
year = 2007
}
%0 Journal Article
%1 benz2007supporting
%A Benz, Dominik
%A Tso, Karen H. L.
%A Schmidt-Thieme, Lars
%D 2007
%J Special Issue of the Computer Networks journal on Innovations in Web Communications Infrastructure
%N 16
%P 4574--4585
%R http://dx.doi.org/10.1016/j.comnet.2007.06.014
%T Supporting Collaborative Hierarchical Classification: Bookmarks as an Example
%U https://www.kde.cs.uni-kassel.de/pub/pdf/benz2007supporting.pdf
%V 51
%X Bookmarks (or favorites, hotlists) are popular strategies to relocate interesting websites on the WWW by creating a personalized URL repository. Most current browsers offer a facility to locally store and manage bookmarks in a hierarchy of folders; though, with growing size, users reportedly have trouble to create and maintain a stable organization structure. This paper presents a novel collaborative approach to ease bookmark management, especially the “classification�? of new bookmarks into a folder. We propose a methodology to realize the collaborative classification idea of considering how similar users have classified a bookmark. A combination of nearest-neighbor-classifiers is used to derive a recommendation from similar users on where to store a new bookmark. A prototype system called CariBo has been implemented as a plugin for the central bookmark server software SiteBar. All findings have been evaluated on a reasonably large scale, real user dataset with promising results, and possible implications for shared and social bookmarking systems are discussed.

2006

URLBibTeXEndNote

Bookmarks (or Favorites, Hotlists) are a popular strategy to relocate interesting websites on the WWW by creating a personalized local URL repository. Most current browsers offer a facility to store and manage bookmarks in a hierarchy of folders; though, with growing size, users reportedly have trouble to create and maintain a stable taxonomy. This paper presents a novel collaborative approach to ease bookmark management, especially the “classification�? of new bookmarks into a folder. We propose a methodology to realize the collaborative classification idea of considering how similar users have classified a bookmark. A combination of nearest-neighbour-classifiers is used to derive a recommendation from similar users on where to store a new bookmark. Additionally, a procedure to generate keyword recommendations is proposed to ease the annotation of new bookmarks. A prototype system called CariBo has been implemented as a plugin of the central bookmark server software SiteBar. A case study conducted with real user data supports the validity of the approach.
@inproceedings{benz2006automatic,
abstract = {Bookmarks (or Favorites, Hotlists) are a popular strategy to relocate interesting websites on the WWW by creating a personalized local URL repository. Most current browsers offer a facility to store and manage bookmarks in a hierarchy of folders; though, with growing size, users reportedly have trouble to create and maintain a stable taxonomy. This paper presents a novel collaborative approach to ease bookmark management, especially the “classification�? of new bookmarks into a folder. We propose a methodology to realize the collaborative classification idea of considering how similar users have classified a bookmark. A combination of nearest-neighbour-classifiers is used to derive a recommendation from similar users on where to store a new bookmark. Additionally, a procedure to generate keyword recommendations is proposed to ease the annotation of new bookmarks. A prototype system called CariBo has been implemented as a plugin of the central bookmark server software SiteBar. A case study conducted with real user data supports the validity of the approach.},
address = {Edinburgh, Scotland},
author = {Benz, Dominik and Tso, Karen H. L. and Schmidt-Thieme, Lars},
booktitle = {Proceedings of the 2nd Workshop in Innovations in Web Infrastructure (IWI2) at WWW2006},
keywords = {itegpub},
month = {05},
note = {isbn = {085432853X}},
title = {Automatic Bookmark Classification - A Collaborative Approach},
year = 2006
}
%0 Conference Paper
%1 benz2006automatic
%A Benz, Dominik
%A Tso, Karen H. L.
%A Schmidt-Thieme, Lars
%B Proceedings of the 2nd Workshop in Innovations in Web Infrastructure (IWI2) at WWW2006
%C Edinburgh, Scotland
%D 2006
%T Automatic Bookmark Classification - A Collaborative Approach
%U https://www.kde.cs.uni-kassel.de/pub/pdf/benz2006automatic.pdf
%X Bookmarks (or Favorites, Hotlists) are a popular strategy to relocate interesting websites on the WWW by creating a personalized local URL repository. Most current browsers offer a facility to store and manage bookmarks in a hierarchy of folders; though, with growing size, users reportedly have trouble to create and maintain a stable taxonomy. This paper presents a novel collaborative approach to ease bookmark management, especially the “classification�? of new bookmarks into a folder. We propose a methodology to realize the collaborative classification idea of considering how similar users have classified a bookmark. A combination of nearest-neighbour-classifiers is used to derive a recommendation from similar users on where to store a new bookmark. Additionally, a procedure to generate keyword recommendations is proposed to ease the annotation of new bookmarks. A prototype system called CariBo has been implemented as a plugin of the central bookmark server software SiteBar. A case study conducted with real user data supports the validity of the approach.