Me
Simone Paolo Ponzetto




I joined the Data and Web Science Group in February 2013 as Juniorprofessor and hold since February 2016 the Chair of Information Systems III (Enterprise Data Analysis) as full professor (W3) at the University of Mannheim, where I lead the Natural Language Processing and Information Retrieval group.







News
acl Computational Linguistics paper accepted

We have a paper on Watset - a meta-algorithm for fuzzy graph clustering that can be applied for a wide range of computational semantic task on graphs of linguistic data - accepted in the CL journal, the most prestigious journal in the field of NLP.

You can find a pre-print here on arXiv.

The paper is part of a collaboration with our friends and colleagues at the University of Hamburg in the context of the JOIN-T project.
jointpapers DFG funds second phase of the JOIN-T project

The Deutsche Forschungsgemeinschaft accepted our proposal for extending a joint research project on hybrid semantic representations together with our friends and colleagues of the Language Technology Group of the University of Hamburg.

The project, titled "Joining graph- and vector-based sense representations for semantic end-user information access" (JOIN-T 2) builds upon and aims at bringing our JOIN-T project (also funded by DFG) one step forward. Our vision for the next three years is to explore ways to produce semantic representations that combine the interpretability of manually crafted resources and sparse representations with the accuracy and high coverage of dense neural embeddings.

Stay tuned for forthcoming research papers and resources!
acl Paper accepted at ACL 2018

Together with our friends and colleagues of the University of Hamburg and Oslo we have a paper on "Unsupervised Semantic Frame Induction using Triclustering" at the forthcoming 56th Annual Meeting of the Association for Computational Linguistics (ACL), the top conference in the field of Natural Language Processing.

You can find the paper as usual on the ACL Antology here.

jcdl Best Paper Award at JCDL 2018

Our paper on Entity-Aspect Linking: Providing Fine-Grained Semantics of Entities in Context won the Best Paper Award at the 2018 edition of the Joint Conference on Digital Libraries (JCDL), the top conference in the field of digital libraries. This work is part of a collaboration between our group and Prof. Laura Dietz at the University of New Hampshire in the context of an Elite Post-Doc grant of the Baden-W├╝rttemberg Stiftung recently awarded from Laura. The paper is available here.

ijcai18 Paper accepted at IJCAI 2018

Together with colleagues of the Sapienza University of Rome we have a paper on accepted at the 27th International Joint Conference on Artificial Intelligence (IJCAI), the top conference in the field of Artificial Intelligence.

Stefano Faralli, Irene Finocchi, Simone Paolo Ponzetto and Paola Velardi: Efficient Pruning of Large Knowledge Graphs.

sigir18 Paper accepted at SIGIR 2018

Together with Ivan Vulic of the University of Cambridge we have a paper on "Unsupervised Cross-Lingual Information Retrieval using Monolingual Data Only" that has been accepted at the 41st International ACM Conference on Research and Development in Information Retrieval (SIGIR), the top conference in the field of Information Retrieval.

jcdl18 Paper accepted at JCDL 2018

We have a paper on "Entity-Aspect Linking: Providing Fine-Grained Semantics of Entities in Context" that has been accepted at the 2018 edition of the Joint Conference on Digital Libraries (JCDL), the top conference in the field of digital libraries.

This work is part of a collaboration between our group and Prof. Laura Dietz at the University of New Hampshire in the context of an Elite Post-Doc grant of the Baden-W├╝rttemberg Stiftung recently awarded from Laura.

deepcc DepCC corpus released

Together with our colleagues of the Language Technology Group of the University of Hamburg, we released a new web-scale dependency-parsed corpus based on the CommonCrawl. DepCC is a large linguistically analyzed corpus in English including 365 million documents, composed of 252 billion tokens and 7.5 billion of named entity occurrences in 14.3 billion sentences from a web-scale crawl.

You can find the corpus here. A description is available in this paper.

Older news can be found here.