Me
Simone Paolo Ponzetto




I joined the Data and Web Science Group in February 2013 as Juniorprofessor and hold since February 2016 the Chair of Information Systems III (Enterprise Data Analysis) as full professor (W3) at the University of Mannheim, where I lead the Natural Language Processing and Information Retrieval group.







News
jcdl Paper accepted at IJCAI 2018

Together with colleagues of the Sapienza University of Rome we have a paper on accepted at the 27th International Joint Conference on Artificial Intelligence (IJCAI), the top conference in the field of Artificial Intelligence.

Stefano Faralli, Irene Finocchi, Simone Paolo Ponzetto and Paola Velardi: Efficient Pruning of Large Knowledge Graphs.

jcdl Paper accepted at SIGIR 2018

Together with Ivan Vulic of the University of Cambridge we have a paper on "Unsupervised Cross-Lingual Information Retrieval using Monolingual Data Only" that has been accepted at the 41st International ACM Conference on Research and Development in Information Retrieval (SIGIR), the top conference in the field of Information Retrieval.

jcdl Paper accepted at JCDL 2018

We have a paper on "Entity-Aspect Linking: Providing Fine-Grained Semantics of Entities in Context" that has been accepted at the 2018 edition of the Joint Conference on Digital Libraries (JCDL), the top conference in the field of digital libraries.

This work is part of a collaboration between our group and Prof. Laura Dietz at the University of New Hampshire in the context of an Elite Post-Doc grant of the Baden-Württemberg Stiftung recently awarded from Laura.

webisadb DepCC corpus released

Together with our colleagues of the Language Technology Group of the University of Hamburg, we released a new web-scale dependency-parsed corpus based on the CommonCrawl. DepCC is a large linguistically analyzed corpus in English including 365 million documents, composed of 252 billion tokens and 7.5 billion of named entity occurrences in 14.3 billion sentences from a web-scale crawl.

You can find the corpus here. A description is available in this paper.

jcdl Papers accepted at LREC 2018

We have a few papers accepted at the 11th edition of the Language Resources and Evaluation Conference (LREC).

Alexander Panchenko, Eugen Ruppert, Stefano Faralli, Simone Paolo Ponzetto and Chris Biemann. Building a Web-Scale Dependency-Parsed Corpus from CommonCrawl.

Alexander Panchenko, Dmitry Ustalov, Stefano Faralli, Simone Paolo Ponzetto and Chris Biemann. Improving Hypernymy Extraction with Distributional Semantic Classes.

Stefano Faralli, Els Lefever and Simone Paolo Ponzetto. MIsA: Multilingual "IsA" Extraction from Corpora.

Stefano Faralli, Alexander Panchenko, Chris Biemann and Simone Paolo Ponzetto. LOaDing: Adding Distributional-Semantics Features to Framester.

Sanja Štajner, Marc Franco-Salvador, Paolo Rosso and Simone Paolo Ponzetto. CATS: A Tool for Customised Alignment of Text Simplification Corpora.

jointpapers JOIN-T journal paper accepted!

We have a new journal paper summarizing the findings of the first part of our DFG JOIN-T (Joining Ontologies and semantics INduced from Text) project with the colleagues of the Language Technology Group of the University of Hamburg.

Chris Biemann, Stefano Faralli, Alexander Panchenko and Simone Paolo Ponzetto: A framework for enriching lexical semantic resources with distributional semantics. To appear in the Journal of Natural Language Engineering. DOI: 10.1017/S135132491700047X. A pre-print version is available here.

You can find the project homepage here.
mad Some recent journal papers

We have a few papers accepted in a bunch of journal venues:

Goran Glavaš, Marc Franco-Salvador, Simone P. Ponzetto and Paolo Rosso. A resource-light method for cross-lingual semantic textual similarity. Knowledge-Based Systems, volume 143, pages 1-9. DOI: 10.1016/j.knosys.2017.11.041.

Federico Nanni, Laura Dietz and Simone Paolo Ponzetto. Toward a computational history of universities: Evaluating text mining methods for interdisciplinarity detection from PhD dissertation abstracts. To appear in Digital Scholarship in the Humanities. DOI: 10.1093/llc/fqx062 (available with a free-access article link here).

acl EMNLP 2017 Outstanding Paper

Our paper on Topic-Based Agreement and Disagreement in US Electoral Manifestos, coauthored with the DH Group at FBK Trento was selected as an Outstanding Paper at EMNLP 2017, one of the premier conferences in the field of NLP.

The paper presents a topic-based analysis of agreement and disagreement in US political manifestos, which relies on a new method for topic detection based on key concept clustering. Data and software can be found here.
aij Artificial Intelligence Journal 2017 Prominent Paper Award

The paper I co-authored back in 2012 with my former colleague Roberto Navigli on BabelNet won the Prominent Paper award 2017 of the Artificial Intelligence journal, the most prestigious journal in the field of AI.

The award recognizes outstanding papers published not more than seven years ago in the AI Journal that are exceptional in their significance and impact. You can find the official announcement here.

jcdl Nomination for the Best Student Paper Award at JCDL 2017

Our paper on Building Entity-Centric Event Collections was nominated for the Best Student Paper Award at the 2017 edition of the Joint Conference on Digital Libraries (JCDL), the top conference in the field of digital libraries. This work is part of a collaboration between our group and Prof. Laura Dietz at the University of New Hampshire in the context of an Elite Post-Doc grant of the Baden-Württemberg Stiftung recently awarded from Laura. The paper is available here.

joint IJCAI-17 paper accepted

We have a paper accepted at the 26th International Joint Conference on Artificial Intelligence (IJCAI), the premier conference in the field of AI.

Sanja Štajner, Simone Paolo Ponzetto and Heiner Stuckenschmidt: Automatic Assessment of Absolute Sentence Complexity.

This work is part of an ongoing DFG-project in the context of our Collaborative Research Center (SFB) 884 on the "Political Economy of Reforms". You can find the paper here.

jointpapers JOIN-T papers accepted!

A few papers focused around our DFG JOIN-T (Joining Ontologies and semantics INduced from Text) project with the Language Technology Group of the University of Hamburg have been accepted at major conferences in NLP and Semantic Web.


Stefano Faralli, Alexander Panchenko, Chris Biemann and Simone Paolo Ponzetto: The ContrastMedium Algorithm: Taxonomy Induction From Noisy Knowledge Graphs With Just A Few Links. To appear in the proceedings of EACL 2017.

Alexander Panchenko, Eugen Ruppert, Stefano Faralli, Simone Paolo Ponzetto and Chris Biemann: Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation. To appear in the proceedings of EACL 2017.

Stefano Faralli, Alexander Panchenko, Chris Biemann and Simone Paolo Ponzetto: Linked Disambiguated Distributional Semantic Networks. In proceedings of ISWC 2016.

webisadb WebIsA Database released

We are releasing a very large database of hypernymy relations extracted from the Common Crawl. You can find it here. For details please have a look at the LREC paper describing it (cite it if you find our work useful).

Our WebIsA Database has been used so far as a component in TAXI, our top-performing SemEval TExEval-2 system, as well as for our system participating in the ESWC 016 Open Knowledge Extraction Challenge - for which we received a nomination for best challenge paper!

joint JOIN-T team wins SemEval TExEval-2

A system developed as part of a collaboration between the NLP group of DWS and the Language Technology group of TU Darmstadt has been ranked first in an upcoming SemEval challenge on Taxonomy Extraction Evaluation (TExEval-2).

The results of the challenge can be found here. SemEval is the premier evaluation forum the computational semantics community. This work is part of an ongoing DFG-project (JOIN-T) collaboration between the two groups.

joint Honorable mention for the best paper award at ICSC 2017

Our paper on Domain Adaptation for Automatic Detection of Speculative Sentences received an honorable mention for the best paper award at the 11th IEEE International Conference on Semantic Computing. This work is part of an ongoing DFG-project in the context of our Collaborative Research Center (SFB) 884 on the "Political Economy of Reforms".

Older news can be found here.