Stefano Faralli, University of Mannheim, Germany

Irene Finocchi, Sapienza, University of Rome, Italy

Simone Paolo Ponzetto, University of Mannheim, Germany

Paola Velardi, Sapienza, University of Rome, Italy

We present CrumbTrail, an algorithm to clean large and dense knowledge graphs. CrumbTrail removes cycles, out-of-domain nodes and non-essential nodes, i.e., those that can be safely removed without breaking the knowledge graph’s connectivity. It achieves this through a bottom-up topological pruning on the basis of a set of input concepts that, for instance, a user can select in order to identify a domain of interest. Our technique can be applied to both noisy hypernymy graphs – typically generated by ontology learning algorithms as intermediate representations – as well as crowdsourced resources like Wikipedia, in order to obtain clean, domainfocused concept hierarchies. CrumbTrail overcomes the time and space complexity limitations of current state-of-art algorithms. | |

| |

We provide 6 examples of the application of CrumbTrail to toy noisy graphs: | |

example 1, example 2, example 3, example 4, example 5, example 6. | |

Once you open an example, click on a row of the table (on the right side of the screen) to visualize the corresponding steps of the algorithm. | |

| |

NetBeans Java application project |

## - License | ||

All the datasets and software are licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) License: https://creativecommons.org/licenses/by/4.0/. |

DFG project JOIN-T