## What is ABSTAT about?

ABSTAT is a framework that provides a better understanding of big and complex linked data sets.

A user who wants to know which data set better fits his needs should be able to answer to questions such as: what types of resources are described in each data set? What properties are used to describe the resources? What types of resources are linked and by means of what properties? How many resources have a certain type and how frequent is the use of a given property?

Considering the size and the number of available linked data sets and that they make use of ontologies (they too might be large) to describe the semantics of their data, answering the above questions by only looking at ontologies or by using explorative queries could be too hard. Therefore it's more efficient to explore a data set using ABSTAT.

The ABSTAT produces a summary of linked data sets that is correct and complete with respect to the assertions of the data set and whose size scales well with respect to ontologies and data set size.

The key feature of a summary is the use of minimal type patterns to represent an abstraction of the data set. A minimal type pattern is a triple (C, P, D) that represents the occurrence of assertions `<a,P,b>`

in the RDF data, such that C is a minimal type of the subject a and D is a minimal type of the object b according to a terminology graph, which is introduced to represent the data ontology. By considering patterns that are based on minimal types we are able to exclude several redundant patterns from the summary. As a consequence, summaries based on our model are rich enough to represent adequately the whole data set, and small enough to avoid redundant information.

Finally, a summary also includes statistics about the occurrence of minimal type patterns and types in the data.

## Pubblications

* ABSTAT-HD: a Scalable Tool for Profiling Very Large Knowledge Graphs.* Renzo Alva Principe, Andrea Maurino, Matteo Palmonari, Michele Ciavotta, Blerina Spahiu - The VLDB Journal 2021.

* ABSTAT 1.0: Compute, Manage and Share Semantic Profiles of RDF Knowledge Graphs.* Renzo Alva Principe, Blerina Spahiu, Matteo Palmonari, Anisa Rula, Flavio De Paoli, Andrea Maurino - ESWC (Satellite Events) 2018: 170-175

* Ontology-based Linked Data Summarization in Semantics-aware Recommender Systems.* Vito Walter Anelli, Tommaso Di Noia, Andrea Maurino, Matteo Palmonari, Anisa Rula - SEBD 2018

**(BEST PAPER AWARD)** * Towards Improving the Quality of Knowledge Graphs with Data-driven Ontology Patterns and SHACL.* Blerina Spahiu, Andrea Maurino, Matteo Palmonari - ISWC (Best Workshop Papers) 2018: 103-117

* Schema-aware Feature Selection in Linked Data-based Recommender Systems.* Corrado Magarelli, Azzurra Ragone, Paolo Tomeo, Tommaso Di Noia, Matteo Palmonari, Andrea Maurino, Eugenio Di Sciascio. - IIR 2017: 67-71

* Schema-summarization in linked-data-based feature selection for recommender systems.* Azzurra Ragone, Paolo Tomeo, Corrado Magarelli, Tommaso Di Noia, Matteo Palmonari, Andrea Maurino, Eugenio Di Sciascio - SAC 2017: 330-335

**(BEST PAPER AWARD)** * ABSTAT: Ontology-Driven Linked Data Summaries with Pattern Minimalization.* Blerina Spahiu, Riccardo Porrini, Matteo Palmonari, Anisa Rula, Andrea Maurino. - ESWC (Satellite Events) 2016: 381-395

* ABSTAT: Linked Data Summaries with ABstraction and STATistics.* Matteo Palmonari, Anisa Rula, Riccardo Porrini, Andrea Maurino, Blerina Spahiu, Vincenzo Ferme. - ESWC (Satellite Events) 2015: 128-132