Disk graph calc

4/14/2023

We study its benefits and tradeoffs for a variety of datasets. To improve the precision of exact subgraph matching, we propose a new method based on the concept of line graphs. We call this holistic query processing where a query is processed as a whole without decomposing it into smaller units. We show that exact subgraph matching and approximate (full) graph matching queries can be processed by performing operations such as intersection and union over the data and query graph signatures. During query processing, a query graph is also mapped into its signature. Rather than decomposing a graph into smaller units (e.g., paths, trees, graphs) for indexing purposes, we represent each graph in the database by its graph signature, which is essentially a multiset, and each signature is then indexed.

We are particularly interested in small and medium-sized data graphs that arise in domains such as chemical informatics and proteomics. We propose a new way of indexing a large database of graphs and processing exact subgraph matching (or subgraph isomorphism) and approximate (full) graph matching queries. The results show that the query suggestions are useful (saved roughly 40% of users’ mouse clicks), and AutoG returns suggestions shortly under a large variety of parameter settings. We study the query suggestion quality with simulations and real users and conduct an extensive performance evaluation.

Fourth, we propose a novel index called feature Dag (FDag) to optimize the ranking. Third, we propose algorithms to rank candidate suggestions. Second, we propose to increment a query with the logical units called c-prime features that are (i) frequent subgraphs and (ii) constructed from smaller c-prime features in no more than c ways. The novelties of AutoG are as follows: First, we formalize query composition. Users may choose a query from \(Q'\) and iteratively apply AutoG to compose their queries. Given an initial query q and a user’s preference as input, AutoG returns ranked query suggestions \(Q'\) as output. In this paper, we propose a novel framework for subgraph query autocompletion (called AutoG). Despite the great success of query formulation aids, in particular, automatic query completion, graph query autocompletion has received much less research attention. This is particularly true of graph queries as they are typically complex and prone to errors, compounded by the fact that graph schemas can be missing or too loose to be helpful for query formulation. Through extensive experiments using many syn- thetic and real datasets, we also provide new empirical findings in the performance of the full disk-based implementations of these methods.Ĭomposing queries is evidently a tedious task. Unlike existing implementations which either use (full or partial) in-memory representations or rely on OS file system cache without guaranteeing real disk I/Os, we have im- plemented these indexes on top of a storage engine that guarantees real disk I/Os. In order to address these problems, we have made significant efforts in implementing all representative indexing methods on a common framework called iGraph. However, we observe this practice may result in several problems. The current practice for experiments in graph indexing techniques is that the author of a newly proposed technique does not implement existing indexes on his own code base, but instead uses the original authors' binary executables and reports only the wall clock time. This way, the number of disk I/Os and subgraph isomorphism tests can be significantly minimized. Then, we need to use expensive subgraph isomorphism tests to ver- ify filtered candidates only. By using a graph index as a fil- ter, we prune graphs that are not real answers at an inexpensive cost. Recently, there have been a lot of research effortstosolvethesubgraphisomorphismproblemforalargegraph database by utilizing graph indexes.

Given a query graphQ, thesubgraph isomorphismprob- lem is to find a set of graphs containing Q from a graph database, which is NP-complete. Graphs are of growing importance in modeling complex structures such as chemical compounds, proteins, images, and program de- pendence.

0 Comments

Disk graph calc

Leave a Reply.

Author

Archives

Categories