Skip to content
Giacomo Bergami edited this page Jan 28, 2024 · 11 revisions

Generalised Graph Grammars

Each graph rule is semi-colon separated, where the last match shall not terminate with a semicolon. The grammar requires at least one graph rule (centralmatch)

Alt Text

Each graph grammar comes with a rule name (OTHERS) and is introduced by an equivalence sign (EQ). It requires providing a node match with a name, potential ingoing or outgoing edges (many_edges) and some joining edges not necessarily referring to the ego-net of the matching node (edge_joining). Data matching conditions on the nodes or the edges being considered are introduced by a where non-terminal. The rewriting is optional: if not specified, both the node and the descendant nodes are unchanged. Otherwise, we can selectively update the nodes in the graph being matched (rewrite_to). As the matching node might be replaced by a newly created node specified in the rewriting phase, this last one must also be specified explicitly at the end of the rewriting.

node match

This statement refers to a generic node match. When appearing at the beginning of a pattern match query, it refers to the entry-point of the graph query. Node declarations are bounded by rounded parentheses (L/RPAR), and might be associated to one or more variables ||-separated across the patterns (multiple_labels), so to determine the order of the graph grammar's rules application depending on the relationship between such nodes.

The following parts of this declaration are only expressed by the query language at this stage but are left to future work for implementation:

  • STAR semantics: expressing the optionality/multi-node match without necessarily nesting the morphism.
  • specifying an optional match referring to the node's label (COL (:) OTHERS). At these stage, these are only supported via subsequent data conditions (see WHERE), and not by pruning the retrieved edges within the morphism.

many_edges match

This matches many optional single edges arriving or departing from the query entry point. These edges can refer to an outgoing edge (#1), an ongoing edge (#2) or a hook (#3). When the node is morphism-aggregated, this refers to an edge connecting all the nodes being grouped by, but not necessarily forming a clique.

We now discuss the edgelabel not-terminal component: edges are introduced by squared brackets (Q/PPAR): the question mark remarks an edge being optionally matched (QM), while the ∀ symbol (FORALL) indicates that the target node of the current directed edge will be nested within a nested morphism. Edges can be optionally referred to by a variable name (OTHERS) that should always be followed by a colon (:). We can there express the possible options allowed for the edge's label via multiple_labels.

Last, we can specify free edges to connect the already-described nodes through multiple optional edge_joining edges within the same pattern.

Data conditions (WHERE)

The data conditions are expressed as a further refinement of the current graph pattern and are optionally introduced by a where token. We can then exploit straightforward equality or inequality conditions (=,≠,<,≤) between expressions identified in rewrite_expr, express a conjunction (∧) or disjunction (∨) against the previous boolean predicates, to use an advanced data predicate being expressed in Script v3.0 syntax in double-quotes. We can also further determine whether to keep an entire morphism from the current pattern as valid if one of the nodes or edges associated to a variable OTHER₁ do (not) appear as part of a morphism of the pattern OTHER₂ associated to a variable OTHER₃.

The data condition can be either a simple non double-quoted string (OTHER) or some specific data retrieved through the morphism (test_expr_side), which can be the following:

  • The OTHERS-th value (𝜉) associated to an object or variable rewrite_expr
  • The OTHERS-th label (ℓ) associated to an object or variable rewrite_expr
  • A key or property rewrite_expr₁ belonging to an object or variable rewrite_expr₂
  • A node-id accessible through an outgoing edge (or GSM containment) of rewrite_expr₂ with label (or containment name) rewrite_expr₁
  • The edge label for an edge associated with a variable rewrite_expr
  • The source (src) or an edge associated with a variable rewrite_expr
  • The target (dst) of an edge associated with a variable rewrite_expr
  • An if-then-else condition, where the alternative condition is optional
  • Yet again a boolean interpretation of a Script v3.0 expression within double quotes.

Graph Rewriting operations

The graph rewriting operations are optional and are introduced by a ↪ sign (REWRITE_TO). Please observe that matching a graph without necessarily restructuring the graph will, in this query language, return the same graph as initially loaded and will not return a set of subgraphs of the original graph; to obtain the latter, you also have to specify which edges or nodes have to be deleted (del). new will temporally instantiate a variable OTHERS holding a newly created object, while set will update the values in rewrite_expr₁ with the ones determined in rewrite_expr₂. If the current entry-point query has been removed and we want to replace such a node with a newly created one, we can then express our intent to return a specific node associated with a given variable with an optional final node statement.

Full Grammar

The full Syntax Diagram generated from this ANTLR4 grammar provides the full specification of this language.

Script v3.0

[TODO]

The complete Syntax Diagram generated from this ANTLR4 grammar provides the full specification of this language.

The most recent description of this scripting language acting as a MetaModel for GSM is described in our seminal paper;

G. Bergami, W. Zegadło. 2023. "Towards a Generalised Semistructured Data Model and Query Language" SIGWEB Newsl. 2023, Summer, Article 4 (Summer 2023), 22 pages.

Still, the current syntax has been improved to easily be parsed with an LL(1) grammar, thus allowing a more efficient code analysis at run time. Therefore, the specific syntax from that paper was used in this, another version of our project.

Clone this wiki locally