The GQL project draws on multiple sources or inputs, notably existing industrial languages and a new section of the SQL standard. In preparatory discussions within WG3 surveys of the history and comparative content of some of these inputs were presented. GQL is a declarative language with its own distinct syntax, playing a similar role to SQL in the building of a database application. Other graph query languages have been defined which offer direct procedural features such as branching and looping (Apache Tinkerpop's
Gremlin), and GSQL, The GQL project coordinates closely with the SQL/PGQ "project split" of (extension to) ISO 9075 SQL, and the technical working groups in the U.S. (INCITS DM32) and at the international level (SC32/WG3) have several expert contributors who work on both projects.
Cypher Cypher is a language originally designed by Andrés Taylor and colleagues at Neo4j Inc., and first implemented by that company in 2011. Since 2015 it has been made available as an open source language description with grammar tooling, a
JVM front-end that parses Cypher queries, and a Technology Compatibility Kit (TCK) of over 2000 test scenarios, using
Cucumber for implementation language portability. The TCK reflects the language description and an enhancement for temporal datatypes and functions documented in a Cypher Improvement Proposal. Cypher allows creation, reading, updating and deleting of graph elements, and is a language that can therefore be used for analytics engines and transactional databases.
Querying with visual path patterns Cypher uses compact fixed- and variable-length patterns which combine visual representations of node and relationship (edge) topologies, with label existence and property value predicates. (These patterns are usually referred to as "
ASCII art" patterns, and arose originally as a way of commenting programs which used a lower-level graph API.
PGQL PGQL is a language designed and implemented by Oracle Inc., but made available as an open source specification, along with JVM parsing software. PGQL combines familiar SQL SELECT syntax including SQL expressions and result ordering and aggregation with a pattern matching language very similar to that of Cypher. It allows the specification of the graph to be queried, and includes a facility for macros to capture "pattern views", or named sub-patterns. It does not support insertion or updating operations, having been designed primarily for an analytics environment, such as Oracle's PGX product. PGQL has also been implemented in Oracle Big Data Spatial and Graph, and in a research project, PGX.D/Async.
G-CORE G-CORE is a research language designed by a group of academic and industrial researchers and language designers which draws on features of Cypher, PGQL and
SPARQL. The project was conducted under the auspices of the Linked Data Benchmark Council (LDBC), starting with the formation of a Graph Query Language task force in late 2015, with the bulk of the work of paper writing occurring in 2017. G-CORE is a composable language which is closed over graphs: graph inputs are processed to create a graph output, using graph projections and graph set operations to construct the new graph. G-CORE queries are pure functions over graphs, having no side effects, which mean that the language does not define operations which mutate (update or delete) stored data. G-CORE introduces views (named queries). It also incorporates paths as elements in a graph ("paths as first class citizens"), which can be queried independently of projected paths (which are computed at query time over node and edge elements). G-CORE has been partially implemented in open-source research projects in the LDBC GitHub organization.
GSQL GSQL is a language designed for
TigerGraph Inc.'s proprietary graph database. Since October 2018 TigerGraph language designers have been promoting and working on the GQL project. GSQL is a
Turing-complete language that incorporates procedural flow control and iteration, and a facility for gathering and modifying computed values associated with a program execution for the whole graph or for elements of a graph called accumulators. These features are designed to enable iterative graph computations to be combined with data exploration and retrieval. GSQL graphs must be described by a schema of vertexes and edges, which constrains all insertions and updates. This schema therefore has the closed world property of an SQL schema, and this aspect of GSQL (also reflected in design proposals deriving from the Morpheus project) is proposed as an important optional feature of GSQL. Vertexes and edges are named schema objects which contain data but also define an imputed type, much as
SQL tables are data containers, with an associated implicit row type. GSQL graphs are then composed from these vertex and edge sets, and multiple named graphs can include the same vertex or edge set. GSQL has developed new features since its release in September 2017, most notably introducing variable-length edge pattern matching using a syntax related to that seen in Cypher, PGQL and SQL/PGQ, but also close in style to the fixed-length patterns offered by
Microsoft SQL/Server Graph GSQL also supports the concept of Multigraphs which allow subsets of a graph to have role-based access control. Multigraphs are important for enterprise-scale graphs that need fine-grain access control for different users.
Morpheus: multiple graphs and composable graph queries in Apache Spark The opencypher Morpheus project The Morpheus project acted as a testbed for extensions to Cypher (known as "Cypher 10") in the two areas of graph DDL and query language extensions. Graph DDL features include • definition of property graph views over
JDBC-connected SQL tables and Spark DataFrames • definition of graph schemas or types defined by assembling node type and edge type patterns, with subtyping • constraining the content of a graph by a closed or fixed schema • creating catalog entries for multiple named graphs in a hierarchically organized catalog • graph data sources to form a federated, heterogeneous catalog • creating catalog entries for named queries (views) Graph query language extensions include • graph union • projection of graphs computed from the results of pattern matches on multiple input graphs • support for tables (Spark DataFrames) as inputs to queries ("driving tables") • views which accept named or projected graphs as parameters. These features have been proposed as inputs to the standardization of property graph query languages in the GQL project. == Limitations ==