Polypheny extends the traditional CQL Specification by introducing several new features and adjustments to enhance query detail and clarity. This documentation provides an overview of the custom CQL implementation, including the absence of prefix assignments and search-term-only filters, and the introduction of additional keywords.
Polypheny implements namespaces to function as Context Sets, containing tables, which themselves host columns. The roles of these entities align with the “index” definition in the standard CQL specification. Operations performed on columns are referred to as ‘Filters’, while the outcomes of combined operations (such as Joins or Set Operations) on tables are designated as ‘Relations’.
Information: The integration of CQL into Polypheny was primarily executed as part of Vishal Dalwadi’s Google Summer of Code 2021 project, entitled Support for Contextual Query Language.
CQL can be executed using the query console provided by the Polypheny-UI. Additionally, Polypheny also supports an HTTP-Interface, which is able to handle CQL queries. After deploying the interface, CQL queries can be submitted via HTTP POST request to the specified port (default: 13137) and the corresponding route (
/cql). The query needs to be placed in the body of the POST request. The result is returned as JSON.
Polypheny’s CQL implementation uses fully qualified names instead of indices. For example, column names like
public.emps.emp and table names like
Polypheny’s CQL Implementation consists of four major parts: Filters, Relations, Sort Specifications and Projections. The format of each of these is discussed later. The basic syntax of a query consisting of these four parts is shown below. The parser is case-insensitive when it comes to keywords (such as modifiers, boolean operators or comparison operators) but is case-sensitive when it comes to names and literals (such column names, table names or literal values in filters).
CQL Query: ( Filters | relation Relation | Filters relation Relation ) [ sortby SortSpecification ] [ project Projection ] Filters: ( Filters ) | Filter BooleanOp Filter | Filter Relation: Table Combiner Relation | Table SortSpecification: Column Modifiers Projection: Column Modifiers Filter: Column ComparisonOp ( Column | Literal value with or without double quotes ) BooleanOp: (AND | OR | NOT | PROX) Modifiers Combiner: (AND | OR) Modifiers Column: NamespaceName.TableName.ColumnName Modifiers: ( Modifier )* Modifier: / ModifierName [ ComparisonOp ModifierValue ] ComparisonOp: (= | == | <> | < | > | <= | >= | NamedComparator) Modifiers NamedComparator: String containing only alphabets. Table: NamespaceName.TableName NamespaceName, TableName, ColumnName: String containing only alphabets or underscores (_). ModifierName: String without double quotes, white spaces, escaped double quote, parenthesis, =, <, > or /. ModifierValue: String value with or without double quotes.
Filters are used to do comparisons on a column in the
Relation. They are similar to SQL’s WHERE clause.
Relation is specified, the column must be from the relation specified. Comparisons can be between literal
values or columns; however, support for comparisons with columns is still underway. Comparisons like equals,
not equal, less than, greater than, less than or equals and greater than or equals can be done with support for
more (between, any, all, etc.) to be added later. Multiple filters can be separated by AND, OR, NOT or PROX boolean
operators; however, support for PROX is still underway. Boolean operators’ precedence depend on their position; i.e.
First occurring boolean operator has higher precedence than those occurring later. However, this can be change by
Relation is the final table that the query would be executed on. So the relation is a combination of
multiple tables. The combination operation can be a join, union, intersection, set-difference, etc; however,
implementation of set operations as combiners is still underway. The actual combiner keyword used is AND or OR.
For joins, AND means a INNER join, whereas OR means a FULL, LEFT or RIGHT join. The combiner also takes modifiers
used to modify its execution. The two modifiers currently supported are:
null modifier is to be used with OR combiner to specify which of the rows can be null. Possible values of
null modifier are:
both (FULL join),
right (LEFT join) or
left (RIGHT join).
on modifier is used to specify the columns to join on. It only works if the column(s) belong to both the
tables. Possible values of
on modifier are:
all (finds the common columns between two tables; Default for AND),
none (Default for OR), comma-separated list of column names (for example, ‘name,id’).
Sort specification is used to specify a space separated column list on which to sort the query output. These columns must be projected if a projection clause is specified.
Projection is used to specify the columns for the result. It is also used for aggregations and grouping.
Consider a namespace “public” with tables “employee” and “dept” defined as follows:
CREATE TABLE public.dept( deptno TINYINT NOT NULL, deptname VARCHAR(30) NOT NULL, PRIMARY KEY (deptno) ); CREATE TABLE public.employee( empno INTEGER NOT NULL, empname VARCHAR(20) NOT NULL, salary REAL NOT NULL, deptno TINYINT NOT NULL, married BOOLEAN NOT NULL, dob DATE NOT NULL, joining_date DATE NOT NULL, PRIMARY KEY (empno) );
Then the following CQL queries can be executed on the namespace.
Find employee named “Loki”:
public.employee.empname == "Loki"
Find all employees in the HR department that are married:
public.dept.deptname == "HR" and public.employee.married == TRUE
Find all employees from the HR or IT department:
public.dept.deptname == "HR" or public.dept.deptname == "IT" relation public.employee and public.dept
Find all employees from all departments except HR:
public.employee.empno >= 1 NOT public.dept.deptname == "HR"
Count the number of employees:
relation public.employee project public.employee.empno/count
Get all the employee names sorted by date of birth:
relation public.employee sortby public.employee.dob project public.employee.empname
Count the number of employees in each department:
relation public.employee project public.employee.empno/count public.employee.deptno
Looking ahead, there are several promising enhancements and features that will further refine the functionality and user experience of Polypheny’s CQL:
getCommonColumnsOptimization: By creating a cache, the performance of this method can be substantially improved.
- Column Filters Support: This will provide a more versatile and refined filtering mechanism.
- Set Operations Support: This will allow for more complex query operations.
- Proximity Boolean Operator Support: This will enable advanced, proximity-based search operations.
- Querying the Result of a Query: This will allow for nested querying, leading to more complex and comprehensive data extraction.
- Modifiers for Sorting, Projection, and Filtering: These will allow for further customization and fine-tuning of data operations.
- Enhanced Test Coverage: To ensure robustness and reliability of the CQL, comprehensive testing will be essential.