Polypheny extends the traditional CQL Specification by introducing several new features and adjustments to enhance query detail and clarity. This documentation provides an overview of the custom CQL implementation, including the absence of prefix assignments and search-term-only filters, and the introduction of additional keywords.
Polypheny implements namespaces to function as Context Sets, containing tables, which themselves host columns. The roles of these entities align with the “index” definition in the standard CQL specification. Operations performed on columns are referred to as ‘Filters’, while the outcomes of combined operations (such as Joins or Set Operations) on tables are designated as ‘Relations’.
Information: The integration of CQL into Polypheny was primarily executed as part of Vishal Dalwadi’s Google Summer of Code 2021 project, entitled Support for Contextual Query Language.
Basics
CQL can be executed using the query console provided by the Polypheny-UI. Additionally, Polypheny also supports an HTTP-Interface, which is able to handle CQL queries. After deploying the interface, CQL queries can be submitted via HTTP POST request to the specified port (default: 13137) and the corresponding route (/cql
). The query needs to be placed in the body of the POST request. The result is returned as JSON.
Polypheny’s CQL implementation uses fully qualified names instead of indices. For example, column names like public.emps.emp
and table names like public.emps
.
Syntax
Polypheny’s CQL Implementation consists of four major parts: Filters, Relations, Sort Specifications and Projections. The format of each of these is discussed later. The basic syntax of a query consisting of these four parts is shown below. The parser is case-insensitive when it comes to keywords (such as modifiers, boolean operators or comparison operators) but is case-sensitive when it comes to names and literals (such column names, table names or literal values in filters).
CQL Query:
(
Filters
|
relation Relation
|
Filters relation Relation
)
[ sortby SortSpecification ]
[ project Projection ]
Filters:
( Filters ) | Filter BooleanOp Filter | Filter
Relation:
Table Combiner Relation | Table
SortSpecification:
Column Modifiers
Projection:
Column Modifiers
Filter:
Column ComparisonOp ( Column | Literal value with or without double quotes )
BooleanOp:
(AND | OR | NOT | PROX)
Modifiers
Combiner:
(AND | OR)
Modifiers
Column:
NamespaceName.TableName.ColumnName
Modifiers:
( Modifier )*
Modifier:
/ ModifierName
[ ComparisonOp ModifierValue ]
ComparisonOp:
(= | == | <> | < | > | <= | >= | NamedComparator)
Modifiers
NamedComparator:
String containing only alphabets.
Table:
NamespaceName.TableName
NamespaceName, TableName, ColumnName:
String containing only alphabets or underscores (_).
ModifierName:
String without double quotes, white spaces, escaped double quote, parenthesis, =, <, > or /.
ModifierValue:
String value with or without double quotes.
Filters
Filters are used to do comparisons on a column in the Relation
. They are similar to SQL’s WHERE clause.
If Relation
is specified, the column must be from the relation specified. Comparisons can be between literal
values or columns; however, support for comparisons with columns is still underway. Comparisons like equals,
not equal, less than, greater than, less than or equals and greater than or equals can be done with support for
more (between, any, all, etc.) to be added later. Multiple filters can be separated by AND, OR, NOT or PROX boolean
operators; however, support for PROX is still underway. Boolean operators’ precedence depend on their position; i.e.
First occurring boolean operator has higher precedence than those occurring later. However, this can be change by
using parenthesis.
Relation
Relation is the final table that the query would be executed on. So the relation is a combination of
multiple tables. The combination operation can be a join, union, intersection, set-difference, etc; however,
implementation of set operations as combiners is still underway. The actual combiner keyword used is AND or OR.
For joins, AND means a INNER join, whereas OR means a FULL, LEFT or RIGHT join. The combiner also takes modifiers
used to modify its execution. The two modifiers currently supported are: null
and on
.
The null
modifier is to be used with OR combiner to specify which of the rows can be null. Possible values of
null
modifier are: both
(FULL join), right
(LEFT join) or left
(RIGHT join).
The on
modifier is used to specify the columns to join on. It only works if the column(s) belong to both the
tables. Possible values of on
modifier are: all
(finds the common columns between two tables; Default for AND),
none
(Default for OR), comma-separated list of column names (for example, ‘name,id’).
Sort Specification
Sort specification is used to specify a space separated column list on which to sort the query output. These columns must be projected if a projection clause is specified.
Projection
Projection is used to specify the columns for the result. It is also used for aggregations and grouping.
Example Queries
Consider a namespace “public” with tables “employee” and “dept” defined as follows:
CREATE TABLE public.dept(
deptno TINYINT NOT NULL,
deptname VARCHAR(30) NOT NULL,
PRIMARY KEY (deptno) );
CREATE TABLE public.employee(
empno INTEGER NOT NULL,
empname VARCHAR(20) NOT NULL,
salary REAL NOT NULL,
deptno TINYINT NOT NULL,
married BOOLEAN NOT NULL,
dob DATE NOT NULL,
joining_date DATE NOT NULL,
PRIMARY KEY (empno) );
Then the following CQL queries can be executed on the namespace.
Find employee named “Loki”:
public.employee.empname == "Loki"
Find all employees in the HR department that are married:
public.dept.deptname == "HR" and public.employee.married == TRUE
Find all employees from the HR or IT department:
public.dept.deptname == "HR" or public.dept.deptname == "IT" relation public.employee and public.dept
Find all employees from all departments except HR:
public.employee.empno >= 1 NOT public.dept.deptname == "HR"
Count the number of employees:
relation public.employee project public.employee.empno/count
Get all the employee names sorted by date of birth:
relation public.employee sortby public.employee.dob project public.employee.empname
Count the number of employees in each department:
relation public.employee project public.employee.empno/count public.employee.deptno
Future Directions and Desirable Features
Looking ahead, there are several promising enhancements and features that will further refine the functionality and user experience of Polypheny’s CQL:
- Combiner’s
getCommonColumns
Optimization: By creating a cache, the performance of this method can be substantially improved. - Column Filters Support: This will provide a more versatile and refined filtering mechanism.
- Set Operations Support: This will allow for more complex query operations.
- Proximity Boolean Operator Support: This will enable advanced, proximity-based search operations.
- Querying the Result of a Query: This will allow for nested querying, leading to more complex and comprehensive data extraction.
- Modifiers for Sorting, Projection, and Filtering: These will allow for further customization and fine-tuning of data operations.
- Enhanced Test Coverage: To ensure robustness and reliability of the CQL, comprehensive testing will be essential.