Data aggregation in Cypher works similarly to aggregation in SQL, with a variety of functions that allow you to group and summarize your data. In this section, we’ll cover how to collect values, count data, and use other aggregation functions.
Collecting Values
The COLLECT
function is used to create a list of values. It can be useful when you want to aggregate nodes, relationships, or properties into a list.
MATCH (p:Person)
RETURN p.name, COLLECT(p.age) AS ages
This query returns each person’s name along with a list of their ages. If there were multiple Person
nodes with the same name, it would collect all of their ages into one list.
Counting
The COUNT
function is used to count the number of rows, nodes, relationships, or properties that match a given pattern.
MATCH (p:Person)
RETURN COUNT(p)
This query counts the number of Person
nodes in the database.
To count distinct values, use COUNT(DISTINCT ...)
. For example, to count the number of distinct ages among Person
nodes:
MATCH (p:Person)
RETURN COUNT(DISTINCT p.age)
Other Aggregation Functions
Cypher includes a variety of other aggregation functions for summarizing your data:
SUM
: Calculates the sum of numeric values.MATCH (p:Person) RETURN SUM(p.age)
AVG
: Calculates the average of numeric values.MATCH (p:Person) RETURN AVG(p.age)
MIN
andMAX
: Find the minimum or maximum of numeric or temporal values.MATCH (p:Person) RETURN MIN(p.age), MAX(p.age)
STDEV
andSTDEVP
: Calculate the standard deviation of a set of numeric values.MATCH (p:Person) RETURN STDEV(p.age)
Remember that all aggregation functions will operate on the current set of rows, as defined by the MATCH
clause and any WHERE
clauses. If you want to group rows before aggregating, similar to the GROUP BY
functionality in SQL, you can use WITH
or RETURN
to define your groups.
For example, to get the average age of Person
nodes, grouped by their name
:
MATCH (p:Person)
RETURN p.name, AVG(p.age)
This query will return each distinct name, along with the average age of Person
nodes with that name.
Understanding how to aggregate data will allow you to write more powerful and flexible queries, enabling you to derive insights from your graph data in Polypheny.