Aggregating Data

Data aggregation in Cypher works similarly to aggregation in SQL, with a variety of functions that allow you to group and summarize your data. In this section, we’ll cover how to collect values, count data, and use other aggregation functions.

Collecting Values

The COLLECT function is used to create a list of values. It can be useful when you want to aggregate nodes, relationships, or properties into a list.

MATCH (p:Person)
RETURN p.name, COLLECT(p.age) AS ages

This query returns each person’s name along with a list of their ages. If there were multiple Person nodes with the same name, it would collect all of their ages into one list.

Counting

The COUNT function is used to count the number of rows, nodes, relationships, or properties that match a given pattern.

MATCH (p:Person)
RETURN COUNT(p)

This query counts the number of Person nodes in the database.

To count distinct values, use COUNT(DISTINCT ...). For example, to count the number of distinct ages among Person nodes:

MATCH (p:Person)
RETURN COUNT(DISTINCT p.age)

Other Aggregation Functions

Cypher includes a variety of other aggregation functions for summarizing your data:

  • SUM: Calculates the sum of numeric values.
    MATCH (p:Person)
    RETURN SUM(p.age)
    
  • AVG: Calculates the average of numeric values.
    MATCH (p:Person)
    RETURN AVG(p.age)
    
  • MIN and MAX: Find the minimum or maximum of numeric or temporal values.
    MATCH (p:Person)
    RETURN MIN(p.age), MAX(p.age)
    
  • STDEV and STDEVP: Calculate the standard deviation of a set of numeric values.
    MATCH (p:Person)
    RETURN STDEV(p.age)
    

Remember that all aggregation functions will operate on the current set of rows, as defined by the MATCH clause and any WHERE clauses. If you want to group rows before aggregating, similar to the GROUP BY functionality in SQL, you can use WITH or RETURN to define your groups.

For example, to get the average age of Person nodes, grouped by their name:

MATCH (p:Person)
RETURN p.name, AVG(p.age)

This query will return each distinct name, along with the average age of Person nodes with that name.

Understanding how to aggregate data will allow you to write more powerful and flexible queries, enabling you to derive insights from your graph data in Polypheny.

© Polypheny GmbH. All Rights Reserved.