YMatrix
Quick Start
Connecting
Benchmarks
Deployment
Data Usage
Manage Clusters
Upgrade
Global Maintenance
Expansion
Monitoring
Security
Best Practice
Technical Principles
Data Type
Storage Engine
Execution Engine
Streaming Engine(Domino)
MARS3 Index
Extension
Advanced Features
Advanced Query
Federal Query
Grafana
Backup and Restore
Disaster Recovery
Graph Database
Introduction
Clauses
Functions
Advanced
Guide
Performance Tuning
Troubleshooting
Tools
Configuration Parameters
SQL Reference
Aggregation functions — aggr(expr) — operate over all rows matching a given aggregation key. Keys are compared for equivalence.
In standard aggregation (i.e., aggr(expr)), the list of values passed to the aggregate is derived from the candidate values after removing all NULLs.
Note!
In YMatrix AGE, aggregation results are not ordered. To obtain an ordered result set from aggregated data, applyORDER BYoutside thecypher()function — in the outer SQL query. AnORDER BYinside the Cypher clause has no effect on global ordering; it only guarantees ordering per segment.
Example:
SELECT * FROM cypher('graph_name', $$ /* aggr(expr) */ $$) AS (grouping_key agtype) ORDER BY grouping_key;
SELECT * FROM cypher('graph_name', $$
CREATE (a:Person {name: 'A', age: 13}),
(b:Person {name: 'B', age: 33, eyes: "blue"}),
(c:Person {name: 'C', age: 44, eyes: "blue"}),
(d1:Person {name: 'D', eyes: "brown"}),
(d2:Person {name: 'D'}),
(a)-[:KNOWS]->(b),
(a)-[:KNOWS]->(c),
(a)-[:KNOWS]->(d1),
(b)-[:KNOWS]->(d2),
(c)-[:KNOWS]->(d2)
$$) AS (a agtype);
Cypher provides aggregation capabilities analogous to SQL’s GROUP BY. Aggregation functions accept a set of values and compute a single summary value — for example, avg() computes the arithmetic mean of numeric values, and min() returns the smallest numeric or lexicographic value in the set.
When we say an aggregation function operates “over a set of values”, we mean the inner expression (e.g., n.age) is evaluated for every record within the same aggregation group.
Aggregations may be computed over the entire matched result set, or further subdivided using grouping keys. Grouping keys are non-aggregate expressions used to partition input rows before applying the aggregation function.
Consider the following RETURN statement:
SELECT * FROM cypher('graph_name', $$
MATCH (v:Person)
RETURN v.name, count(*)
$$) AS (grouping_key agtype, count agtype);
| count | key |
|---|---|
"A" |
1 |
"B" |
1 |
"C" |
1 |
"D" |
2 |
| (1 row) |
There are two return expressions: v.name and count(*). The first is not an aggregate, so it becomes the implicit grouping key. The second is an aggregate. Matched records are grouped by the value of v.name, and count(*) is computed once per group.
In DISTINCT aggregation (i.e., aggr(DISTINCT expr)), the list of candidate values is first filtered to remove all NULLs. Then, among equivalent values, only one instance is retained — duplicates (under equivalence) are eliminated.
The DISTINCT operator is used with aggregations to ensure uniqueness before the aggregation function processes the values.
SELECT *
FROM cypher('graph_name', $$
MATCH (v:Person)
RETURN count(DISTINCT v.eyes), count(v.eyes)
$$) AS (distinct_eyes agtype, eyes agtype);
| distinct_eyes | eyes |
|---|---|
2 |
3 |
| (1 row) |
Cypher’s automatic grouping behavior — where users need not explicitly declare grouping keys — can lead to ambiguity.
SELECT * FROM cypher('graph_name', $$
CREATE (:L {a: 1, b: 2, c: 3}),
(:L {a: 2, b: 3, c: 1}),
(:L {a: 3, b: 1, c: 2})
$$) AS (a agtype);
AGE resolves ambiguity by requiring that any variable referenced outside an aggregate function must either:
WITH or RETURN clause, or SELECT * FROM cypher('graph_name', $$
MATCH (x:L)
RETURN x.a + count(*) + x.b + count(*) + x.c
$$) AS (a agtype);
ERROR: "x" must be either part of an explicitly listed key or used inside an aggregate function
LINE 3: RETURN x.a + count(*) + x.b + count(*) + x.c
In AGE, any non-aggregate column in a WITH or RETURN clause is treated as a grouping key.
SELECT * FROM cypher('graph_name', $$
MATCH (x:L)
RETURN (x.a + x.b + x.c) + count(*) + count(*), x.a + x.b + x.c
$$) AS (count agtype, key agtype);
Here, x.a + x.b + x.c serves as the grouping key. Parentheses are required to form a compound key.
| count | key |
|---|---|
12 |
6 |
| (1 row) |
SELECT * FROM cypher('graph_name', $$
MATCH (x:L)
RETURN x.a + count(*) + x.b + count(*) + x.c, x.a, x.b, x.c
$$) AS (count agtype, a agtype, b agtype, c agtype);
Here, x.a, x.b, and x.c are each treated as independent grouping keys.
| count | a | b | c |
|---|---|---|---|
8 |
3 |
1 |
2 |
8 |
2 |
3 |
1 |
8 |
1 |
2 |
3 |
| (3 rows) |
A vertex or edge itself may serve as a grouping key. When doing so, any of its properties may be referenced directly in the same WITH or RETURN clause without explicit declaration.
SELECT * FROM cypher('graph_name', $$
MATCH (x:L)
RETURN count(*) + count(*) + x.a + x.b + x.c, x
$$) AS (count agtype, key agtype);
The result is grouped by x. Because vertices are identity-based, their properties need not be enumerated for unambiguous grouping.
If a grouping key is needed for aggregation but should not appear in the final output, perform the aggregation in a WITH clause and pass only the desired result to RETURN.
SELECT * FROM cypher('graph_name', $$
MATCH (x:L)
WITH count(*) + count(*) + x.a + x.b + x.c AS column, x
RETURN column
$$) AS (a agtype);
| a |
|---|
8 |
8 |
8 |
| (3 rows) |