Subqueries in Expressions
1.7 SQL Functions
1.7.6 Window Functions
Syntax
<window_fuction> ::= <window_function_type> OVER ( [ PARTITION BY <expression>[{,
<expression>}...] <window_order_by_clause>[{, <window_order_by_clause>}...] ) <window_function_type> ::= RANK() | DENSE_RANK() | ROW_NUMBER() | PERCENT_RANK() | CUME_DIST()
<ntile_function> ::= NTILE( <number_of_buckets> )
<lead_lag_function> ::= { LEAD | LAG } ( <expression> [, <offset> [,
<default_expression>]] )
<first_last_value_function> ::= { FIRST_VALUE | LAST_VALUE } ( <expression> ) <nth_value_function> ::= NTH_VALUE ( <expression> , <n> )
<window_aggregate_function> ::= COUNT(*) | <agg_name> ( <expression> ) <agg_name> ::= COUNT | MIN | MAX | SUM | AVG | STDDEV | VAR
<window_order_by_clause> ::= ORDER BY {<window_order_by_expression>}
<window_order_by_expression> ::= <expression> [ASC | DESC] [NULLS FIRST | NULLS LAST ]
Description
The window function let users divide result sets of a query (or a logical partition of a query) into groups of rows called window partition.
A window partition is specified by one or more expressions in the OVER clause.
Window functions other than aggregation functions must have ORDER BY clause in the OVER clause.
Result sets are first partitioned as specified by PARTITION BY clause, and then sorted by ORDER BY clause specification within the window partition.
Finally window functions are applied to each row within window partition boundaries.
Note that ORDER BY clause in OVER clause is only used to evaluate window function so that the order of resulting rows is non-deterministic if not specified by ORDER BY for SELECT.
The default window frame of the window function depends on whether or not a window orderby clause is specified.
If window orderby clause is specified, the default window frame becomes 'between UNBOUNDED PRECEDING and CURRENT ROW', that is, the window function computes on rows preceding or peer with current row.
As a result, the function returns cumulative values.
If window orderby clause is not specified, the default window frame becomes 'between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING', that is the window function computes on the whole window partition.
As a result, the function returns the same value within a window partition regardless of current row.
RANK()
Returns rank of a row within a partition, starting from 1. Returns duplicate values in the ranking sequence when there are ties between values.
DENSE_RANK()
Returns ranking values without gaps.
ROW_NUMBER()
Returns unique row number within a partition.
The output of ROW_NUMBER function can be non-deterministic among tie values.
PERCENT_RANK()
Returns relative rank of a row.
The relative rank of a row R is defined as (RANK()-1)/(NR-1), where NR is defined to be the number of rows in the window partition of R.
CUME_DIST()
Returns relative rank of a row.
The relative rank of a row R is defined as NP/NR, where NP is defined to be the number of rows preceding or peer with R inthe window ordering of the window partition of R and NR is defined to be the number of rows in the window ordering of the window.
LEAD ( <expression> [, <offset> [, <default_expression>]] )
Returns value of the 'offset' rows after current row. The 'offset' should be non-negative and default is 1.
If the offset crosses boundaries of the partition 'default' value is returned. If the 'default' is not specified null value is returned.
The 'offset' and 'default' are evaluated at current row.
The output of LEAD function can be non-deterministic among tie values.
LAG ( <expression> [, <offset> [, <default_expression>]] )
Returns value of the 'offset' rows before current row. The 'offset' should be non-negative and default is 1.
If the offset crosses boundaries of the partition 'default' value is returned. If the 'default' is not specified null value is returned.
The 'offset' and 'default' are evaluated at current row.
The output of LAG function can be non-deterministic among tie values.
NTILE( <number_of_buckets> )
Distributes rows into a specified 'number of buckets' and assigns the bucket number starting from 1 to each row in the bucket.
The output of NTILE function can be non-deterministic among tie values.
FIRST_VALUE ( <expression> )
Returns the value of 'expression' evaluated at the first row of the window frame.
The output of FIRST_VALUE function can be non-deterministic among tie values.
LAST_VALUE ( <expression> )
Returns the value of 'expression' evaluated at the last row of the window frame.
The output of LAST_VALUE function can be non-deterministic among tie values.
NTH_VALUE ( <expression> , <n> )
Returns the value of 'expression' evaluated at the 'n'-th row from the first row of the window frame.
The output of NTH_VALUE function can be non-deterministic among tie values.
Window Aggregate Function
See Aggregate Expressions for aggregate functions. Window aggregation does not support DISTINCT.
Example
Here is an example table T.
create table T (class char(10), val int, offset int);
insert into T values('A', 1, 1);
insert into T values('A', 3, 3);
insert into T values('A', 5, null);
insert into T values('A', 5, 2);
insert into T values('A', 10, 0);
insert into T values('B', 1, 3);
insert into T values('B', 1, 1);
insert into T values('B', 7, 1);
Example : Ranking functions select class, val,
ROW_NUMBER() over (partition by class order by val) as row_num, RANK() over (partition by class order by val) as rank,
DENSE_RANK() over (partition by class order by val) as dense_rank from T;
The example result of the query is shown below. Note that the result of ROW_NUMBER can be non-deterministic.
class val row_num rank dense_rank
A 1 1 1 1 select class, val, offset,
LEAD(val) over (partition by class order by val) as lead,
LEAD(val,offset,-val) over (partition by class order by val) as lead2, LAG(val) over (partition by class order by val) as lag,
LAG(val,offset,-val) over (partition by class order by val) as lag2 from T;
The example result of the query is shown below. Note that the result of LEAD/LAG can be non-deterministic.
class val offset lead lead2 lag lag2
A 1 1 3 3 null -1
B 1 1 7 7 1 1
B 7 1 null -7 1 1
Example : NTILE/FIRST_VALUE/LAST_VALUE/NTH_VALUE functions select class, val,
NTILE(3) over (partition by class order by val) as nt,
FIRST_VALUE(val) over (partition by class order by val) as first, LAST_VALUE(val) over (partition by class order by val) as last, NTH_VALUE(val, 4) over (partition by class order by val) as nth from T;
The example result of the query is shown below. Note that the result of NTILE/FIRST_VALUE/LAST_VALUE/
NTH_VALUE can be non-deterministic.
class val nt first last nth
A 1 1 1 1 null select class, val, offset,
COUNT(*) over (partition by class) as c1, COUNT(offset) over (partition by class) as c2,
COUNT(*) over (partition by class order by val) as c3, COUNT(offset) over (partition by class order by val) as c4, MAX(val) over (partition by class) as m1,
MAX(val) over (partition by class order by val) as m2 from T;
The result of the query is shown below.
class val offset C1 C2 C3 C4 M1 M2
A 1 1 5 4 1 1 10 1
A 3 3 5 4 2 2 10 3
A 5 null 5 4 4 3 10 5
A 5 2 5 4 4 3 10 5
A 10 0 5 4 5 4 10 10
B 1 3 3 3 2 2 7 1
B 1 1 3 3 2 2 7 1
B 7 1 3 3 3 3 7 7