windowing_functions

Window function syntax

<window_function>() OVER (
    [PARTITION BY <column>]
    [ORDER BY <column> [ASC|DESC]]
    [ROWS|RANGE BETWEEN <frame_specification>]
)

Interviewers often ask candidates to solve problems that involve:

Calculating running totals or moving averages
Ranking or assigning row numbers within groups
Comparing values to previous/next rows
Finding top N per group
Calculating percentiles or distributions

SQL window functions perform calculations across a set of rows related to the current row, which are often used with the OVER clause. Here’s an exhaustive list of SQL window functions, categorized by their functionality:

1. Ranking Functions

ROW_NUMBER(): Assigns a unique number to each row within the partition of a result set.
RANK(): Assigns a rank to each row within the partition of a result set, with gaps in ranking when there are ties.
DENSE_RANK(): Similar to RANK(), but without gaps in the ranking.
NTILE(n): Divides the result set into n roughly equal parts and assigns a bucket number to each row.

Example 1

Example 2

2. Aggregate Functions

Where_we_can_use_aggregate_functions
COUNT(): Counts the number of rows in a partition.
SUM(): Computes the sum of values in a partition.
AVG(): Calculates the average of values in a partition.
MIN(): Finds the minimum value in a partition.
MAX(): Finds the maximum value in a partition.

3. Value Functions

FIRST_VALUE(): Returns the first value in an ordered partition.
LAST_VALUE(): Returns the last value in an ordered partition.
NTH_VALUE(): Returns the nth value in an ordered partition.

4. Analytic Functions

LEAD(): Accesses the next row’s value in the partition, often used to compare current and subsequent rows.
LAG(): Accesses the previous row’s value in the partition, often used to compare current and prior rows.
CUME_DIST(): Computes the cumulative distribution of a value in a partition.
PERCENT_RANK(): Calculates the relative rank of a row within a partition, expressed as a percentage.
PERCENTILE_CONT(): Calculates the continuous percentile of a value in a partition.
PERCENTILE_DISC(): Calculates the discrete percentile of a value in a partition.
FLOOR(): Returns the largest integer less than or equal to a specified value.
CEIL(): Returns the smallest integer greater than or equal to a specified value.

5. Offset Functions

LEAD(): Retrieves the value of a subsequent row.
LAG(): Retrieves the value of a preceding row.

6. Windowing Functions

OVER(): Defines the window for the window function. It can be combined with partitioning (PARTITION BY) and ordering (ORDER BY).
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: Specifies a window range, starting from the first row in the partition to the current row.
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING: Specifies a window range, starting from the current row to the last row in the partition.
ROWS BETWEEN <start> AND <end>: Specifies the start and end rows within a window.

7. Miscellaneous Functions

NTH_VALUE(): Returns the nth value of a specific column from a window.
GROUPS: Used in conjunction with GROUPING SETS or CUBE to compute additional rows.

These functions help to perform various types of analysis and calculations over sets of rows related to each row in the dataset. They can be used for advanced queries like running totals, moving averages, and complex rankings.

In SQL, window functions allow you to perform calculations across a specific range of rows relative to the current row. The RANGE clause in a window function specifies a range of rows based on a logical value (like the value of a column), not the physical position of rows. This makes it different from the ROWS clause, which considers the physical row position.

Syntax

<window_function>() OVER (
    PARTITION BY <partition_column>
    ORDER BY <order_column>
    RANGE BETWEEN <range_start> AND <range_end>
)

Key Points:

Logical Values: RANGE works with the values of the ORDER BY column.
Supported Keywords:
- UNBOUNDED PRECEDING: Includes all rows from the start of the partition.
- UNBOUNDED FOLLOWING: Includes all rows to the end of the partition.
- CURRENT ROW: Includes the current row only.
- <n> PRECEDING: Includes rows within n units before the current row.
- <n> FOLLOWING: Includes rows within n units after the current row.

Example Scenarios

1. Cumulative Sum Up to Current Row

SELECT 
    customer_id,
    order_date,
    total_amount,
    SUM(total_amount) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date 
        RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS cumulative_sum
FROM orders;

Explanation: This calculates a running total of total_amount for each customer_id based on the order_date.

2. Sums Over a Range of Values

SELECT 
    product_id,
    sales_date,
    daily_sales,
    SUM(daily_sales) OVER (
        ORDER BY sales_date 
        RANGE BETWEEN 3 PRECEDING AND 3 FOLLOWING
    ) AS seven_day_window_sum
FROM sales;

Explanation: The SUM is calculated for a range of 7 days centered around the current row (3 days before to 3 days after the current row).

Difference Between `RANGE` and `ROWS`

Clause	Behavior
`RANGE`	Considers all rows with the same value in the `ORDER BY` column.
`ROWS`	Considers a specific number of physical rows, regardless of column value.

Example of `RANGE` vs `ROWS`:

For a table where ORDER BY column values are not unique:

-- RANGE example
SUM(salary) OVER (ORDER BY department RANGE BETWEEN 1000 PRECEDING AND 1000 FOLLOWING)
 
-- ROWS example
SUM(salary) OVER (ORDER BY department ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)

RANGE: Includes all rows where the department value is within ±1000 of the current value.
ROWS: Includes the two rows physically before and after the current row.

Important Notes

RANGE requires the ORDER BY clause.
RANGE is typically used with numerical or date columns.
If rows have the same value in the ORDER BY column, all such rows are included in the calculation.

Would you like further clarification or examples?

Srinath's Blog

Explorer

windowing_functions

Window function syntax

1. Ranking Functions

Example 1

Example 2

2. Aggregate Functions

3. Value Functions

4. Analytic Functions

5. Offset Functions

6. Windowing Functions

7. Miscellaneous Functions

Syntax

Key Points:

Example Scenarios

1. Cumulative Sum Up to Current Row

2. Sums Over a Range of Values

Difference Between `RANGE` and `ROWS`

Example of `RANGE` vs `ROWS`:

Important Notes

Graph View

Table of Contents

Backlinks

Srinath's Blog

Explorer

windowing_functions

Window function syntax

1. Ranking Functions

Example 1

Example 2

2. Aggregate Functions

3. Value Functions

4. Analytic Functions

5. Offset Functions

6. Windowing Functions

7. Miscellaneous Functions

Syntax

Key Points:

Example Scenarios

1. Cumulative Sum Up to Current Row

2. Sums Over a Range of Values

Difference Between RANGE and ROWS

Example of RANGE vs ROWS:

Important Notes

Graph View

Table of Contents

Backlinks

Difference Between `RANGE` and `ROWS`

Example of `RANGE` vs `ROWS`: