Neon Twin: Dev/preview/test on Neon, ship to prod on RDS - Read how here
Postgres guides/Functions/Aggregate functions

Postgres COUNT() function

Count rows or non-null values in a result set

The Postgres COUNT() function counts the number of rows in a result set or the number of non-null values in a specific column.

It's useful for data analysis, reporting, and understanding the size and composition of your datasets. Some common use cases include calculating the total number of records in a table, finding the number of distinct values in a column, or determining how many rows meet certain conditions.

Try it on Neon!

Neon is Serverless Postgres built for the cloud. Explore Postgres features and functions in our user-friendly SQL editor. Sign up for a free account to get started.

Sign Up

Function signatures

The COUNT() function has two main forms:

COUNT(*) -> bigint
  • Counts the total number of rows in the result set.
COUNT([DISTINCT] expression) -> bigint
  • Counts the number of rows where the input expression is not NULL.
  • DISTINCT is an optional keyword, that removes duplicate values before counting.

Example usage

Consider an orders table that tracks orders placed by customers of an online store. It has columns order_id, customer_id, product_id, and order_date. We'll use the COUNT() function to analyze this data.

CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    customer_id INTEGER NOT NULL,
    product_id INTEGER,
    order_amount DECIMAL(10, 2) NOT NULL,
    order_date TIMESTAMP NOT NULL
);

INSERT INTO orders (customer_id, product_id, order_amount, order_date)
VALUES
    (1, 101, 150.00, '2023-01-15 10:30:00'),
    (2, 102, 75.50, '2023-01-16 11:45:00'),
    (1, 103, 200.00, '2023-02-01 09:15:00'),
    (3, 104, 50.25, '2023-02-10 14:20:00'),
    (2, 105, 125.75, '2023-03-05 16:30:00'),
    (4, NULL, 90.00, '2023-03-10 13:00:00'),
    (1, 106, 180.50, '2023-04-02 11:10:00'),
    (3, 107, 60.25, '2023-04-15 10:45:00'),
    (5, 108, 110.00, '2023-05-01 15:20:00'),
    (2, 109, 95.75, '2023-05-20 12:30:00');

Count all rows

To get the total number of orders, you can use COUNT(*):

SELECT COUNT(*) AS total_orders
FROM orders;

This query will return the total number of rows in the orders table.

total_orders
--------------
           10
(1 row)

Count non-null values

To count how many orders have a product_id (assuming some orders might not have a product associated):

SELECT COUNT(product_id) AS orders_with_product
FROM orders;

This query will return the number of orders where product_id is not NULL.

orders_with_product
---------------------
                   9
(1 row)

Count distinct values

To find out how many unique customers have placed orders:

SELECT COUNT(DISTINCT customer_id) AS unique_customers
FROM orders;

This query will return the number of distinct customer_id values in the orders table.

unique_customers
------------------
                5
(1 row)

Advanced examples

We use the orders table created in the previous section to demonstrate more use cases of the COUNT() function.

Combine COUNT() with GROUP BY

You can use COUNT() with GROUP BY to get counts for different categories:

SELECT
  DATE_TRUNC('month', order_date) AS month,
  COUNT(*) AS orders_per_month
FROM orders
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;

This query counts the number of orders for each month.

month        | orders_per_month
---------------------+------------------
 2023-01-01 00:00:00 |                2
 2023-02-01 00:00:00 |                2
 2023-03-01 00:00:00 |                2
 2023-04-01 00:00:00 |                2
 2023-05-01 00:00:00 |                2
(5 rows)

Use COUNT() in a subquery

You can use COUNT() in a subquery to filter based on counts:

SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > (
  SELECT AVG(order_count)
  FROM (
    SELECT COUNT(*) AS order_count
    FROM orders
    GROUP BY customer_id
  ) AS customer_order_counts
);

This query finds customers who have placed more orders than the average number of orders per customer.

customer_id | order_count
-------------+-------------
           2 |           3
           1 |           3
(2 rows)

Combine COUNT() with CASE

You can use COUNT() with CASE statements to only count rows that meet specific conditions:

SELECT
  COUNT(*) AS total_orders,
  COUNT(CASE WHEN order_amount > 100 THEN 1 END) AS high_value_orders,
  COUNT(CASE WHEN order_amount <= 100 THEN 1 END) AS low_value_orders
FROM orders;

This query counts the total number of orders, as well as the number of high-value and low-value orders.

total_orders | high_value_orders | low_value_orders
--------------+-------------------+------------------
           10 |                 5 |                5
(1 row)

Use COUNT() with FILTER clause

Postgres also allows using a FILTER clause with aggregate functions, which can be more readable than CASE statements:

SELECT
  COUNT(*) AS total_orders,
  COUNT(*) FILTER (WHERE order_date >= '2023-04-01') AS recent_orders
FROM orders;

This query counts the total number of orders, as well as the number of orders placed after April 1, 2023.

total_orders | recent_orders
--------------+---------------
           10 |             4
(1 row)

Additional considerations

Performance implications

COUNT(*) is generally faster than COUNT(column) or COUNT(DISTINCT column) because it doesn't need to check for NULL values or uniqueness. However, on very large tables, even COUNT(*) can be slow if it needs to scan the entire table.

For frequently used counts, consider maintaining a separate counter table or using materialized views to improve performance.

NULL handling

Both COUNT(column) and COUNT(DISTINCT column) expressions do not count NULL values. If you need to include NULL values in your count, use COUNT(*) or COUNT(COALESCE(column, 0)).

Alternative approaches

  • For approximate counts of distinct values in very large datasets, consider using the pg_stat_statements extension or the HyperLogLog algorithm (available through extensions like postgresql-hll).
  • For faster counts on large tables, consider using estimate counts based on table statistics with pg_class.reltuples.

Resources

Last updated on

Was this page helpful?