Skip to content Skip to sidebar Skip to footer

Count Rows After Joining Three Tables In Postgresql

Suppose I have three tables in PostgreSQL: table1 - id1, a_id, updated_by_id table2 - id2, a_id, updated_by_id Users - id, display_name Suppose I am using the using the following

Solution 1:

To simplify your logic, aggregate first, join later.

Guessing missing details, this query would give you the exact count, how many times each user was referenced in table1 and table2 respectively for all users:

SELECT*FROM   users u
LEFTJOIN (
   SELECT updated_by_id AS id, count(*) AS t1_ct
   FROM   table1
   GROUPBY1
   ) t1 USING (id)
LEFTJOIN (
   SELECT updated_by_id AS id, count(*) AS t2_ct
   FROM   table2
   GROUPBY1
   ) t2 USING (id);

In particular, avoid multiple 1-n relationships multiplying each other when joined together:

To retrieve a single or few users only, LATERAL joins will be faster (Postgres 9.3+):

SELECT*FROM   users u
LEFTJOINLATERAL (
   SELECTcount(*) AS t1_ct
   FROM   table1
   WHERE  updated_by_id = u.id
   ) ONtrueLEFTJOINLATERAL (
   SELECTcount(*) AS t2_ct
   FROM   table2
   WHERE  updated_by_id = u.id
   ) ONtrueWHERE  u.id =100;

Explain perceived difference

The particular mismatch you report is due to the specifics of a FULL OUTER JOIN:

First, an inner join is performed. Then, for each row in T1 that does not satisfy the join condition with any row in T2, a joined row is added with null values in columns of T2. Also, for each row of T2 that does not satisfy the join condition with any row in T1, a joined row with null values in the columns of T1 is added.

So you get NULL values appended on the respective other side for missing matches. count() does not count NULL values. So you can get a different result depending on whether you filter on u1.id=100 or u2.id=100.

This is just to explain, you don't need a FULL JOIN here. Use the presented alternatives instead.

Post a Comment for "Count Rows After Joining Three Tables In Postgresql"