Skip to content Skip to sidebar Skip to footer

How To Calculate Retention Month Over Month Using Sql

Trying to get a basic table that shows retention from one month to the next. So if someone buys something last month and they do so the next month it gets counted. month, num_trans

Solution 1:

Given the following test table (which you should have provided):

CREATE TEMP TABLE transaction (buyer_id int, tstamp timestamp);
INSERTINTO transaction VALUES 
 (1,'2012-01-03 20:00')
,(1,'2012-01-05 20:00')
,(1,'2012-01-07 20:00')  -- multiple transactions this month
,(1,'2012-02-03 20:00')  -- next month
,(1,'2012-03-05 20:00')  -- next month
,(2,'2012-01-07 20:00')
,(2,'2012-03-07 20:00')  -- not next month
,(3,'2012-01-07 20:00')  -- just once
,(4,'2012-02-07 20:00'); -- just once

Table auth_user is not relevant to the problem. Using tstamp as column name since I don't use base types as identifiers.

I am going to use the window function lag() to identify repeated buyers. To keep it short I combine aggregate and window functions in one query level. Bear in mind that window functions are applied after aggregate functions.

WITH t AS (
   SELECT buyer_id
         ,date_trunc('month', tstamp) ASmonth
         ,count(*) AS item_transactions
         ,lag(date_trunc('month', tstamp)) OVER (PARTITIONBY  buyer_id
                                           ORDERBY date_trunc('month', tstamp)) 
          = date_trunc('month', tstamp) -interval'1 month'ORNULLAS repeat_transaction
   FROM   transaction
   WHERE  tstamp >='2012-01-01'::dateAND    tstamp <'2012-05-01'::date-- time range of interest.GROUPBY1, 2
   )
SELECTmonth
      ,sum(item_transactions) AS num_trans
      ,count(*) AS num_buyers
      ,count(repeat_transaction) AS repeat_buyers
      ,round(
          CASEWHENsum(item_transactions) >0THENcount(repeat_transaction) /sum(item_transactions) *100ELSE0END, 2) AS buyer_retention
FROM   t
GROUPBY1ORDERBY1;

Result:

month|num_trans|num_buyers|repeat_buyers|buyer_retention_pct---------+-----------+------------+---------------+--------------------2012-01|5|3|0|0.002012-02|2|2|1|50.002012-03|2|2|1|50.00

I extended your question to provide for the difference between the number of transactions and the number of buyers.

The OR NULL for repeat_transaction serves to convert FALSE to NULL, so those values do not get counted by count() in the next step.

-> SQLfiddle.

Solution 2:

This uses CASE and EXISTS to get repeated transactions:

SELECT
    *,
    CASEWHEN num_transactions = 0THEN0ELSE round(100.0 * repeat_transactions / num_transactions, 2)
    ENDAS retention
FROM
    (
        SELECT
            to_char(timestamp, 'YYYY-MM') AS month,
            count(*) AS num_transactions,
            sum(CASEWHEN EXISTS (
                    SELECT1FROM transaction AS t
                    JOIN auth_user AS u
                    ON t.buyer_id = u.id
                    WHERE
                        date_trunc('month', transaction.timestamp)
                            + interval '1 month'
                            = date_trunc('month', t.timestamp)AND auth_user.email = u.email
                )
                THEN1ELSE0END) AS repeat_transactions
        FROM
            transaction
            JOIN auth_user
            ON transaction.buyer_id = auth_user.id
        GROUPBY1
    ) AS summary
ORDERBY1;

EDIT: Changed from minus 1 month to plus 1 month after reading the question again. My understanding now is that if someone buy something in 2012-02, and then buy something again in 2012-03, then his or her transactions in 2012-02 are counted as retention for the month.

Post a Comment for "How To Calculate Retention Month Over Month Using Sql"