How To Calculate Retention Month Over Month Using Sql
Solution 1:
Given the following test table (which you should have provided):
CREATE TEMP TABLE transaction (buyer_id int, tstamp timestamp);
INSERTINTO transaction VALUES
(1,'2012-01-03 20:00')
,(1,'2012-01-05 20:00')
,(1,'2012-01-07 20:00') -- multiple transactions this month
,(1,'2012-02-03 20:00') -- next month
,(1,'2012-03-05 20:00') -- next month
,(2,'2012-01-07 20:00')
,(2,'2012-03-07 20:00') -- not next month
,(3,'2012-01-07 20:00') -- just once
,(4,'2012-02-07 20:00'); -- just once
Table auth_user
is not relevant to the problem.
Using tstamp
as column name since I don't use base types as identifiers.
I am going to use the window function lag()
to identify repeated buyers. To keep it short I combine aggregate and window functions in one query level. Bear in mind that window functions are applied after aggregate functions.
WITH t AS (
SELECT buyer_id
,date_trunc('month', tstamp) ASmonth
,count(*) AS item_transactions
,lag(date_trunc('month', tstamp)) OVER (PARTITIONBY buyer_id
ORDERBY date_trunc('month', tstamp))
= date_trunc('month', tstamp) -interval'1 month'ORNULLAS repeat_transaction
FROM transaction
WHERE tstamp >='2012-01-01'::dateAND tstamp <'2012-05-01'::date-- time range of interest.GROUPBY1, 2
)
SELECTmonth
,sum(item_transactions) AS num_trans
,count(*) AS num_buyers
,count(repeat_transaction) AS repeat_buyers
,round(
CASEWHENsum(item_transactions) >0THENcount(repeat_transaction) /sum(item_transactions) *100ELSE0END, 2) AS buyer_retention
FROM t
GROUPBY1ORDERBY1;
Result:
month|num_trans|num_buyers|repeat_buyers|buyer_retention_pct---------+-----------+------------+---------------+--------------------2012-01|5|3|0|0.002012-02|2|2|1|50.002012-03|2|2|1|50.00
I extended your question to provide for the difference between the number of transactions and the number of buyers.
The OR NULL
for repeat_transaction
serves to convert FALSE
to NULL
, so those values do not get counted by count()
in the next step.
Solution 2:
This uses CASE
and EXISTS
to get repeated transactions:
SELECT
*,
CASEWHEN num_transactions = 0THEN0ELSE round(100.0 * repeat_transactions / num_transactions, 2)
ENDAS retention
FROM
(
SELECT
to_char(timestamp, 'YYYY-MM') AS month,
count(*) AS num_transactions,
sum(CASEWHEN EXISTS (
SELECT1FROM transaction AS t
JOIN auth_user AS u
ON t.buyer_id = u.id
WHERE
date_trunc('month', transaction.timestamp)
+ interval '1 month'
= date_trunc('month', t.timestamp)AND auth_user.email = u.email
)
THEN1ELSE0END) AS repeat_transactions
FROM
transaction
JOIN auth_user
ON transaction.buyer_id = auth_user.id
GROUPBY1
) AS summary
ORDERBY1;
EDIT: Changed from minus 1 month to plus 1 month after reading the question again. My understanding now is that if someone buy something in 2012-02, and then buy something again in 2012-03, then his or her transactions in 2012-02 are counted as retention for the month.
Post a Comment for "How To Calculate Retention Month Over Month Using Sql"