Chapter 3 · Product Analyst
3. SQL for product analysts
~10 min read
Product analyst SQL has a distinct flavor: it is mostly about events, timestamped user actions, and the cohort, funnel, and retention questions you ask of them. This chapter focuses on the patterns that come up daily and in interviews.
3.1 The events table mental model#
Most product data is a long, narrow events table: one row per action, with a user id, an event name, a timestamp, and properties. Almost every product question becomes 'filter to the right events, group by the right key, and count or window over time.'
sql
-- events(user_id, event_name, ts, platform, properties)
-- Daily active users by platform, last 30 days
SELECT DATE_TRUNC('day', ts) AS day,
platform,
COUNT(DISTINCT user_id) AS dau
FROM events
WHERE ts >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY 1, 2
ORDER BY day;3.2 Window functions, your power tool#
The two you'll reach for most are ROW_NUMBER() (first/last event per user) and LAG() (gap between a user's actions).
sql
-- Each user's FIRST purchase (the activation event)
WITH ranked AS (
SELECT user_id, ts, amount,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY ts) AS rn
FROM events WHERE event_name = 'purchase'
)
SELECT user_id, ts AS first_purchase_at, amount
FROM ranked WHERE rn = 1;3.3 Cohort retention in SQL#
sql
-- Day-7 retention by signup-week cohort
WITH signups AS (
SELECT user_id, DATE_TRUNC('week', MIN(ts)) AS cohort
FROM events GROUP BY user_id
),
activity AS (
SELECT DISTINCT user_id, DATE_TRUNC('day', ts) AS active_day
FROM events
)
SELECT s.cohort,
COUNT(DISTINCT s.user_id) AS cohort_size,
COUNT(DISTINCT CASE WHEN a.active_day = s.cohort + INTERVAL '7 days'
THEN a.user_id END) AS retained_d7
FROM signups s
LEFT JOIN activity a ON a.user_id = s.user_id
GROUP BY s.cohort ORDER BY s.cohort;3.4 Funnels in SQL#
sql
-- View -> Add to cart -> Purchase funnel
SELECT
COUNT(*) AS viewed,
COUNT(cart_ts) AS added,
COUNT(CASE WHEN buy_ts > cart_ts THEN 1 END) AS bought
FROM (
SELECT user_id,
MIN(CASE WHEN event_name='view' THEN ts END) AS view_ts,
MIN(CASE WHEN event_name='cart' THEN ts END) AS cart_ts,
MIN(CASE WHEN event_name='buy' THEN ts END) AS buy_ts
FROM events GROUP BY user_id
) u;3.5 Sessionization#
sql
WITH gaps AS (
SELECT user_id, ts,
CASE WHEN ts - LAG(ts) OVER (PARTITION BY user_id ORDER BY ts)
> INTERVAL '30 minutes' THEN 1 ELSE 0 END AS new_session
FROM events
)
SELECT user_id, ts,
SUM(new_session) OVER (PARTITION BY user_id ORDER BY ts) AS session_id
FROM gaps;Get the next chapter and weekly interview tips by email
One short email per week. Skim in a minute. Unsubscribe anytime.
