Level 3: Loyalty Illusion
Problem 3.1: The VIP Program Request
Ting ting! Not the monthly paycheck notification :(, but an email from Kelly, the Marketing Director.
From: Kelly (Marketing Director)
To: Data Team
Subject: VIP List for Coupon Campaign
"Hi Team! 🚀
We want to launch a 'VIP Loyalty Program' to reward our best users.
I need a list of all Repeat Customers (anyone who has placed more than 1 order with us).
Please give me the count of how many customers qualify. Thanks!
This time, use updated duckdb file: https://public.lelouvincx.com/brazilian_ecommerce.duckdb
Total Repeat Customers is: 0
Logic / Approach:
-
A "repeat customer" means a customer who comes back and places more than one order
-
In the dataset, each row in orders represents one order, and customers are identified by customer_id
-
Therefore: repeat customer = number of orders per customer > 1.
-
To solve this, we:
- Group by customer_id
- Count orders for each customer: COUNT(order_id)
- Keep customers with COUNT(order_id) > 1
WITH repeat_customers AS (
SELECT
customer_id
FROM orders
GROUP BY customer_id
HAVING COUNT(order_id) > 1 )
SELECT
COUNT(*) AS repeat_customer_count
FROM repeat_customers;

Problem 3.2: Check
I hope before writing a single line of code, you decided to audit the customers table to see how it's structured.
Run this query to look at the first few rows of the customers table:
SELECT *
FROM customers
LIMIT 5;
What do you see?
You should see 2 different ID columns:
customer_idcustomer_unique_id
Why would a table have two IDs for the same person?
Problem 3.3
Can you answer the problem 3.1 again?