You are working as a Data Engineer for a SaaS platform. The product team wants to measure user engagement by tracking Monthly Active Users (MAU) . You are given a table called user_activity with the following structure: activity_id (INTEGER): Unique identifier for each activity user_id (INTEGER): Identifier for the user activity_date (DATE): Date when the activity occurred activity_type (TEXT): Type of activity (e.g., 'login', 'upload', 'download') Task: Write a SQL query to calculate the number of distinct active users per month . A user is considered "active" in a month if they have at least one activity in that month. Expected Output: month (truncate date to the first day of the month) active_users (count of distinct users in that month) The result should be sorted by month in ascending order.
Table Setup (SQL Schema)
-- DDL: Create table
CREATE TABLE user_activity (
activity_id INTEGER PRIMARY KEY,
user_id INTEGER,
activity_date DATE,
activity_type TEXT
);
-- DML: Insert sample data
INSERT INTO user_activity (activity_id, user_id, activity_date, activity_type) VALUES
(1, 101, '2024-01-05', 'login'),
(2, 102, '2024-01-10', 'upload'),
(3, 101, '2024-01-15', 'download'),
(4, 103, '2024-01-20', 'login'),
(5, 104, '2024-02-01', 'login'),
(6, 101, '2024-02-05', 'upload'),
(7, 102, '2024-02-10', 'login'),
(8, 105, '2024-02-12', 'download'),
(9, 106, '2024-03-01', 'login'),
(10, 101, '2024-03-03', 'download'),
(11, 104, '2024-03-05', 'upload'),
(12, 107, '2024-03-07', 'login');