Back to questions

IBM db2 Product Analytics IBM SQL Interview Question

IBM db2 Product Analytics

IBM SQL Interview Question

IBM is analyzing how their employees are utilizing the Db2 database by tracking the SQL queries executed by their employees. The objective is to generate data to populate a histogram that shows the number of unique queries run by employees during the third quarter of 2023 (July to September). Additionally, it should count the number of employees who did not run any queries during this period.

Display the number of unique queries as histogram categories, along with the count of employees who executed that number of unique queries.

Schema:

Column Name	Type	Description
employee_id	integer	The ID of the employee who executed the query.
query_id	integer	The unique identifier for each query (Primary Key).
query_starttime	datetime	The timestamp when the query started.
execution_time	integer	The duration of the query execution in seconds.

Example Input:

Assume that the table below displays all queries made from July 1, 2023 to 31 July, 2023:

employee_id	query_id	query_starttime	execution_time
226	856987	07/01/2023 01:04:43	2698
132	286115	07/01/2023 03:25:12	2705
221	33683	07/01/2023 04:34:38	91
240	17745	07/01/2023 14:33:47	2093
110	413477	07/02/2023 10:55:14	470

Schema:

Assume that the table below displays all employees in the table:

Column Name	Type	Description
employee_id	integer	The ID of the employee who executed the query.
full_name	string	The full name of the employee.
gender	string	The gender of the employee.

Example Input:

employee_id	full_name	gender
1	Judas Beardon	Male
2	Lainey Franciotti	Female
3	Ashbey Strahan	Male

Example Output:

unique_queries	employee_count
0	191
1	46
2	12
3	1

The output indicates that 191 employees did not run any queries, 46 employees ran exactly 1 unique queries, 12 employees ran 2 unique queries, and so on.

The dataset you are querying against may have different input & output - this is just an example!

Solution

Step 1: Identify Queries in the Third Quarter of 2023

First, we need to filter the table to include only those queries executed between July 1, 2023, and September 30, 2023. We use these dates because they correspond to the third quarter of the year.

Using the function, we also count the number of unique queries per employee within this time frame.

💡Why Filter Up to October 1st?

The date range specified in the filter () ensures we include all queries executed from the start of July to the end of September. By using , we capture all timestamps up to, but not including October 1st. This is standard practice to include the entire last day of September without accidentally including any part of October.

Step 2: Ensure All Employees are Counted

Next, we need to ensure that all employees are included, even if they did not run any queries during the third quarter. Using the same query in Step 1, we'll use a to achieve this, combining the table with the results from Step 1.

We'll also use function to set the query count to 0 for employees with no queries.

We'll wrap this query in a CTE called .

Step 3: Count Employees by Unique Query Count

Finally, in the main query, we group the results by the number of unique queries and count how many employees fall into each group. We also order the results by the number of unique queries to generate the histogram.

Sourced from

IBM

Difficulty

Easy

Input

Output