logo

8 CD Projekt SQL Interview Questions (Updated 2024)

Updated on

March 3, 2024

At CD Projekt, SQL is used day-to-day for analyzing gaming user data to enhance gameplay experience, and managing large databases for effective game development. Because of this, CD Projekt typically asks SQL coding questions in interviews for Data Science, Analytics, and & Data Engineering jobs.

As such, to help you ace the CD Projekt SQL interview, this blog covers 8 CD Projekt SQL interview questions – able to answer them all?

8 CD Projekt SQL Interview Questions

SQL Question 1: Average Game Ratings Per Month

Given a database of user review data for various games produced by CD Projekt, can you write a SQL query that calculates the average rating of each game for each month? The average rating should be rounded to two decimal places.

The table contains the following columns:

  • review_id is a unique identifier for each review
  • user_id is the id of the reviewer
  • submit_date is the date the review was submitted
  • product_id is a unique identifier for each game
  • stars is a rating from 1 to 5
Example Input:
review_iduser_idsubmit_dateproduct_idstars
11232022-01-1010014
24562022-01-1510023
37892022-01-2010015
41232022-02-1010014
54562022-02-1510012
67892022-02-2010024

Answer:


This query uses a window function to calculate the average stars per product per month. The EXTRACT function gets the month from the submit_date, and it groups the data first by the month then by the product_id. The AVG function calculates the average stars. The results are ordered by month then by product_id for easier readability.

Example Output:
monthproduct_idavg_rating
110014.50
110023.00
210013.00
210024.00

p.s. Window functions show up super often during SQL interviews, so practice the 27+ window function questions on DataLemur

DataLemur SQL Questions

SQL Question 2: Identify Active Customers and their Favorite Games

CD Projekt, a renowned video game publisher, is conducting analysis on its customers' usage patterns. They would like to identify their active customers and figure out which games are their favorites. An "active customer" is defined by CD Projekt as a customer who has purchased more than three games in the last six months. A customer’s favorite game is the game that they've played the most, based on the total number of play hours.

Write a SQL query that retrieves all active customers and their favorite games. Include the total play hours for the favorite game.

Table Example Input:
customer_idname
101John Doe
102Jane Doe
103Jim Brown
104Jill White
Table Example Input:
purchase_idcustomer_idgame_idpurchase_date
2011013012022-01-01
2021013022022-02-01
2031013032022-03-01
2041023012022-01-01
2051033042022-10-01
2061043052022-05-01
2071013062022-04-01
Table Example Input:
gameplay_idpurchase_idhours_played
50120150
50220230
50320370
50420410
50520590
50620640
50720720

Answer:


This query first joins the customers, purchases, and gameplay tables. It filters out the purchases that were made in the last six months. It then groups the results by customer name and filters customers that made more than three purchases (thus, active customers). Finally, it extracts the maximum hours played for each customer, which corresponds to their favorite game, and orders the results by play hours in descending order. Note that if a customer has multiple games with the same highest play hours, this query will only return one of them. To return all such games, we would need a more complex query.

SQL Question 3: What does do in a SQL query?

In SQL, the operator combines the result of two or more SELECT statements into a single result set. Note that each statement within the UNION must have the same number of columns and the columns must have similar data types. The operator removes duplicates from the final result set. If you want to include duplicates in the final result set, you can use the operator instead.

Here's an example of using the UNION operator to combine the results of two SELECT statements that retrieve data from tables of CD Projekt's Facebook ads and their Google ads:


This statement would return a combined result set of both Facebook and Google ads that have a spend greater than $500.

CD Projekt SQL Interview Questions

SQL Question 4: Find Game Titles Similar to User Input

As a part of the data team at CD Projekt, you are tasked with a feature that recommends similar game titles to users based on their input. The input will be a string provided by the user. You need to write an SQL query to find all game titles in the database that are similar to the user's input. Assume your database is in PostgreSQL and uses the keyword for pattern matching.

Please note that the similar titles should:

  • Start with the same first letter as the user's input
  • At least one word in the title should start with the same first letter
Example Input:
game_idtitlerelease_dateprice
101The Witcher 3: Wild Hunt05/19/201539.99
202Cyberpunk 207712/10/202059.99
303Thronebreaker: The Witcher Tales10/23/201819.99
404The Witcher 2: Assassins of Kings05/17/201119.99
505The Witcher: Enhanced Edition10/26/20079.99

User input: "The Wandering Knight"

Example Output:
game_idtitle
101The Witcher 3: Wild Hunt
303Thronebreaker: The Witcher Tales
404The Witcher 2: Assassins of Kings
505The Witcher: Enhanced Edition

Answer:


The SQL command above selects and from the table where begins with 'T' and also has at least one word in the title starting with 'W'. This would fetch all games that have similar titles to the user's input.

SQL Question 5: Can you explain what / SQL commands do?

The / operator is used to remove to return all rows from the first SELECT statement that are not returned by the second SELECT statement.

Note that is available in PostgreSQL and SQL Server, while MINUS is available in MySQL and Oracle (but don't stress about knowing which DBMS supports what exact commands since the interviewers at CD Projekt should be lenient!).

Here's a PostgreSQL example of using EXCEPT to find all of CD Projekt's Facebook video ads with more than 10k views that aren't also being run on YouTube:


If you want to retain duplicates, you can use the EXCEPT ALL operator instead of EXCEPT. The EXCEPT ALL operator will return all rows, including duplicates.

SQL Question 6: Calculation of Weighted Ratios

In CD Projekt, game analysts want to evaluate player engagement for each game in order to understand the impact of different features. The user engagement is taken from a ratings system, where users rate their interaction with each feature on a scale of 1 to 5 (5 is the best), and these ratings are incorporated into an engagement score. The function is used to amplify the impact of higher ratings, function is used to understand the distribution of ratings, is used to measure distance from the highest rating, and is to format the final result. Calculate the weighted engagement score with the following formula:

Using the table, create a view organising information by , and , round the to 2 decimal points.

Example Input:
feature_idgame_nameratingrating_count
1001Cyberpunk 2077512000
2002Cyberpunk 207748000
3003The Witcher 337000
4004The Witcher 323500
5005Cyberpunk 207712000
Example Output:
game_namefeature_idengagement_score
Cyberpunk 207710015.00
Cyberpunk 207720022.50
The Witcher 330031.43
The Witcher 340041.14
Cyberpunk 207750051.00

Answer:


In this query, each of the key math functions (POWER, MOD, ABS) are used in the calculation of the engagement score, which is the central part of the question. The helps to normalize the engagement considering the users' opinions. This engagement score is then rounded to 2 decimal points for easier reading. The results are presented in a new view ordered by and .

To practice a very similar question try this interactive Amazon Average Review Ratings Question which is similar for dealing with ratings and averages or this Alibaba Compressed Mean Question which is similar for its use of mean calculations.

SQL Question 7: What is the difference between the and window functions?

In SQL, both and are used to rank rows within a result set. The key difference between the two functions is how deal with two or more rows having the same value in the ranked column (aka how the break ties).

RANK(): When there's a tie, leaves a gap in the ranking. For example, if three rows are tied for 2nd place, the RANK() function will assign a rank of 2 to the first of these rows, a rank of 3 to the second row, and a rank of 4 to the third row.

DENSE_RANK(): For ties, does not leave a gap in the ranking. Instead, it assigns the same rank to all tied rows, and then makes the next row 1 bigger. Confusing, I know, but here's an example to make it more clear: if three rows are tied for 3rd place, the function will assign a rank of 3 to all three rows, and then assign a rank of 4 to the next row.

SQL Question 8: Retrieve Total Sales Per Month for The Witcher 3: Wild Hunt

CD Projekt is known in the gaming world for its highly successful game "The Witcher 3: Wild Hunt". Suppose you were given a sales dataset with date information. Your task is to write a SQL query that retrieves the total game sales per month for "The Witcher 3: Wild Hunt".

Sample Input:

sale_idgame_idsale_dateunits_sold
1301/10/2020200
2201/15/2020300
3302/12/2020250
4102/20/2020300
5303/05/2020150
6303/22/2020200

Sample Input:

game_idgame_title
1Cyberpunk 2077
2The Witcher 2: Assassins of Kings
3The Witcher 3: Wild Hunt

Example Output:

mthyeargame_titletotal_units_sold
12020The Witcher 3: Wild Hunt200
22020The Witcher 3: Wild Hunt250
32020The Witcher 3: Wild Hunt350

Answer:


This SQL query extracts the MONTH and YEAR from the and applies a group by clause on them along with the game title, thus giving us a total sales count for each month and year. It joins the sale table with the game table to fetch the title and filters the records for the game "The Witcher 3: Wild Hunt". The results are ordered by year and month.

CD Projekt SQL Interview Tips

Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the CD Projekt SQL interview is to solve as many practice SQL interview questions as you can! Beyond just solving the earlier CD Projekt SQL interview questions, you should also solve the 200+ DataLemur SQL Interview Questions which come from companies like Netflix, Airbnb, and Amazon. DataLemur SQL Interview Questions

Each exercise has hints to guide you, fully explained answers along with a discussion board to see how others solved it and best of all, there is an online SQL coding environment so you can right online code up your SQL query answer and have it graded.

To prep for the CD Projekt SQL interview you can also be useful to practice SQL questions from other tech companies like:

But if your SQL skills are weak, don't worry about going right into solving questions – go learn SQL with this SQL interview tutorial.

SQL tutorial for Data Scientists & Analysts

This tutorial covers things like aggregate window functions and aggregate functions – both of which show up routinely during SQL interviews at CD Projekt.

CD Projekt Data Science Interview Tips

What Do CD Projekt Data Science Interviews Cover?

In addition to SQL interview questions, the other types of problems to practice for the CD Projekt Data Science Interview are:

  • Probability & Stats Questions
  • Python or R Programming Questions
  • Analytics and Product-Metrics Questions
  • Machine Learning and Predictive Modeling Questions
  • Resume-Based Behavioral Questions

CD Projekt Data Scientist

How To Prepare for CD Projekt Data Science Interviews?

To prepare for CD Projekt Data Science interviews read the book Ace the Data Science Interview because it's got:

  • 201 interview questions sourced from FAANG & startups
  • a crash course on SQL, Product-Sense & ML
  • over 900+ reviews on Amazon & 4.5-star rating

Ace the DS Interview