At Embracer Group, SQL is used across the company for analyzing gaming user behavior and optimizing the performance of their diverse portfolio of game titles. That's the reason behind why Embracer Group often tests SQL questions in interviews for Data Science, Analytics, and & Data Engineering jobs.
Thus, to help you study for the Embracer Group SQL interview, we'll cover 10 Embracer Group SQL interview questions – able to answer them all?
As a data analyst at Embracer Group, which operates in the video games industry, your task is to analyze the customer review data to find out the average rating each product has received on a monthly basis. You are given a dataset contained in a "reviews" table and you need to construct a SQL query to achieve the desired outcome.
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
6171 | 123 | 2022-06-08 | 1 | 4 |
7802 | 265 | 2022-06-10 | 2 | 4 |
5293 | 362 | 2022-06-18 | 1 | 3 |
6352 | 192 | 2022-07-26 | 2 | 3 |
4517 | 981 | 2022-07-05 | 2 | 2 |
Your task is to calculate the average rating for each for each month in the dataset. Here, represents the ID for each of the products involved in the reviews and represents the rating each user gave to a product on a scale of 1 to 5.
month | product_id | avg_stars |
---|---|---|
06 | 1 | 3.50 |
06 | 2 | 4.00 |
07 | 2 | 2.50 |
In this PostgreSQL query, we have used the function to extract the month part from the . Then, we have grouped by and to calculate the average for each product per month. The function averages the ratings across each group defined by the clause. The average is then casted to float with 2 decimal points for readability. The clause sorts the rows by month and product in ascending order.
The result of this query will be a list of the monthly average star rating for all products in the dataset.
Pro Tip: Window functions are a popular SQL interview topic, so practice all the window function problems on DataLemur
Given the table for Embracer Group, an international gaming company, that has records of customers and their respective game purchases, write a SQL query to filter out customers who have purchased games developed by 'Tarsier Studios' and belongs to a genre 'Adventure', OR customers who made any purchase within the 'Post-apocalypse' game category, regardless of the game developer.
Here's what a sample of the data you're working with might look like:
customer_id | name | game_id | |
---|---|---|---|
3712 | Alice | alice@gmail.com | 1030 |
7804 | Bob | bob@gmail.com | 2045 |
5296 | Charlie | charlie@gmail.com | 1186 |
6355 | Dave | dave@gmail.com | 2045 |
4518 | Eve | eve@gmail.com | 1552 |
game_id | game_name | developer | genre | category |
---|---|---|---|---|
1030 | Little Nightmares | Tarsier Studios | Adventure | Platformer |
2045 | RAGE 2 | id Software | Shooter | Post-apocalypse |
1186 | Uniform | ENUU GAMES | RPG | Fantasy |
1552 | The Secret of Monkey Island | Lucasfilm Games | Adventure | Point and click |
customer_id | name | game_id | |
---|---|---|---|
3712 | Alice | alice@gmail.com | 1030 |
7804 | Bob | bob@gmail.com | 2045 |
6355 | Dave | dave@gmail.com | 2045 |
This query uses a operation to combine information from the and tables based on a common . It then uses a clause to include only customers who purchased a game from a specific developer and genre or belong to a certain category.
A foreign key is a column or group of columns in a table that refers to the primary key in another table. The foreign key constraint helps maintain referential integrity between the two tables. The table with the foreign key is called the child table, while the table with the candidate key is called the parent or referenced table.
For example, consider a database with two tables: and customer_idcustomer_idcustomer_id` column in Embracer Group customers table.
Embracer Group is a parent company that owns many video game development studios. A typical data point that they might be interested in is the average rating that each of their games receives monthly. Assume they have a table with the following structure:
review_id | user_id | submit_date | game_id | stars |
---|---|---|---|---|
6251 | 323 | 06/28/2022 00:00:00 | 87021 | 4 |
7823 | 452 | 07/10/2022 00:00:00 | 43285 | 5 |
5291 | 612 | 08/19/2022 00:00:00 | 87021 | 3 |
6372 | 892 | 09/26/2022 00:00:00 | 43285 | 2 |
6521 | 781 | 10/05/2022 00:00:00 | 87021 | 4 |
Given this table, write a SQL query to output the average game rating for each game per month. For simplicity, assume all dates are within the same year. Your output should look like this:
mth | game_id | avg_stars |
---|---|---|
6 | 87021 | 4.00 |
7 | 43285 | 5.00 |
8 | 87021 | 3.00 |
9 | 43285 | 2.00 |
10 | 87021 | 4.00 |
Each row in the output table represents a single game in a single month, and the average star rating given to that game in that month.
In this query, we make use of the function to retrieve the month from . However, is in a timestamp format, so we can aggregate the reviews by month. Then we aggregate the number of stars by taking the average, using the function. We cast the result to retain two decimal places for readability. Finally, the results are ordered by month and game ID for better organization.
A self-join is a type of join in which a table is joined to itself. To perform a self-join, you need to specify the table name twice in the clause, and give each instance of the table a different alias. You can then join the two instances of the table using a clause, and use a clause to specify the relationship between the rows.
For example, say you were doing an HR analytics project and needed to analyze how much all Embracer Group employees in the same department interact with each other. Here's a self-join query you could use to retrieve all pairs of Embracer Group employees who work in the same department:
This query returns all pairs of Embracer Group employees who work in the same department, and excludes pairs where the employee's id is the same (since this would represent the same Embracer Group employee being paired with themselves).
You are given the 'customers' table for Embracer Group, a company that owns, manages, and develops game franchises. You are required to identify and filter down customer records where the customer's email domain is 'gmail.com'.
customer_id | first_name | last_name | date_created | |
---|---|---|---|---|
1 | John | Doe | john.doe@gmail.com | 05/10/2020 |
2 | Jane | Smith | jane.smith@yahoo.com | 05/12/2020 |
3 | Harry | Styles | harry.styles@gmail.com | 02/20/2021 |
4 | Taylor | Swift | taylor.swift@hotmail.com | 03/30/2021 |
5 | Michael | Jackson | michael.jackson@gmail.com | 04/01/2022 |
This SQL query will filter and return all records from the 'customers' table where the email address ends with '@gmail.com'. The '%' symbol in SQL LIKE clause is a wildcard character that can match any sequence of characters. In this case, it indicates that any series of characters followed by '@gmail.com' should be matched and returned. With this query, we can identify all customers who are using Gmail as their email service.
A is a column or set of columns in a table that references the primary key of another table. It is used to link the two tables together, and to ensure that the data in the foreign key column is valid.
The constraint helps to ensure the integrity of the data in the database by preventing the insertion of rows in the foreign key table that do not have corresponding entries in the primary key table. It also helps to enforce the relationship between the two tables, and can be used to ensure that data is not deleted from the primary key table if there are still references to it in the foreign key table.
For example, if you have a table of Embracer Group customers and an orders table, the customer_id column in the orders table could be a that references the id column (which is the primary key) in the Embracer Group customers table.
As a data analyst at Embracer Group, your task is to analyze the customer's gaming data. You have access to two tables: and . The table lists all the customers with their details. The table lists the games purchased by the customers along with the purchase date.
Please write a SQL query to find out the total revenue generated from each game by joining these tables. Also, include the number of unique customers who purchased each game.
customer_id | first_name | last_name | phone | |
---|---|---|---|---|
123 | John | Doe | john.doe@example.com | 123-456-7890 |
456 | Jane | Smith | jane.smith@example.com | 098-765-4321 |
789 | Bob | Johnson | bob.johnson@example.com | 111-222-3333 |
147 | Alice | Williams | alice.williams@example.com | 444-555-6666 |
purchase_id | customer_id | game_name | purchase_date | price |
---|---|---|---|---|
1 | 123 | Skyrim | 2020-01-01 | 19.99 |
2 | 123 | Borderlands 3 | 2020-02-01 | 59.99 |
3 | 456 | Skyrim | 2020-03-01 | 19.99 |
4 | 789 | Borderlands 3 | 2020-04-01 | 59.99 |
5 | 147 | Witcher 3 | 2020-05-01 | 29.99 |
This query joins the two tables on the field. It then groups the result by . For each game, it counts the number of unique customer IDs (to get the number of unique customers who purchased each game) and sums up the purchase price (to calculate the total revenue from each game).
Because joins come up so often during SQL interviews, try this interactive Snapchat Join SQL question:
Embracer Group wants to compute monthly revenue, cost, and profit percentage for each game they sell. They have two tables: and . The table records every sale made, including the game_id, the sale price (), and the sale date (). The table contains information about each game including the game_id, title, and cost to produce ().
Sample table:
Sample table:
Your task is to write a PostgreSQL query that will create a table with columns: month, title, total revenue, total cost, and profit percentage. The revenue and cost should be rounded to the nearest integer. The profit percentage should be computed as ((revenue - cost) / cost) * 100, rounded to 2 decimal places. Profit percentage should be represented as a fractional number (not a percentage).
This query performs a join on the and tables based on the field, then aggregates the revenue, costs, and profit percentage by month and title. It uses PostgreSQL's TO_CHAR function to extract the month from the . The math functions, ROUND and SUM, are used in combination with the arithmetic operators to calculate the revenues, costs and profit percentages. The output will be sorted by month, then by title, helping the company to analyze its financial performance per game each month.
To practice a very similar question try this interactive Amazon Highest-Grossing Items Question which is similar for calculating maximum revenue per category or this Wayfair Y-on-Y Growth Rate Question which is similar for calculating growth rate based on sales.
Database normalization is the process of breaking down a table into smaller and more specific tables and defining relationships between them via foreign keys. This minimizes redundancy, and creates a database that's more flexible, scalable, and easier to maintain. It also helps to ensure the integrity of the data by minimizing the risk of data inconsistencies and anomalies.
The best way to prepare for a Embracer Group SQL interview is to practice, practice, practice. Beyond just solving the above Embracer Group SQL interview questions, you should also solve the 200+ SQL questions on DataLemur which come from companies like Netflix, Google, and Amazon.
Each DataLemur SQL question has hints to guide you, fully explained answers along with a discussion board to see how others solved it and crucially, there is an interactive SQL code editor so you can right online code up your SQL query answer and have it executed.
To prep for the Embracer Group SQL interview you can also be useful to practice interview questions from other tech companies like:
In case your SQL query skills are weak, forget about diving straight into solving questions – go learn SQL with this SQL tutorial for Data Analytics.
This tutorial covers SQL topics like ordering data and LEAD/LAG window functions – both of these show up frequently during SQL job interviews at Embracer Group.
In addition to SQL query questions, the other question categories covered in the Embracer Group Data Science Interview are:
To prepare for Embracer Group Data Science interviews read the book Ace the Data Science Interview because it's got: