At Nintendo, SQL is used often for analyzing user gameplay data for insights on player behavior and managing transactional database systems for accurate digital and physical product sales tracking. That's why Nintendo frequently asks SQL problems during interviews for Data Analytics, Data Science, and Data Engineering jobs.
As such, to help you practice for the Nintendo SQL interview, we've curated 10 Nintendo SQL interview questions – can you solve them?
You work as a data analyst at Nintendo. Recently, your manager asked you to analyze the monthly average review score for each game sold by the company. The dataset available to you consists of a table named "reviews". The "reviews" table has 5 columns: "review_id" (an unique identifier for each review), "user_id" (the ID of the user who wrote the review), "submit_date" (the date the review was submitted), "product_id" (the ID of the product being reviewed), "stars" (the number of stars the product was given, ranging from 1 to 5).
You need to aggregate this data to find the monthly average review score for each product_id for the year 2022. The score should be rounded to two decimal places.
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
6171 | 123 | 2022-06-08 | 50001 | 4 |
7802 | 265 | 2022-06-10 | 69852 | 4 |
5293 | 362 | 2022-06-18 | 50001 | 3 |
6352 | 192 | 2022-07-26 | 69852 | 3 |
4517 | 981 | 2022-07-05 | 69852 | 2 |
month | product_id | avg_stars |
---|---|---|
6 | 50001 | 3.50 |
6 | 69852 | 4.00 |
7 | 69852 | 2.50 |
In this SQL query, we are utilizing PostgreSQL's EXTRACT function to grab the MONTH and YEAR from the column. We are calculating the average using AVG function and rounding it to two decimal places. We GROUP BY the derived and to get the average score per month for each product. The resulting data is sorted by and .
p.s. SQL Window functions show up super often during SQL interviews, so practice the 27+ window function questions on DataLemur
Given a database that contains information on the sales of games developed by Nintendo, the database comprises of three tables:
Design the SQL query to:
Sample database:
game_id | game_name | release_date |
---|---|---|
001 | Super Mario Odyssey | 10/27/2017 |
002 | The Legend of Zelda: Breath of the Wild | 03/03/2017 |
003 | Pokemon Let's Go | 11/16/2018 |
004 | Luigi's Mansion 3 | 10/31/2019 |
005 | Animal Crossing: New Horizons | 03/20/2020 |
record_id | game_id | units_sold | sale_date | region_id |
---|---|---|---|---|
1001 | 001 | 50000 | 01/03/2020 | 01 |
2002 | 002 | 75000 | 02/10/2020 | 02 |
3003 | 003 | 120000 | 04/10/2020 | 02 |
4004 | 004 | 85000 | 06/20/2020 | 03 |
5005 | 005 | 265000 | 03/30/2020 | 01 |
region_id | region_name | population |
---|---|---|
01 | North America | 579000000 |
02 | Europe | 747600000 |
03 | Asia | 4601375000 |
This set of answers helps to understand the performance of Nintendo games sold in 2020. The first query returns the top 5 selling games in 2020. The second gives a breakdown by region for each of these top 5 games.
The COALESCE() function returns the first non-NULL value from a list of values. This function is often used to replace a NULL with some default value, so that you can then take a or of some column without NULLs messing things up.
For example, suppose you ran a customer satisfaction survey for Nintendo and had statements like "I'd buy from Nintendo again". In the survey, customers would then answer how strongly they agreed with a statement on a scale of 1 to 5 (strongly disagree, disagree, neutral, agreee, strongly agree).
Because attention spans are short, many customers skipped many of the questions, and thus our survey data might be filled with NULLs:
customer_id | question_id | agree_scale |
---|---|---|
101 | 1 | 4 |
101 | 2 | 5 |
202 | 1 | 4 |
202 | 2 | NULL |
303 | 1 | 5 |
303 | 2 | NULL |
Before doing further analytics on this customer survey data, you could replace the NULLs in the column with the value of (because that corresponds to the default 'neutral' answer) using the function:
This would result in the following:
customer_id | question_id | agree_scale |
---|---|---|
101 | 1 | 4 |
101 | 2 | 5 |
202 | 1 | 4 |
202 | 2 | 3 |
303 | 1 | 5 |
303 | 2 | 3 |
Nintendo needs to filter down their customer records to obtain a list of users who have purchased their products, specifically gamers who play "Super Mario" and "Legend of Zelda". Additionally, they want to obtain the total spend of these gamers based on all their purchases.
For this exercise, let's suppose we have two tables – and . The table contains information about each purchase including user id, product name, and the cost, and the table stores information about each user such as user id and user name.
user_id | product_name | cost |
---|---|---|
1001 | Super Mario | 59.99 |
1023 | Legend of Zelda | 59.99 |
1001 | Animal Crossing | 59.99 |
1045 | Super Mario | 59.99 |
1001 | Legend of Zelda | 59.99 |
user_id | user_name |
---|---|
1001 | John Doe |
1023 | Jane Doe |
1045 | Harry Potter |
We want to filter down this data to a list of users who bought "Super Mario" or "Legend of Zelda" and the summation of their total purchases.
The expected output should resemble:
user_name | total_spent |
---|---|
John Doe | 179.97 |
Jane Doe | 59.99 |
Harry Potter | 59.99 |
We could extract the desired information using a combination of SQL , , and :
This query joins the purchases and customer tables on user id, filters data to include only purchases of "Super Mario" or "Legend of Zelda", then group results by user name. The function calculates the total amount spent by each user.
A foreign key is a field in a database table that links to the primary key of another table, establishing a connection between the two tables. For example, let's analyze Nintendo's Google Ads campaigns data:
:
+------------+------------+------------+------------+ | ad_id | campaign_id| keyword | click_count| +------------+------------+------------+------------+ | 1 | 201 | Nintendo reviews | 120 | | 2 | 202 | Nintendo pricing | 150 | | 3 | 101 | buy Nintendo | 65 | | 4 | 101 | Nintendo alternatives | 135 | +------------+------------+------------+------------+
is a foreign key that connects to the of the corresponding Google Ads campaign. This establishes a relationship between the ads and their campaigns, enabling easy querying to find which ads belong to a specific campaign or which campaigns a specific ad belongs to.
The table may also have multiple foreign keys that reference primary keys in other tables. For example, and foreign keys could be used to link each ad to its ad group and the Google Ads account that the campaigns belong to, respectively.
Nintendo, a renowned video game company, releases digital ads for its video games. For each of these ads, Nintendo tracks the number of times an ad was served (impressions) and the number of times it was clicked (clicks). They want to calculate the Click-Through-Rate (CTR) for each game ad, which is the ratio of clicks to impressions.
Calculate the CTR for each game ad for the month of August.
ad_id | game_id | ad_served_date | impressions | clicks |
---|---|---|---|---|
101 | 3001 | 08/01/2022 00:00:00 | 5000 | 450 |
102 | 3002 | 08/02/2022 00:00:00 | 4500 | 300 |
103 | 3001 | 08/03/2022 00:00:00 | 5200 | 500 |
104 | 3003 | 08/04/2022 00:00:00 | 3000 | 250 |
105 | 3002 | 08/05/2022 00:00:00 | 4800 | 350 |
game_id | ctr |
---|---|
3001 | 0.095522388 |
3002 | 0.069444444 |
3003 | 0.083333333 |
This query first filters the rows where the is in August. Then, for each game (indicated by ), it sums up the total and total and calculates the ratio between them, which represents the CTR. This calculation is done within each group of rows that share the same , as indicated by the clause. The makes sure the division is not integer division.
To solve a related SQL problem on DataLemur's free online SQL code editor, try this Facebook SQL Interview question:
Database views are virtual tables based on the results of a SQL statement. They're just like vanilla tables, except views allow you to create simplified versions of tables or hide sensitive data from certain users.
In PostgreSQL, you can create a view by using the command. Here's an example for the table:
As a Data Analyst for Nintendo, you have been tasked with analyzing the customer database . Specifically, you are requested to find all of the active customers from the United States that are Nintendo Switch users. Customers are considered active if their field is set to . We know that the info field contains the main console used by each customer and for Nintendo Switch users it would contain the string .
Use the SQL keyword to perform your data filter and return the , , and fields.
customer_id | name | status | country | info |
---|---|---|---|---|
1001 | John Doe | active | United States | Switch user since 2018 |
1002 | Jane Smith | inactive | United States | Switch user since 2019 |
1003 | James Johnson | active | Canada | Wii user |
1004 | Patricia Williams | active | United States | Gamecube user |
1005 | Robert Jones | active | United States | Switch user since 2020 |
This query selects the relevant fields from the table where the field is , the field is , and the field contains the string . The operator is a wildcard that matches any sequence of characters, meaning it will return a match even if there are other characters before or after in the field.
Given a table named that stores data about Nintendo's game plays, with columns , , , and ; and another table named having columns , , and . Nintendo decides to offer a discount to users on games based on their play time. The discount is calculated as such: if a user's total play time is more than X hours, the discount for the user is SQRT(X) % of the original game price. Write a PostgreSQL query to calculate the final price for each user for each game.
session_id | user_id | game_id | play_time |
---|---|---|---|
201 | 54 | 1001 | 5.2 |
202 | 54 | 1002 | 3.6 |
203 | 78 | 1003 | 8.1 |
204 | 92 | 1003 | 4.5 |
205 | 92 | 1002 | 6.5 |
game_id | game_name | original_price |
---|---|---|
1001 | Super Mario Odyssey | 59.99 |
1002 | The Legend of Zelda: Breath of the Wild | 69.99 |
1003 | Animal Crossing: New Horizons | 59.99 |
This query first joins the and table on the column. It then groups by , and to calculate the sum of of a user for each game. The discount is calculated by taking the square root of the sum of , and then the final price is calculated by subtracting the percentage discount from the original price of each game for each user. The final price is rounded to the nearest 2nd decimal place (cents).
To practice a very similar question try this interactive Amazon Highest-Grossing Items Question which is similar for calculating based on quantities or this Stripe Repeated Payments Question which is similar for dealing with repeated transactions.
A cross join is a type of JOIN operation in SQL that creates a new table by combining each row from the first table with every row from the second table. It is also known as a cartesian join.
For example, say you worked on the Marketing Analytics team at Nintendo, and needed to understand what advertising copy (text) goes best with what advertising creative (the photo that gets posted with the advertisement copy).
Here is an example of a cross-join between a table of ad_copy and a table of ad_creative:
A natural join, on the other hand, is a type of JOIN that combines rows from two or more tables based on their common columns. It is called a "natural" join because it is based on the natural relationship that exists between the common columns in the tables being joined.
For an example of each one, say you had sales data exported from Nintendo's Salesforce CRM stored in a datawarehouse which had two tables: and .
An (which is a type of natural join) combines the two tables on the common
This query will return rows from the sales and tables that have matching customer id values. Only rows with matching values will be included in the results.
One main difference between cross joins and natural joins is that cross joins do not require any common columns between the tables being joined, while natural joins do. Another difference is that cross joins can create very large tables if the input tables have a large number of rows, while natural joins will only return a table with the number of rows equal to the number of matching rows in the input tables.
The key to acing a Nintendo SQL interview is to practice, practice, and then practice some more! In addition to solving the above Nintendo SQL interview questions, you should also solve the 200+ SQL questions on DataLemur which come from companies like Amazon, Microsoft, Meta, and smaller tech companies.
Each problem on DataLemur has multiple hints, step-by-step solutions and best of all, there's an interactive SQL code editor so you can easily right in the browser your query and have it graded.
To prep for the Nintendo SQL interview you can also be useful to practice SQL problems from other tech companies like:
In case your SQL foundations are weak, don't worry about diving straight into solving questions – strengthen your SQL foundations with this interactive SQL tutorial.
This tutorial covers SQL topics like finding NULLs and WHERE vs. HAVING – both of which show up frequently during Nintendo SQL assessments.
Besides SQL interview questions, the other types of problems to prepare for the Nintendo Data Science Interview are:
To prepare for Nintendo Data Science interviews read the book Ace the Data Science Interview because it's got: