logo

13 Google SQL Interview Questions (Updated 2024)

Updated on

January 22, 2024

Google almost always asks SQL problems during interviews for Data Analytics, Data Science, and Data Engineering jobs. To help you land your dream job in data at Google, practice these 13 REAL Google SQL interview questions which we've curated and solved for you.

While these problems are a grind to practice, preparing for Google is SO worth it, and I'm speaking from experience – I interned as a Google Data Engineer in 2016 and it was a BLAST:

Nick Singh Google Nest Data Engineer Intern

13 Google SQL Interview Questions

SQL Question 1: Identify Most Active Google Search Users

The Google Search team wants to identify their 'power users' or VIP users that perform a lot of search activities. These users are determined by those who have conducted more than 500 searches in the past month. Write a SQL query to find user ids and number of searches in the last month for these power users.

We assume two tables, and .

Example Input:
user_iduser_name
123Alice
265Bob
362Charlie
192David
981Eve
Example Input:
search_iduser_idsearch_date
00112306/08/2024 00:00:00
00226506/10/2024 00:00:00
00336206/18/2024 00:00:00
00419207/26/2024 00:00:00
00598107/05/2024 00:00:00
00612306/08/2024 00:00:01
00719206/18/2024 00:00:01
.........
60012307/28/2024 00:00:00
60112307/29/2024 00:00:00

In the above example, assume user '123' has made over 500 searches in the time period indicated so we'd return that, along with their # of searches.

Answer:


This query joins the and tables on . It filters the search records to only include those made in the last month. It then groups by and , counting the number of searches. The clause is used to filter the grouped records, showing only users who have made more than 500 searches.

If this problem seemed easy, just know that it's more of a warm-up type problem!

SQL Question 2: Odd & Even Measurements

Assume you're given a table with measurement values obtained from a Google sensor over multiple days with measurements taken multiple times within each day.

Write a query to calculate the sum of odd-numbered and even-numbered measurements separately for a particular day and display the results in two different columns.

Example Input:

measurement_idmeasurement_valuemeasurement_time
1312331109.5107/10/2024 09:00:00
1352111662.7407/10/2024 11:00:00
5235421246.2407/10/2024 13:15:00
1435621124.5007/11/2024 15:00:00
3464621234.1407/11/2024 16:45:00

Example Output:

measurement_dayodd_sumeven_sum
07/10/2024 00:00:002355.751662.74
07/11/2024 00:00:001124.501234.14

Example Explanation

Based on the results,

  • On 07/10/2024, the sum of the odd-numbered measurements is 2355.75, while the sum of the even-numbered measurements is 1662.74.
  • On 07/11/2024, there are only two measurements available. The sum of the odd-numbered measurements is 1124.50, and the sum of the even-numbered measurements is 1234.14.

Before we reveal the solution, try this Google SQL Interview Question yourself on DataLemur:

Google SQL Interview Question

Answer:


SQL Question 3: Google Maps Flagged UGC

As a Data Analyst on the Google Maps User Generated Content team, you and your Product Manager are investigating user-generated content (UGC) – photos and reviews that independent users upload to Google Maps.

Write a query to determine which type of place () attracts the most UGC tagged as "off-topic". In the case of a tie, show the output in ascending order of .

Example Input:
place_idplace_nameplace_category
1Baar BaarRestaurant
2RubirosaRestaurant
3Mr. PurpleBar
4La CavernaBar
Example Input:
content_idplace_idcontent_tag
1011Off-topic
1102Misinformation
1532Off-topic
1763Harassment
1903Off-topic
Example Output:
off_topic_places
Restaurant

The restaurants (Baar Baar and Rubirosa) have a total of has 2 UGC posts tagged as "off-topic". The bars only have 1. Restaurant is shown here because it's the type of place with the most UGC tagged as "off-topic".

Answer:


The above is a partial-solution – DataLemur Premium users can unlock multiple hints, and the full solution to this Google SQL Interview problem here:

Google SQL Interview Question: Google Maps UGC

SQL Question 4: Determine the Most Popular Google Search Category

For this scenario, assume that Google wants to analyze the top searched categories in their platform to optimize their search results. We have two tables, which has information about each search, and where every category ID is associated with a category name.

The table has the following structure:

Example Input:
search_iduser_idsearch_datecategory_idquery
1001765406/01/2024 00:00:003001"chicken recipe"
1002234606/02/2024 00:00:003001"vegan meal prep"
1003876506/03/2024 00:00:002001"google stocks"
1004987107/01/2024 00:00:001001"python tutorial"
1005876007/02/2024 00:00:002001"tesla stocks"

The table has the following structure:

Example Input:
category_idcategory_name
1001"Programming Tutorials"
2001"Stock Market"
3001"Recipes"
4001"Sports News"

The question is: Can you write a SQL query that gives the total count of searches made in each category by month for the available data in the year 2024?

Expected output:

Example Output:

category_namemonthtotal_searches
"Programming Tutorials"071
"Stock Market"061
"Stock Market"071
"Recipes"062

Answer:

You can implement SQL Window function to solve this:


This query first join the and tables, and then partition the results by and the month of the . The clause counts the number of searches in each partition. The WHERE filter ensure the data used is from the year 2024.

Finally, we order the results by in descending order to get the categories with the most searches by month at the top.

SQL QUESTION 5: What is database denormalization?

Denormalization is the process of modifying a database schema in a way that deviates from the typical rules of normalization (1NF, 2NF, 3NF, etc.).

Denormalization is often used to improve the performance of a database, particularly when it is being used for reporting and analytical purposes (rather than in an Online Transaction Processing (OLTP) manager).

By duplicating data, denormalization can reduce the number of expensive joins required to retrieve data, which can improve query performance. However, denormalization can also cause problems such as increased data redundancy and the need for more complex update and delete operations.

SQL Question 6: Filter Google Ads by Relevant Details

As a data analyst at Google, you are tasked with examining the Google Ads data for better ad placement and customer targeting. You are asked to retrieve all records of ads from the database that fall into the following coditions:

  1. The 'status' of the ad is 'active'.
  2. The 'impressions' is greater than 500,000.
  3. The ad 'last_updated' in the year 2024.

Your task is to write a query to filter down the records following these conditions.

Example Input:
ad_idnamestatusimpressionslast_updated
1234Google Phoneactive60000006/25/2024 12:00:00
5678Google Laptopinactive80000005/18/2024 12:00:00
9012Google Appactive30000004/02/2024 12:00:00
3456Google Cloudactive70000008/12/2024 12:00:00
7890Google Mailinactive55000009/03/2024 12:00:00

Answer:


The query first identifies rows where the value is 'active', then further filters the result to show those rows only with 'impressions' more than 500,000, and then finally filters on year.

SQL QUESTION 7: What do stored procedures do?

Stored procedures are a lot like functions in programming. They're used to encapsulate and organize business logic into one unit of code, and they can accept multiple input parameters and return multiple output values.

For example, if you were a Data Analyst at Google working on a HR analytics project, you might create a stored procedure to calculate the average salary for a given department:


To call this stored procedure and find the average salary for the Data Analytics department you'd write the following query:


SQL Question 8: Median Google Search Frequency

Google's marketing team is making a Superbowl commercial and needs a simple statistic to put on their TV ad: the median number of searches a person made last year.

However, at Google scale, querying the 2 trillion searches is too costly. Luckily, you have access to the summary table which tells you the number of searches made last year and how many Google users fall into that bucket.

Write a query to report the median of searches made by a user. Round the median to one decimal point.

Google Search Frequency Data

Before we reveal the answer, please try coding this up yourself on DataLemur's SQL Interview platform:

Google Median Frequency SQL Interview Problem

SQL QUESTION 9: What do the / operators do, and can you give an example?

The / operator is used to remove to return all rows from the first SELECT statement that are not returned by the second SELECT statement.

Note that is available in PostgreSQL and SQL Server, while MINUS is available in MySQL and Oracle (but don't stress about knowing which DBMS supports what exact commands since the interviewers at Google should be lenient!).

SQL Question 10: Assessing Google Ad Click-Through and Conversion Rates

As a data analyst on Google Shopping, one of your tasks is to monitor the efficiency of various Google Shopping ads. Specifically, you are interested in the click-through rate (CTR) and conversion rate (each click that results in placing an item into the shopping cart).

Given the two data tables and , write a SQL query to assess the click-through rate (CTR) and conversion rate for each ad.

Example Input:
ad_iduser_idclick_date
100112306/08/2024
100226506/10/2024
100136206/18/2024
100319207/26/2024
100298107/05/2024
Example Input:
ad_iduser_idcart_date
100112306/08/2024
100319207/26/2024
100226506/11/2024

Answer:


Above we join and tables on and , group by and compute the counts of distinct s in both tables.

The conversion rate is calculated as the ratio of to !

In case this marketing problem was interesting, checkout how this blog on how marketing analysts use SQL.

SQL Question 11: Google Ad Campaign Performance

As a data analyst on the advertiser solutions team at Google, your task is to analyze the performance of various ad campaigns running on Google AdWords for a F500 client. You were asked to find the average cost per click (CPC) for each campaign and each ad group within those campaigns for the previous month. CPC is calculated as the total cost of all clicks divided by the number of clicks.

For this task, you have been given access to the table which stores data about each click on the ads.

Example Input:
click_iddatecampaign_idad_group_idclickscost
432506/08/20241302200150100.00
463706/10/20241403200265130.00
487606/18/20241302200170140.00
453107/05/20241604300180200.00
474907/05/20241604200275180.00

You'd like to return an output table in the following format:

Example Output:
campaign_idad_group_idavg_CPC
130220012.4
140320022.0
160430012.50
160420022.4

Answer:


This query returns the average cost per click (avg_CPC) for each campaign and ad group by grouping the table by and and then calculating the average CPC as per the formula given, i.e., total cost divided by total number of clicks.

p.s. If you have DataLemur Premium, you'll also get access to a similar Google Ad Campaign ROAS problem.

SQL QUESTION 12: In database design, what do foreign keys do?

A foreign key is a column or group of columns in a table that refers to the primary key in another table. The foreign key constraint helps maintain referential integrity between the two tables.

Let's examine employee data from Google's HR database:

:

+------------+------------+------------+------------+ | employee_id| first_name | last_name | manager_id | +------------+------------+------------+------------+ | 1 | Aubrey | Graham | 3 | | 2 | Marshal | Mathers | 3 | | 3 | Dwayne | Carter | 4 | | 4 | Shawn | Carter | | +------------+------------+------------+------------+

In this table, serves as the primary key and functions as a foreign key because it links to the of the employee's manager. This establishes a relationship between Google employees and their managers, allowing for easy querying to find an employee's manager or see which employees report to a specific manager.

The table may also have multiple foreign keys that reference primary keys in other tables. For example, and foreign keys could be used to connect each employee to their respective department and location.

SQL Question 13: Analyze Android In-App Purchases

As a data analyst at Google on the Android PlayStore team, you are tasked with providing insights into in-app purchases made via the PlayStore.

Write a SQL query to get a list of customers along with their last purchase. The result should contain customer_id, first name, last name, product, and latest purchase date.

Google's Android Mascot Coding SQL

Table:
customer_idfirst_namelast_nameapp
1JohnDoeTinder
2JaneSmithCandyCrush
3JackBrownFortnite
4EmilyJohnsonUber
5JakeKennyGoogle Music
Table:
purchase_idcustomer_idpricedate
101179.992024-02-23
102249.992024-03-18
103389.992024-06-08
1044119.992024-07-05

Answer:


The query joins the Customers and Google PlayStore Purchases tables on the "customer_id" field. It then groups by the customer's details to show one row per customer. In addition, the function is used to find the latest (most recent) purchase date for each customer.

Because joins come up frequently during SQL interviews, try this interactive Snapchat JOIN SQL interview question:

Snapchat Join SQL question

Tips To Prepare for the Google SQL Interview

The Google Data Analytics Certificate isn't enough to land a job at Google. The key to crushing Google SQL interview questions is to practice, practice, and then practice some more! Besides solving the earlier Google SQL interview questions, you should also solve the 200+ DataLemur SQL Interview Questions which come from companies like FAANG tech companies and tech startups.

DataLemur Questions

Each SQL question has hints to guide you, step-by-step solutions and crucially, there's an interactive coding environment so you can right online code up your query and have it checked. Please, for the love of god, don't just read the questions & solutions – actually DO the damn problems, otherwise when the Google interviewer asks you a question you'll be like:

Google SQL Interview Meme

To prep for the Google SQL interview it's also wise to solve interview questions from other FAANG tech companies like:

But if your SQL coding skills are weak, forget about going right into solving questions – improve your SQL foundations with this SQL tutorial for Data Analytics.

DataLemur SQL Tutorial for Data Science

This tutorial covers SQL topics like grouping by multiple columns and WHERE with AND/OR/NOT – both of these come up frequently in Google interviews.

Google Data Science Interview Tips

What Do Google Data Science Interviews Cover?

For the Google Data Science Interview, in addition to SQL query questions, the other types of questions to prepare for are:

If this sounds like a ton of topics... it is! Google can afford to be very thorough, which is why candidates often have to do 5 or more interviews to land a Google Data Science job.

Google Data Science Interview Meme

How To Prepare for Google Data Science Interviews?

To prepare for Google Data Science interviews read the book Ace the Data Science Interview because it's got:

  • 201 interview questions sourced from tech companies like Netflix, Google, & Airbnb
  • a refresher covering SQL, AB Testing & ML
  • over 900+ 5-star reviews on Amazon

Ace the Data Science Interview by Nick Singh Kevin Huo