11 Presight.ai SQL Interview Questions (Updated 2024)

Updated on

March 3, 2024

At Presight.ai SQL is used day-to-day for cleaning the training data that feeds the different AI models that Presight deploys for it's customers. That's why Presight almost always asks SQL query questions during interviews for Data Science, Analytics, and & Data Engineering jobs.

To help you ace the Presight SQL interview, we've curated 11 Presight SQL interview questions – able to answer them all?

11 Presight.ai SQL Interview Questions

SQL Question 1: Identify High Value "Whale" Users for Presight AI

Presight AI is a company specializing in artificial intelligence and machine learning tools. Assume they have two tables: , which logs each activity by a user with a timestamp, and , which contains information about each user.

The business wants to identify "whale" users, i.e., users who use their tools extensively and frequently. For simplicity assume a "whale" user is defined as a user who has performed more than 100 activities in the last month.

Write a SQL query to identify these "whale" users.

Example Input:
110107/26/2022 00:00:00'analyze'
210207/25/2022 00:00:00'analyze'
310107/24/2022 00:00:00'simulate'
410307/23/2022 00:00:00'simulate'
510107/22/2022 00:00:00'analyze'
Example Input:
101'Sam'01/01/2022 00:00:00'sam@example.com'
102'Amy'02/02/2022 00:00:00'amy@example.com'
103'Leo'03/03/2022 00:00:00'leo@example.com'


This query selects and from the table and the count of activities from the table. The INNER JOIN combines these two tables based on the column. The WHERE clause filters out activities that are older than one month. The GROUP BY clause groups the selected columns by and . The HAVING clause makes sure to select only the users who have performed more than 100 activities in the last month.

To work on another SQL customer analytics question where you can solve it interactively and have your SQL solution instantly executed, try this Walmart SQL Interview Question: Walmart SQL Interview Question

SQL Question 2: Compute the cumulative number of unique users and total revenue per product per month.

Presight AI would be interested in knowing the total number of unique users who have used a product per month and the total revenue made per product per month.

Let's say we have a table that stores every transaction made by every user:

Example Input:
Example Output:


Using a window function in SQL, we can gain insights about the number of unique users per product per month, and the total revenue from each product, every month.

In this query, we are partitioning our data by and by the month of . For each partition, we are calculating the count of distinct and sum . This will give us the total number of unique users and total revenue for each product per month.

To practice another window function question on DataLemur's free interactive coding environment, try this Amazon SQL Interview Question: Amazon Window Function SQL Interview Problem

SQL Question 3: What are SQL constraints, and can you give some examples?

Constraints are used to specify the rules concerning data in the table. It can be applied for single or multiple fields in an SQL table during the creation of the table or after creating using the ALTER TABLE command. The constraints are:

Say you were storing sales analytyics data from Presight AI's CRM inside a database. Here's some example constraints you could use:

  • PRIMARY KEY constraint: You might use a PRIMARY KEY constraint to ensure that each record in the database has a unique identifier. For example, you could use the "opportunity_id" field as the primary key in the "opportunities" table.

  • FOREIGN KEY constraint: You might use a FOREIGN KEY constraint to link the data in one table to the data in another table. For example, you could use a foreign key field in the "opportunities" table to reference the "account_id" field in the "accounts" table.

  • NOT NULL constraint: You might use a NOT NULL constraint to ensure that a field cannot contain a NULL value. For example, you could use a NOT NULL constraint on the "opportunity_name" field in the "opportunities" table to ensure that each opportunity has a name.

  • UNIQUE constraint: You might use a UNIQUE constraint to ensure that the data in a field is unique across the entire table. For example, you could use a UNIQUE constraint on the "email" field in the "contacts" table to ensure that each contact has a unique email address.

  • CHECK constraint: You might use a CHECK constraint to ensure that the data in a field meets certain conditions. For example, you could use a CHECK constraint to ensure that the "deal_probability" field in the "opportunities" table is a value between 0 and 100.

  • DEFAULT constraint: You might use a DEFAULT constraint to specify a default value for a field. For example, you could use a DEFAULT constraint on the "stage" field in the "opportunities" table to set the default value to "prospecting"

Presight AI SQL Interview Questions

SQL Question 4: Data Analysis for Presight AI

Presight AI's main product is an AI solution that predicts and prevents incidents in real-time in the industrial sector. They have many clients that use their services. They need to track and analyze the number of incidents encountered by each of their clients, as well as the response times of their AI solution in diagnosing and responding to these incidents.

Using this requirement, we need to design and implement a relational database schema in SQL for this business model.


  1. Design the necessary tables that would store this data.
  2. Populate these tables with some sample data. Then, write a SQL query to:
    • Retrieve the total number of incidents each client had.
    • Retrieve the average response time of the AI solution for each client.

The database schema could contain the following tables:

Clients Table
Client IDClient Name
1Client A
2Client B
3Client C
4Client D
Incidents Table
Incident IDClient IDIncident Occurrence
112021-01-01 09:10:00
212021-01-02 10:20:00
322021-01-02 11:30:00
432021-01-03 12:40:00
AI_Responses Table
AI Response IDIncident IDResponse Timestamp
112021-01-01 09:11:00
222021-01-02 10:23:00
332021-01-02 11:36:00
442021-01-03 12:42:00

Now, using this data, we can write a PostgreSQL query to solve the problem as follows:


The first query will give you the total number of incidents encountered by each client. The second query gives the average response time (in minutes) of the AI solution for each client. We use EXTRACT with EPOCH to get the difference in seconds and then divide by 60 to convert to minutes.

SQL Question 5: Does a typically give the same results as a ?

No, in almost all cases, and for all practical purposes, and do NOT produce the same result.

While both are similar, in that they combine two tables, you can think of joins as increasing the width of the resulting table (you'll have more columns in the result set for a left/inner/right join), whereas a union is used to combine rows which increases the height of the result set but keeps the column count the same.

SQL Question 6: Average Prediction Accuracy by Model

For Presight AI, which is a Predictive Analytics company, we might be interested in the average accuracy of different models. Given a table containing a model's ID, the prediction it made, and the actual outcome, write a SQL query that calculates the average prediction accuracy (the percentage of predictions that were correct) for each model.

Example Input:
Example Output:


This SQL query first uses a CASE WHEN statement to create a column of 1s and 0s, where 1 indicates that the prediction was correct (i.e., the matches the ) and 0 indicates it was incorrect. We then find the average of this column (which is essentially the proportion of correct predictions) for each model_id. The result is the average prediction accuracy for each model.

To practice a very similar question try this interactive Facebook App Click-through Rate (CTR) Question which is similar for calculating rates using SQL or this Stripe Repeated Payments Question which is similar for identifying specific instances within a given timeframe.

SQL Question 7: What sets a cross join apart from a natural join?

A cross join is a type of JOIN operation in SQL that creates a new table by combining each row from the first table with every row from the second table. It is also known as a cartesian join.

For example, say you worked on the Marketing Analytics team at Presight AI, and needed to understand what advertising copy (text) goes best with what advertising creative (the photo that gets posted with the advertisement copy).

Here is an example of a cross-join between a table of ad_copy and a table of ad_creative:

A natural join, on the other hand, is a type of JOIN that combines rows from two or more tables based on their common columns. It is called a "natural" join because it is based on the natural relationship that exists between the common columns in the tables being joined.

For an example of each one, say you had sales data exported from Presight AI's Salesforce CRM stored in a datawarehouse which had two tables: and .

An (which is a type of natural join) combines the two tables on the common

This query will return rows from the sales and tables that have matching customer id values. Only rows with matching values will be included in the results.

One main difference between cross joins and natural joins is that cross joins do not require any common columns between the tables being joined, while natural joins do. Another difference is that cross joins can create very large tables if the input tables have a large number of rows, while natural joins will only return a table with the number of rows equal to the number of matching rows in the input tables.

SQL Question 8: Calculating the Average Efficiency by Algorithm Type

Assume you work for Presight AI, a company that offers AI-based solutions. Currently, the company tracks the performance of different AI algorithms in its database. Your task is to calculate the average efficiency (measured in percentage) for each algorithm type based on the operational data.

Consider the following table representing the operational data for a given period:

Example Input:
1001Neural Network2022-06-0787.5
1002Random Forest2022-06-1085.6
1003Neural Network2022-06-1590.1
1004Random Forest2022-07-0186.3
1005Neural Network2022-07-0791.4
1006Random Forest2022-07-0884.9
1007Random Forest2022-07-1588.2

Your task is to calculate the average efficiency per month for each algorithm type.

Example Output:
6Neural Network88.80
6Random Forest85.60
7Neural Network91.40
7Random Forest86.47


The PostgreSQL query to achieve this is as follows:

The result of this query will provide the average efficiency for each algorithm type grouped by month. Each row in the result represents a particular month and an algorithm type, associated with the calculated average efficiency. The efficiency values are rounded to two decimal places. This provides insights into which AI algorithms perform best in different periods, which can be useful for determining resource allocation and future development strategies.

SQL Question 9: Filter Customer Details Based on Company Name

In Presight AI's customer records database, filter the customer records to find all entries wherein the field contains 'AI' in its name. Assume the database has the following schema:

  • table with fields: (integer), (string), (string), (string).
Example Input:
101Jane Doejane.doe@example.comTech AI Ltd.
102John Smithjohn.smith@example.comData Corp
103Emma Brownemma.brown@example.comAI Innovations
104Liam Johnsonliam.johnson@example.comAI Advance Co
105Olivia Williamsolivia.williams@example.comSmart Tech Systems


You can accomplish this task by executing the following SQL command:


This SQL query filters the records from the table where the field contains the string 'AI'. The '%' before and after 'AI' are wildcards that match any sequence of characters. The query will return all records of customers who are associated with a company whose name includes 'AI'.

SQL Question 10: What's the difference between and ?

The clause is used to filter the groups created by the clause. It's similar to the clause, but it is used to specify conditions on the groups created by the clause, rather than on the individual rows of the table.

For example, say you were analyzing salaries for analytics employees at Presight AI:

This query retrieves the total salary for each Analytics department at Presight AI and groups the rows by the specific department (i.e. "Marketing Analytics", "Business Analytics", "Sales Analytics" teams).

The clause then filters the groups to include only Presight AI departments where the total salary is greater than $1 million

SQL Question 11: Analyze Customers and Orders at Presight AI

As an interviewee, you are provided with two tables. One table contains the customer details in Presight AI. It includes , , , and . The second table contains orders information, including , , , and .

You are tasked to write a SQL query to join both tables and list all orders each customer placed along with their details. Make sure you bring out all the records from both tables, including customers who did not place any orders and orders which do not have any customer information.

Here are the sample markdown-formatted tables:

Example Input:
1John DoeUSASoftware
2Jane SmithCanadaHospitality
3Mohamed AliEgyptAgriculture
4Li WeiChinaManufacturing
Example Input:
10101/01/20221AI Model
10202/02/20221Training Data
10303/03/20222AI Model
10404/04/2022nullTraining Data


You can solve this problem by using the SQL command. The query is as follows:

This SQL command will return a table that contains customer details along with their respective order details, if any. This allows Presight AI to get a comprehensive view of each customer and their orders, without losing any customers who did not place any orders and orders not associated with any customers.

Because joins come up frequently during SQL interviews, practice this Spotify JOIN SQL question: SQL join question from Spotify

Presight.ai SQL Interview Tips

The best way to prepare for a Presight AI SQL interview is to practice, practice, practice. Beyond just solving the above Presight AI SQL interview questions, you should also solve the 200+ SQL exercises on DataLemur which come from companies like FAANG and tech startups. DataLemur SQL Interview Questions

Each interview question has multiple hints, full answers and most importantly, there's an interactive SQL code editor so you can right online code up your SQL query and have it executed.

To prep for the Presight AI SQL interview you can also be wise to practice SQL problems from other tech companies like:

However, if your SQL coding skills are weak, don't worry about diving straight into solving questions – refresh your SQL knowledge with this DataLemur SQL Tutorial.

Interactive SQL tutorial

This tutorial covers SQL concepts such as aggreage functions like MIN()/MAX() and sorting data with ORDER BY – both of which come up often in Presight AI SQL assessments.

Presight Data Science Interview Tips

What Do Presight.ai Data Science Interviews Cover?

In addition to SQL query questions, the other types of problems tested in the Presight AI Data Science Interview are:

  • Statistics and Probability Questions
  • Python or R Programming Questions
  • Product Data Science Interview Questions
  • Machine Learning Questions
  • Resume-Based Behavioral Questions

Presight AI Data Scientist

How To Prepare for Presight AI Data Science Interviews?

The best way to prepare for Presight AI Data Science interviews is by reading Ace the Data Science Interview. The book's got:

  • 201 Interview Questions from Facebook, Google & startups
  • A Refresher on Product Analytics, SQL & ML
  • Great Reviews (900+ reviews, 4.5-star rating)

Acing Data Science Interview