Databases are dumb β they can't infer what data you need in order to conduct an analysis for your boss. That's why we use SQL SELECT queries to exactly tell the database what data we need to fetch and display.
The most basic Query contains two essential SQL keywords: SELECT followed by FROM.
tells the database that you want to output data, and is followed by the specific names of the columns you want to have the database print out.
tells the database which specific table contains that data we want to output.
Suppose you had access to Amazon's database, and there was a table called reviews which has data related to Amazon product reviews. Let's start by selecting a couple of columns from the reviews table!
To help spreadsheet nerds visualize this Amazon example, here's what the reviews table looks like if it were opened in Excel or Google Sheets:
We're trying to have the database show us something similar, using the below SQL SELECT query:
The result of this query will be a two-dimensional set of rows and columns, effectively a copy of the reviews table, but only with the columns that we requested.
To see the output yourself, copy and paste the above SQL snippet into the Amazon SQL Interview Question and hit "Run Code".
No, seriously, run the damn code β our tutorials are interactive and RELY on you actually writing and running code β not just passively reading. Plus, you don't even need to sign-in or create a free account to run code!
Now that you've run the Amazon SQL code (you did that riiiiiight??), let's explain how it works, line by line:
The first line of SQL code is telling the database to output (SELECT) the data from three different columns: the review_id, submit_date, and stars columns.
Line #2 is telling the database that the rows and columns to print out are specifically found in the reviews table. At a large tech company like Amazon, there are thousands of different tables, which is why the database needs us to be precise and specify the exact table we want to retrieve data from.
When you run this query, it'll print out a set of rows β one for each product review β that shows the unique ID of the review, when the review was submitted, and how many stars the customer gave the product that was reviewed.
In a SQL query, whenever you select multiple columns, they must be separated by commas, but you should NOT include a trailing comma after the last column name. If you accidentally put a comma after the last column you select, you'll get a SQL query syntax error.
In the above SQL query, there is a trailing comma after "stars" which is why we got a syntax error and no output.
In SQL, to select each and every column in a table, you can use * instead of manually typing out all the column names:
The comes in handy because in the real world, there might be hundreds of columns in the reviews table, used to represent the hundreds of attributes associated with a single Amazon review. To output all the columns in one go, you'd run this SELECT * query:
We often run a query when starting to solve SQL interview questions because it's an easy way to inspect the input data and get our bearings.
Real SQL interview questions are quite complex β check out this Microsoft Data Science SQL Interview question for example. It's totally okay if you don't have any clue how to solve this β we've literally just started this tutorial!
I linked to the above problem, because in almost all SQL interview questions, the first step is to start exploring the data with which is the inspiration for the much simpler exercise I want you to tackle in this lesson.
Click below, to work on the practice question where you simply select and output all data on Microsoft Azure products:
In this tutorial, we showed you how to get all the columns in a table (with ), or only get certain columns. But what about rows?
How do we only get certain specific rows, rather than outputting every single row in a table?
That's where the next tutorial on the SQL WHERE clause comes in!
Next Lesson
SQL WHERE π