Lame AF tutorials start out with some chatGPT-written fluff like "SQL serves as a fundamental tool for analytics because it facilitates the retrieval and analysis of data, which enables synergies across business units" but I'm a straight shooter so here's the low-down.
SQL is THE SHIT. Not like... it's shit... I mean it's THE SHIT... it's amazing.
Let me tell you in simple terms:
First up: why money SQL is amazing!
I β€οΈ SQL's power and speed compared to Excel. I love that SQL is the PERFECT place for a beginner without much coding experience to start their Data Analytics & Data Engineering journey.
Sure, programming languages like Python & R have their place in the data analysis tech stack, but you'll get the highest return on investment (ROI) on your time if you start by learning SQL.
Speaking of financial terms like ROI, I β€οΈ SQL because I love money. More importantly, I love helping other people also earn more money too, and snagging a high-paying job in Data Analytics or Data Science is a great path towards earning more money.
That's exactly the reason why I wrote the best-selling book Ace the Data Science Interview which now has helped thousands of people land 6-figure Data Science & Data Analytics jobs at FAANG tech companies.
I love helping people up skill into data careers, and learning SQL is the gateway to high-paying jobs in Data Science, Data Engineering, and Data Analytics. I'm excited to help you get started down this path!
TL;DR - learn SQL, analyze data, interview well, get paid!
The 30+ free SQL lessons are meant to get you from SQL zero, to SQL hero. We divided the tutorial into an intro, intermediate, and advanced SQL tutorial, because they have different styles & teaching philosophies.
The basic SQL lessons are meant for SQL newbies, who might have used Excel but don't have formal coding experience.
Each lesson comes with SQL exercises which you can directly run and execute in the browser β you don't need to install any software to run SQL code!
Don't believe me?
Here's an example for your skeptical ass:
Click the card above, and hit "Run Code" β right in the browser you can execute a SQL query, and if you login, you can submit your solution which we'll grade for correctness!
The intermediate SQL tutorial is meant for those who have played around with SQL, might know some Pandas or Dplyr, and want to become productive at SQL for data analysis.
We still cover some SQL syntax, but by the end, we start to cover problem-solving with SQL, and you'll tackle multiple real SQL interview questions from companies like Google, Tesla, LinkedIn, & Amazon.
The advanced SQL tutorials is where we really shine β we train you to think like a Data Scientist, and run through real-world data workflows, including a full Instacart SQL Data Analytics case study.
If you want to start with your first SQL command, feel free to jump to Lesson #1: SQL SELECT!
But, if you're a total beginner, here's some background on what SQL is, and why it's so damn important!
SQL, which is pronounced "Sequel", NOT "S.Q.L.", stands for "Structured Query Language". It's used to manage & query data stored in a relational database management system (RDBMS). For example, if you were a Data Analyst at Amazon, you might write the following query to compute the average rating of different products:
p.s. if you want to try to run this SQL query, copy-paste the code into this Amazon SQL Interview Question!
Think of a Relational Database Management System (RDBMS) as a vast collection of Excel workbooks. Each workbook (in database terms, we call these "tables") contains different sheets, and on each sheet, you have rows and columns of data. The sheets are organized in a way that the data can be quickly searched, updated, inserted, or deleted.
Anyone whose used Excel or Google Sheets knows how slow and laggy it can get when you get past 10,000 thousand rows and a dozen columns. An RDBMS offers a better way to store large datasets with millions, sometimes even billions of rows. Using SQL, you can then retrieve and analyze this big data from the RDBMS in mere seconds β workloads that would instantly crash Excel or Google Sheets.
Plus multiple access is terrible in Excel. If two people try to edit a workbook at the same time, you'd run into issues. But databases are designed so that many users can read and write data simultaneously without conflicts.
Then there's the issue of data integrity. In Excel, you can easily overwrite or delete data by accident. Databases have features that ensure data integrity and consistency, meaning it helps prevent unwanted changes or deletions.
Most importantly for Data Analysts and Data Scientists, an RDBMS + SQL let's you query and analyze advanced relationships. In Excel you might use VLOOKUP to get data from one sheet based on data in another. In databases, tables can be "related" in complex ways, and SQL lets you query across these relationships easily.
For example, see all the tables music-streaming app Spotify would have, and their complicated relationships! Good luck trying to represent and analyze this in Excel!
In summary, while Excel is a fantastic tool for a range of tasks, when it comes to the issue of handling large datasets, and the associated problems of data integrity and concurrent access, it's no surprise why databases paired with SQL are the industry standard solution compared to Excel.
SQL was designed to look a LOT like plain English, which is why we love it so much compared to more confusing languages like Python and R!
But don't let SQL's simplicity and similarity to English fool you... there's a reason tricky FAANG SQL interview questions exist!
People casually use "SQL" interchangeably with "MySQL" and "PostgreSQL". While that's not technically correct, in most cases for beginners in the field, it doesn't matter too much unless you want to be pedantic.
But, if we're trying to be precise, SQL is general, high-level language for querying and manipulating relational databases (RDBMS). MySQL, Postgres, SQLite, and SQL Server are all RDBMs's (relational database management system). You use varying flavors of SQL syntax to query each unique RDBMS. For example, to query Postgres, you write PostgreSQL. To query MySQL... you write MySQL. Confusing naming, I know!
Here's the good news: the syntax for MySQL, PostgreSQL, SQL Server etc. aren't too radically different from one another. That means, if you do complete our SQL tutorial (which is taught with the PostgreSQL dialect), you should be able to adapt to MySQL or SQL Server pretty effortlessly!
We LOVE using PostgreSQL, and the associated Postgres RDBMS, for a few reasons:
Next Lesson
SQL SELECT π¦