Chris Castiglione Teacher at One Month. Faculty at Columbia University where I teach Digital Literacy. I write about coding, the internet, and social impact.

Python vs SQL – What’s the Difference?

4 min read

Python vs SQL - What’s the Difference?

Python vs. SQL | Pros and Cons

Approximately twenty years ago, there were only a handful of programming languages that a software engineer would need to know well. Even back then, Structured Query Language, or SQL, was the go-to language when you needed to gain quick insight on some data, fetch records, and then draw preliminary conclusions that might, eventually, lead to a report or to writing an application. 

Nowadays, data comes in many shapes and formats, not necessarily synonymous with relational databases any longer. Data exists within CSV files, plain text, and on the web, as well as in many other different formats. It is in this maze of data that Python’s broad toolset of libraries shine.

With a vast set of helper libraries and associated platforms, Python is an excellent programming language for quick, and iterative data exploration. Python’s set of libraries include everything from data visualization to statistical analysis, making it convenient for developers to jump into data analysis and begin identifying patterns.

Use cases for SQL and Python

SQL is designed to query and extract data from tables within a database. SQL is good at allowing you as a developer, to seamlessly join (or merge) several data together. Combining data from multiple tables is a key strength.

However, SQL isn’t designed for manipulating or transforming data into other formats. Types of higher-level data manipulation that are common within data science, such as statistical analysis, regression tests, and time-series data manipulation, are is very difficult to achieve using SQL exclusively.

Python, on the other hand, has a well-known data analysis Library called Pandas, which has been specially designed for data analysis and manipulation.

Python is particularly well suited for structured (tabular) data which can be fetched using SQL and then require farther manipulation, which might be challenging to achieve using SQL alone. This leads to the following question: When should you use Python vs. SQL?

SQL and Python Differences

Use SQL to retrieve the essential data required for the analysis, and then use specialized Python libraries to process it.

Let’s take a look at an example using SQL: 

Say we have the following database table with hotel information called hotels. This is a small sample dataset of the information contained within the table. This table could include thousands or millions of rows (entries), each one representing a hotel.

NameCityStreetPriceCurrencyStarsPhone
1 Via dei ValeriRomeVia Dei Valeri 154

USD

5

108 Minutes HostelMoscowMalaya Ordynka Street 5/6 Bld.4 App.324

USD

1

7(495) 795-4259

1st Arbat Hostel at NovinskyMoscowNovinsky Bulvar 1/2 App 6426

USD

1

7(967)0130047

A Plus HostelPragueNa Florenci 14 13/3314

USD

3

Aalborg Hotel AmsterdamAmsterdamSarphatipark 10639

USD

3

106 – 1073

Adagio Access Vanves Porte de ChatillonParis5 rue Aristide Briand,53

USD

3,5

33140955600

Advisor Hotel HanoiHanoi51 Nguyen Sieu Street Hoan Kiem13

USD

2

39935766

Afrian House HotelCairo15 Emad El Din Street11

USD

1

Agon Aldea Hotel BerlinBerlinBulowstrasse 19-2222

USD

3

Agustina Serviced Apartments Quezon CityManilaNo. 9 18th St. Brgy Kalusugan (New Manila)27

USD

3

2098647-25

Let’s compare how we could query this dataset using SQL versus Python using Pandas.

Type of ActionSQLPython using Pandas
Retrieve all the hotelsSELECT * FROM hotels  hotels
Retrieve the first 3 hotelsSELECT * FROM hotels LIMIT 3 hotels.head(3)
Get the names of hotels in ParisSELECT Name FROM hotels WHERE City = ‘Parishotels[hotels.City == ‘Paris‘].Name

As we can see, so far, both are quite similar. However, things start to become more interesting when we need to do more complex queries.

Examples when Python is easier to use than SQL

Let’s say we did some preliminary data querying, and now have a dataset table called hotels_by_city, that contains the number of hotels per city, which looks as follows:

CityHotel_Count
Paris567
Moscow482

With this new dataset, let’s say we want to order results by Hotel _Count and only select the top ten cities with the largest count. We could do this as follows:

SQLPython using Pandas
SELECT City FROM hotels_by_city ORDER BY Hotel_Count DESC LIMIT 10hotels_by_city.nlargest(10, columns=’Hotel_Count‘)

You can start to see that the Pandas syntax is becoming easier to grasp and write, as the examples of the data being retrieved, become a bit more complicated.

Let’s add a layer of complexity, so we now need to determine using this new dataset table, which is the next ten hotels, after the top ten.

SQLPython using Pandas
SELECT City FROM hotels_by_city ORDER  BY Hotel_Count DESC LIMIT 10 OFFSET 10hotels_by_city.nlargest(20, columns=’Hotel_Count‘).tail(10)

Notice how the Pandas syntax remains almost unaltered as complexity increases, whereas the SQL syntax becomes more complex to read.

Another way to think about differences between Python and SQL is that SQL provides the setup and Python takes you from there. SQL allows you to start with a dataset which is structured into tables. Once you have the tables, you use Python to branch off different analyses in various directions.

Types of SQL

MySQL (pronounced MY Sequel) is the most popular type of SQL database. The super-geek way to refer to this type of SQL database as a relational database management system (or RDBMS). More or less, that means it’s a database with many tables that link together in some way. All WordPress installations by default are using MySQL.

PostgreSQL (often just referred to as “Postgres”) is similar to MySQL but is known to be more durable. However, setting up Postgres for newbies can often be difficult, so if you are new to this, MySQL is probably you’re best bet.

SQLite (pronounced Sequel Lite) is a very portable, and compact database. SQLite is usually best used for testing and very, very simple applications.

When to choose Python vs. SQL?

SQL contains a much simpler and narrow set of commands compared to Python. In SQL, queries almost exclusively use some combination of JOINS, aggregate functions, and subqueries functions.

Python, by contrast, is like a collection of specialized Lego sets, each with a specific purpose. Each library has custom pieces for building a particular type of a particular thing: Pandas for data analysis, PyPDF2 for PDF manipulation, Scikit-learn for machine learning, and much more.

Which language to learn first Python or SQL?

We think that the best place to start is by learning  SQL. SQL is an essential tool for any kind of data retrieval from relational databases, even if you’re primary job has little or nothing to do with data analysis.

Even if the SQL query is ten times longer then the equivalent Python script, it feels easier to do then doing the equivalent in Python because it reads like English. Remember, learning is more laborious than typing, and takes more time.

Once you have a reasonable knowledge of SQL, that you can at least write a query that joins two tables together, attempt to rewrite some of that same logic in Python using Pandas.

Comparison Chart

PythonSQL
Founded Created in 1991 by Guido Van Rossum Created by Donald D. Chamberlin and Raymond F. Boyce in 1974
ProsA versatile, dynamic, easy, broad scopeA very easy learning curve, English-like syntax.
Cons Steeper learning curve then SQLMostly applicable only in relational databases (with a few exceptions)
Web Libraries A library for almost anythingNone, SQL is just SQL.
Types  Python 2 and Python 3 are the two most common types of PythonMySQL, SQL Server, PostgreSQL, SQLite
Usage Google

YouTube

DropBox

Almost every high-tech company.

Furthermore, Python is widely used in many organizations for data-science and exploration.

Nearly every website on the internet uses a SQL database for its back-end.

Thanks to Eduardo Freitas for his help co-writing and proofing this piece! 

Learn to Code Comment Avatar
Chris Castiglione Teacher at One Month. Faculty at Columbia University where I teach Digital Literacy. I write about coding, the internet, and social impact.

3 Replies to “Python vs SQL – What’s the Difference?”

  1. Nice Info
    On varies b/w python and SQL, I totally agree with this article.
    I discovered this which exceptionally utilizes full.
    Extraordinary article and data continue sharing more!
    I thought that it was exceptionally helpful.
    Thank you for sharing your great knowledge with us.
    Looking forward to your further tips on similar topics….. Heap Of Thanks

  2. I have been thinking a lot what language should I start learning first. Thanks for your piece of advice!

Leave a Reply

Your email address will not be published. Required fields are marked *