4 Types of SQL JOIN Every Data Scientist Should Know: Visual Representation | by Zoumana Keita | Aug, 2022

Complete guide to creating tables, and running all four types of SQL JOINS.

Photo by Caspar Camille Rubin on Unsplash

Relational databases try to model real-life entities and their relationships. However, as a data practitioner, you will be dealing with more than one table when interacting with those databases. Efficient interaction with those tables requires a better understanding of the joint statements because each one gives a different result.

This article will first build your understanding of each JOIN clause before walking you through hands-on practice. Before that, we will be creating from scratch the relevant tables.

To better understand the process, we will need the following two tables created. This section aims to create those tables and populate them with some sample data.

Create Tables

Creating a table is straightforward, and both tables are created as follows in our database. This section is not mandatory to understanding the rest of the article.

Student Table

  • Each row StudentTable will represent a single student’s information.
SQL script to Create Student Table

Teaching Assistant Table

Each row of theTeachingAssistantTablewill represent a single teacher’s information.

SQL script to Create Teaching Assistant Table

Populate Tables

Now that our tables are created, we can finally populate them with new data using the INSERT INTO [table name] VALUES statement.

Student Table

SQL script to upload data into Student Table

Teaching Assistant Table

Below are the final representations of the two tables.

Student (StudentTable) and TA (TeachingAssistantTable) tables (Image by Author)

If you have been on the SQL journey, you might have already heard of Joins, but why do we use them? TheJOINstatement is an important clause in relational databases such as Oracle, Microsoft SQL, MySQL, PostgreSQL, just to name a few. It is used to create a single table that is the combination of two or more tables, based on some logical relationships between those tables.

Now that you know what it is used for, let’s explore the [X] types of JOIN.

1. Inner Join

Let’s start with this simple example. We want to know which Students are also Teachers. In other words, where are the matching rows in both Student and Teacher tables, meaning their intersections. We can observe that both of our tables contain Ibrahim, Mamadou, and Fatim.

Here is whereINNER JOIN comes in handy. It only looks at the intersections between two tables based on the column specified in theONclause. For instance, the following instruction gets all the intersected rows based on the Email column.

SELECT * 
FROM StudentTable st
JOIN TeachingAssistantTable tat ON st.Email = tat.Email;
  • SELECT* means “get all the columns” from all the tables.
  • FROM StudentTable st creates st as an alias of the StudentTable, meaning that whenever we want to use TeacherTable, can use st instead. The same applies to TeachingAssistantTable tat
  • JOIN TeachingAssistantTable tat ON st.Email = tat.Email means only get the rows having the same Email from both tables.

This is the graphical result we get from the previous SQL command.

INNER JOIN Applied to StudentTable and TeachingAssistantTable (Image by Author)

The join does not only apply to key columns as we can see for Email but any column the user decides to use in the ONclause. For instance, it could be:ON st.Student = tat.Teacher which would generate the table where student name equals teacher’s name.

2. Left Outer Join / Left Join

This type of join deals with joining two types of table: the primary table and the secondary table. The primary table is the one on the LEFT of the equality sign. It works like this:

  • Identify the column specified in the ON clause.
  • Take all the rows from the primary table, without any distinction.
  • All the rows in the secondary table that do not match the primary table based on the column in the ON clause are discarded (replaced with NULL).
SELECT * 
FROM StudentTable st LEFT JOIN TeachingAssistantTable tat
ON st.Email = tat.Email;

This is the graphical result we get from the LEFT JOIN.

LEFT JOIN Applied to StudentTable and TeachingAssistantTable (Image by Author)

3. Right Outer Join / Right Join

This join is very similar to the previous one. The only difference is that it considers the table on the right to be the primary table, and the one on the left to be the secondary table.

SELECT * 
FROM StudentTable st RIGHT JOIN TeachingAssistantTable tat
ON st.Email = tat.Email;

This is the graphical result we get from the RIGHT JOIN.

RIGHT JOIN Applied to StudentTable and TeachingAssistantTable (Image by Author)

4. Full outer join

A full outer join is the combination of the left and right joins. In this statement, there is no room for discarding rows from any of the two tables, they are all taken into consideration. It works as follows:

  • Perform a right outer join on the original tables and consider the result as temporary table 1.
  • Run a left outer join on the original tables and consider the result as temporary table 2.
  • Concatenate temporary tables 1 and 2 as the final result.
SELECT * 
FROM StudentTable st FULL OUTER JOIN TeachingAssistantTable tat
ON st.Email = tat.Email;
FULL OUTER JOIN Applied to StudentTable and TeachingAssistantTable (Image by Author)

From all the previous final results, we notice that we selected all the columns from all the tables, which creates duplicate values for columns such as Email, Student, and Teacher. However, we can specify in the SELECT clause the columns we want in the final result. For instance, the following instruction returns all the columns in the Student Name, its Email, Gender, Age, and Degree.

SELECT st.Student, st.Gender, st.Email, st.Age, tat.Degree
FROM StudentTable st INNER JOIN TeachingAssistantTable tat
ON st.Email = tat.Email;
INNER JOIN Applied to StudentTable and TeachingAssistantTable with column selection (Image by Author)

This article has covered the four main join cases in SQL. The versatility of SQL can provide you with the skills to assist businesses in analyzing data and taking smart decisions.

If you like reading my stories and wish to support my writing, consider becoming a Medium member to unlock unlimited access to stories on Medium.

Feel free to follow me on Medium, Twitter, or say Hi on LinkedIn. It is always a pleasure to discuss AI, ML, Data Science, NLP, and MLOps stuff!

Leave a Reply

Your email address will not be published.