Guide to SQL joins: Intuition and Implementation
Introduction
The purpose of a join clause is to combine rows from different tables, based on a related column between them which is known as the “Key”.
The various types joins can be intuitively understood with the help of Venn diagrams mentioned below, we will be talking about each of these joins separately and we will also see how to implement them using SQL.
The dataset that has been used to illustrate the examples in this article is publicly available and can be downloaded from this link
The below image gives us an un idea about the tables that we will be using in this article:
Left Inclusive
The first join that we will be talking is the Left inclusive join, here our final goal is to obtain a table that contains all the records from the left table mapped to the corresponding row for the right table.
1 2 3 4 5 6 |
create table LIJ as select A.Customer_ID,State,Age from Customer_state A left join Customer_Age B on A.Customer_ID = B.Customer_ID; |
Let us check out the number of records in this table:
1 |
select count(*) from LIJ; |
As you can notice above that there are certain nulls in the Age column, these are those records that are present in the left table but not in the right table. Also the count of records in the final table is 113 which is the same as our left table.
Left Exclusive
In the Left exclusive join, our final goal is to obtain a table that contains all the records that are present in the left table but not in the right table.
1 2 3 4 5 6 7 |
create table LEJ as select A.Customer_ID,State,Age from Customer_state A left join Customer_Age B on A.Customer_ID = B.Customer_ID where B.Customer_ID is NULL; |
Let us check out the number of records in this table:
1 |
select count(*) from LEJ; |
As we can notice the output that all the rows in the Age column are null, all these records are present in the left table but not in the right table. Also the count of records in the final table is 20 which means that the left table has 20 records that are not present in the right table.
Full Outer Inclusive
In the Full Outer inclusive join, our final goal is to obtain a table that contains all the records that are present in the both the tables as is represented in the Venn diagram.
1 2 3 4 5 6 7 8 |
create table FOI as select A.Customer_ID as A_Customer_ID, B.Customer_ID as B_Customer_ID, State,Age from Customer_state A left join Customer_Age B on A.Customer_ID = B.Customer_ID union select A.Customer_ID as A_Customer_ID, B.Customer_ID as B_Customer_ID, State,Age from Customer_state A right join Customer_Age B on A.Customer_ID = B.Customer_ID; |
Let us check out the number of records in this table:
1 |
select count(*) from FOI; |
We can conclude from the above results that in total there are 125 distinct Customer_ID’s available in both these datasets.
Inner Join
We perform the inner join to obtain the records that are common to both the left and the right table.
1 2 3 4 |
Create table IJ as select A.Customer_ID as A_Customer_ID, B.Customer_ID as B_Customer_ID, State,Age from Customer_state A inner join Customer_Age B on A.Customer_ID = B.Customer_ID; |
Count of records:
1 |
select count(*) from IJ; |
We can conclude from the above results that both the tables contain 93 records common, we can also notice that there are no nulls in the State and Age columns as these Customer_IDs are present in both the tables.
Right Inclusive
Right Inclusive join is similar to the left inclusive join here the only difference is that the final output must contain all the records of the right table and the corresponding row from the left table.
1 2 3 4 5 6 |
create table RIJ as select B.Customer_ID,State,Age from Customer_state A right join Customer_Age B on A.Customer_ID = B.Customer_ID; |
1 |
select count(*) from RIJ; |
We can conclude from the outputs that there are a total of 105 Customer_IDs in the right table.
Right Exclusive Join
Our main goal while performing the right exclusive file is to obtain a table that contains all the records that are present in the right table but not in the left table.
1 2 3 4 5 6 7 |
create table REJ as select B.Customer_ID,State,Age from Customer_state A right join Customer_Age B on A.Customer_ID = B.Customer_ID where A.Customer_ID is NULL; |
Counts:
1 |
select count(*) from REJ; |
We can conclude from the above results that there are 12 Customer_IDs that are present in the right table but not in the left table.
Full Outer Exclusive
The full Outer Exclusive join is performed to obtain all the records from both the tables excluding the common records as is illustrated in the Venn Diagram.
1 2 3 4 5 6 7 8 9 10 |
create table FOE as select A.Customer_ID as A_Customer_ID, B.Customer_ID as B_Customer_ID, State,Age from Customer_state A left join Customer_Age B on A.Customer_ID = B.Customer_ID where A.Customer_ID is NULL or B.Customer_ID is NULL union select A.Customer_ID as A_Customer_ID, B.Customer_ID as B_Customer_ID, State,Age from Customer_state A right join Customer_Age B on A.Customer_ID = B.Customer_ID where A.Customer_ID is NULL or B.Customer_ID is NULL; |
1 |
select count(*) from FOE; |
The above results tell us that there are 32 records present across both the tables that are not common.
We have now successfully seen how and when to use the SQL joins and this knowledge can now be applied to real-world data.