Up to this point, all the SQL statements we’ve been querying have been using a single table. However, when we can join the data in tables together is where databases really start to shine. Joins are done when you combine data from two or more tables linking them through common attributes. There are many different types of joins including natural joins, equijoin, outer join, cross join and self joins that we will work through in upcoming tutorials.
Let us first create two tables for representatives and departments. The common attribute between the two tables is the representative_id from the representative table and the manager_id in the manager table. Notice that in this case, the two column names are different from one another but linked through the primary key.
CREATE TABLE representative ( representative_id INT PRIMARY KEY, first_name VARCHAR (30) NOT NULL, last_name VARCHAR (30) NOT NULL );
CREATE TABLE department ( department_id INT PRIMARY KEY, department_name VARCHAR (100) NOT NULL, manager_id INT, constraint fk_manager FOREIGN KEY (manager_id) REFERENCES representative (representative_id) );
INSERT INTO representative (representative_id, first_name, last_name) VALUES (1, 'Bob','Evans'), (2, 'Tango','Rushmore'), (3, 'Danika','Arkane'), (4, 'Mac','Anderson');
INSERT INTO department (department_id, department_name,manager_id) VALUES (1, 'Sales', 1), (2, 'Marketing', 3), (3, 'IT', 4), (4, 'Finance', null), (5, 'Support', null);
In looking at the data, we have Tango Rushmore that is not a manager, and the Finance and Support departments not having managers yet.
We can represent the data through the use of a Venn diagram where the left circle represents the data from the Representative table and the right table contains the data from the department table:
Individually, the data looks like the following:
Note that because we do not have the same name for a column in the two tables, we cannot complete a natural join. However, this natural join works in the same way that an inner join does. However, the difference is that the column name is identified with the inner join. The basic inner join between the two tables should look like the following:
SELECT * FROM representative JOIN department ON representative.representative_id = department.manager_id;
We will get into the specifics of the syntax in later tutorials. With this scenario of this inner join, we are finding the data in the representative_id in the representative table that matches the department’s manager_id. There should be 3 that match:
The Venn diagram below illustrates the rest from the inner join:
This type of join is the most common type of join as typically we want to be able to identify where data matches between two or more tables.
There are other joins as well including the OUTER join. The OUTER join allows us to get data that may not fully match between the two tables:
We may want to use a left join to find data that exists in both tables but also data in the left table that does not match. This Venn diagram illustrates this left join.
We could also want to find data that exists in the left table but does not match with data on the right table. This would be considered a left outer join:
We can get the data that exists in both tables along with the data that does not match from the right table using a right join:
We could also do the same to just have data in the right table only using a right outer join: