A natural join links tables together by choosing the rows with common values using their common attribute. Typically, a natural join is a result of three stages.
The first stage creates the product of the two tables together.
Then the next stage has it such that it takes the output from the prior stage and only displays the rows in which the common attributes are equal with one another.
Then in the last stage, a PROJECT is performed on the results to have on a single copy of each attribute which removes the duplicate column.
The final result based on the natural join basically provides a result set that only has the copies of the matches between the two tables. It does not include any unmatched pairs from either table.
The syntax of a natural join looks like the following:
SELECT <column_list>
FROM <table1>
NATURAL [INNER,] JOIN <table2>;
Note that with a natural join, the default type of join is an inner join but it can also perform a left or right join that we’ll discuss in upcoming tutorials. Instead of a column list, we can also use the asterisk (*). This will include all columns from both tables including those that have the same name. However, it will only list the column that it is joining by once.
Let us build a simple set of tables of product and categories:
CREATE TABLE category ( category_id serial PRIMARY KEY, category_name VARCHAR (100) NOT NULL );
CREATE TABLE product ( product_id serial PRIMARY KEY, product_name VARCHAR (100) NOT NULL, category_id INT NOT NULL, FOREIGN KEY (category_id) REFERENCES category (category_id) );
INSERT INTO category (category_name)
VALUES ('Game'), ('Movie'), ('CD');
INSERT INTO product (product_name, category_id)
VALUES ('Call of Duty', 1), ('Final Fantasy', 1), ('Wizard of Oz', 2), ('Jaws', 2), ('Great Hits', 3), ('Journey', 3);
To join them using a natural join, we can set it as:
SELECT *
FROM category
NATURAL JOIN product;
Note that we could also swap the two tables in the query:
SELECT *
FROM product
NATURAL JOIN category;
You should notice that in both cases, the result set will first display the common column first which is the category_id. Then it will display all of the columns from the first table (other than the common column) and then all of the columns from the second table (other than the common column).
However, there are issues with a natural join as sometimes they can create unexpected results. Let’s recreate the tables but instead of having product_name and category_name, we may have them just called as “name” in both tables:
CREATE TABLE category ( category_id serial PRIMARY KEY, name VARCHAR (100) NOT NULL );
CREATE TABLE product ( product_id serial PRIMARY KEY, name VARCHAR (100) NOT NULL, category_id INT NOT NULL, FOREIGN KEY (category_id) REFERENCES category (category_id) );
INSERT INTO category (name)
VALUES ('Game'), ('Movie'), ('CD');
INSERT INTO product (name, category_id)
VALUES ('Call of Duty', 1), ('Final Fantasy', 1), ('Wizard of Oz', 2), ('Jaws', 2), ('Great Hits', 3), ('Journey', 3);
If we now tried to run the natural join:
SELECT *
FROM product
NATURAL JOIN category;
We get the following result:
So although we had the category_id in both tables, we also have names in both tables. This common column has different meanings for each table separately and being that the names between the two do not match, no rows are returned.