As we discussed in the prior tutorial, natural joins are not ideal to use as they will join all common columns between the two tables. Although in some cases this may not be an issue, there can be common columns for a variety of purposes.
Let us take a look at the set of product and category tables that had an issue with the natural join:
CREATE TABLE category ( category_id serial PRIMARY KEY, name VARCHAR (100) NOT NULL ); CREATE TABLE product ( product_id serial PRIMARY KEY, name VARCHAR (100) NOT NULL, category_id INT NOT NULL, FOREIGN KEY (category_id) REFERENCES category (category_id) ); INSERT INTO category (name) VALUES ('Game'), ('Movie'), ('CD'); INSERT INTO product (name, category_id) VALUES ('Call of Duty', 1), ('Final Fantasy', 1), ('Wizard of Oz', 2), ('Jaws', 2), ('Great Hits', 3), ('Journey', 3);
When we tried to run a natural join:
SELECT * FROM product NATURAL JOIN category;
We get the following result:
This was due to the fact that the tables had two common attributes. However, one of them, name, didn't have values that match between the two tables. The JOIN and USING clauses will help with this. One of the key requirements is that it does depend on the two tables having a common attribute, but we can pick which attribute if there is more than one common one to join on.
The syntax for the command looks like this:
SELECT <columnlist> FROM <table1> INNER JOIN <table2> USING (<commonattribute>);
In our set of tables, we can use this to only use category_id:
SELECT * FROM product INNER JOIN category USING (category_id);
Since name also exists in both tables, we can use name:
SELECT * FROM product INNER JOIN category USING (name);
However, as you see, since none of the names between the tables matched, no rows were returned.
Let us go back to some of our tables like the album and artist table.
Right now, we have only focused on each individual table. We don’t know which album has which artist other than the artist_id unless we look it up. Since the artist_id exists in both tables, we can use the JOIN and USING clause to join the tables on the artist_id:
SELECT * FROM album INNER JOIN artist USING (artist_id);
Now our data is starting to make a bit more sense as we join the data together.
We can add more than just two tables together by adding additional INNER JOIN statements with USING. If we take our prior example of the artist and album, we can also identify the tracks on each:
SELECT * FROM track INNER JOIN album USING (album_id) INNER JOIN artist USING (artist_id);
However, as you see above, as we start to join more tables, by using the *, we may have too many columns being returned. We can specify in the SELECT clause which columns should be returned. It is a best practice to use the format <tablename>.<columnname> when we list the columns. Otherwise, if a column name exists in more than one table, the database does not know which column you want to display and returns an ambiguous error. For example, the “name” exists in the track and artist table. If we tried to simply include the name column in the SELECT clause:
SELECT name FROM track INNER JOIN album USING (album_id) INNER JOIN artist USING (artist_id);
We will get the following error message:
Note with an INNER JOIN, you can add in the artist_id, album_id, and track_id as the result set only has one item. However, this is not a good practice.
SELECT album_id, artist_id, track_id, album.title, artist.name, track.name FROM track INNER JOIN album USING (album_id) INNER JOIN artist USING (artist_id) ORDER BY album_id, artist_id, track_id;
It would be a better practice to use the same labeling as the following:
SELECT album.album_id, artist.artist_id, track.track_id, album.title, artist.name, track.name FROM track INNER JOIN album USING (album_id) INNER JOIN artist USING (artist_id) ORDER BY album.album_id, artist.artist_id, track.track_id;
Source: Authored by Vincent Tran