A reference table is frequently also called a lookup table. Any database is bound to have at least a few of these types of tables. These reference tables generally will have at least two columns set up as key-value pairs where the first column is the key while the second column (or more columns) would have the value. These reference tables are used when there is no need to store the history for the reference data. This is often the case when we have data that is not going to change or will rarely change. Common instances that are used could be product codes, stock exchange symbols, international currency codes, country abbreviations, or US state code abbreviations. They would be either integers or short string codes.
There is no standard difference between the lookup and the reference table. The accepted difference between a lookup table and a reference table is that the reference table is meant to have a long value that’s stored outside of the main table like a customerID. The lookup table generally means that it has a list of allowable values for a certain field. However, in both cases, they are very similar to one another.
The primary key of the reference table is either an auto-increment value or an abbreviation while the value column will have the full reference. By using this lookup table, we can ensure that the values that are being referenced will come from a set number of items. We can add to this if needed but this will be rare. For example, with our movie database, we have the Genre table as a reference table. The GenreName may contain values like Action, Adventure, Comedy, Drama, Horror, Romance, Science Fiction, and so forth. If we had the genre field being a free form without a lookup table, a user could enter in anything they wanted like “Action”, “action”, “Action Film”, ”Action Genre”, etc. There would be quite a number of potential entries that would make it quite difficult to manage so these lookup and reference tables will help eliminate a lot of the data redundancies. Having a lookup table like this, using foreign keys, we can ensure the set of values will only be from a certain list. This also helps increase the performance for some queries since we have fewer values to search on.
In looking at our tables in PostgreSQL, we can see these lookup tables being used for the genre:
We will also see it for media_type:
In both cases, we will have a set number of rows in each that won’t change. They could also be called reference tables as the media_type_id and genre_id are used in the track table in place of the full text in the name for each of the tables. One of the benefits of using these values is the flexibility for data modification. If you needed to update a media type name, rather than change it in every referenced value, it can be done in the media_type table. There may be some questions about what types of genres that we offer. Rather than querying the track table to find out, it’s much faster to go directly to the lookup table to find those values. This approach helps to centralize the data that’s about the key.