Online College Courses for Credit

Query Processing Bottlenecks

Query Processing Bottlenecks

Author: Sophia Tutorial

Recall the five components that can affect performance due to traffic loads.

See More

what's covered
This tutorial explores some of the query processing bottlenecks in two parts:
  1. Introduction
  2. Types of Bottlenecks

1. Introduction

The various query processing bottlenecks could occur in a variety of scenarios. It does depend on the purpose of the database whether it is a transactional or analytical database, the number of concurrent users, and the database selection itself. The most common type of bottleneck generally comes from different transactions competing for the same data rows. The issue here would come from shared buffers and locks on the data that could generate a variety of issues that would generate bottlenecks. There can be many other criteria that may be specific to the hardware due to memory, CPU, or I/O.

2. Types of Bottlenecks

When it comes to memory bottlenecks, they can be due to insufficient physical memory on the system that is available to the database. They can also be due to the database activities currently taking up the available memory. Typically, if we are seeing longer query execution times, excessive input/output on the system, or even out-of-memory messages in the logs, this would be a good indicator that there is an issue. Typically to avoid those memory-related bottlenecks, we want to run the query optimizers to improve the queries by tuning the queries or potentially adding more physical memory. Without enough memory/RAM, all the running processes on the system would have to compete for this resource as it is shared among all processes.

If there isn’t enough storage on the system to support the database, we can see a lot of input/output types of bottlenecks. In those situations, we will see longer response times from the database or time-outs on queries. To help avoid these types of issues, it would be useful to set up monitoring on the database to identify which activities will use a lot of storage and figure out the best ways to fix those problems. In addition, the storage/hard disks will also depend on the speed and transfer rates of the disk. The storage is not only used for the end-user data but also for the rest of the system data as well. The OS if there is not enough physical memory can use the hard disk for virtual memory.

With CPU bottlenecks, the most common issue is due to insufficient hardware resources. We can generally identify these issues by looking at the logs to see if the database is heavily using the CPU. Generally, to help avoid CPU-related issues, it can be beneficial to have only the database running on the server and running all the other software on a separate machine. It can help to identify the CPU hogs and then tune the issues accordingly. If the CPU has this high utilization, the database would be very slow for work that is performed and would affect all of the other applications and processes on the system.

Network issues can also be a bottleneck although the issues are typically external to the database itself. However, this component helps to connect the database server to the clients. If we have the network nodes accessing the database at the same time, there can be a limited amount of bandwidth that has to be shared among all clients.

Application code is also another common issue that generally isn’t caused by hardware, but the limitation of the application code can be improved with code optimization, query optimization as long as the underlying database is well created.

Query processing bottlenecks can be caused by issues internal to the database as well as external.

Source: Authored by Vincent Tran