Sunday 8 March 2015

Query Optimization

A query is a request for information from a database.Query optimization is a function of  relational database management systems. The query optimizer attempts to determine the most efficient way to execute a given query by considering the possible query plans.the query optimizer cannot be accessed directly by users: once queries are submitted to database server, and parsed by the parser, they are then passed to the query optimizer where optimization occurs

A query is processed in four general steps:

  1. Scanning and Parsing
  2. Query Optimization or planning the execution strategy
  3. Query Code Generator (interpreted or compiled)
  4. Execution in the runtime database processor

1. Scanning and Parsing

  • When a query is first submitted (via an applications program), it must be scanned and parsed to determine if the query consists of appropriate syntax.
  • Scanning is the process of converting the query text into a tokenized representation.
  • The tokenized representation is more compact and is suitable for processing by the parser.
  • This representation may be in a tree form.
  • The Parser checks the tokenized representation for correct syntax.
  • In this stage, checks are made to determine if columns and tables identified in the query exist in the database and if the query has been formed correctly with the appropriate keywords and structure.
  • If the query passes the parsing checks, then it is passed on to the Query Optimizer.

2. Query Optimization or Planning the Execution Strategy

  • For any given query, there may be a number of different ways to execute it.
  • Each operation in the query (SELECT, JOIN, etc.) can be implemented using one or more different Access Routines.
  • For example, an access routine that employs an index to retrieve some rows would be more efficient that an access routine that performs a full table scan.
  • The goal of the query optimizer is to find a reasonably efficient strategy for executing the query (not quite what the name implies) using the access routines.
  • Optimization typically takes one of two forms: Heuristic Optimization or Cost Based Optimization
  • In Heuristic Optimization, the query execution is refined based on heuristic rules for reordering the individual operations.
  • With Cost Based Optimization, the overall cost of executing the query is systematically reduced by estimating the costs of executing several different execution plans.

3. Query Code Generator (interpreted or compiled)

  • Once the query optimizer has determined the execution plan (the specific ordering of access routines), the code generator writes out the actual access routines to be executed.
  • With an interactive session, the query code is interpreted and passed directly to the runtime database processor for execution.
  • It is also possible to compile the access routines and store them for later execution.

4. Execution in the runtime database processor


  • At this point, the query has been scanned, parsed, planned and (possibly) compiled.
  • The runtime database processor then executes the access routines against the database.
  • The results are returned to the application that made the query in the first place.
  • Any runtime errors are also returned.

Query Optimization

We divide the query optimization into two types: Heuristic (sometimes called Rule based) and Systematic (Cost based).

Heuristic Query Optimization

  • Oracle calls this Rule Based optimization.
  • A query can be represented as a tree data structure. Operations are at the interior nodes and data items (tables, columns) are at the leaves.
  • The query is evaluated in a depth-first pattern.
  • Consider this query from Elmasri/Navathe.

Systematic (Cost based) Query Optimization

  • Access cost to secondary storage (hard disk)
  • Storage Cost for intermediate result sets
  • Computation costs: CPU, memory transfers, etc. for performing in-memory operations.
  • Communications Costs to ship data around a network. e.g., in a distributed or client/server database.

No comments:

Post a Comment