Wednesday, September 2, 2009

DELETE vs. TRUNCATE

“If both Truncate and Delete commands delete all the rows of a table, then what is the difference between DELETE and TRUNCATE command?

Any interviewer may ask this question. The most expected replies could be any of the below:

  1. DELETE is a DML while TRUNCATE is a DDL statement
  2. DELETE is less drastic, in that a deletion can be rolled back whereas a truncation cannot be.
  3. DELETE is also more controllable, in that it is possible to choose which rows to delete, whereas a truncation always affects the whole table.
  4. DELETE is, however, a lot slower and can place a lot of strain on the database. TRUNCATE is virtually instantaneous and effortless

(To know more such answers, look through Geek Interviews Page. I don’t take liability for the irrelevant answers :))

Alright! But how does one prove that TRUNCATE is virtually instantaneous, effortless and is really of better performance than DELETE? How does it work internally?

As said in the first point, TRUNCATE is a DDL command and it operates within the data dictionary and affects the structure of the table, not the contents of the table. However, the change it makes to the structure has the side effect of destroying all the rows in the table.

An insight to it…

The data dictionary will have the definition of data and also table’s physical location. When a table is created, a table is allocated a single area of space (fixed size) in the database’s data files. This is known as an extent and initially will be empty. Then, as rows are inserted into the table, the extent fills up. Once an extent is full, more extents will be allocated to the table automatically. Therefore, a table may consist of one or more extents which hold the rows. Along with tracking the extent allocation, the data dictionary also tracks how much of the space allocated to the table has been used. This is done with the high water mark. The high water mark is the last position in the last extent that has been used; all space below the high water mark has been used for rows at one time or another, and none of the space above the high water mark have been used yet.

It should be noted that it is possible for there to be plenty of space below the high water mark that is not being used at the moment; this is because of rows having been removed with a DELETE command. Inserting rows into a table pushes the high water mark up. Deleting them leaves the high water mark where it is; the space they occupied remains assigned to the table but is freed up for inserting more rows. Truncating a table resets the high water mark. That is, within the data dictionary, the recorded position of the high water mark is moved to the beginning of the table’s first extent. As Oracle assumes that there can be no rows above the high water mark, this has the effect of removing every row from the table. The table is emptied and remains empty until subsequent insertions begin to push the high water mark back up again. In this manner, one DDL command, which does little more than make an update in the data dictionary, can annihilate billions of rows in a table.

0 comments: