Database Transaction and Concurrency Control

Transaction

A transaction is an action or series of actions that are being performed by a single user or application program by a single user or application program, which reads or updates the contents of the database.

Transaction execution process:

· Start transaction

· Execute DML

· Check error in transaction execution

· If error occur than Rollback transaction

· Otherwise commit transaction

Transaction Management:

Transaction management handles all transactions property in DBMS. Database transactions are the events or activities such as series of data read or write operations on data objects stored in database system.

A transaction in a database system must maintain ACID property:

· Atomicity

· Consistency

· Isolation

· Durability

These properties ensure that a database remains stable state after the transaction is executed. It ensure accuracy, completeness and data integrity.

Atomicity

The atomicity property identifies that the transaction is atomic. This property ensures that either all the operations of a transaction reflect in database of none. Any updates that a transaction might effect on a system are completed in their entity. If for any reason an error occurs and the transaction is unable to compute all of its steps, then the system is returned to the state it was in before the transaction was started. Atomicity is also known as the ‘All or nothing rule’.

It involves following two operations:

Abort: if a transaction aborts, changes made to database are not visible.

Commit: if a transaction commits, changes made are visible.

The transaction should fail without executing any of the operation or it should process both the operations, the atomicity property ensures that. Not returning the system to its original state known as ROLLBACK.

Consistency

This means that integrity constraints must be maintained so that the database is consistent before and after the transaction. It refers to the correctness of a database. A transaction enforces that at the end of any transaction the system is in a valid state.

Isolation

In a database system where more than one transaction are being executed simultaneously and in parallel, the property of isolation states that all the transaction will be carried out and executed as if it is the only transaction in the system. No transaction will affect the existence of any other transaction. For every pair of transactions, one transaction should start executing only when the other finished.

Durability

Once a transaction completes successfully, the changes it has made into the database should be permanent even if there is a system failure. The recovery management component of database system ensures the durability of transaction.

Once a transaction commits, the system must guarantee that the results of the operations will never be lost, in spite of subsequent failures.

Concurrency control:

It is a process to ensure that data is updated correctly and appropriately when multiple transactions concurrently executed in DBMS.

When there is multiple transactions executing at the same time on same data, it may affect the result of the transaction. Hence it is necessary to maintain the order of execution of those transactions. In addition, it should not alter the ACID property of a transaction.

In order to maintain the concurrent access of transactions different protocols are introduced:

· Lock based protocol

· Time-stamp protocol

· Optimistic method of concurrency control

Lock based protocol

Lock is in other words called an access. In this type of protocol any transaction will not be processed until the transaction gets the lock on the record. That means any transaction will not retrieve, insert update or delete the data unless it gets the access to that particular data.

These are further classified into:

Lock Granularity

A database is basically represented as a collection of named data items. The size of the data item chosen as the unit of protection by a concurrency control program is called Granularity.

It can take place at the following level.

· Database level

· Table level

· Page level

· Row (Tuple) level

· Attributes (Field) level

Lock Types:

The DBMS mainly uses following types of locking techniques:

Binary locking

In binary lock, data can either be locked or unlocked. It will have only these two states. If locked than it can be locked for retrieve, insert, update or delete the data. If unlocked than it can be unlocked for not using the data.

Shared locking

Shared lock is placed when we are reading the data, multiple shared locks can be placed on the data but, when a shared lock is placed no exclusive lock can be placed.

Exclusive locking

Exclusive lock is placed when we want to read and write the data. This lock allows both the read and write operation, once this lock is placed on the data no other lock can be placed on the data until exclusive lock is released.

Deadlocks

A deadlock is a condition wherein two or more tasks are waiting for each other in order to be finished but none of the task is willing to give up the resources that other task needs. In this situation no task ever gets finished and is in waiting state forever.

Coffman conditions

Coffman stated four conditions for a deadlock occurrence:

Mutual exclusion condition: there must be at least one resource that cannot be used by more than one process at a time.

Hold and wait condition: A process that is holding a resources can request for additional resource that are being held by other process in the system.

No pre-emption condition: A resource can not be forcibly taken from a process. Only the process can release a resource that is being held by it.

Circular wait condition: A condition where one process in waiting for a resource that is being held by second process. And the second process is waiting for third process and so on, and the last process is waiting for the first process. Thus making a circular chain of waiting.

Deadlock handling techniques:

1. Deadlock prevention

We have learnt that if all the four Coffman conditions hold true then a deadlock occurs so the preventing one or more of them could prevent the deadlock.

Removing mutual exclusion

All resources must be sharable that means at a time more than one process can get a hold of the resources. That approach is practically impossible.

Removing hold and wait condition

This can be removed if the process acquires all the resources that are needed before starting out. Another way to remove this is to enforce a rule of requesting resource when there are none in held by the process.

Preemption of resources

This can result in rollback and thus this needs to be avoided in order to maintain the consistency and stability of the system.

Avoid circular wait condition

This can be avoided if the resources are maintained in a hierarchy and process can hold the resources in increasing order of precedence. This avoid circular wait. Another way of doing this is to force one resource per process rule. A process can request for a resource once it releases the resource currently being held by it. This avoids circular wait.

2. Deadlock detection

This technique allows deadlock to occur, but then, it detects and solve it. Here, a database periodically checked for deadlocks. If a deadlock is detected, one of the transaction continue their execution. An aborted transaction is rollback and restarted.

3. Deadlock avoidance

Deadlock can be avoided if resources are allocated in such a way that it avoids the deadlock occurrence. There are two algorithms:

Wait/die

Wound/wait

Time-stamp method for concurrency control:

The most commonly used concurrency protocol is the timestamp based protocol. This protocol uses either system time or logical counter as a timestamp.

The lock-based protocols manage the order between the conflicting pairs among transactions at the time of execution, whereas timestamp-based protocols start working as soon as a transaction is created. Every transaction has a time stamp associated with it, and the ordering is determined by the age of the transaction. In addition, every data is given the latest read and write timestamp. This lets the system know when the last ‘read and write’ operation was performed on the data item.

Optimistic method of concurrency control:

The optimistic approach is based on the assumption that the majority of the database operations do not conflict. An optimistic concurrency control method is also known as validation or certification methods. The optimistic method approach requires neither locking nor time-stamping techniques. Instead, a transaction is executed without restriction until it is committed.

In this method, each transaction moves through the following phases:

Read phase:

During the read phase, the transaction reads values of committed data from the database, execute the needed computations and makes the updates to a private copy of the database value. All update operations of the transaction are recorded in a temporary update file, which is not accessed by the remaining transactions.

Validation or certification phase:

In a validation phase, the transaction is validated to assure that the changes made will not affect the integrity and consistency of the database. If the validation test is positive, the transaction goes to the write phase. If the validation test is negative, the transaction is restarted and the changes are discarded. Thus, in this phase the list of granules is checked for conflicts. If conflicts are detected in this phase, the transaction is aborted and restarted. The validation algorithm must check that the transaction has seen all modifications of transactions committed after it starts.

Write phase:

In this phase, the changes are permanently applied to the database and the updated granules are made public. Otherwise, the updates are discarded and the transaction is restarted. This phase is only for the Read-Write transactions and not for Read-only trannsactions.

My learning by experience

Search This Blog

Featured Post

Database Transaction and Concurrency Control

Labels

Comments

Post a Comment