MySQL transactions

I. Reasons for using transactions

In many cases, we need to execute a set of operations that either all fail or all succeed. For example, in a bank transfer, one user transfers money to another, deducting funds from the first user’s account and increasing the second user’s balance. Imagine if the MySQL server crashes after the transfer, or if for some other reason the second user’s balance doesn’t increase—that would be a disaster. This is where the role of transactions becomes apparent.

Example:

// Prepare the test table 
drop table if exists accout;

create table accout(
id int primary key auto_increment,
name varchar(20) comment 'Account Name',
money decimal(11,2) comment 'Amount'
);

insert into accout(name, money) values
​​('Cat', 5000),
('Dog', 1000);

For example, the Dog stole 2,000 yuan from Cat’s account.

-- Cat account reduced by 2000
update accout set money=money-2000 where name = 'Cat';

-- Dog account increased by 2000
update accout set money=money+2000 where name = 'Dog';

For example, if a network error occurs or the database crashes when executing the first SQL statement above, Cat’s account will be deducted 2,000 yuan, but the Dog’ account will not receive any additional funds.

Solution: Use transactions to control the process, ensuring that either both SQL statements succeed or both fail.

start transaction;
-- Cat account reduced by 2000
update accout set money=money-2000 where name = 'Cat';

-- Dog account increased by 2000
update accout set money=money+2000 where name = 'Dog';
commit;

II. Rollback Mechanism

We’ve learned that MySQL transactions ensure a group of SQL statements either all succeed or all fail. How does it ensure this? This leads to the rollback mechanism. In short, after a transaction is committed, if any statement in the group fails, a rollback will occur, restoring the database to its state before the transaction was executed.

To implement the rollback mechanism, the database records logs when executing transactions. Once the transactions are completed without any errors, these records can be discarded. However, if problems occur during the execution of a transaction, MySQL can use the records in the logs to perform recovery operations.

1) If an addition operation was performed previously, then delete the data.

2) If a deletion operation was performed previously, then add the data back.

3) If modifications were made previously, revert the data.

4) The previous operation was a query, so it has no impact and requires no response.

III. Basic Characteristics of Transactions

3.1 Atomicity

In summary, we know that a transaction is essentially packaging multiple SQL statements into a single “whole” that is guaranteed to either complete all executions or none at all, preventing any intermediate execution. This characteristic is known as the atomicity of transactions.

3.2 Consistency

The data before and after a transaction is executed must be consistent (there will be no discrepancies). This is also related to rollback; once a rollback is triggered, the rolled-back data must be correct. If the transaction executes successfully without triggering a rollback, the data must still meet the requirements. In other words, it’s a promise of “data correctness.”

3.3 Persistence

Durability means storing data on disk so that the data persists even after the program or host restarts. Transaction durability means that modifications made to the database by executing a transaction are persistently saved on disk and remain after a restart.

3.4 Isolation

 Before discussing the isolation of MySQL transactions, let’s first talk about the potential problems that MySQL transactions may encounter in concurrent scenarios. These mainly include the following issues.

3.4.1 Dirty Read Problem

Consider the following scenario: two transactions, A and B, execute concurrently. Transaction A modifies data in a table. Simultaneously, transaction B reads data from the same table, accessing temporary data from before transaction A’s commit. After transaction B reads this data, transaction A modifies it again. Thus, the data read by transaction B is temporary data, or “dirty data.” The key to solving the dirty read problem lies in using locks for write operations, preventing simultaneous reading and writing of data.

3.4.2 Non-repeatable read problem

There are three transactions, A, B, and C. Transaction A modifies data in a certain table. After transaction A commits, transaction B starts reading data from the same table. While transaction B is reading, transaction C modifies the data again. This results in transaction B reading different data initially than later. To solve this problem, a lock can be added to the read operations to prevent write operations from occurring while data is being read.

3.4.3 Phantom Reading Problem

Consider three transactions, A, B, and C. Transaction A modifies data in a certain table. After transaction A commits, transaction B begins reading data from the same table. While transaction B is reading, transaction C performs insert operations on the same table. Thus, although transaction B reads the same data before and after, the result sets are different. Solving the phantom read problem requires completely serial execution of transactions, essentially sacrificing concurrency.

MySQL offers four transaction levels, which can be configured via configuration files.

1) Read uncommitted: Allows reading data that has not been committed by other transactions (dirty read + non-repeatable read + phantom read; highest concurrency, lowest isolation).

2) Red committed: Can only read data committed by other transactions (solves dirty read problem, but has non-repeatable read and phantom read problems; reduces concurrency, improves isolation).

3) Repeatable red: Locks are added for both read and write operations (solves the problems of dirty reads and non-repeatable reads, which can lead to phantom reads, while reducing concurrency and improving isolation).

4) Serializable: All transactions are executed serially (solves dirty read, non-repeatable read, and phantom read problems; concurrency is virtually nonexistent; and isolation is the highest).