Delayed loading and L2 caching in mybatis

I. Lazy Loading

1. Core Definition

Lazy loading is MyBatis’s optimization mechanism for related queries : when querying the main object, its related objects (such as the one-to-many relationship between users and accounts) are not queried immediately. The SQL execution of the related query is only triggered when the related object is actually used (getter method is called).

2. Core Comparison (Lazy Loading vs. Immediate Loading)

Loading method	Timing of Execution	Applicable Scenarios	advantage	shortcoming
Lazy loading	`getter` When calling the associated object	Related data is not frequently used; related tables contain a large amount of data.	Reduce invalid queries and improve performance	This may trigger the N+1 problem (during a loop query).
Load Now	When querying the main object (e.g., `LEFT JOIN` related objects).	Related data is essential and involves small amounts of data.	Complete the query in one go, avoiding multiple database interactions.	Redundant data loading, performance loss

3. Implementation Principle

Based on dynamic proxies : When querying the main object, MyBatis returns a proxy instance of the main object; when calling a method of a related object getter , the proxy object intercepts the call, triggers the execution of the related query SQL, and returns the query result after assigning it.

4. Configuration Requirements (Globally Enabled)

SqlMapConfig.xml This needs to <settings> be configured in the `<myBatis>` tag (MyBatis disables this by default):

<settings> 
    <!-- Enable lazy loading globally (core) --> 
    <setting name="lazyLoadingEnabled" value="true"/> 
    <!-- Disable aggressive loading (default false in MyBatis 3.4.1+, manual setting required for earlier versions) 
         to avoid loading all associated objects at once --> 
    <setting name="aggressiveLazyLoading" value="false"/> 
    <!-- Optional: Specify the method that triggers lazy loading (default equals/clone/hashCode/toString). 
         Calling these methods will not trigger lazy loading --> 
    <setting name="lazyLoadTriggerMethods" value="equals,clone,hashCode,toString"/> 
</settings>

5. Related Query Configuration (XML Example)

One-to-many (user -> account)

<!-- UserMapper.xml --> 
<resultMap id="userAccountMap" type="cn.tx.domain.User"> 
    <id property="id" column="id"/> 
    <result property="username" column="username"/> 
    <!-- Collection configures one-to-many relationships, select specifies the relationship query method --> 
    <collection 
        property="accounts" <!-- Relationship attribute name in the main object --> 
        ofType="cn.tx.domain.Account" <!-- Relationship object type --> 
        column="id" <!-- Relationship condition (primary key of the main table -> foreign key of the subordinate table) --> 
        select="cn.tx.mapper.AccountMapper.findAccountsByUid"/> <!-- Relationship query SQL --> 
</resultMap> 
 
<!-- Main query: only query users, do not load accounts --> 
<select id="findUserById" resultMap="userAccountMap"> 
    SELECT id, username FROM user WHERE id = #{id} 
</select>

One-to-one (User -> ID card)

<resultMap id="userIdCardMap" type="cn.tx.domain.User">
    <id property="id" column="id"/>
    <result property="username" column="username"/>
    <!-- Association configuration for one-to-one association-->
    <association 
        property="idCard" 
        javaType="cn.tx.domain.IdCard"  <!-- Implementing one-on-one with JavaType -->
        column="id" 
        select="cn.tx.mapper.IdCardMapper.findByIdCardByUid"/>
</resultMap>

6. Key Considerations

SqlSession liveness requirements : Lazy loading requires the SqlSession execution of join queries, therefore it cannot be closed getter before the call SqlSession (otherwise an error will occur PersistenceException).
Avoiding the N+1 problem : When querying multiple main objects in a loop, avoid triggering related queries one by one (1 main query + N related queries); for large amounts of data, it is recommended to use LEFT JOIN immediate loading.
There are no special requirements for entity classes : they do not need to implement interfaces, only that the associated attributes have getter/setter methods (the proxy object needs to getter be loaded via a trigger).

II. MyBatis Caching Mechanism (First-Level Cache + Second-Level Cache)

The core purpose of MyBatis caching is to reduce the number of database queries and improve performance. It is divided into first-level cache (local cache) and second-level cache (global cache).

1. First-level cache (Local Cache)

(1) Core definition

Scope of application: Within a single SqlSession (session-level cache); different SqlSessions do not share the cache.
Storage medium: memory (HashMap), enabled by default, no additional configuration required.

(2) Work process

Executing the same SqlSession query (same SQL + parameters):
- First query: Query the database and store the results in the first-level cache.
- Secondary query: retrieve directly from the cache without executing SQL.
Scenarios that trigger cache clearing:
- Perform insert/update/delete the operation (automatically clear the first-level cache of the current SqlSession to ensure data consistency).
- Call sqlSession.clearCache() the manual clear function.
- SqlSession is closed (cache invalidation).

(3) Configuration instructions

Enabled by default, but localCacheScope its scope (global configuration) can be adjusted:

<settings>
    <!-- SESSION (Default): Cache applies to the entire SQL Session -->
    <!-- STATEMENT：Cache only applies to the current SQL statement and is cleared immediately after execution -->
    <setting name="localCacheScope" value="SESSION"/>
</settings>

2. Second Level Cache

(1) Core definition

Scope: Globally shared (across SqlSession), isolated by Mapper interface namespace(shared cache within the same namespace).
Storage medium: Default memory (HashMap), can be integrated with third-party caches such as Redis/Ehcache (persistence).
Dependency requirements: The cached entity class must implement Serializable the interface (cacheable objects must be serialized for storage).

(2) Work process

When the second-level cache is enabled, SqlSession data in the first-level cache will be written to the second-level cache when it is disabled.
Execute the same query again SqlSession (same namespace + same SQL + parameters):
- First check the second-level cache; if a match is found, return the result.
- If a cache miss occurs, the database is queried, and the result is stored in the first-level cache SqlSession. After the cache is closed, it is synchronized to the second-level cache.
Scenarios that trigger cache clearing:
- Perform the operation within the same namespace insert/update/delete (automatically clear the second-level cache of the current namespace).
- Configure flushInterval automatic refresh (e.g., every 60 seconds).
- Manually invoked sqlSessionFactory.getConfiguration().getCache(namespace).clear().

(3) Complete configuration steps

Step 1: Implement Serializable in the entity class

public class User implements Serializable { // Must be implemented, otherwise cache serialization will fail
    private Integer id;
    private String username;
    private List<Account> accounts;
    // getter/setter/toString
}

Step 2: Enable second-level caching globally (SqlMapConfig.xml)

<settings>
    <!-- Global enable L2 cache (default true) -->
    <setting name="cacheEnabled" value="true"/>
    <!-- Optional: Global cache auto refresh time (milliseconds), default not auto refresh -->
    <setting name="cacheFlushInterval" value="60000"/>
</settings>

Step 3: Enable caching for the Mapper (XML method)

Add the following tag under the root tag of the Mapper XML <cache> :

<!-- UserMapper.xml --> 
<mapper namespace="cn.tx.mapper.UserMapper"> 
    <!-- Enable second-level caching for the current namespace --> 
    <cache 
        eviction="LRU" <!-- Cache eviction strategy (default LRU) --> 
        flushInterval="60000" <!-- Automatic refresh every 60 seconds --> 
        size="1024" <!-- Maximum cache size (default 1024 objects) --> 
        readOnly="false"/> <!-- false: modifiable (returns a copy); true: read-only (returns the original object, higher performance) --> 
 
    <!-- Query method: default useCache="true" (enables second-level caching) --> 
    <select id="findUserById" resultMap="userAccountMap" useCache="true"> 
        SELECT id, username FROM user WHERE id = #{id} 
    </select> 
 
    <!-- CRUD operations: default flushCache="true" (clears the cache), can be omitted --> 
    <update id="updateUser" flushCache="true"> 
        UPDATE user SET username = #{username} WHERE id = #{id} 
    </update> 
</mapper>

Step 4: Annotation Configuration (Optional)

// UserMapper.java (Enables second-level caching via annotations) 
@CacheNamespace( 
    implementation = PerpetualCache.class, // Cache implementation class (default) 
    eviction = LruCache.class, // Evaporation strategy 
    flushInterval = 60000, 
    size = 1024, 
    readWrite = true // Equivalent to readOnly="false" 
) 
public interface UserMapper { 
    @Options(useCache = true) // Enable second-level caching 
    User findUserById(@Param("id") Integer id); 
 
    @Options(flushCache = true) // Clear cache 
    void updateUser(User user); 
}

(4) Core attributes of the tag

property	Value description
`eviction`	Cache eviction strategies (4 types): – LRU (default): Least Recently Used, removes the longest unused object. – FIFO: First In First Out, removes objects in the order they were added. – SOFT: Soft reference, removes objects when memory is insufficient. – WEAK: Weak reference, removes objects during garbage collection.
`flushInterval`	Automatic refresh time (milliseconds); default is no automatic refresh (only triggered by CRUD operations).
`size`	The maximum number of cached objects (default 1024) needs to be adjusted based on available memory.
`readOnly`	`false` (Default): The cached object is modifiable (returns a serialized copy); `true`: Read-only (returns the original object, resulting in higher performance).

3. Comparison of L1 and L2 cache cores

characteristic	Level 1 cache (Local Cache)	Second-level cache
Scope of application	Single SqlSession	Global (across SqlSession), isolated by namespace
How to activate	Enabled by default, no configuration required.	Global switch + Mapper enabled separately
storage media	Memory (HashMap)	Default memory, third-party caching can be integrated.
Serialization requirements	none	Entity classes must implement Serializable
Data consistency	Consistent within the session (automatically cleared)	Consistent across sessions (automatic clearing of data after CRUD operations).
Applicable Scenarios	Repeated queries within a single session	Multi-session sharing of data (such as dictionaries, static data)

4. Caching Usage Considerations

Consistency of Cache for Related Queries : If a query by Mapper A is related to a query by Mapper B, it is recommended that Mapper B also enable a second-level cache to avoid inconsistency in the cache of related data.
Disable partial query caching : For data with high real-time requirements (such as order status), you can useCache="false" disable the second-level cache.

<select id="findRealTimeOrder" resultType="Order" useCache="false">
    SELECT * FROM order WHERE id = #{id}
</select>

Third-party cache integration : In production environments, it is recommended to use Redis/Ehcache instead of the default memory cache (the default cache will be lost after the project restarts). You need to add the corresponding dependency (e.g. mybatis-redis) and configure the cache implementation class.
Avoid caching dirty data : It is not recommended to use second-level caching for multi-table joins (across namespaces), as updates to one table may not trigger cache clearing in other namespaces, leading to dirty data.

III. The Collaborative Working of Lazy Loading and Caching

Lazy-loaded join queries also trigger caching: getter after the join query is executed for the first time, the result is stored in the first-level cache, SqlSession and after it is closed, it is synchronized to the second-level cache.
Cache priority: Second-level cache > First-level cache > Database (when querying, first check the second-level cache, then the first-level cache, and finally the database).
Collaborative optimization scenario: When querying the main object (such as a user), use lazy loading (to avoid redundant related data), and enable second-level caching for both the main object and related objects (to reduce repeated queries). This is suitable for scenarios where “the main data does not change frequently and related data is loaded on demand” (such as user information + historical orders).