This chapter describes some of the available performance tuning features for EclipseLink, an open-source persistence framework used with Oracle TopLink. The chapter includes the following topics:
Oracle TopLink includes the open source EclipseLink as the Java Persistence API (JPA) implementation. Oracle TopLink extends EclipseLink with advanced integration into the Oracle Application Server.
The information in this chapter assumes that you are familiar with the basic functionality of EclipseLink. Before you begin tuning, consider reviewing the following introductory information:
"Understanding Queries" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/queries.htm#CHDGGCJB
"Understanding Caching" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/general004.htm#CHDEEBFG
"Understanding Mappings" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/mappingintro.htm#CHDFEJIJ
For more information on Oracle TopLink, see the TopLink page on the Oracle Technology Network (OTN).]
Note:
This chapter serves as a 'Quick Start' guide to performance tuning JPA in the context of a Java EE environment. While the chapter provides common performance tuning considerations and related documentation resources, it is not meant to be comprehensive list of areas to tune.The following tuning recommendations are applicable to most deployments. Always consult your own usecase scenarios before implementing any of these configurations.
This section covers using efficient SQL statements and SQL querying. Table 10-1 and Table 10-2 show tuning parameters and performance recommendations related to SQL statements and querying.
Table 10-1 EJB/JPA Using Efficient SQL Statements and Querying
| Tuning Parameter | Description | Performance Notes | 
|---|---|---|
| Parameterized SQL Binding | Using parameterized SQL and prepared statement caching, you can improve performance by reducing the number of times the database SQL engine parses and prepares SQL for a frequently called query. EclipseLink enables parameterized SQL by default. However, not all databases and JDBC drivers support these options. Note that the Oracle JDBC driver bundled with Oracle Application Server does support this option. The persistence property in persistence.xml "eclipselink.jdbc.bind-parameters" is used to configure this. See Also: "Understanding Caching" at  Default Value: PERSISTENCE_UNIT_DEFAULT (which is true by default) | Leave parameterized SQL binding enabled for selected databases and JDBC drivers that support these options. | 
| JDBC Statement Caching | Statement caching is used to lower the performance impact of repeated cursor creation and repeated statement parsing and creation; this can improve performance for applications using a database. Note: For Java EE applications, use the data source's statement caching (and do not use EclipseLink Statement Caching for EJB3.0/JPA, for example:  Set this option in an Oracle Weblogic data-source by setting  See also "Increasing Performance with the Statement Cache" in Administering JDBC Data Sources for Oracle WebLogic Server. Default Value: The Oracle Weblogic Server data source default statement cache size is 10 statements per connection. | You should always enable statement caching if your JDBC driver supports this option. The Oracle JDBC driver supports this option. | 
| Fetch Size | The JDBC fetch size gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed. For large queries that return a large number of objects, you can configure the row fetch size used in the query to improve performance by reducing the number database hits required to satisfy the selection criteria. Most JDBC drivers use a default fetch size of 10. If you are reading 1000 objects, increasing the fetch size to 256 can significantly reduce the time required to fetch the query's results. Note: The default value means use the JDBC driver default value, which is typically 10 rows for the Oracle JDBC driver. To configure this, use query hint  Default Value: 0 | The optimal fetch size is not always obvious. Usually, a fetch size of one half or one quarter of the total expected result size is optimal. Note that if you are unsure of the result set size, incorrectly setting a fetch size too large or too small can decrease performance. | 
| Batch Writing | Batch writing can improve database performance by sending groups of  The persistence property in persistence.xml  Default Value: Off | Enable for the persistence unit. | 
| Change Tracking | This is an optimization feature that lets you tune the way EclipseLink detects changes in an Entity. Default Value: AttributeLevel if using weaving (Java EE default), otherwise Deferred. | Leave at default AttributeLevel for best performance. | 
| Weaving | Can disable through persistence.xml properties " Default Value: On | Leave on for best performance. | 
| Read Only | Setting an EJB3.0 JPA Entity to read-only ensures that the entity cannot be modified and enables EclipseLink to optimize unit of work performance. Set through query hint "eclipselink.read-only". Can also be set at entity level using  Default Value: False | For optimal performance use read-only on any query where the resulting objects are not changed. | 
| firstResult and maxRows | These are JPA query properties that are used for paging large queries. Typically, these properties can be used when the entire result set of a query returning a large number of rows is not needed. For example, when a user scans the result set (a page at a time) looking for a particular result and then discards the rest of the data after the record is found. | Use on queries that can have a large result set and only a subset of the objects is needed. | 
| Sequence number pre-allocation | Sequence number pre-allocation enables a batch of ids to be queried from the database simultaneously in order to avoid accessing the database for an id on every insert. Default Value: 50 | Always use sequence number pre-allocation for best performance for inserts.  | 
Table 10-2 shows the Entity relationship query parameters for performance tuning.
Table 10-2 EJB3.0 Entity Relationship Query Performance Options
| Tuning Parameter | Description | Performance Notes | 
|---|---|---|
| Batch Fetching | The eclipselink.batch hint supplies EclipseLink with batching information so subsequent queries of related objects can be optimized in batches instead of being retrieved one-by-one or in one large joined read. Batch fetching has three types: JOIN, EXISTS and IN. The type is set through the query hint "eclipselink.batch.type" Note that batching is only allowed on queries that have a single object in their select clause. The query hint to configure this is "eclipselink.batch". Batch fetching can also be set using the @BatchFetch annotation. Default Value: Off | Use for queries of tables with columns mappings to table data you need.You should only use either batch fetching or joining if you know that you are going to access all of the data; if you do not intend to access the relationships, then just let indirection defer their loading. Batch fetching is more efficient than joining because it avoids reading duplicate data; therefore for best performance for queries where batch fetching is supported, consider using batch fetching instead of join reading. | 
| Join Fetching | Join fetching is a query optimization feature that enables a single query for a class to return the data to build the instances of that class and its related objects. Use this feature to improve query performance by reducing database access. By default, relationships are not join-read: each relationship is fetched separately when accessed if you are using lazy-loading, or as a separate database query if you are not using lazy-loading. You can specify the use of join in  Joining is part of the JPA specification, whereas batch fetching is not. And, joining works on queries that not work with batch fetching. For example, joining works on queries with multiple objects in the select clause, queries with a single result, and for cursors and first/max results, whereas batch fetching does not. See Also: "Join Fetching" at  Default Value: Not Used | Use for queries of tables with columns mappings to table data you need.You should only use either batch fetching or joining if you know that you are going to access all of the data; if you do not intend to access the relationships, then just let indirection defer their loading.For the best performance of selects, where batch fetching is not supported, a join is recommended | 
| Lazy loading | Without lazy loading on, when EclipseLink retrieves a persistent object, it retrieves all of the dependent objects to which it refers. When you configure lazy reading (also known as indirection, lazy loading, or just-in-time reading) for an attribute mapped with a relationship mapping, EclipseLink uses an indirection object as a place holder for the referenced object. EclipseLink defers reading the dependent object until you access that specific attribute. This can result in a significant performance improvement, especially if the application is interested only in the contents of the retrieved object, rather than the objects to which it is related. See Also: "Using Lazy Loading" at  Default Value: On for collection mapping (ToMany mappings, @OneToMany, @ManyToMany) Default Value: Off for reference (ToOne mappings, @OneToOne, @ManyToOne) (Note that setting lazy loading On for @OneToOne, @ManyToOne requires weaving, which is On by default for Java EE.) | Use lazy loading for all mappings. Using lazy loading and querying the referenced objects using batch fetching or Join is more efficient than Eager loading. You may also consider using optimized loading with LoadGroups which allows a query to force instantiation of relationships. | 
This section describes tuning the default internal cache that is provided by EclipseLink. Oracle Toplink/EclipseLink can also be integrated with Oracle Coherence. For information on configuring and tuning an EclipseLink Entity Cache using Oracle Coherence, see Section 10.3.1, "Integrating with Oracle Coherence".
The default settings for EJB3.0/JPA used with the EclipseLink persistence manager and cache are no locking, no cache refresh, and cache-usage DoNotCheckCache. To ensure that your application uses the cache and does not read stale data from the cache (when you do not have exclusive access), you must configure these and other isolation related settings appropriately. Table 10-3 shows the cache configuration options.
For more information on cache configuration, see "Understanding Caching" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/cache.htm#CDEFHHEH.
Note:
By default, EclipseLink assumes that your application has exclusive access to the data it is using (that is, there are no external, non-EclipseLink, applications modifying the data). If your application does not have exclusive access to the data, then you must change some of the defaults from Table 10-3.Table 10-3 EJB3.0 JPA Entities and Cache Configuration Options
| Tuning Parameter | Description | Performance Notes | 
|---|---|---|
| Object Cache | EclipseLink sessions provide an object cache. EJB3.0 JPA applications that use the EclipseLink persistence manager create EclipseLink sessions that by default use this cache. This cache, known as the session cache, retains information about objects that are read from or written to the database, and is a key element for improving the performance of an EclipseLink application. Typically, a server session's object cache is shared by all client sessions acquired from it. Isolated sessions provide their own session cache isolated from the shared object cache. The annotation type @Cacheable specifies whether an entity should be cached. Caching is enabled when the value of the persistence.xml caching element is ENABLE_SELECTIVE or DISABLE_SELECTIVE. The value of the Cacheable annotation is inherited by subclasses; it can be overridden by specifying Cacheable on a subclass. Cacheable(false) means that the entity and its state must not be cached by the provider. Default Value: Enabled (shared is True) | Generally it is recommended that you leave caching enabled. If you have an object that is always read from the database, as in a pessimistic locked object, then the cache for that entity should be disabled. Also, consider disabling the cache for infrequently accessed entities | 
| Query Result Set Cache | In addition to the object cache in EclipseLink, EclipseLink also supports a query cache: 
 The query hints for a query cache are: 
 
 
 See Also: "Understanding Caching" at  Default Value: Not Used | Use for frequently executed non-primary key queries with infrequently changing result sets.Use with a cache invalidation time out to refresh as needed. | 
| Cache Size | Cache size can be configured through persistence properties:  " 
 See Also: "About the Persistence Unit" at  Default Value: Type  | Set the cache size relative to how much memory you have available, how many instances of the class you have, the frequency the entities are accessed, and how much caching you want based on your tolerance for stale data. Consider creating larger cache sizes for entities that have many instances that are frequently accessed and stale data is not a big issue. Consider using smaller cache sizes or no cache for frequently updated entities that must always have fresh data, or infrequently accessed entities. | 
| Locking | Oracle supports the locking policies shown in Table 10-4: no locking, optimistic, pessimistic, and read-only. Locking is set through JPA @Version annotation, eclipselink.read-only See "Descriptors and Locking" at  Default Value: No Locking | For entities that can be updated concurrently, consider using the locking policy to prevent a user from writing over another users changes. To optimize performance for read-only entities, consider defining the entity as read-only or use a read-only query hint. | 
| Cache Usage | By default, all query types search the database first and then synchronize with the cache. Unless refresh has been set on the query, the cached objects can be returned without being refreshed from the database. You can specify whether a given query runs against the in-memory cache, the database, or both. To get performance gains by avoiding the database lookup for objects already in the cache, you can configure that the search attempts to retrieve the required object from the cache first, and then search the data source only if the object is not in the cache. For a query that looks for a single object based on a primary key, this is done by setting the query hint " Default Value:  | For faster performance on primary key queries, where the data is typically in the cache and does not require a lot of refreshing, it is recommended to check the cache first on these queries (using  This avoids the default behavior of retrieving the object from the database first and then for objects already in the cache, returning the cached values (not updated from the database access, unless refresh has been set on the query). | 
| Isolation | There is not a single tuning parameter that sets a particular database transaction isolation level in a JPA application that uses EclipseLink. In a typical EJB3.0 JPA application, a variety of factors affect when database transaction isolation levels apply and to what extent a particular database transaction isolation can be achieved, including the following: 
 See Also: "Isolated Cache" at  | |
| Cache Refreshing | By default, EclipseLink caches objects read from a data source. Subsequent queries for these objects access the cache and thus improve performance by reducing data source access and avoiding the cost of rebuilding object's and their relationships. Even if a query accesses the data source, if the objects corresponding to the records returned are in the cache, EclipseLink uses the cached objects. This default caching policy can lead to stale data in the application. Refreshing can be enabled at the entity level ( For more information see: Section 10.2.2.1, "Cache Refreshing Scenarios" See Also: "Understanding Caching" at  Default Value: No Cache Refreshing | Try to avoid entity level cache refresh and instead, consider configuring the following: 
 | 
There are a few scenarios to consider for data refreshing in the cache, all with performance implications:
In the case where you never want cached data and always want fresh data, consider using an isolated cache (Shared=False). This is the case when certain data in the application changes so frequently that it is desirable to always refresh the data, instead of only refreshing the data when a conflict is detected.
In the case when you want to avoid stale data, but getting stale data is not a major issue, then using a cache expiry policy would be the recommended solution. In this case you should also use optimistic locking, which automatically refresh stale objects when a locking error occurs. If using optimistic locking, you could also enable the entity @Cache attributes alwaysRefresh and refreshOnlyIfNewer to allow queries that access the database to refresh any stale objects returned, and avoid refreshing invalid objects when unchanged. You may also want to enable refreshing on certain query operations when you know you want refreshed data, or even provide the option of refreshing something from the client that would call a refreshing query.
In the case when you are not concerned about stale data, you should use optimistic locking; this automatically refresh stale objects in the cache on locking errors.
The locking modes, as shown in Table 10-4, along with EclipseLink cache-usage and query refreshing options, ensures data consistency for EJB entities using JPA. The different combinations have both functional and performance implications, but often the functional requirements for up-to-date data and data consistency lead to the settings for these options, even when it may be at the expense of performance.
For more information, see "Descriptors and Locking" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/descriptors002.htm#CHEEEIEA.
Table 10-4 Locking Mode Policies
| Locking Option | Description | Performance Notes | 
|---|---|---|
| No Locking | The application does not prevent users overwriting each other's changes. This is the default locking mode. Use this mode if the Entity is never updated concurrently or concurrent reads and updates to the same rows with read-committed semantics is sufficient. Default Value: No Locking | In general, no locking is faster, but may not meet your needs for data consistency. | 
| Optimistic | All users have read access to the data. When a user attempts to make a change, the application checks to ensure the data has not changed since the user read the data. See Also: "Using Optimistic Locking" at  | If infrequent concurrent updates to the same rows are expected, then optimistic locking may provide the best performance while providing data consistency guarantees. | 
| Pessimistic | The first user who accesses the data with the purpose of updating it locks the data until completing the update. | If frequent concurrent updates to the same rows are expected, pessimistic locking may be faster than optimistic locking that is getting a lot of concurrent access exceptions and retries. When using pessimistic locking at the entity level, it is recommended that you use it with an isolated cache (Shared=False) for best performance. | 
| Read Only | Setting an EJB3.0 JPA Entity to read-only ensures that the entity cannot be modified and enables EclipseLink to optimize unit of work performance. Set at the entity level using @ReadOnly class annotation. Can also be set at the query level through query hint " | Defining an entity as read-only can perform better than an entity that is not defined as read-only, yet does no inserts, updates, or deletes, since it enables EclipseLink to optimize the unit of work performance. Always use read-only for all read-only operations | 
EclipseLink can transform data between an object representation and a representation specific to a data source. This transformation is called mapping and it is the core of a EclipseLink project.
A mapping corresponds to a single data member of a domain object. It associates the object data member with its data source representation and defines the means of performing the two-way conversion between object and data source.
For information on Mapping see, "Mapping and Descriptors" at http://www.eclipse.org/eclipselink/documentation/2.5/solutions/performance002.htm#sthref153.
EclipseLink allows you to configure data partitioning using the @Partitioned annotation. Partitioning enables an application to scale information across multiple databases; including clustered databases. For more information on using @Partitioned and other partitioning policy annotations, see "Partitioning Annotations" at http://www.eclipse.org/eclipselink/documentation/2.5/jpa/extensions/annotations_ref.htm#CACHIHIB.
After you have performed the modifications recommended in the previous section, you can make additional changes that are specific to your deployment. Consider carefully whether the recommendations in this section are appropriate for your environment.
Oracle Toplink can be integrated with Oracle Coherence. This integration is provided through the Oracle TopLink Grid feature. With TopLink Grid, there are several types of integration with EclipseLink JPA features.
For example:
Replace the default EclipseLink L2 cache with Coherence. This provides support for very large L2 caches that span cluster nodes. EclipseLink's default L2 cache improves performance for multi-threaded and Java EE server hosted applications running in a single JVM, and requires configuring special cache coordination features if used across a cluster.
Configure entities to execute queries in the Coherence data grid instead of the database. This allows clustered application deployments to scale beyond database-bound operations.
For more information on using EclipseLink JPA with a Coherence Cache, see "Grid Cache Configuration" in Integrating Oracle Coherence.
For more information on Oracle Toplink integration with Oracle Coherence, see "Introduction to Toplink Grid with Oracle Coherence" in Integrating Oracle Coherence.
This section lists a few features in EclipseLink that can help you analyze your JPA application performance:
For form monitoring performance, see "Performance Monitoring" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/monitoring003.htm#BABJABIH. Note that this tool is intended to profile and monitor information in a multithreaded server environment.
For profiling performance, see "Task 1: Measure EclipseLink Performance with the EclipseLink Profiler" at http://www.eclipse.org/eclipselink/documentation/2.5/solutions/performance002.htm#CHDIAFJI. Note that this tool is intended for use with single-threaded finite use cases.
For debugging performance issues and testing, you can view the SQL generated from EclipseLink. To view the SQL, increase the logging level to "FINE" by using the EclipseLink JPA extensions for logging.
For best performance, remember to restore the logging levels to the default levels when you are done profiling or debugging.