Kyle Brown, et al.

Subscribe to Kyle Brown, et al.: eMailAlertsEmail Alerts
Get Kyle Brown, et al.: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Related Topics: Java Developer Magazine, WebSphere

Java Developer : Article

Enterprise Java Programming with IBM WebSphere 2nd Edition

Excerpts from Chapter 28: Transactions in WebSphere 5.0

The topic of transactions is one that most Java programmers would rather ignore than try to understand. And, in fact, in most cases you can ignore them - the default settings of WebSphere Application Server (WAS) and WebSphere Studio Application Developer work well enough in most situations that many programmers can build large and complex applications without having to know the details of how transactions work. Unfortunately, at some point this blissful ignorance must come to an end.

Then you have to hunker down and learn how transactions operate in order to solve problems that have ramifications all the way up and down your architecture. In this article, based on Chapter 28 of my new book, I'll examine how transactions operate in WAS.

Advice on Using Transactions
The best set of advice about EJB transactions that I've come across is a set of simple rules that Keys Botzum (from the IBM Software Services for WebSphere group) came up with that give you the 90 percent case for dealing with transactions. Keys' rules of thumb are:

  • Always assume you're going to use container demarcated transactions when using EJBs. It's complicated and difficult to use the JTA API to do your own transaction demarcation, and not worth it in most circumstances.
  • If you need transactions with a servlet (e.g., outside the EJB container), use the JTA API for demarcating a transaction. The beauty of this rule and the previous one is that by doing this you will not have to write your database code one way [using setAutoCommit()] to work within the EJB container and another way to work outside the EJB container. In fact, you should try not to mess with Connection.setAutoCommit()—just assume that the container will handle transaction commit/rollback for you.
  • Assume that the container will handle the appropriate magic of managing local versus global transactions based on the number of participants. Also, assume that it will perform automatic PC optimizations if appropriate. Thus, there is no penalty to global transactions.
  • Use XA enabled resources in the following situations:
    -If there may be more than one participant in a transaction (this could be two JDBC databases, or a database and a JMS connection or EIS connection, or any other combination). This allows the container to do any appropriate optimizations if there is only one participant, but to handle XA correctly if there are two or more.
    -If an EJB needs to access another EJB deployed in a different EJB container, then both containers should use XA resource managers. Sometimes one of your application's EJBs will need to use a utility EJB that provides some service to you. The only way to tie together the two EJBs into a single transaction is to use XA resources in both EJBs. This is an example of a distributed transaction; something relatively rare, but that also requires the use of XA resources.
These rules will work for most situations, but there are a few that you may find yourself in that will require you to go beyond the rules—in particular we need to look at some of the differences between the EJB 1.1 and EJB 2.0 specs with regard to local transactions. Let's look at the following sections from Section 6.5.7 of the EJB 1.1 spec:

"A session Bean's newInstance, setSessionContext, ejbCreate, ejbRemove, ejbPassivate, ejbActivate, and afterCompletion methods are called outside of the client's global transaction.

"For example, it would be wrong to perform database operations within a session Bean's ejbCreate or ejbRemove method and to assume that the operations are executed under the protection of a global transaction. The ejbCreate and ejbRemove methods are not controlled by a transaction attribute because handling rollbacks in these methods would greatly complicate the session instance's state diagram (see next section)."

This statement was modified a bit in the EJB 2.0 specification to make things less confusing.

It said that the operations shouldn't be controlled by the transaction attribute of the bean, but it didn't specify what the behavior of these operations should be in relation to any ongoing global transaction. In particular, it didn't give the vendors much guidance as to how SQL statements in these methods should be handled. Should each statement be its own transaction (e.g., should it be as if the connection were in auto-commit mode) or should the method be a single transaction scope? So let's examine how this statement changed in EJB 2.0 (the following quote is from section 7.5.7 of the EJB 2.0 specification):

"A session bean's newInstance, setSessionContext, ejbCreate, ejbRemove, ejbPassivate, ejbActivate, and afterCompletion methods are called with an unspecifed transaction context. Refer to Subsection 17.6.5 for how the Container executes methods with an unspecified transaction context."

What the section discusses is that it is up to the Container vendor to determine how methods in the unspecified transaction context operate. Now, in addition, you should turn your attention back to the table referenced previously. When an EJB's transaction attribute is Never or NotSupported (or Supported without an outer transaction context) the business methods also run within an unspecified transaction context. It is important to understand exactly what that means in WebSphere, and how to know what the behavior of methods running in the unspecified transaction context will be.

In WebSphere 5.0, there are extended transactional attributes that apply to the unspecified transactional context. The three settings we have to understand are:

  • Boundary (Bean_Method or Activity_Session)
  • Resolver (Application or Container_At_Boundary)
  • Unresolved action (Commit or Rollback)
Let's leave aside the issue of activity sessions for the moment. So, for the moment, just go along with this and we'll discuss what happens when you set Boundary to Bean_Method. Basically this means that all resource manager local transactions [RMLTs]—(we'll call them local transactions from now on) must be committed within the same enterprise bean method within which they are started.

What does Resolver mean? In short, Resolver specifies resolution control and determines who is responsible for handling the commitment of statements that are left hanging by being called within an unspecified transaction context. The two options for Resolver are Application, which means that your program is responsible for forcing commitment [either by using setAutocommit(true) or by using LocalTransaction.begin() and LocalTransaction.commit()], and Container_At_Boundary, which means the container is responsible for committing the local transaction.

If you set Resolver to Container_At_Boundary (and set the Unresolved Action to Commit), then the bean's method will act the same as it would if you had set the bean's transaction attribute to RequiresNew. That is, the container will begin a local transaction when a connection is first used, and the local transaction will commit automatically at the end of the method.

The difference is that this method will execute in a local transaction context, meaning it won't tie together two different data sources into a single 2-PC transaction within the same method. Likewise, you can't carry a connection over into a method that is being used in this way. You must obtain the connection, use it, and close it all within the same method. Any attempt to pass a connection carried over from another transaction context into a method set up in this way will throw an exception.

Things are a bit more complicated if you choose to set the Resolution Control to Application.

Now the behavior of the local transaction depends upon what your code does. If your code specifies the behavior of each local resource [if you use setAutoCommit(true) in JDBC], then each statement will run in its own local transaction. Another option would be that you could delineate the transaction yourself by using javax.resource.cci.LocalTransaction.begin()andjavax.resource.cci.LocalTransaction.commit() [or rollback()]. The interesting bit occurs if you do neither of these things, and leave a transaction open or hanging. This could happen in one of two ways: either you could use LocalTransaction.begin() without a corresponding LocalTransaction.commit() at the end of the method, or you could use setAutocommit(false) after obtaining your JDBC connection and not add any code to control the transaction. As you can see, the state of the transaction at the end of the method is now ambiguous.

To resolve that ambiguity, there is the Unresolved Action option. If Unresolved Action is set to Commit, then open local transactions will commit at the end of the bean method; if it is set to Rollback, they will roll back.

28.12 Dealing with Concurrency
Every application that uses a database in any form will at some point have to face an age-old question: How do I keep two different users from stepping on each other while they update their data? That is, if Tammy in the graphics department is updating our catalog to reflect the new look of our summer items, while Bob in accounting is also updating the catalog to reflect the new price list, how do we keep Bob's updates from overwriting Tammy's and vice versa? This comes down to the issue of managing concurrency and there are two general approaches: optimistic and pessimistic concurrency management.

Pessimistic concurrency management, probably the easiest to understand, is the idea of using a lock on a database record to keep more than one application from updating the database at the same time. So, at the beginning of Tammy's transaction, she obtains a lock on the catalog row. When Bob comes along, he may be restricted from reading the catalog row (if Tammy's lock was a lock on read) and forced to wait until Tammy is done. Another option would be a lock on write, meaning that Bob can read the original data, but he's restricted from writing new data to the row until Tammy's update completes (this would ensure that Bob's updates are additive to Tammy's).

The main problem with the pessimistic approach is the waiting. If several readers are kept from reading a row while another holds a lock that might result in an update, then this may lead to unacceptable runtime performance. In this case, the readers are needlessly waiting for a write that might never occur. To avoid this, another option is the idea of optimistic concurrency. This involves two things: (1) not obtaining locks, thus allowing for maximum concurrency in reading, and (2) performing a read immediately before a write to ensure that the data has not changed in the interim. If the data has changed, the writer will abort the writing process.

In our scenario, Bob would read his row at the beginning of his transaction, getting the original row without Tammy's updates. At the end of his transaction, under most circumstances, he would read the row again and discover it had not changed, and then complete the update. In some cases, he might read the row, discover Tammy's update, and abort his attempt to write the row since he would overwrite Tammy's intervening update in the process.

Detecting whether a row has changed requires one of two approaches. Bob could detect Tammy's update either by using a time stamp, which is applied at the end of each update, or by using an overqualified update, which is where you use the originally read value of every column in the table as part of the WHERE clause of your update statement. In this case, if there are any mismatches (due, for instance, to Tammy's update), the WHERE clause of the SQL UPDATE statement will not locate that row and fail. The major advantage of optimistic concurrency control is that since it doesn't require locking, it allows for much better throughput—at the cost of some number of aborted updates when collisions occur.

28.12.1 Concurrency and EJBs
How is all this managed in WebSphere and WSAD 5.0? WebSphere 5.0 simplified the process by combining all this into one setting now called access intent. You now define one or more access intent policies to apply to a set of entity EJB methods that will control both the concurrency scheme used (optimistic or pessimistic) and the locking strength used in a pessimistic scheme.

The cool thing about the new access intent approach in WebSphere 5.0 is that it also abstracts away the details of picking the right isolation level for each particular database; because different databases have different locking semantics, the access intent setting allows the container to pick the right isolation level based on a general hint.

There are seven different settings for Access Intent in WebSphere 5.0. To begin with, we have the two possible optimistic settings:

  • wsOptimisticUpdate—Use this when you want to allow a method (or group of methods) to perform updates but use optimistic concurrency. This will not perform any locking on select statements, and will perform an overqualified update at the end of the transaction if any set() methods are used during the transaction.
  • wsOptimisticRead—Use this when you want to allow one or more methods to only read from a database. If you attempt to perform any updates [e.g., if you send a set() method] during the execution of the transaction, a PersistenceManagerException will be thrown.
In addition, there are the five pessimistic settings: wsPessimisticRead, wsPessimisticUpdate-Exclusive, wsPessimisticUpdate-NoCollisions, wsPessimisticUpdateWeakestLockAtLoad and wsPessimisticUpdate. The setting wsPessimisticRead is nearly identical to wsOptimisticRead other than the fact that the underlying database isolation levels are set slightly differently (wsPessimisticRead sets the isolation level to RepeatableRead instead of ReadCommitted as it is in wsOptimisticRead). As is the case with wsOptimisticRead, if an update is attempted, the container will throw a PersistenceManagerException. Note that the isolation level setting for this is different in Oracle.

The four choices for pessimistic updates differ in both the isolation level used and in the approach taken to using the FOR UPDATE clause on select statements. Here are the different choices for pessimistic updates:

  • wsPessimisticUpdate-Exclusive—Exclusive in this case means that your application needs exclusive access to the database rows it is using. This setting indicates that it will use the FOR UPDATE clause, and will set the isolation level to serializable. This setting will mean that you will not encounter either phantom reads or nonrepeatable reads, and that the deadlock that is possible with the TRANSACTION_SERIALIZABLE level alone (without the use of a FOR UPDATE clause) will not occur. This is terribly expensive—this will force every transaction to wait in line to acquire a write lock at the beginning of the transaction and will hold all other transactions until this transaction completes.
  • wsPessimisticUpdate-NoCollision—No collision means that the application should be designed such that no concurrent transactions are expected to access the same database rows. This setting (as in the previous) uses the FOR UPDATE clause but sets the isolation level to ReadCommitted.
  • wsPessimisticUpdate-WeakestLockAtLoad—The default setting for WebSphere. WeakestLockAtLoad is applicable only to those databases that support both read locks and write locks. If the database supports them both, a read lock is acquired when a row is accessed and the lock is escalated (promoted) to a write lock if an update is performed on the bean. This setting uses an isolation level of RepeatableRead, but does not use a FOR UPDATE clause. It will work pretty well with nearly every database except Oracle. We will discuss that further later.
  • wsPessimisticUpdate—This setting uses a for-update clause on finder methods and sets the isolation level to RepeatableRead (as in WeakestLockAtLoad) except in Oracle.
Why is Oracle special? Because of the way that it implements its locking mechanism, Oracle does not support the TRANSACTION_REPEATABLE _READ isolation level in the same way as other databases (for example, DB2 or SQL Server). So, everywhere that in other databases the server would have used TX_REPEATABLE_READ the server instead has to use TX_READ_COMMITTED.

In addition, Oracle doesn't use locks in the same way as in other databases. In databases, there is usually a difference between read locks and write locks. A read lock is shared; multiple processes or threads can read an item simultaneously. A write lock is exclusive; only a single transaction holds the lock on the item. In Oracle the weakest lock is an update lock. This becomes interesting when you consider using wsPessimisticUpdate-WeakestLockAtLoad in Oracle. As mentioned earlier, for Oracle the server has to use the TX_READ_COMMITTED for this setting. What's more, in order to maintain the semantics of the setting, the server must also use a SELECT…FOR UPDATE as well. This is only true of Oracle; no other database requires the server to use a FOR UPDATE clause for this setting. The problem is that in many cases using this access intent with Oracle will result in a runtime exception if the back-end datastore does not support the SQL Statement needed by this Access Intent. This is because certain types of SQL statements (for instance multiple table joins) cannot use the FOR UPDATE clause.

28.12.2 Choosing the Right Access Intent
Given all of this, which access intent should you use for your applications? If the question were simple to answer, then WebSphere wouldn't have included so many choices. The fact is this is a complicated question, often motivated by differences in the behavior of the databases themselves. So, let's work through some best practices, some decision points, and some recommendations to guide you.

First, what's the easiest route for access intent? In many cases you would have expected that to be wsOptimisticUpdate. In that case (as you remember), if an application writes to a database, and there are few expected collisions, then readers and writers entirely stay out of each other's way, and writers will only cause each other problems once in a blue moon. What's more, it's even easy to write your domain logic so that if a transaction fails in this way you can restart the entire process if you captured the original data the user was operating on (perhaps using the Command pattern).

However, optimistic locking is not appropriate in all cases, Sometimes you have to use one of the wsPessimisticUpdate variants. wsPessimistic-WeakestLockAtLoad will work in most cases with nearly every database; that's why it's the default. However, if you are using Oracle, it will fail in joins, and wherever the DISTINCT keyword is used, so you would have to move back to using wsPessimisticUpdate-No Collision. However, this policy doesn't ensure data integrity. Since it doesn't hold locks, concurrent transactions can step on each other and overwrite each other's data. So, either you can live with that option (maybe by ensuring that you don't get simultaneous transactions against a row through some other approach) or choose to live with wsPessimisticUpdate-Exclusive, which would serialize access to each row for both readers and writers. In some applications, this would be a significant performance problem. In others, it wouldn't. Your mileage may vary.

Finally, there is a difference in how you set up the optimistic predicate in WebSphere 5.0.0 and 5.0.1. In WebSphere 5.0, by default, all non-binary columns were added to the predicate (as we saw was true in the 4.0.2 version as well). The problem here was that this is slow since typically not all the columns will be indexed. Also, this can lead to problems because now the predicate is too constrained. In version 5.0.1, the default is instead not to add any columns to the predicate, meaning that for optimistic locking, no locking will take place. Instead, you need to manually set which columns are part of the predicate by selecting the mapping of a column in the overview of the WebSphere Studio map editor and then setting the OptimisticPredicate property to true.

Setting up access intents for an EJB in WebSphere Studio is actually very simple. We covered this (in another context) in Chapter 25. The process is the same.

About The Book
Enterprise Java Programming with IBM WebSphere, 2nd Edition by Kyle Brown et al.
ISBN: 032118579X
Publisher: Addison-Wesley
List Price: $59.99

Reprinted with the permission of Addison-Wesley.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.