Java Institute DreamsCity 2000
Welcome to DreamsCity
Return to Java Institute

The tip has three parts:
* Why Use Threads? * Protecting Shared Resources with Synchronized Blocks * Minimizing the Overhead of Synchronized Blocks This tip was developed using Java(tm) 2 SDK, Standard Edition, v 1.2.2. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -


WHY USE THREADS?

Support for concurrent programming is a central feature of the 
Java (tm) programming language. The java.lang package includes the 
Thread class, which allows you to start multiple threads of
execution in your program. On a multi-cpu machine, threads increase 
performance by allowing your code to use all of the available 
processors. Even on a uniprocessor box, there are compelling 
reasons to use threads. Tasks should be placed on separate threads 
if they take a long time to finish, or if they force the processor 
to wait for disk or network access. Dividing tasks into these 
"worker" threads has two advantages. First, the code runs faster 
because the cpu does not waste time idly waiting for resources. 
Second, the code is more responsive, because the user interface 
can have its own thread which handles user input even while the 
"worker" threads are busy.  

Of course, this power comes with a price. If more than one thread 
can access a piece of data, then you must guarantee that 
simultaneous access does not corrupt that data. The second part of 
this tip addresses this problem, showing you how to use 
synchronized blocks to protect data shared between threads. The 
third part of the tip presents some ideas for minimizing the 
overhead of synchronized blocks in your code.  
 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -


PROTECTING SHARED RESOURCES
WITH SYNCHRONIZED BLOCKS

The example code for this tip manages mutual funds. When the 
program begins, all of the money in the fund is equally divided 
among several accounts. The fund is run by a group of fund 
managers, who may at any time decide to shift money from one 
account to another. The fund managers act independently of
one another; to represent that fact, each fund manager has 
a thread dedicated to it. For simplicity, and to make it easy to 
detect concurrency problems when they occur, the total account 
value never changes--money simply moves from one account to 
another.  
Here's the code: /** * The FundConstants interface is a container for constants * that are shared by all classes. Any class that needs * unqualified access to these names implements the interface */ interface FundConstants { static final int noOfStocks = 3; static final int noOfManagers = 150; static final int testDuration = 20; static final int randomTransfers = 1000; static final int initialBalance = 10000; static final int totalMoneyInUniverse = noOfStocks * initialBalance; } /** * Data class representing a desired stock transfer */ class Transfer implements FundConstants { final public int fundFrom; final public int fundTo; final public int amount; public Transfer() { fundFrom = (int)(Math.random() * noOfStocks); fundTo = (int)(Math.random() * noOfStocks); amount = (int)(Math.random() * 1000); } } /** * Contains the current value of each stock, plus methods to * transfer between stocks and verify that the total value is * correct. */ class Stocks implements FundConstants { static int[] balances = new int[noOfStocks]; static { for (int n=0; n < noOfStocks; n++) { balances[n] = 10000; } } static void transfer(Transfer t) { balances[t.fundFrom] -= t.amount; balances[t.fundTo] += t.amount; } static void checkSystem() { int actual = 0; for (int n=0; n < noOfStocks; n++) { actual += balances[n]; } System.out.println("Actual balance is " + actual); } } /** * The FundManager class simulates fund managers who * sleep a lot and move money around at random. */ class FundManager implements Runnable, FundConstants { final String name; long throughput; public FundManager(String name) { this.name = name; } public void run() { int next = 0; while (true) { Stocks.transfer(TestFundManagers.transfers[next++]); throughput++; if (next == randomTransfers) next = 0; try { Thread.sleep(1); } catch (InterruptedException ie) { file://System.out.println(name + " is closing for the day"); return; } } } } /** * This is the main class of the program. It gives each fund manager * a thread and then starts it. The thread alternates between * sleeping for a millisecond and making a random stock transfer. */ public class TestFundManagers implements FundConstants { public static Transfer transfers[] = new Transfer[randomTransfers]; static { for (int n=0; n < randomTransfers; n++) { transfers[n] = new Transfer(); } } public static void main(String [] args) { Thread[] threads = new Thread[noOfManagers]; FundManager[] mgrs = new FundManager[noOfManagers]; for (int n=0; n < noOfManagers; n++) { mgrs[n] = new FundManager("Manager " + n); threads[n] = new Thread(mgrs[n]); file://WARNING! You would rarely call setPriority in production code threads[n].setPriority(1 + (int)(Math.random() * 4)); threads[n].start(); } for (int n=0; n < testDuration; n++) { try { Thread.sleep(1000); Stocks.checkSystem(); } catch (InterruptedException ie) {} } System.out.println(); for (int n=0; n < noOfManagers; n++) { threads[n].interrupt(); } for (int n=0; n < noOfManagers; n++) { try { threads[n].join(); } catch (InterruptedException ie) {} } long throughput = 0; for (int n=0; n < noOfManagers; n++) { throughput += mgrs[n].throughput; } Stocks.checkSystem(); System.out.println("Total transactions " + throughput); } } As you can see from the FundConstants interface, the mutual fund is divided among 3 stocks, each with an initial balance of 10000. The fund is co-managed by 150 different managers. The TestFundManagers class is the application main class. It begins by creating a thread for each manager. The code for each thread is specified by a constructor parameter of type Runnable. When Thread.start() is called, Runnable's run method executes on a new thread. In this example, FundManager is a subclass of Runnable, and the run method simply alternates between two tasks: (1) calling Sleep to put the thread to sleep for 1 millisecond, and (2) making a random change to the stock mix. After firing off the other threads, the application main thread goes to sleep, waking up once a second to verify that the total value of the stocks is still 30000. This should always be true because each fund manager can only move money around, it cannot create or destroy it. Try running the program, and see if the total value stays at 30000 after each iteration. Note that for all the programs in this tip, you should turn off Just-In-Time (JIT) compilation as follows: java -Djava.compiler=NONE TestFundManagers With a little bad luck, you will see the value drift away from 30000 sometime during the test. This value drift problem can occur with or without JIT compilation, but in this example the absence of JIT compilation increases the chance of the problem occurring. Even without JIT compilation, you might have to experiment in order to see the problem. This is because the behavior is sensitive to several factors, such as your CPU speed, your cache, your Java(tm) Virtual Machine*, and other applications and services that are currently running. If you do not see the problem the first time, try various combinations of the following: (1) Change the noOfManagers constant (2) Change the duration of Thread.sleep in the run() method (3) Eliminate the -Djava.compiler=NONE flag Some combination of these changes will manifest the problem on almost any Java Virtual Machine. The problem is with the FundManager object's call to Stocks.transfer. To see what goes wrong, you will need to take a look at the compiled byte code for the transfer method. (You can generate a bytecode listing by using the "javap -c Stocks" command.) Here is a partial listing that shows the trouble spot: Compiled byte code for the line "balances[t.fundFrom] -= t.amount;" 0 getstatic #14 <field int balances[]> 3 aload_0 4 getfield #15 <field int fundFrom> 7 dup2 8 iaload 9 aload_0 10 getfield #12 <field int amount> 13 isub 14 iastore It is not important that you understand exactly what the bytecodes do. The important thing to understand is that your one line of code translates into multiple lines of bytecode. In particular, the loading and storing of the entry in the balances array happens in two separate steps (numbered 8 and 14 above). On most Java(tm) implementations, a thread may be context-switched at any time. This means that the current state of the thread is saved, and another thread is allowed a chance to execute. What happens if a thread gets context-switched between steps 8 and 14, and another thread comes along and changes the value of the stock's balance? That is, what if the following happens: Thread 1 loads balance 2000 Thread 1 switched out, thread 2 allowed to run Thread 2 loads balance 2000 Thread 2 adds 500 Thread 2 stores new balance 2500 Thread 2 eventually switches out, thread 1 allowed to run Thread 1 subtracts 100 (from 2000, not 2500!) Thread 1 stores 1900 The change made by Thread 2 is permanently lost, and money has been seemingly destroyed. Investors are not happy. You might not see the problem behavior on your machines, which suggests that the problem occurs with low probability. However, though it might first appear counterintutive, bugs that manifest rarely are worse than those that occur frequently. By this standard, data corruption in a threaded program is a bug of the nastiest sort: it is hard to spot the problem by reading the code, it occurs only rarely in the field, and it may disappear entirely when you run the code in a debugger! Use a synchronized block to avoid basic data corruption problems in threaded Java programs. The synchronized block looks like this: synchronized (someObject) { file://code to be protected } Each object in the Java language is associated with its own concurrency protection object, called a monitor. When a thread encounters a synchronized block, it must acquire the monitor associated with someObject before proceeding into the block. When the thread exits a synchronized block, it releases the monitor. Only one thread at a time may acquire a monitor, which guarantees that there will never be more than one thread at a time inside the block. If several blocks of code synchronize on the same object, then only one thread can be in the entire set of blocks. So, to protect data, you need to (1) identify which pieces of data might be touched by multiple threads, and (2) protect that data in a synchronized block. In the Stocks example, the endangered data is the balances array. The balances array is accessed from both the transfer and checkSystem methods, so both methods will need synchronized blocks: static void transfer(Transfer t) { synchronized (balances) { balances[t.fundFrom] -= t.amount; balances[t.fundTo] += t.amount; } } static void checkSystem() { int actual = 0; synchronized (balances) { for (int n=0; n < noOfStocks; n++) { actual += balances[n]; } } System.out.println("Actual balance is " + actual); } Try running the program again. This time the balance will be correct throughout the progam's run. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -


MINIMIZING THE OVERHEAD OF
SYNCHRONIZED BLOCKS

The synchronized keyword works by blocking threads. In other 
words, if one thread already has the monitor, then a second thread 
is blocked, and cannot run until the first thread releases the 
monitor. This makes the code run slower. An indicator of this 
slowdown is the throughput variable in the TestFundManager; the
throughput variable counts the total number of transactions on all 
threads. Try re-running the program with and without the 
synchronized blocks. You might find that the synchronized version 
runs slower than the unsynchronized version, perhaps as much as 
30 percent slower. But you get correct results. 
There are several ways to minimize the overhead of synchronized blocks. Perhaps the most important is to avoid using them except where necessary. For example, take a look at the throughput field in the FundManager class. Each throughput field is accessed by two different threads: the thread assigned to that FundManager, and the application main thread. However, there is no need to synchronize the throughput field because the application design guarantees that those threads will never access the data at the same time. The application thread does not look at these values until after all the other threads have returned from their run methods. Another way to reduce overhead is to minimize the lines of code inside the block. Examine the checkSystem method. The synchronized block does not include the entire method. The call to System.out.println does not access the balances array, so it does not need any protection. If you are willing to make your code more complex, you could gain a performance benefit from fine-grained locking. The lock on the balances field is a coarse-grained lock. In order to modify any part of the array, you lock the entire array. To implement a fine grained strategy, take out a separate lock for each position in the array: class Stocks implements FundConstants { static int[] balances = new int[noOfStocks]; static Object[] locks = new Object[noOfStocks]; static { for (int n=0; n < noOfStocks; n++) { balances[n] = 10000; locks[n] = new Object(); } } static void transfer(Transfer t) { int lo, hi; if (t.fundFrom < t.fundTo) { lo = t.fundFrom; hi = t.fundTo; } else { lo = t.fundTo; hi = t.fundFrom; } synchronized (locks[lo]) { synchronized (locks[hi]) { balances[t.fundFrom] -= t.amount; balances[t.fundTo] += t.amount; } } } static int sumHelper (int next) { synchronized (locks[next]) { if (next == (noOfStocks-1)) { return balances[next]; } else { return balances[next] + sumHelper(next+1); } } } static void checkSystem() { int actual = 0; actual = sumHelper(0); System.out.println("Actual balance is " + actual); } } This code is much more complex than the coarse-grained version. First, notice that you cannot lock on an integer type, so it is impossible to take a direct lock on the items in the balances array. Instead, you have to create an array of Object (here named locks). These objects have no value themselves, you just want them for their associated monitors. Second, notice that where you formerly entered one monitor in the transfer method, you must now enter two monitors. In the checkSystem method, the situation is even worse. You must simultaneously enter all the monitors because you are traversing all of the items in the array. This requires a helper method, sumHelper, which holds each monitor as it recursively takes the next one. As soon as you have more than one monitor, an even more insidious problem appears. What happens when two threads, each holding a monitor, attempt to acquire the monitor held by the other thread? Here is an example: Thread 1 enters locks[4] Thread 2 enters locks[6] Thread 2 requests locks[4] and blocks waiting for Thread 1 Thread 1 requests locks[6] and blocks waiting for Thread 2 Neither thread can move forward, but neither can reach the end of the synchronized block to let the other move forward. Both threads will wait forever. This situation is called deadlock. If you control all the threads and all the monitors, then you can avoid deadlock by following this simple rule: All threads must request monitors in the same order. If every thread requests monitors in the same order, then there is no way for two threads to be stuck waiting on each other. The code above follows this rule by always acquiring the locks in number order. If you want to see deadlock, try changing the order by changing the direction of the '<' in the transfer method. This causes the transfer method to acquire the monitors in decreasing numeric order, while the sumHelper method is still using increasing numeric order. If you try running the code this way, it will quickly deadlock. After all this work, you deserve to see a performance benefit. In this case, you probably will. Go back and fix the code so that it doesn't deadlock. Now run the code, and compare it with the results from the code in the first tip. Test runs produced the following results: No locking version: throughput around 780000 Coarse-grained locking: throughput around 560000 Fine-grained locking: throughput around 775000 It appears the fine-grained locking was worth the trouble in this case, since the fine-grained version is running almost as fast as the version with no locking at all. But don't jump to the conclusion that fine-grained locking is always faster. In another situation (or on different hardware), it might actually be slower. You will need to test your own code to know for sure. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Threads allow you to take advantage of multiple CPUs, to increase performance on single CPUs, and to increase the responsiveness of your user interface. The ability to write threaded programs correctly will be increasingly important, as multi-processor machines become more widespread and user demands for performance and responsiveness increase. If you write concurrent code in the Java(tm) language, be sure to use synchronized blocks to protect shared data. Be aware that most concurrent data problems have more than one correct solution, and that you might want to investigate both coarse and fine-grained locking schemes to maximize the performance of your applications.

Any comments? email to:
richard@dreamscity.net