Java Institute

DreamsCity 2000

The tip has three parts:


         * Why Use Threads?
         * Protecting Shared Resources with Synchronized Blocks
         * Minimizing the Overhead of Synchronized Blocks
         
This tip was developed using Java(tm) 2 SDK, Standard Edition, 
v 1.2.2.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

WHY USE THREADS?

Support for concurrent programming is a central feature of the 
Java (tm) programming language. The java.lang package includes the 
Thread class, which allows you to start multiple threads of
execution in your program. On a multi-cpu machine, threads increase 
performance by allowing your code to use all of the available 
processors. Even on a uniprocessor box, there are compelling 
reasons to use threads. Tasks should be placed on separate threads 
if they take a long time to finish, or if they force the processor 
to wait for disk or network access. Dividing tasks into these 
"worker" threads has two advantages. First, the code runs faster 
because the cpu does not waste time idly waiting for resources. 
Second, the code is more responsive, because the user interface 
can have its own thread which handles user input even while the 
"worker" threads are busy.  

Of course, this power comes with a price. If more than one thread 
can access a piece of data, then you must guarantee that 
simultaneous access does not corrupt that data. The second part of 
this tip addresses this problem, showing you how to use 
synchronized blocks to protect data shared between threads. The 
third part of the tip presents some ideas for minimizing the 
overhead of synchronized blocks in your code.  
 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

PROTECTING SHARED RESOURCES
WITH SYNCHRONIZED BLOCKS

The example code for this tip manages mutual funds. When the 
program begins, all of the money in the fund is equally divided 
among several accounts. The fund is run by a group of fund 
managers, who may at any time decide to shift money from one 
account to another. The fund managers act independently of
one another; to represent that fact, each fund manager has 
a thread dedicated to it. For simplicity, and to make it easy to 
detect concurrency problems when they occur, the total account 
value never changes--money simply moves from one account to 
another.  


Here's the code: 

/** 
 * The FundConstants interface is a container for constants 
 * that are shared by all classes.  Any class that needs 
 * unqualified access to these names implements the interface
 */
interface FundConstants {
    static final int noOfStocks = 3;
    static final int noOfManagers = 150;
    static final int testDuration = 20;
    static final int randomTransfers = 1000;
    static final int initialBalance = 10000;
    static final int totalMoneyInUniverse = noOfStocks * initialBalance;
}

/**
 * Data class representing a desired stock transfer
 */
class Transfer implements FundConstants {
    final public int fundFrom;
    final public int fundTo;
    final public int amount;
    
    public Transfer() {
        fundFrom = (int)(Math.random() * noOfStocks);
        fundTo = (int)(Math.random() * noOfStocks);
        amount = (int)(Math.random() * 1000);
    }
}

/**
 * Contains the current value of each stock, plus methods to 
 * transfer between stocks and verify that the total value is 
 * correct.
 */
class Stocks implements FundConstants {
    static int[] balances = new int[noOfStocks];
    static 
    {
        for (int n=0; n < noOfStocks; n++) {
            balances[n] = 10000;    
        }
    }
    
    static void transfer(Transfer t) {
        balances[t.fundFrom] -= t.amount;
        balances[t.fundTo] += t.amount;
    }
    static void checkSystem() {
        int actual = 0;
        for (int n=0; n < noOfStocks; n++) {
            actual += balances[n];
        }
        System.out.println("Actual balance is " + actual);    
    }
}

/**
 * The FundManager class simulates fund managers who
 * sleep a lot and move money around at random. 
 */
class FundManager implements Runnable, FundConstants {
    
    final String name;
    long throughput;
    
    public FundManager(String name) {
        this.name = name;    
    }

    public void run() {
        int next = 0;
        while (true) {
            Stocks.transfer(TestFundManagers.transfers[next++]);
            throughput++;   
            if (next == randomTransfers)
                next = 0;
            try {
                Thread.sleep(1);
            }
            catch (InterruptedException ie) {
                file://System.out.println(name + " is closing for the day");
                return;
            }
        }
    }
}

/** 
 * This is the main class of the program. It gives each fund manager 
 * a thread and then starts it. The thread alternates between
 * sleeping for a millisecond and making a random stock transfer.
 */
public class TestFundManagers implements FundConstants {

    public static Transfer transfers[] = 
new Transfer[randomTransfers];
    static {
        for (int n=0; n < randomTransfers; n++) {
            transfers[n] = new Transfer();
        }
    }

    public static void main(String [] args) {
        Thread[] threads = new Thread[noOfManagers];
        FundManager[] mgrs = new FundManager[noOfManagers];
        for (int n=0; n < noOfManagers; n++) {
            mgrs[n] = new FundManager("Manager " + n);
            threads[n] = new Thread(mgrs[n]);
            file://WARNING! You would rarely call setPriority in production code
            threads[n].setPriority(1 + (int)(Math.random() * 4));
            threads[n].start();
        }
        for (int n=0; n < testDuration; n++) {
            try {
                Thread.sleep(1000);
                Stocks.checkSystem();
            }
            catch (InterruptedException ie) {}
        }
        System.out.println();
        for (int n=0; n < noOfManagers; n++) {
            threads[n].interrupt();
        }
        for (int n=0; n < noOfManagers; n++) {
            try {
                threads[n].join();
            } catch (InterruptedException ie) {}
        }
        long throughput = 0;
        for (int n=0; n < noOfManagers; n++) {
            throughput += mgrs[n].throughput;
        }
        Stocks.checkSystem();
        System.out.println("Total transactions " + throughput);
    }
}

As you can see from the FundConstants interface, the mutual fund 
is divided among 3 stocks, each with an initial balance of 
10000. The fund is co-managed by 150 different managers. The 
TestFundManagers class is the application main class. It begins 
by creating a thread for each manager. The code for each thread  
is specified by a constructor parameter of type Runnable. When 
Thread.start() is called, Runnable's run method executes on 
a new thread. In this example, FundManager is a subclass of 
Runnable, and the run method simply alternates between two tasks: 
(1) calling Sleep to put the thread to sleep for 1 millisecond, 
and (2) making a random change to the stock mix.  

After firing off the other threads, the application main thread 
goes to sleep, waking up once a second to verify that the total 
value of the stocks is still 30000. This should always be true 
because each fund manager can only move money around, it cannot 
create or destroy it. Try running the program, and see if the 
total value stays at 30000 after each iteration. Note that for all 
the programs in this tip, you should turn off Just-In-Time (JIT) 
compilation as follows:
  
java -Djava.compiler=NONE TestFundManagers
  
With a little bad luck, you will see the value drift away from 
30000 sometime during the test. This value drift problem can
occur with or without JIT compilation, but in this example the 
absence of JIT compilation increases the chance of the problem 
occurring. Even without JIT compilation, you might have to 
experiment in order to see the problem. This is because the 
behavior is sensitive to several factors, such as your CPU speed, 
your cache, your Java(tm) Virtual Machine*, and other 
applications and services that are currently running. If you
do not see the problem the first time, try various combinations 
of the following: 

(1) Change the noOfManagers constant
(2) Change the duration of Thread.sleep in the run() 
    method
(3) Eliminate the -Djava.compiler=NONE flag

Some combination of these changes will manifest the problem on 
almost any Java Virtual Machine.  

The problem is with the FundManager object's call to 
Stocks.transfer. To see what goes wrong, you will need to take 
a look at the compiled byte code for the transfer method. (You can 
generate a bytecode listing by using the "javap -c Stocks" 
command.) Here is a partial listing that shows the trouble spot: 

   Compiled byte code for the line "balances[t.fundFrom] -= t.amount;"
   
   0 getstatic #14 <field int balances[]>
   3 aload_0
   4 getfield #15 <field int fundFrom>
   7 dup2
   8 iaload
   9 aload_0
  10 getfield #12 <field int amount>
  13 isub
  14 iastore

It is not important that you understand exactly what the bytecodes 
do. The important thing to understand is that your one line of code 
translates into multiple lines of bytecode. In particular, the 
loading and storing of the entry in the balances array happens in 
two separate steps (numbered 8 and 14 above). On most 
Java(tm) implementations, a thread may be context-switched at any 
time. This means that the current state of the thread is saved, and 
another thread is allowed a chance to execute. What happens if a 
thread gets context-switched between steps 8 and 14, and another 
thread comes along and changes the value of the stock's balance? 
That is, what if the following happens: 

Thread 1 loads balance 2000
Thread 1 switched out, thread 2 allowed to run
Thread 2 loads balance 2000
Thread 2 adds 500
Thread 2 stores new balance 2500
Thread 2 eventually switches out, thread 1 allowed to run
Thread 1 subtracts 100 (from 2000, not 2500!)
Thread 1 stores 1900

The change made by Thread 2 is permanently lost, and money has 
been seemingly destroyed.  Investors are not happy.  

You might not see the problem behavior on your machines, 
which suggests that the problem occurs with low probability.   
However, though it might first appear counterintutive, bugs that 
manifest rarely are worse than those that occur frequently. By 
this standard, data corruption in a threaded program is a bug of 
the nastiest sort: it is hard to spot the problem by reading the 
code, it occurs only rarely in the field, and it may disappear 
entirely when you run the code in a debugger!  

Use a synchronized block to avoid basic data corruption problems 
in threaded Java programs. The synchronized block looks like 
this: 

synchronized (someObject) {
file://code to be protected 
    }

Each object in the Java language is associated with its own 
concurrency protection object, called a monitor. When a thread 
encounters a synchronized block, it must acquire the monitor 
associated with someObject before proceeding into the block. 
When the thread exits a synchronized block, it releases the 
monitor. Only one thread at a time may acquire a monitor, which 
guarantees that there will never be more than one thread at a time 
inside the block. If several blocks of code synchronize on the 
same object, then only one thread can be in the entire set of 
blocks.  

So, to protect data, you need to (1) identify which pieces of data 
might be touched by multiple threads, and (2) protect that data in 
a synchronized block. In the Stocks example, the endangered data 
is the balances array. The balances array is accessed from both 
the transfer and checkSystem methods, so both methods will need 
synchronized blocks: 

    static void transfer(Transfer t) {
        synchronized (balances) {
            balances[t.fundFrom] -= t.amount;
            balances[t.fundTo] += t.amount;
        }
    }
    static void checkSystem() {
        int actual = 0;
        synchronized (balances) {
            for (int n=0; n < noOfStocks; n++) {
                actual += balances[n];
            }
        }
        System.out.println("Actual balance is " + actual);    
    }

Try running the program again. This time the balance will be 
correct throughout the progam's run.  

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

MINIMIZING THE OVERHEAD OF
SYNCHRONIZED BLOCKS

The synchronized keyword works by blocking threads. In other 
words, if one thread already has the monitor, then a second thread 
is blocked, and cannot run until the first thread releases the 
monitor. This makes the code run slower. An indicator of this 
slowdown is the throughput variable in the TestFundManager; the
throughput variable counts the total number of transactions on all 
threads. Try re-running the program with and without the 
synchronized blocks. You might find that the synchronized version 
runs slower than the unsynchronized version, perhaps as much as 
30 percent slower. But you get correct results. 
 

There are several ways to minimize the overhead of synchronized 
blocks. Perhaps the most important is to avoid using them except 
where necessary. For example, take a look at the throughput field 
in the FundManager class. Each throughput field is accessed by two 
different threads: the thread assigned to that FundManager, and 
the application main thread. However, there is no need to 
synchronize the throughput field because the application design 
guarantees that those threads will never access the data at the 
same time. The application thread does not look at these values 
until after all the other threads have returned from their run 
methods.  

Another way to reduce overhead is to minimize the lines of code 
inside the block. Examine the checkSystem method. The synchronized 
block does not include the entire method. The call to 
System.out.println does not access the balances array, so it does 
not need any protection.  

If you are willing to make your code more complex, you could gain 
a performance benefit from fine-grained locking. The lock on the 
balances field is a coarse-grained lock. In order to modify any 
part of the array, you lock the entire array. To implement a 
fine grained strategy, take out a separate lock for each position 
in the array: 

class Stocks implements FundConstants {
    static int[] balances = new int[noOfStocks];
    static Object[] locks = new Object[noOfStocks];
    static 
    {
        for (int n=0; n < noOfStocks; n++) {
            balances[n] = 10000;    
            locks[n] = new Object();
        }
    }
    
    static void transfer(Transfer t) {
        int lo, hi;
        if (t.fundFrom < t.fundTo) {
            lo = t.fundFrom;
            hi = t.fundTo;
        } else {
            lo = t.fundTo;
            hi = t.fundFrom;
        }            
        synchronized (locks[lo]) {
            synchronized (locks[hi]) {
                balances[t.fundFrom] -= t.amount;
                balances[t.fundTo] += t.amount;
            }
        }
    }
    
    static int sumHelper (int next) {
        synchronized (locks[next]) {
            if (next == (noOfStocks-1)) {
                return balances[next];
            } else {
                return balances[next] + sumHelper(next+1);
            }
        }
    }
    
    static void checkSystem() {
        int actual = 0;
        actual = sumHelper(0);
        System.out.println("Actual balance is " + actual);    
    }
}

This code is much more complex than the coarse-grained version.  
First, notice that you cannot lock on an integer type, so it is 
impossible to take a direct lock on the items in the balances 
array. Instead, you have to create an array of Object (here named 
locks). These objects have no value themselves, you just want them 
for their associated monitors. Second, notice that where you 
formerly entered one monitor in the transfer method, you must now 
enter two monitors. In the checkSystem method, the situation is 
even worse. You must simultaneously enter all the monitors because 
you are traversing all of the items in the array. This requires 
a helper method, sumHelper, which holds each monitor as it 
recursively takes the next one.  

As soon as you have more than one monitor, an even more insidious 
problem appears. What happens when two threads, each holding a 
monitor, attempt to acquire the monitor held by the other thread?  
Here is an example: 

Thread 1 enters locks[4]
Thread 2 enters locks[6]
Thread 2 requests locks[4] and blocks waiting for Thread 1
Thread 1 requests locks[6] and blocks waiting for Thread 2

Neither thread can move forward, but neither can reach the end of 
the synchronized block to let the other move forward. Both threads 
will wait forever. This situation is called deadlock. If you 
control all the threads and all the monitors, then you can avoid 
deadlock by following this simple rule: All threads must request 
monitors in the same order. If every thread requests monitors in 
the same order, then there is no way for two threads to be stuck 
waiting on each other. The code above follows this rule by always 
acquiring the locks in number order. If you want to see deadlock, 
try changing the order by changing the direction of the '<' in the 
transfer method. This causes the transfer method to acquire the 
monitors in decreasing numeric order, while the sumHelper method
is still using increasing numeric order. If you try running the 
code this way, it will quickly deadlock.  

After all this work, you deserve to see a performance benefit. In 
this case, you probably will. Go back and fix the code so that it 
doesn't deadlock. Now run the code, and compare it with the 
results from the code in the first tip. Test runs produced the 
following results: 

No locking version: throughput around 780000 
Coarse-grained locking: throughput around 560000
Fine-grained locking: throughput around 775000

It appears the fine-grained locking was worth the trouble in this 
case, since the fine-grained version is running almost as fast as 
the version with no locking at all. But don't jump to the 
conclusion that fine-grained locking is always faster. In another 
situation (or on different hardware), it might actually be slower.  
You will need to test your own code to know for sure.  

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

Threads allow you to take advantage of multiple CPUs, to increase 
performance on single CPUs, and to increase the responsiveness of 
your user interface. The ability to write threaded programs 
correctly will be increasingly important, as multi-processor 
machines become more widespread and user demands for performance 
and responsiveness increase. If you write concurrent code in the 
Java(tm) language, be sure to use synchronized blocks to protect 
shared data. Be aware that most concurrent data problems have more 
than one correct solution, and that you might want to investigate 
both coarse and fine-grained locking schemes to maximize the 
performance of your applications.

Any comments? email to:
richard@dreamscity.net

WHY USE THREADS?

PROTECTING SHARED RESOURCES WITH SYNCHRONIZED BLOCKS

MINIMIZING THE OVERHEAD OF SYNCHRONIZED BLOCKS

PROTECTING SHARED RESOURCES
WITH SYNCHRONIZED BLOCKS

MINIMIZING THE OVERHEAD OF
SYNCHRONIZED BLOCKS