The tip has three parts:
* Why Use Threads?
* Protecting Shared Resources with Synchronized Blocks
* Minimizing the Overhead of Synchronized Blocks
This tip was developed using Java(tm) 2 SDK, Standard Edition,
v 1.2.2.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
WHY USE THREADS?
Support for concurrent programming is a central feature of the
Java (tm) programming language. The java.lang package includes the
Thread class, which allows you to start multiple threads of
execution in your program. On a multi-cpu machine, threads increase
performance by allowing your code to use all of the available
processors. Even on a uniprocessor box, there are compelling
reasons to use threads. Tasks should be placed on separate threads
if they take a long time to finish, or if they force the processor
to wait for disk or network access. Dividing tasks into these
"worker" threads has two advantages. First, the code runs faster
because the cpu does not waste time idly waiting for resources.
Second, the code is more responsive, because the user interface
can have its own thread which handles user input even while the
"worker" threads are busy.
Of course, this power comes with a price. If more than one thread
can access a piece of data, then you must guarantee that
simultaneous access does not corrupt that data. The second part of
this tip addresses this problem, showing you how to use
synchronized blocks to protect data shared between threads. The
third part of the tip presents some ideas for minimizing the
overhead of synchronized blocks in your code.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
PROTECTING SHARED RESOURCES
WITH SYNCHRONIZED BLOCKS
The example code for this tip manages mutual funds. When the
program begins, all of the money in the fund is equally divided
among several accounts. The fund is run by a group of fund
managers, who may at any time decide to shift money from one
account to another. The fund managers act independently of
one another; to represent that fact, each fund manager has
a thread dedicated to it. For simplicity, and to make it easy to
detect concurrency problems when they occur, the total account
value never changes--money simply moves from one account to
another.
Here's the code:
/**
* The FundConstants interface is a container for constants
* that are shared by all classes. Any class that needs
* unqualified access to these names implements the interface
*/
interface FundConstants {
static final int noOfStocks = 3;
static final int noOfManagers = 150;
static final int testDuration = 20;
static final int randomTransfers = 1000;
static final int initialBalance = 10000;
static final int totalMoneyInUniverse = noOfStocks * initialBalance;
}
/**
* Data class representing a desired stock transfer
*/
class Transfer implements FundConstants {
final public int fundFrom;
final public int fundTo;
final public int amount;
public Transfer() {
fundFrom = (int)(Math.random() * noOfStocks);
fundTo = (int)(Math.random() * noOfStocks);
amount = (int)(Math.random() * 1000);
}
}
/**
* Contains the current value of each stock, plus methods to
* transfer between stocks and verify that the total value is
* correct.
*/
class Stocks implements FundConstants {
static int[] balances = new int[noOfStocks];
static
{
for (int n=0; n < noOfStocks; n++) {
balances[n] = 10000;
}
}
static void transfer(Transfer t) {
balances[t.fundFrom] -= t.amount;
balances[t.fundTo] += t.amount;
}
static void checkSystem() {
int actual = 0;
for (int n=0; n < noOfStocks; n++) {
actual += balances[n];
}
System.out.println("Actual balance is " + actual);
}
}
/**
* The FundManager class simulates fund managers who
* sleep a lot and move money around at random.
*/
class FundManager implements Runnable, FundConstants {
final String name;
long throughput;
public FundManager(String name) {
this.name = name;
}
public void run() {
int next = 0;
while (true) {
Stocks.transfer(TestFundManagers.transfers[next++]);
throughput++;
if (next == randomTransfers)
next = 0;
try {
Thread.sleep(1);
}
catch (InterruptedException ie) {
file://System.out.println(name + " is closing for the day");
return;
}
}
}
}
/**
* This is the main class of the program. It gives each fund manager
* a thread and then starts it. The thread alternates between
* sleeping for a millisecond and making a random stock transfer.
*/
public class TestFundManagers implements FundConstants {
public static Transfer transfers[] =
new Transfer[randomTransfers];
static {
for (int n=0; n < randomTransfers; n++) {
transfers[n] = new Transfer();
}
}
public static void main(String [] args) {
Thread[] threads = new Thread[noOfManagers];
FundManager[] mgrs = new FundManager[noOfManagers];
for (int n=0; n < noOfManagers; n++) {
mgrs[n] = new FundManager("Manager " + n);
threads[n] = new Thread(mgrs[n]);
file://WARNING! You would rarely call setPriority in production code
threads[n].setPriority(1 + (int)(Math.random() * 4));
threads[n].start();
}
for (int n=0; n < testDuration; n++) {
try {
Thread.sleep(1000);
Stocks.checkSystem();
}
catch (InterruptedException ie) {}
}
System.out.println();
for (int n=0; n < noOfManagers; n++) {
threads[n].interrupt();
}
for (int n=0; n < noOfManagers; n++) {
try {
threads[n].join();
} catch (InterruptedException ie) {}
}
long throughput = 0;
for (int n=0; n < noOfManagers; n++) {
throughput += mgrs[n].throughput;
}
Stocks.checkSystem();
System.out.println("Total transactions " + throughput);
}
}
As you can see from the FundConstants interface, the mutual fund
is divided among 3 stocks, each with an initial balance of
10000. The fund is co-managed by 150 different managers. The
TestFundManagers class is the application main class. It begins
by creating a thread for each manager. The code for each thread
is specified by a constructor parameter of type Runnable. When
Thread.start() is called, Runnable's run method executes on
a new thread. In this example, FundManager is a subclass of
Runnable, and the run method simply alternates between two tasks:
(1) calling Sleep to put the thread to sleep for 1 millisecond,
and (2) making a random change to the stock mix.
After firing off the other threads, the application main thread
goes to sleep, waking up once a second to verify that the total
value of the stocks is still 30000. This should always be true
because each fund manager can only move money around, it cannot
create or destroy it. Try running the program, and see if the
total value stays at 30000 after each iteration. Note that for all
the programs in this tip, you should turn off Just-In-Time (JIT)
compilation as follows:
java -Djava.compiler=NONE TestFundManagers
With a little bad luck, you will see the value drift away from
30000 sometime during the test. This value drift problem can
occur with or without JIT compilation, but in this example the
absence of JIT compilation increases the chance of the problem
occurring. Even without JIT compilation, you might have to
experiment in order to see the problem. This is because the
behavior is sensitive to several factors, such as your CPU speed,
your cache, your Java(tm) Virtual Machine*, and other
applications and services that are currently running. If you
do not see the problem the first time, try various combinations
of the following:
(1) Change the noOfManagers constant
(2) Change the duration of Thread.sleep in the run()
method
(3) Eliminate the -Djava.compiler=NONE flag
Some combination of these changes will manifest the problem on
almost any Java Virtual Machine.
The problem is with the FundManager object's call to
Stocks.transfer. To see what goes wrong, you will need to take
a look at the compiled byte code for the transfer method. (You can
generate a bytecode listing by using the "javap -c Stocks"
command.) Here is a partial listing that shows the trouble spot:
Compiled byte code for the line "balances[t.fundFrom] -= t.amount;"
0 getstatic #14 <field int balances[]>
3 aload_0
4 getfield #15 <field int fundFrom>
7 dup2
8 iaload
9 aload_0
10 getfield #12 <field int amount>
13 isub
14 iastore
It is not important that you understand exactly what the bytecodes
do. The important thing to understand is that your one line of code
translates into multiple lines of bytecode. In particular, the
loading and storing of the entry in the balances array happens in
two separate steps (numbered 8 and 14 above). On most
Java(tm) implementations, a thread may be context-switched at any
time. This means that the current state of the thread is saved, and
another thread is allowed a chance to execute. What happens if a
thread gets context-switched between steps 8 and 14, and another
thread comes along and changes the value of the stock's balance?
That is, what if the following happens:
Thread 1 loads balance 2000
Thread 1 switched out, thread 2 allowed to run
Thread 2 loads balance 2000
Thread 2 adds 500
Thread 2 stores new balance 2500
Thread 2 eventually switches out, thread 1 allowed to run
Thread 1 subtracts 100 (from 2000, not 2500!)
Thread 1 stores 1900
The change made by Thread 2 is permanently lost, and money has
been seemingly destroyed. Investors are not happy.
You might not see the problem behavior on your machines,
which suggests that the problem occurs with low probability.
However, though it might first appear counterintutive, bugs that
manifest rarely are worse than those that occur frequently. By
this standard, data corruption in a threaded program is a bug of
the nastiest sort: it is hard to spot the problem by reading the
code, it occurs only rarely in the field, and it may disappear
entirely when you run the code in a debugger!
Use a synchronized block to avoid basic data corruption problems
in threaded Java programs. The synchronized block looks like
this:
synchronized (someObject) {
file://code to be protected
}
Each object in the Java language is associated with its own
concurrency protection object, called a monitor. When a thread
encounters a synchronized block, it must acquire the monitor
associated with someObject before proceeding into the block.
When the thread exits a synchronized block, it releases the
monitor. Only one thread at a time may acquire a monitor, which
guarantees that there will never be more than one thread at a time
inside the block. If several blocks of code synchronize on the
same object, then only one thread can be in the entire set of
blocks.
So, to protect data, you need to (1) identify which pieces of data
might be touched by multiple threads, and (2) protect that data in
a synchronized block. In the Stocks example, the endangered data
is the balances array. The balances array is accessed from both
the transfer and checkSystem methods, so both methods will need
synchronized blocks:
static void transfer(Transfer t) {
synchronized (balances) {
balances[t.fundFrom] -= t.amount;
balances[t.fundTo] += t.amount;
}
}
static void checkSystem() {
int actual = 0;
synchronized (balances) {
for (int n=0; n < noOfStocks; n++) {
actual += balances[n];
}
}
System.out.println("Actual balance is " + actual);
}
Try running the program again. This time the balance will be
correct throughout the progam's run.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
MINIMIZING THE OVERHEAD OF
SYNCHRONIZED BLOCKS
The synchronized keyword works by blocking threads. In other
words, if one thread already has the monitor, then a second thread
is blocked, and cannot run until the first thread releases the
monitor. This makes the code run slower. An indicator of this
slowdown is the throughput variable in the TestFundManager; the
throughput variable counts the total number of transactions on all
threads. Try re-running the program with and without the
synchronized blocks. You might find that the synchronized version
runs slower than the unsynchronized version, perhaps as much as
30 percent slower. But you get correct results.
There are several ways to minimize the overhead of synchronized
blocks. Perhaps the most important is to avoid using them except
where necessary. For example, take a look at the throughput field
in the FundManager class. Each throughput field is accessed by two
different threads: the thread assigned to that FundManager, and
the application main thread. However, there is no need to
synchronize the throughput field because the application design
guarantees that those threads will never access the data at the
same time. The application thread does not look at these values
until after all the other threads have returned from their run
methods.
Another way to reduce overhead is to minimize the lines of code
inside the block. Examine the checkSystem method. The synchronized
block does not include the entire method. The call to
System.out.println does not access the balances array, so it does
not need any protection.
If you are willing to make your code more complex, you could gain
a performance benefit from fine-grained locking. The lock on the
balances field is a coarse-grained lock. In order to modify any
part of the array, you lock the entire array. To implement a
fine grained strategy, take out a separate lock for each position
in the array:
class Stocks implements FundConstants {
static int[] balances = new int[noOfStocks];
static Object[] locks = new Object[noOfStocks];
static
{
for (int n=0; n < noOfStocks; n++) {
balances[n] = 10000;
locks[n] = new Object();
}
}
static void transfer(Transfer t) {
int lo, hi;
if (t.fundFrom < t.fundTo) {
lo = t.fundFrom;
hi = t.fundTo;
} else {
lo = t.fundTo;
hi = t.fundFrom;
}
synchronized (locks[lo]) {
synchronized (locks[hi]) {
balances[t.fundFrom] -= t.amount;
balances[t.fundTo] += t.amount;
}
}
}
static int sumHelper (int next) {
synchronized (locks[next]) {
if (next == (noOfStocks-1)) {
return balances[next];
} else {
return balances[next] + sumHelper(next+1);
}
}
}
static void checkSystem() {
int actual = 0;
actual = sumHelper(0);
System.out.println("Actual balance is " + actual);
}
}
This code is much more complex than the coarse-grained version.
First, notice that you cannot lock on an integer type, so it is
impossible to take a direct lock on the items in the balances
array. Instead, you have to create an array of Object (here named
locks). These objects have no value themselves, you just want them
for their associated monitors. Second, notice that where you
formerly entered one monitor in the transfer method, you must now
enter two monitors. In the checkSystem method, the situation is
even worse. You must simultaneously enter all the monitors because
you are traversing all of the items in the array. This requires
a helper method, sumHelper, which holds each monitor as it
recursively takes the next one.
As soon as you have more than one monitor, an even more insidious
problem appears. What happens when two threads, each holding a
monitor, attempt to acquire the monitor held by the other thread?
Here is an example:
Thread 1 enters locks[4]
Thread 2 enters locks[6]
Thread 2 requests locks[4] and blocks waiting for Thread 1
Thread 1 requests locks[6] and blocks waiting for Thread 2
Neither thread can move forward, but neither can reach the end of
the synchronized block to let the other move forward. Both threads
will wait forever. This situation is called deadlock. If you
control all the threads and all the monitors, then you can avoid
deadlock by following this simple rule: All threads must request
monitors in the same order. If every thread requests monitors in
the same order, then there is no way for two threads to be stuck
waiting on each other. The code above follows this rule by always
acquiring the locks in number order. If you want to see deadlock,
try changing the order by changing the direction of the '<' in the
transfer method. This causes the transfer method to acquire the
monitors in decreasing numeric order, while the sumHelper method
is still using increasing numeric order. If you try running the
code this way, it will quickly deadlock.
After all this work, you deserve to see a performance benefit. In
this case, you probably will. Go back and fix the code so that it
doesn't deadlock. Now run the code, and compare it with the
results from the code in the first tip. Test runs produced the
following results:
No locking version: throughput around 780000
Coarse-grained locking: throughput around 560000
Fine-grained locking: throughput around 775000
It appears the fine-grained locking was worth the trouble in this
case, since the fine-grained version is running almost as fast as
the version with no locking at all. But don't jump to the
conclusion that fine-grained locking is always faster. In another
situation (or on different hardware), it might actually be slower.
You will need to test your own code to know for sure.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Threads allow you to take advantage of multiple CPUs, to increase
performance on single CPUs, and to increase the responsiveness of
your user interface. The ability to write threaded programs
correctly will be increasingly important, as multi-processor
machines become more widespread and user demands for performance
and responsiveness increase. If you write concurrent code in the
Java(tm) language, be sure to use synchronized blocks to protect
shared data. Be aware that most concurrent data problems have more
than one correct solution, and that you might want to investigate
both coarse and fine-grained locking schemes to maximize the
performance of your applications.
|