Building Distributed Application with Aglet*

This project is listed in the IBM Tokyo Research Lab's aglet homepage.
-----

Chong Xu Dongbin Tao

-----

Contents


Introduction

Major existing paradigms for building distributed applications can be classified into two groups. Examples of the first class include traditional RPC and most recently its object cousins RMI and COBRA. For this class of paradigms, functionality of applications are partitioned among participating nodes. Different participants use message-passing to coordinate distributed computation. Computation itself is partitioned, participants exchange intermediate results and other synchronization information. For the second class of paradigms, computation is migrated toward resources. This type of paradigm is especially useful for applications requesting immediate reaction to incoming streams of real-time data and distributed applications that are very tightly coupled. In this project, we try to experiment with one such paradigm--Aglet. Aglet is the shorthand for agent plus applet. It provides us an infrastructure for building second-class distributed applications.

The following part of the paper is divided into four sections. For the first section, we will describe aglet's concepts and architecture. For the next section, we will define the application. And in the following section we will give details of our design and implementation of a distributed stock information system. And finally We will summarize the common working mechanism of Aglet-based applications.


Aglet Concepts and Architecture



Aglet is a java-based internet agent. We can use a few phrases to characterize an Aglet: written in pure java, light-weight object migration , built with persistent support, event-driven. It is easy to understand why JAVA is necessary for WAN application's existence in today's heterogeneous networking environment. Besides providing platform independence, JAVA also provides sandbox security to protect host against malicious attacks from alien applications.

Aglet is different from distributed object model in that computation itself is transmitted, while for distributed object models, we actually transmit the requests for remote methods. Condor migrates the computation too. But condor migrate the process including its stack. Condor's cost is very high, even for LAN the time delay can be expected to be a few minutes. Aglet adopted another approach, it uses a technique called serialization to transmit data on the heap and migrate the interpretable byte-code.Aglet has well-defined entry point for itself to re-start computation.

Aglet also comes with support for persistence. By calling appropriate base-class functions, we can temporarily store aglets in secondary storage and later activate it.



The big picture in an Aglet world has the following components: an aglet viewer such as tahiti, an aglet server and finally aglets themselves. Aglet viewer is in many sense an applet viewer. Besides that, It further allows you to create, retract, activate, deactivate and dispatch aglets. It is a client-side control center.

Aglet server are powerful machines that can host large number of aglets and typically with large amount of data or computing resources. Aglets live in the context of hosts. Hosts enforce security policy by configuring its security manager. When dispatched, each aglet will carry an itinary. It will follow the itinary to choose its own routing. Aglets in their lifetime will visit several aglet hosts, perform computation tasks at host machines and finally carries the result back.


Problem Definition



In this project, we try to make use of aglet infrastructure to build a distributed applications for WAN. We believe it is the best way to understand Aglet mechanism. The application we choose is a distributed stock information system for WAN. It can retrieve stock information from world-wide stock exchange sites (pseudo sites in our project). We assume each exchange site will have its own database and provide an aglet server. The client side of the application will dispatch aglets to those aglet servers and fetch information requested by end-users.

The application we choose is a simple but complete one. It activates most mechanism under the hood of aglet. It also illustrates a model which is suitable for Aglet-based distributed computing. e.g. We will rely on aglet to do intensive remote I/O; and for complicated user requests, we will request aglet to process/filter intermediate results in our stead; user's requests may only be satisfied by collecting data from several databases and some of these databases may be temporally unavailable because of broken links. The scenario poses great challenges to application developers who work in traditional RPC-like framework.


Design And Implementation

On the server side of our applications are aglet hosts and databases which store all the stock information. For the client side, stock is the major aglet. Stock is a static aglet, at user's requests, it will dispatch slave aglets to server side and fetch stock information.

We download a popular freeware database engine mysql-3.20.16-beta and configure a SQL server on magpie. We build different databases for simulated exchange sites. Conceptually, these databases are geologically distributed all over the world. We generate scripts to automatically load initial data into our databases.

We use gweMysqlJDBC V0.9.2 as our JDBC driver to connect to our mysql databases.

The major part of the application is done on the client side. We will first of all describe the visual components of the client-side user interface.


The Stock is the aglet that plays the role of the user's agent. It also controls the StockWindow which manages interaction with end users. On the top of StockWindow is the canvass object. It will visualize the final query result. Two instances of TextArea class are used to display intermediate results and report aglet-related information. At the bottom of the StockWindow is a panel. In this panel, we use three Choice buttons to obtain user requests. There are three more buttons, one for dispatching the aglet to remote site, another for refreshing the contents of choice buttons and the other for quitting.

When the user presses go button, Stock aglet creates a slave called StockSlave, which is subclassed from ibm.aglets.patterns.Slave, and passes the destination and arguments of the query to the StockSlave. StockSlave is the labor aglet that really go to the remote side. Upon arrival at the remote site, doJob member function of StockSlave is called. doJob does what the real work we assign to the slave. Here, it extracts the arguments and connect to the database. It then calls JDBC SQL statement to do the query for the stock information the user is interested in. If there is more than one destination in its itinary, it might dispatch itself to another destination after it finishes work on one site. After finishing the itinary, the slave brings back the query results through TransferInfo class which implements serializable interface.

When the StockSlave returns to the original site, member function callback of the static master aglet Stock is activated. The result is passed as an argument of the callback member function. Stock aglet will extract the result and display on the canvass.

When implementing our applications, we encountered a few problems. The first problem is the caching of byte-code. We modify our stock database by inserting new records into the tables. All the insertion works fine for a stand-alone java application program. But when we tried to use aglet to do the insertion for us, we will get a duplicate key error from the SQL server.

Example:
	String query = "INSERT INTO jdbc VALUES (17, 20000,8000000,
               'This is a variable string')";
	ResultSet rs = stmt.executeQuery(query);

	The first field 17 is a primary key, if we modify it to another key 
	and perform the same insertion, we get a "duplicated primary key error". 

The reason is rather complicated. An aglet is composed of two types of data: bytecode and states. Not all aglet' states are serializalbe. Only those states on the heap can be carried from site to site. If we put the query variable as a local variable inside the doJob member function of the StockSlave, it is part of the byte code.

One of the reasons for the above problem might be that the checksum for the bytecode of the modified aglet and previous aglet are same. In this case, the remote aglet server will reuse the bytecode, therefore we execute the same SQL statement again and get the error message from SQL server.

To circumvent this problem, we define the variable query to be a member variable of the aglet, it will become part of the aglet's state on the heap and thus visible on remote site.

Other problems are related to ALPHAs and BETAs. Since the entire project is based on alpha or beta release of shareware, we have to spend substantial amount of efforts in appreciating the special flavors of downloaded software.


Summary

In this project, we build a simple aglet application for stock information retrieval. We hereby summarize our experience with Aglet-based applications. First of all, we believe Aglet is only suitable for a specific set of applications, such as applications with needs for intensive remote computation, remote decision and remote real-time interaction. Secondly, aglet infrastructure greatly eases the development for the class of applications we just mentioned. Thirdly, aglet's popularity will strongly rely on the availability of public aglet hosts. And finally, we feel the management of mobile aglets will be a critical problem. Since aglets are user's agents, it may be cached on multiple sites and vulnerable to hacker's attack.


Any Comments?

Please send your comments to xuchong@yahoo.com