Architectural Trends

As this is my personal take on emerging trends in technology, I'm going to take a somewhat meandering muse. These last couple of years in general, and the immediately past one in particular, have been difficult and confusing. We experienced the technology melt-down followed by even more trying circumstances, known to us all and so not requiring of mention. The industry seems to be both converging and diverging and people are understandably concerned with direction. I'm going to address a number of different issues in this editorial and it's going to be somewhat technical at times so I beg your forebearance.

As I've mentioned before, we've seen a lot of new frameworks and technologies emerge in the last little while and some of them are going to power enterprises over the next decade at least. What's important is to discern their suitability for a particular task or tasks. There's no single "white knight" like Java on the horizon, ready to address all our requirements, but a rather confusing combination of standards and technologies. The choices have become more complex and yet this is not altogether different from how the industry has worked for decades.

So let's investigate some of the emerging trends and try to place them in their correct historical prespective. First of all, the Y2K situation came about as the result of the conflict between a pressure and an expectation. Storage, in memory, on tape or disk, was expensive. You could store many more records if you didn't record the century, for example. That was considered redundant and these systems would never live to see the 21st century anyway, would they? Don't forget, applications written in COBOL in the '50s and '60s were hardly expected to have a life expectancy of 30, 40 or even 50 years.

Storage costs remained high until fairly recently. When the Radio Shack TRS-80 came out, a 5 MB hard drive cost about $5,000. Now we can purchase 40 GB drives for around $100. So we've gone from $1,000/MB to about $0.025/MB, about a 40,000 fold reduction. Now take a look at the cost of memory. A 4 MB Itel memory box for an IBM mainframe could have cost you well over $100,000 in the late '70s/early '80s. I can now purchase a PC-133 128 MB card for around $40. So we went from around $25,000/MB to approximately $0.30/MB.

The overarching concern in the early days was to conserve space any way possible. The IBM 360 architecture with its 24-bit address could only directly access 16 MB of main memory. IBM even used "binary coded decimal", storing two decimal digits in a single byte since the digits 0-9 can be represented in 4 bits. Some incredibly powerful systems were developed on this architecture, some of which remain with us to this day (like CICS). Low bit densities on magnetic media like tape (556 bits/inch on 7-track tape) and disk also required techniques which could store the most amount of data in the least amount of space. No effort was spared in compressing information to the bare minimum required.

Communication links of the mainframe era were typically lines leased from the telephone companies and running protocols such as SDLC (Synchronous Data Link Control). Even so, at speeds of 19.2 Kbps the aim was to send as little data as possible in order to perform a particular task. Later, in the days of the mini-computers (like the Digital Equipment Corporation PDP-11s), we had remote terminals connecting to the hosts using dial-up modems at the blistering speed of 300 baud. The original consoles for the DEC machines were ASR-33 teletypes which could only achieve 110 baud. That's partially why the UNIX operating system is so terse, with one, two or three letter command names: communication speed, even to the console, was limited.

Fast forward to today and many people consider dial-up to be ancient history. Even if you can get a 53 Kbps dial-up connection (not common), it's nothing compared to the speeds of xDSL or cable modem. My connection runs at about 2 Mbps on the downstream side, more than 100 times faster than a 19.2 Kbps leased line which could have supported an entire bank branch in days gone by. Now that we're not bandwidth constrained, it no longer makes sense to use compressed and cryptic standards such as EDI. Now that we've got the Internet, we no longer have to exchange magnetic tapes and worry about format and compatibility issues. And that's where XML enters the picture.

SGML (Standardized General Markup Language) has been around for a long time, used internally in products such as IBM's DCF (Document Composition Facility). DCF tags were typically short and cryptic since you still had to worry about the size of files on disk. HTML borrowed from SGML and now controls the presentation of what we view in our web browsers. As it's name implies, XML is an extensible markup language and the definition of tags is unlimited. We still require a grammar to understand an XML document and that's where DTD (Document Type Definition) and XMLSchema files come into play. XMLSchema is a powerful mechanism which fully describes the document format as well as the tag contents.

The huge advantage of XML documents is that they're text-only as well as self-describing. As such they can be transmitted over any transport protocol currently in existance. Fields are tag delimited and the ability to escape the tag start marker within content has been addressed. The handling of white space characters is defined within the various tag content types and even normalization (essential for encryption) has been codified. While the namespace support is a bit unweildy, there is a method to their madness. While necessarily more verbose than other document encoding schemes, technological advances have mostly served to ameliorate that objection.

We also have a supporting cast of tools: DOM and SAX parsers to manipulate information in an XML document and validate the document against a schema. There's also XSLT which enables the reformatting of an XML document. So how do they all work together? Below is a diagram showing a hypothetical application. An external source generates an XML document which is transmitted to the consumer. A validating DOM parser checks the document against the schema and passes the document to XSLT which uses a stylesheet to transform the document into a proprietary internal format.

But that's more of a batch-oriented process. Let's look at a more typical flow. E-mail (RFC-822) is another text-only format so things like binary attachments have to be encoded into a text representation. We typically use base64 to convert binary information into text format. The output from this process can just as easily be embedded in an XML document. So let's assume that we need to extract information from an XML document (an x-ray image perhaps) and insert it into a database. The flow might look something like this:

In this case the application gives the document and possibly the schema (it might be derived from the document itself) to the parser. The document object returned can then be processed and the necessary information extracted and passed to the base64 decoder. The output could then be written to a local database. Note that the arrow in this case runs from the application box, not the base64 application. The base64 decoder is a simple process which reads encoded input and writes decoded output. It is up to the application to contain database interface code like JDBC (Java DataBase Connectivity).

Now let's widen the focus even more. What if we want to expose this process to the web? Our application could be re-written as a servlet and contained in a servlet container which could communicate with a web server. In this case, we're going to assume an Apache web server and the Tomcat servlet container. No doubt there will be some who would ask "why not just use a J2EE server?" What I'm trying to demonstrate is incrementally increasing the complexity. Depending on the individual customer needs, this approach might be all that is required. Also note that Apache and Tomcat are both free and run on Linux, which is also free. So now we have something which looks like this:

So now the picture is getting a bit crowded. We're also going to have to start to address some real-world concerns like security and availability. That's why I introduced the x-ray scenario. We can't have just anybody inserting an x-ray into a patient's file. We also have to be wary of eavesdroppers while the data is in transit. Fortunately there's already a solution to that particular aspect in the form of Secure Socket Layer (SSL), included with most web server platforms. With the easing of cryptography export laws and the expiration of RSA patents, robust SSL implementations are available at no cost.

While SSL addresses encryption of the data in transit, we still need to be able to validate the credentials of the sender. We also need to ensure that the data has not been modified in transit by a "man in the middle" type of attack. We can achieve both of these goals quite elegantly with a single enhancement: content signing using public key cryptography. For those unfamiliar with this concept, click here and pay particular attention to section 6.8. In this methodology, a message digest (typically MD5) is generated for the normalized document and then signed with the private key of the sender. Decrypting the encrypted digest with the public key of the sender should produce the same result as digesting the document at the destination. If they match then it proves that the private key of the claimed sender was used to encrypt the digest and that the content was not modified while in transit.

Granted, that the last paragraph was very technical. What I'm attempting to convey is the concept that elements like encryption and non-repudiation are going to gain increasing importance. If we're going to start sending highly sensitive and/or personal information over the 'net then cryptography needs to be addressed. There's a significant learning curve involved here and some elements of the overall picture are still works-in-progress (ensuring the validity of a public key, for example) but ignoring the issues isn't going to make them disappear. They will become even more burning with the emergence of wireless applications.

The importance of availability is relative; there are few hard-and-fast rules save for certain essential services like police departments. For the rest, you have to weigh the costs against the benefits. A company doing millions of dollars of on-line business every hour simply cannot afford service interruptions. E-commerce on the web is slightly different, however. Customers are now used to intermittent outages and scheduled maintenance. My bank, for example, doesn't make on-line banking available during their maintenance window on Sunday evenings. I'm annoyed if I can't obtain my balance due to an unscheduled outage but it's not generally a matter of life and death. If I can't use my debit Visa card to make a purchase due to network outages then I'm likely to be more upset.

There's an entire spectrum of possibilities available today. Single servers might need nothing more than a battery UPS and off-site backups. Others might want duplicate servers with an intelligent load-balancing switch connecting them to the 'net. Still others might require a high-availability cluster to host the database, separate from the webservers. Then there are RAID-5 dual-ported disk arrays, multiple Internet links from multiple vendors on separate cable bundles, on-line backups to disk via gigabit ethernet, multiple sites with real-time database mirroring...well, you get the picture. I always recommend engaging the services of a professional when planning your approach.

And this discussion could not be complete without addressing disaster recovery. As the events of two years ago demonstrated, this oft overlooked facet of systems management can be vitally important. It doesn't matter that your offices aren't located on a flood plain (witness Europe last summer) or on a fault line. Fire, civil unrest, terrorism and a myriad of other possibilities cannot be safely ignored. Many companies have been forced out of business simply because they didn't have a plan in place for the inconceivable. Having a plan and testing it regularly is still the best advice I can give in this area. Given inevitable staff turnover, it's as important to know the members of your disaster recovery team as the floor fire marshalls.

We can start to address some of these issue as we make the example architecture more complex. The database in the previous models has been portayed in the traditional way, as a single, homogeneous data store. Modern relational databases have become remarkably robust and can hide much of their internal complexity. Their locking capabilities prevent multiple clients from simultaneously attempting to alter the same data. Stored procedures and/or triggers can encapsulate business logic but these tend to be proprietary. As such, a customer can become wedded to a particular vendor simply because the cost and complexity of migration would be onerous.

This obviously runs counter to current thinking, which is that everything should be as portable as possible. We're seeking true plug-and-play compatibility but how do we achieve that goal while still incorporating that layer of business logic? Moving up the line from portable database access with JDBC, Sun Microsystems has introduced J2EE (Java 2 Enterprise Edition) to an eager market. Incorporating webserver, servlet engine, JNDI (Java Naming and Directory Interface), EJBs (Enterprise Java Beans) and transaction management, J2EE represents an ambitious attempt to provide an all-encompassing framework for business applications.

Since the webserver and servlet engine architectures have been implemented in other products, where's the differentiation in J2EE? That would be EJBs and the transactional server model. EJBs come in two flavours, each with two variants. Entity beans can be thought of as one step up from JDBC, a wrapper for database table or view rows. Objectifying the data if you will. Instead of extracting data from a ResultSet with something like surname = rs.getString( 12 ) you could use the much more descriptive surname = bean.getSurname(). While that doesn't at first appear to offer much in the way of advantage, what happens if the database schema is altered at some time in the future with additional columns inserted? Every instance of the first code sample might have to be modified while the second would continue to function without change.

That's a tough nut to crack without some concrete examples so I'll have to throw in some code snippets to demonstrate my meaning. Suppose an American company creates a database table thusly:

	ZIP		CHAR(5),
The code to extract the date of birth from the record might look something like this:
	ResultSet	rs;
	// perform a select using Statement.executeQuery and
	// put results into the result set
	Date	dob = rs.getDate( 8 );
The real Java programmers out there would be quite correct in suggesting the following line of code instead:
	Date	dob = rs.getDate( "DATEOFBIRTH" );
That alternate approach would still fail with the wrinkle I'm about to introduce. Suppose that the company extends their operations to Canada. In Canada they use postal codes, not ZIP codes, and they're six characters long, not five. And we'll also need a country code in the table to differentiate between the two. Now the create script looks something like this:
Now, the DBAs are going to rail against this modification. They might suggest that one never inserts a column into to a table but merely adds it to the end. But the architect here has decided that the country code belongs at this position in the table. Not only that, he's changed the field name which used to be ZIP to POSTCODE and changed the length. Now let's look at some outcomes.

  1. Retrieval of DATEOFBIRTH by field position:
    	Date	dob = rs.getDate( 8 );
    This will break when the data in the LASTNAME field cannot be parsed as a Date field.
  2. Retrieval of DATEOFBIRTH by field name:
    	Date	dob = rs.getDate( "DATEOFBIRTH" );
    This will work.
  3. Retrieval of ZIP by field position:
    	String	zip = rs.getString( 5 );
    This will work too.
  4. Retrieval of ZIP by field name:
    	String	zip = rs.getString( "ZIP" );
    This will fail since there is no longer a field by the name of ZIP.

So how would we deal with these problems using entity EJBs? The first one is dead simple: we'd still use something like Date dob = bean.getDateOfBirth(). The second one is trickier but still easy to solve. All we would do is add a method to the entity bean named getZip() which would return exactly the same data as that returned by getPostCode(). In fact, we could have getZip() simply return the output from the invocation of getPostCode(). We'd have to do the equivalent for the setZip() method.

Now some are going to complain that their carefully crafted ZIP code validation routines are going to fail on Canadian postal codes. That's quite true and leads us up the chain to session EJBs.

First off I should mention that there are two types of entity beans: container managed (CMP) and bean managed (BMP). The nuances are far too complex to deal with here so I'll just suggest you search on Sun's web site or purchase one of the many excellent books on the subject. Similary, session beans come in two varieties: stateful and stateless. I'm only going to address stateless session beans here but there are many additional sources of information, both on the web and in your local bookstore.

A stateless session bean, as its name implies, does not maintain state between method invocations. Everything necessary for the completion of a task must be provided as arguments. Session beans implement the business logic of an organization, just like stored procedures. They can access databases directly via JDBC and/or make use of other beans, entity or session, local or remote. And it's here that the beauty of the J2EE framework really shines. Unlike CORBA where facilities like transaction and security services are optional extras, they are incorporated directly into any J2EE-compliant server.

This advantage becomes clear when you consider the nature of complex transactions involving multiple data sources. The server manages the transactional and security contexts of their clients. As with Java itself, exceptions are passed up the chain until caught. In the case of a session EJB which invokes a sequence of methods on a variety of targets, an uncaught exception will flow up to the container and the entire transaction will be rolled-back without any effort on the part of the coder! Credentials are also managed by the container and each element of a chain can examine the credentials and either permit or deny access to data and/or methods based on that information. Security exceptions thrown and caught by the container similarly cause transaction roll-backs.

In any case, the J2EE environment is complex (because it has to be) yet simpler than the CORBA equivalent. It's a powerful thoroughbred which offers an elegant approach to providing programmers with the services they require for industrial-strength applications. As previously discussed, not every customer has the need for this level of complexity. It should also be noted that J2EE servers are expensive and should only be deployed in those situations where their capabilities are truly required. Here's what our complete example architecture might look like:

Some might ask "why not put the base64 decoding in the session bean?" The reason is that base64 is merely an external data representation and therefore belongs at the presentation layer. One of the fundamental concepts of bean design is resusability. We might utilize different channels in the future but we should be able to reuse our x-ray session bean without modification. This becomes even more important as systems grow larger and more complex. We might have a client which uses the name services component of the J2EE server to access the home interface of the session bean and create a new instance directly. Requiring that the data be in base64 format for the method invocations would introduce additional complexity for the client.

Finally, this discussion would be incomplete without mention of Web Services. Web Services can be thought of as a form of universal RPC (Remote Procedure Call) mechanism. In the most common form it uses SOAP (Simple Object Access Protocol) over HTTP to invoke a method on an object class instance. It certainly has a lot of potential in many sutiations but it's not the ultimate solution any more than was DCE (Distributed Computing Environment). But since our example was based on an XML document being provided as the source, here's how we would extend it to incorporate Web Services:

The UDDI (Universal Description, Discovery and Integration) transactions are not depicted here since they're outside the purview of a single request. Note that the only difference in this case is that the XML document has been wrapped in a SOAP envelope. To be fair, SOAP is designed to also be transported over SMTP (not surprising as it is merely XML wrapped in XML) and to interface to JMS.

And there you have it. We're moving to a less proprietary model and some very powerful and proven technologies. I can understand that some companies don't feel the need to adopt some of these models. Web Services in particular is not exactly the Holy Grail. What should be clear is that now is the perfect time to gain familiarity with the options. For many organizations which have decided to hold off on new projects, there are opportunities to significantly enhance the delivery of services and contribute to the all-important bottom line. Starting soon could be a prudent move as I expect the balance of power to shift again within the next two quarters.

Update: July 16th, 2003

Copyright © 2002, 2003 by Phil Selby