Developing Clouds

What I mean is deploying applications to the cloud. I recently found myself on the AWS (Amazon Web Services) site doing some research on cloud computing. A client of mine had a very aggressive schedule for deployment of a garden-variety J2EE application. When I say "garden-variety" I mean the typically complex J2EE application, with a JSF front-end, EJBs, DAOs, etc. It was designed (by me) to be database-agnostic but I tend to like MySQL for development and light-duty production. Plus it doesn't require a $35,000 license fee like some other databases. As with any properly-designed application, this one can be deployed on any standards-compliant J2EE server. I like Glassfish because it's free as well as being the reference standard.

I was merely absorbing information on the site when the phrase "free usage tier" caught my eye. It provides up to 750 hours of usage of a wide variety of services. Sure, you have to provide a credit card so that they can charge you if you exceed the usage permitted, but that's not unreasonable. A few keystrokes and a verification telephone call later and I was ready to create my first instance. They have comprehensive documentation available on the site so I'm not going to duplicate it here. Suffice it to say, it didn't take long to create a new t1.micro instance. They generate a private key so that you can ssh and scp into the instance. The operating system is Amazon's flavour of Linux but most contemporary versions are (thankfully!) remarkably similar.

So the first thing to do is to login to your new instance. It's fairly standard:

$ ssh -X -i keyFile ec2-user@hostName

where keyFile is the name of the file in which you stored the provided PEM private key and hostName is the hostname. So how do you know the hostname? Go to https://console.aws.amazon.com/console/home and click on EC2. In the Navigation box, select Instances. Click on your instance and the box below will be populated with the details. The default tab is Description. Scroll down until you get to the Public DNS field. The value is the hostname. It maps directly to an IP address so if the hostname is ec2-a-b-c-d.compute-1.amazonaws.com then the IP address is a.b.c.d.

The default user is ec2-user but it can run commands as the superuser by simply prefixing everything with sudo. That's fine for perhaps most people but when I want to perform a number of steps as root then I find it more convenient to su and exit when I've done all that I intended. It's just the way I've gotten used to doing things, after decades of experience with various *NIX systems. It's easy enough to change the root password:

$ sudo passwd root

You'll be prompted to enter and verify the new root password. Now we need to start installing the "extras" we're going to need. It's straightforward:

# yum install xauth
# yum install ant
# yum install mysql
# yum install mysql-server
# yum install xterm

You may wonder at the requirement for xauth. When you install glassfish, it pops up a window with the license agreement which you'll have to accept before the installation continues. If you don't have xauth then the window can't be displayed, even though X11 forwarding is configured by default. The last one isn't required but I use Cygwin/X when I'm on a certain unnamed platform and the default window you get when you fire it up has a really small text size. I prefer to get a more readable one by using:

$ xterm -fn 10x20 &

Yes, I know that you can change the font size in the xterm window but I just prefer the look of the 10x20 font. It's a personal preference and YMMV.

Although the MySQL server is now installed, it doesn't start automatically upon installation so you have to do this:

# /etc/rc.d/init.d/mysqld start

At this point you can do any required configuration such as creating users and databases and creating and populating the schema. There is no default password for the root user which is just fine by me. BTW, if you need to transfer a script to perform the schema generation and table population just follow the directions here.

We need to download a couple of products required by the application. Obviously we need the Glassfish server but the application also requires OpenOffice in order to perform some document conversions. The basic operating system doesn't come with a browser, or at least none I could find. I tried installing Firefox but it started to look like one of those trips down the rabbit hole so I abandoned my efforts and went back to the old stalwart: wget. Many (most?) of you will have never needed to use this, or possibly even heard of it, but it's been around since '95. Ancient history, to be sure (at least by 'net standards,) but it fits the bill nicely when you don't need a full browser.

I always like to have a standard directory for downloads. Depending on the nature of the sytem, it's either ~/downloads or /opt/downloads. I just like to have a standard repository for downloaded artifacts in case I ever have to reload applications. Here's what I did:

$ cd
$ mkdir downloads
$ cd downloads
$ wget http://download.java.net/javaee5/v2.1_branch/promoted/Linux/glassfish-installer-v2.1-b60e-linux.jar
$ wget http://sourceforge.net/projects/openofficeorg.mirror/files/stable/3.4.1/Apache_OpenOffice_incubating_3.4.1_Linux_x86-64_install-rpm_en-US.tar.gz/download

Some might be curious at to why I'm not downloading v2.1.1 or v3 of Glassfish. I tried deploying the application under 2.1.1 and ran into some mysterious issues. I didn't have time to diagnose the problems and don't have time to test under v3 so I'm happy with v2.1. It's been very stable, in my personal experience, and provides everything we need.

We have to jump through a couple of hoops to install OpenOffice:

# gunzip *.gz
# tar xvf *.tar
# cd en-US/RPMS
# rpm --install *.rpm
# cd ../..
# rm -fr en-US

I like to clean up after the installation; we can always regenerate the RPMS directory later by untarring the gunzip'd file.

As with the downloads directory, I usually create the glassfish directory in either my home directory or /opt. For this test I'll just build it in my home directory:

$ cd
$ mkdir glassfish
$ jar xvf downloads/glassfish*.jar
$ cd glassfish
$ ant -f setup.xml

This is fairly standard and generates the default domain with an admin password of adminadmin. You'll probably want to change this, even though the console will only be available for very short periods of time.

You'll have to create a script in /etc/rc.d/init.d for starting and stopping glassfish, i.e. asadmin [ start-domain | stop-domain ]. Since I want OpenOffice to run as a daemon, I need to create another script in the same directory to start OpenOffice with the following command:

/opt/openoffice.org3/program/soffice -headless -nofirststartwizard -accept="socket,host=localhost,port=8100;urp;StarOffice.Service"

You'll have to create S and K links to these scripts in /etc/rc.d/rc3.d (run-state 3 is the default multi-user mode for Amazon Linux.) There's another modification I like to make to glassfish. By default, it listens on port 8080. There's nothing wrong with this since port numbers < 1024 (like the default HTTP port of 80) require root permission to bind to them. But since the startup scripts are being run as root anyway, I prefer to use port 80 so you don't have to specify the port number in the URL. Edit $GLASSFISH_HOME/domains/domain1/config/domain.xml and change the port number in the http-listener element. Or you could just search for the string 8080.

Now we need to permit external access to port numbers 80 and 4848 (glassfish admin console). Go back to the EC2 console (directions above) and select Security Groups from the Navigation panel. I used the defaults when I created the instance so the quick-start-1 is the one to select in the Security Groups panel. In the panel below, click on the Inbound tab. We're going to create a new custom TCP rule so enter 4848 in the port range field and click the Add Rule button. Now go back and select HTTP in the Create a new rule menu and click the Add Rule button. Finally, click the Apply Rule Changes button.

You should now be able to access the glassfish admin console from the outside world, using the hostname or IP address you previously determined. When you go to deploy a new application with glassfish, the default selection on the Deploy Enterprise Applications/Modules panel is Packaged file to be uploaded to the server. While that's fine for a local connection, where you're running at 100MB or 1GB on the LAN, but file uploads are going to occur over your 'net connection. I'm not suggesting that file uploads over HTTP are necessarily inefficient, just that there is (IMHO) a better way. From your local system, use the following command:

$ scp -i keyFile earFile ec2-user@hostName:/tmp

Now you can select Local packaged file or directory that is accessible from the Application Server and install your ear from the /tmp directory. Once deployed, you should be able to access your application the usual way. You can use the IP address directly or make an entry in your hosts file to provide easier access. If you have access to your DNS servers, you can even create an entry (A record) so that you can access it by name within your domain namespace.

Caveats

One thing you have to be aware of is that if you stop the instance from the EC2 management console and then start it at some later time, the IP address is going to change. Not a huge inconvenience, especially if you're trying to maximize the free hours, just something to keep in mind. Another thing to consider is the glassfish admin console accessibility. Call me paranoid, but I don't like to have it available all the time, even if the admin password has been changed from the default. You can go into Security Groups -> Inbound and delete the rule for port 4848. You can always recreate it at a later time, as needs arise.

Summary

This was a surprisingly pleasant experience. I say surprisingly since I didn't really know in advance what to expect. Some processes are very convoluted but this went swimmingly. Granted, it helps if you're skilled with *NIX (the "user vicious" operating system!) but at least you don't have to worry about gnarly details like iptables configuration. Amazon has done a remarkable job and my testing to date proves that they have a solid platform. While I haven't yet tested all the features (like elastic load balancing,) I'm hoping that they turn out to be just as robust as EC2.