I've read a number of discussions recently debating whether cloud computing will supplant grid computing as the preferred deployment environment for complex computation needs. Notwithstanding the clear value that cloud computing environments provide, I firmly believe that for various reasons (architecture, security, bandwidth etc.), enterprises will not move all of their infrastructure to cloud environments any time soon. Rather, they're likely to create hybrid or "partly cloudy" environments in which some computations will remain local and others will be sent to the cloud.
In fact, we're working with a client at this very moment on creating such a hybrid environment. The client wants to retain some core applications and computation within their local environment, but wants to leverage Amazon's Elastic Compute Cloud (EC2) for their larger calculation needs. In this case, however, simply having the raw compute resources available to them isn't sufficient, as they need to manage and coordinate the calculation tasks that are submitted to the cloud. The EC2 APIs and utilities enable you to create, manage and interact with virtual machine images, but does not provide facilities for tying these images together into a coordinated computation platform.
This is where the Grid comes in. By deploying virtual machine images containing Grid nodes, the cloud becomes an extension of the local compute grid that can be activated and deactivated on an as-needed basis. This hybrid approach enables enterprises to retain tight control of the management, policy and security related components of the grid by running them on local infrastructure, and at the same time, lets the Grid farm out compute tasks to the cloud grid nodes.
The exciting part is that this scenario works today. Our client is building a partly cloudy deployment environment using GridServer and Amazon EC2, and these products work seamlessly right out of the box. The Amazon Machine Images (AMIs) are GridServer engine nodes. GridServer even enables these images to declare on which EC2 Instance Type the node is running so that you can create policies that select the appropriate instance type (i.e. size) for any given compute task. Once the AMIs are created, and the required NAT and security configurations are defined, the GridServer broker can assign tasks to any of the cloud images just as it would for local images.
Taking this a step further, by using GridServer extension mechanisms and the EC2 API, the grid can be configured to register, unregister, run and terminate AMIs automatically as needed. In doing so, the client will be able to let the grid decide when new AMIs are needed rather than preallocating EC2 instances (and paying for them!) in anticipation of need. This policy-based management of cloud instances is one of the key features that makes this partly cloudy approach exciting. It's hard to believe I'm promoting a partly cloudy forecast, especially given the weather in here in Boston lately. But in this case I say bring on the Clouds!
