The SOA quickstep ; the dance explained

Here's the skinny on the many sessions I've held about SOA lately.

When I started working for the library about 2 1/2 years ago, I started as a senior Java developer, and whinged about the infrastructure and development methodologies before I quickly jumped to another branch that does more "web things", meaning XML, XSLT, PHP, HTML, JavaScript, Apache stuff and a whole slew of other technologies, but more importantly, I got to do more design work and project management. And more importantly, we did things here more in isolation from the normal grind of applications. Which is good. Truly.

A very recent project threw me back into my old world, and I got a taste of battling with the infrastructure and development methodologies once again, and I got a bit ... hmmm, frustrated? Since then (this was about 6 months ago) I've been making secret plans with a couple of collegues to fix a lot of our development problems. In an environment such as ours we can't let slow processes, mammoth specs, silo mentality and molasses project management kill off any chance to innovate and be creative. But the more I found out, the more stuff I felt I needed to fix.

Back when I was working for Bekk Consulting (the most amazing bunch of smart people I've ever had the pleasure of working with in my life; I grew from baby to man in my three years there) we were slowly bringing in web services and those things that has slowly evolved into what we today roughly know as SOA (Service Oriented Architecture). I felt an urge to bring this to the library.

I reckon that the more you work with enterprise architecture, the more SOA makes sense. After you've had a few stabs at single-sign-on (with session and user management thrown in), distributed and load-balanced databases, serious MVC application development, code reuse (which normally means library development [JAR/WAR, DLL's, etc] with lots of copy and paste thrown in) ... after a while you start to think that the overhead you pay for things being converted down to XML (either RESTfully or through SOAP and the WS-* stack, over HTTP or otherwise) and back again is really worth it. Here's what SOA means to me ;

Thinner applications: you don't need to include lots of libraries and wrappers to add functionality, which means that as the application gets deployed the memory isn't taken up by repeated code, but by actual data. You can spawn as many application deployment environments as you need, and proxy all web services into a funnel. Also, since each service is more focused you don't get a lot of overhead in the application that deals with pesky things not in the best interest of that service; they can be trimmed down to do exactly what it says on the tin, and won't suffer to featuritis.

Sharing with partners: This enables us to create external services for anyone who wants to play with our data, such as our NBD/OPAC/cataloge data, our various smaller semantically modelled sites, our thesaurii and plethora of news and event items ... and blogging. Did I mention that we're trying to get folks here to blog? (See our implementation of Confluence further down; every person in the library will have a blog, and aggregation channels will be opened to the public as well)

Technology agnosism: All applications and services can be written in whatever technology you prefer, be it Perl, Java, SQL/PL, PHP, XSLT, Ruby, C/C++, LISP, FORTRAN ... the only requirement is that the technology can throw XML over (in our case) HTTP. We can tailor the task to the best tool instead of forcing one hammer onto nails, screws and nuts. You don't need to think of specific technologies as strategic direction. You don't have to invest in more Java developers just for the sake of the infrastructure; you can pick developers with more diverse skills, and hire in short-term developers to convert legacy parts into replacable parts.

Performance using tried channels: XML over HTTP means you can create and divert services through means of proxies, routers and other channels used for normal web traffic, use HTTP loggers and traffic analysis tools, do security easily over HTTPS and testing a service can be easily debugged through any browser. Our organisation has a lot more expertise in HTTP than in most other protocols, and finding people with such skills is also a lot simpler. Any developer today knows XML and HTTP, and any technology knows it too (well, I'd be hard-pressed to find some technologies that don't).

Better business: Thinking about services more than simply functions (remember; SOA is not RPC [Remote Procedural Call] even if you do that too!) can lead to better support of the business areas and help them develop their business. Functional requirements will be based on use-cases more than application structure. Ontology sessions on a high level to spark understanding of the services, and it will be easier to get the overal picture and better understanding of what we can do, where we should travel, and what to kill off. (It's easier to handle all of this on a decoupled level than if we're dealing with mammoth applications!)

Innovation: Once you've stopped thinking that application design is an excercise of constraints you'll start seeing how these different services can be joint together to form new services and applications. It's not because this wasn't possible in the past that makes this now viable; it's because you can prototype your idea in a matter of a day or so instead of a month or so. Some developer may have developed a Java application that does something interesting, but when asked to make it an abstract class we can reuse as a JAR file (meaning also documentation, testing, packaging, etc) the cost to do this usually stops it in its tracks right there. If his original application was exposed through a web service then we could prototype up that idea quicker than the developer could estimate the original project! We need to focus on all those things we don't have to do if we are to look to innovation.

I guess that last point was a bit longer than the others, but that is perhaps because that is where I see the biggest benefit, and is in the area that I enjoy the most. But of course, before we can feast on the flexibility a SOA gives us, we need to start building it.

Our first* step is Single-sign-on; a service for authentication, and a service for profile and session handling. We will start using the OSUser module from OpenSymphony (yes, we know it's a stale project and that AtlassianUser is the next generation, but it isn't available yet. We need to get started, and it does work well for what it is supposed to do) for user management and authentication, and have a database for profiles and sessions (basically, a session follows an authenticated users profile, meaning we can also share session info and profile info across services and applications). This means we're creating a ticket-based system for user, the same as for JIRA and Confluence from Atlassian, and integrating ourselves against these systems have worked really well.

* It's not really our first step; we've done several other services in our lab that will be more official as we go along; the APAIS thesaurus, a Lucene-based version of our OPAC (with tagging, comments, clustering and more!), a resource sharing database, the before-mentioned E-Resources application, and an upcoming harvester as well.

Watch this space; I'll try to blog what we find through this process, and write about gotchas and successes, especially as they relate to libraries.

ShelterIt - My digital think-tank

2 June 2006

The SOA quickstep ; the dance explained

No comments:

Post a Comment