ShelterIt - My digital think-tank: May 2006

26 May 2006

The epistemological implications of Topic Maps for librarians

Topic Maps in the library world

Quite often I'm asked about the link between libraries and Topic Maps, given that the latter is something that I've tried to specialise in. For example, I was recently invited to join a panel at LITA's Nashville conference 2006 as a Topic Maps "expert" (meaning; someone who knows a little more than the rest). Sadly I couldn't attend, which is a shame as I had an exciting Topic Maps paper accepted, although since it touches on the topic of this post you'll get some gist of it from here.

I wrote an introductory article about Topic Maps some time ago, and quite a number of librarians (or in the business of librarians) have since asked numerous questions about it, how well it fits into the library world, and isn't it fun doing all that Topic Maps work?

Lots of people in the library world have got the "metadata map" part of it somewhat right, but few seem to understand what Topic Maps really is all about. Yes, it's mostly about metadata, but no, it doesn't support a single metadata standard as such; it's a general data model in which you can fit whatever metadata you wish. Some folks gets confused at the "map" part of Topic Maps, and understandably so; "map" gives us certain association with something visual, however that is quite misleading; the "map" refers to modelling.

First of all, "data modelling" is most often hijacked by relational database folks as a term to explain how they design their databases, document it, do their normalisation and optimisation of the model, and so forth. The reason I stuck "epistemological" in the title of this post is to separate myself a bit from the RDBMS (Relational DataBase Management System) guys for a minute, and talk about philosophy ;

Epistemology

There are a number of epistemological (and notice that Wikipedia URL; 'Justified_true_belief'!) things that apply to data modelling, such as "What is a piece of knowledge?", "What is information?" and "What is representation?" These are good questions; How can we think we do knowledge management if we don't know what it is? How can we create information systems without know what information is? How can we represent our knowledge and information if we don't know what that representation mean?

I'll let you know that I'm in the representationalism camp in these regards; anything outside the workings of my own mind is observed by proxy, even other people's knowledge. I need to find ways to fold new perceptions into my own knowledge to gain new knowledge. My tea in front of me is represented by the visual cup, the smell of aroma, and the taste of tea, bit of sugar, pinch of milk; observations that make up a context.

This context can be represented by "something", and this is in all simplicity all that we information folks do; we try to come up with models that best represent the information for us and for the computer, for reuse, for knowledge creation, and for archiving.

In a Topic Map, this context is conceptualised through a Topic, contextualized through Associations, and turned into information through Occurrences, but there are hundres of other ways to do it, in relational databases, in XML, in binary formats, with paper and pen, with facial expressions, through music and dance and art and ...

Ontology

It's all about expressions of something. With computer systems we have a tendency to think in very technological ways about these things, but as any long-time database modeller knows; there are people who are good at normalising, and people who suck at it! My theory here is that the people who are good at normalising understand epistemology (knowingly or not). The same with people who are good at creating XML schemas, or good at design, good with visual design, good at writing, good at presenting. In fact, I'd stress that epistemology understanding is crucial to any form of quality representation of an expression.

Let's take a step sideways into ontologies for a second; ontology asks "What actually exists?" and goes on to define a model in which we can represent that which we think actually exists.

In modern information sciences, ontology work is what we refer to when we try to explain "things" through a more formal network of definition, so that "X is a Y of type Z" and "X is the opposite of D" and "D is a class of U"; given enough such statements, computers (or humans if you're patient) can infer "knowledge" (basically; hidden or not explicitly stated information). Of course, you need to have a lot of these statements, and they must all be true, and probably authoritative, which for many is the very reason they don't believe in the Semantic Web (of which I'm such a sceptic myself).

In a closed system though, I have much belief in ontological models and information systems, and libraries have a lot of closed systems in which openness to the hidden information could provide some seriously good applications for it. For example, a lot of what librarians care about are in collections of sort, and a collection and the metadata about it can well be mined for some rich information not explicitly stated.

Collections in a Topic Map

I've done a few experiments with collections in Topic Maps with some pretty good results. For example, there's the "Fish Trout, you're out" childrens folklore in our oral history project; I got all the MARC records that belongs to the collection, converted it to a Topic Map, and lots of interesting things happened; I learned more about the collection, knew more about what type of information was within it, I could browse through it through various facets, I could ask the Topic Map for items that had complex relationships ... basically, I could do a bucketload of things that no OPAC could ever dream of being able to do, yet we both had the same basic MARC records to work with.

The recent National Treasures exhibition was designed with my XPF framework, an XSLT-based wrapper and query language for Topic Maps, so all the data items in that collection sits in A topic Map; every picture, every comment, every text, every page, every theme and every note. Yet, the actual site looks pretty much like most other sites out there, so where's the juice? Well, internally we've created a couple of alternative interfaces to the Topic Map with dramatic different results, and although they are not public (and probably never will be, although we thought about creating an interface for kids!) they showed us again what rich hidden information we could get out of data we already had. And that's an important key to why Topic Maps are so important!

Another collection I've plodded with in a Topic Map is the Mauritius Collection, over 2000 items with a great variety of semantics and types. One of the problems with a lot of these collections is maintaining them, and getting an overview of the collection is often quite difficult; people spend years trying to get the full picture, especially if the collection is somewhat fluid (items coming and going from it). The Mauritius Collection is hard to get an overview of, yet in a Topic Map - a model which is designed from the ground up to handle complex relationships - it seemed almost too simple to browse around the collection, looking for things or simply exploring stuff that's there and learning stuff in the process.

And I've yet to talk about books in this context, but most other people are fixated on books and cover them quite well. To me, life and everything I do isn't based on books, but all of the collection wonderment mentioned for items can equally be applied to books. Personally, if I was given the oppertunity, I'd give our maps collection a go next!

Epistemological implications of Topic Maps for librarians

So what are these implications? Well, there's a few paradigms that differ from the normal set of information technology set of RDBMS, databases, OPAC, fielded search and NBD (National Bibliographic Database; another library term for a large database).

First of all, librarians know about thesaurus and taxonomy work. In the former there are notions such as "broader term", "narrower term", "related term", "use instead", and so forth; these all makes up the ontology of the thesaurus; they explain what things might be, and in a thesaurus, in a very loose and general way (mostly). In a taxonomy, most of the relationships between items (and hence the ontology) is explained through the structure itself; this item is above this one, meaning "X is a Y" or, in more complex taxonomies, "X is an instance of class Y which is super-class of Z".

Topic Maps takes this a few steps further; in the same Topic Map, you can have a thesaurus, a taxonomy, a facetted classification system, LCSH (Library of Congress Subject Headings), MARC records and an ontology, all working in unison. This has some implications to how we can use the information in single applications, but also on what synergetic implications as well - in revealing hidden information that's not explicitly stated.

Secondly, Topic Maps is based around the notion of atomic nodes on which you hang various information, such as metadata and relationships, and this is quite unlike a record in a database, of which MARC is a good example. But what's important to understand is that we're not talking about taking the data out of MARC or converting MARC to MODS to XOBIS to Dublin Core to whatever; no, MARC stays as MARC, but Topic Maps lays a layer of "semantics" (we can stretch or implode the meaning of "semantics" here, I think; it all depends on what you want to do, how much energy you're prepared to waste and resources you've got allocated) on top. This is why it's a Map; a map to guide you through your information soup.

And thirdly, soup with added Topic Maps makes a dang fine stew. I love stew.

We knew that

A lot of librarians (and others who might read this) already knew all this; why am I telling you this, then?

Because in order to truly understand Topic Maps and why I'm so keen on them, is to understand how Topic Maps and its data model is closer to human cognition and epistemological ideals than what we're currently immensed in, such as the relational database, the notion of a "record", the notion of collections that don't overlap (Hah! I dare you to show me one!), the ideas of a book being atomic (the guys who's into FRBR knows all about this one), the idea of marshalled viewpoints of information (guides vs. the reference librarian), taxomatic classification schemes (this one is heavily disputed, but in the classical form it certainly causes problems, although it might be more the human problem than a technological one; for example, can we mix and match LCSH, tagsonomies and thesaurii? You can in a Topic Map with relative ease.) and so forth.

In the end, how do we know that what we're doing aids our goals? Is our technology working for us, alongside us, behind us, against us? The goal must be to preserve and encourage knowledge, right? For libraries this is of course on the borderline between the collection mentality and the education mentality; some librarians have only one of these, some both (and a few rare exceptions neither!) and then various mixes of the two. In my view, there is no two; they are the same thing.

How do we know that we're delivering systems that's supposed to help them in whatever quest they have? Right now I feel we're second-guessing on every level on that question; we design systems with a specific set of features in the hopes that we help at least a given percentage of users. I'd stress that we really need to work it the other way around, as usability has shown us time and time again that guessing what the user wants will always fail; we need completely open systems where the user narrows the features until the goal is reached! This is what we humans are about, isn't it? First we read the table of contents or the index (both from which you gain a sense of overview), then jump to the right chapter for the details, and from there make descissions on where to go next.

Let's not design more applications; let's design systems.

23 May 2006

It depends

All afternoon I've written in bits and chunks on this post; I've had a number of enquires and meetings of late where after I've said my piece was asked more detailed questions. All of my answers started with "well, it depends on ..." so after a few too many of these answers I told myself not to say those dreaded words anymore as it somewhat leads to vagueness. I wanted to write down a few thoughts on why I should stop saying it, but I soon realised that the right answer was to explain why "it depends" itself is the right answer.

It's was therefore smack-down timely that superstar Donna Maurer - who's busy presenting over at our New Zealand buddies at Webstock - wrote a quite similar piece called Black, white or grey ;

But the more projects I do, the more I realise neat black & white answers don't fit any sort of real world, which means I end up talking more about the context, and feeling like I'm disappointing people and being vague. Oh well, at least I try to explain what 'it depends' on, or what the implications are for different contexts.

What surprises me is that people quite often want a black or white answer, and given the audience I'm referring to here (me; collegues and other IT professionals, and Donna; I assume were presenting to IT professionals as well) it's a bit shocking that they would even think the answer is in binary format.

How come? Is our processes in which we do our work so streamlined that there is no room for variables and fuzzy answers? Is our communication these days so pop-cultured that sentences longer than a given size isn't registering? Is our brain so filled up with other stuff that whenever we try to jam more info in there, we prefer it to be black and white because we think it takes up less space? Are we afraid our brains will explode if we pop one more nugget in there?

I remember back when I wanted to know more about human categorisation, in which I jumped on 3 books (at the time that made sense to read; can't remember how I came to those, though) and read them cover to cover before continuing; Bertrand Russel's "The Problems of Philosophy" (I knew from past rememberance the epistomological implications Bertrand has on categorisations, beside I had long wanted to read the book, not for it being the latest thinking but because it's a standard work; I never preempt optimisations ... :) ), Lakoff's "Women, fire and dangerous things", and Tom Stafford's "Mind hacks". Lakoff was the one that I found the most interesting in terms of categorisations, and from there I at least had a spring-board for further research. The point of this little sidestory is that I simply know there are no easy answers; I need a lot more context to understand the answer, I need to read more and understand more around my problem in order to understand it.

Another example is what is used by everybody, it seems, to say that something is logical or just makes sense, in a mathematical sense; 2 + 2 = 4. We use this in a way to say "look, this is 2+2=4, ok? It's so simple!" The thing is that it ain't so simple. My best friend Magnus, a mathematical genious, one day while it was raining and we were taking shelter under a kiosk-roof that was closed for the season, he explained to me the basics of axioms; basic rules from which all mathematic is derived. He explained for example that there are two major axiomatic systems that is used by us normal folks and mathematicians as well; the pythegorian system (which we all should know from primary and secondary school mathematics; ah, those pesky PI's!) in which two parallell lines can never touch, and the one (don't know its name) that was somewhat adapted after it was established that the universe was concavely shaped and always expanding where two parallell lines at some point must meet because of the shape of the universe. (This is more complex than this common-folksie summary, of course, but you get the idea) In other words, I had an epiphany and asked if one could define a set of axioms in which 2 + 2 = 5. The answer is yes, you can do that. All logic, all mathematics, every concept, every observation, every thought, every darn little frigging thing we think we know, has context ... context so huge and chopped into so many little bits that it's mind-boggling trying to visualise it! Our brains are so frigging amazingly clever!

Our brain is an incredible tool, and if you think for a minute that a tool such as this, a tool designed for making some kind of sense out of the masses amount of context - specialised for context sorting! - require a black and white answer to fit things in, you're doing yourself a disfavour; you're dumbing down, not smartening up. (This contextual inputting is why, for example, most of the time experience [you're in the context] teach you better than a text-book [explaining some context])

So, the next time you ask someone a question and the person answer with a "it depends" and go off on a tangent referring constantly to a bucketload of books, thoughts and ideas of others, you just might be getting the right answer.

Joe Clark : To Hell with WCAG 2

Joe Clark, one of the all-time experts in web accessibility - he's wonderfully down-to-earth and he lives in the real-world, as compared to the metaphysical one of many so-called "experts" - have just published his review of the final draft for WCAG2, called To Hell with WCAG 2 ;

The proposed new WCAG 2.0 is the result of five long years’ work by a Web Accessibility Initiative (WAI) committee that never quite got its act together. In an effort to be all things to all web content, the fundamentals of WCAG 2 are nearly impossible for a working standards-compliant developer to understand. WCAG 2 backtracks on basics of responsible web development that are well accepted by standardistas. WCAG 2 is not enough of an improvement and was not worth the wait.

I've had my struggles through the years trying to be a good WGAC1 citizen, which for levels 1 and 2 was achievable, but these new WCAG2 guidelines are just plain out-of-this-world and wrong. Let's hope the grassroots movement - in which a lot of real disabled people live - can help overcome this one.

22 May 2006

Salut Baroque!

Once again the wonderful gang of Salut Baroque! came to Canberra (friday 19th of May, 2006; for the record), this time with a concert called "Music, Men and Manners", music based on the works of extraordinaire Charles Burney that musicologists everywhere have a lot to thank for; he wrote a lot of musical history, including lots of contemporary 17th century as he personally visited CPE Bach, Quantz, Hasse, Handel, Gluck and many more, wrote about their manner, music and performances.

As to this concert, it was quite different from many of the previous ones I've been to. For one, both old-timers Hans-Dieter Michatz (recorder; an amazing energetic and vibrant player) and Sally Melhuish (also recorder; founder of the ensamble) were absent, although the replacement was a bit of a stunning surprise;

Melissa Farrow; baroque flute. Wow, I think I'm in love; she's absolutely gorgeous, and plays in a excitingly warm and sensous way. (If she was doing cross-over, she'd be famous!) She was oozing control and direction, with a deliciously warm tone; no wishy-washy energetic swaying to trick you away from the quality of the music. Wonderful, just wonderful. If she sold posters of herself, I'd buy one.

This was also the first time in a long while I've really enjoyed Tim Blomfield (bass violin; musical director) who was truly present, playing with energy I didn't know he had. It was great to hear him push his way through the music, taking a stand; more!

As always the beautiful Monika Kornel (harpsichord, and I suspect she's Polish, but I've never been able to find out) ran over the keyboard with prescision and style. I really like her; low on show-offing, high on control of that soft lovely type, not too much ornamentation nor too little. Nice.

The usual suspects of Myee Clohessy (Baroque Violin; she's lived in Norway for a few years and probably knows a good friend of mine, Anna Helgadottir), Nicole Forsyth (Baroque Viola), and Valmai Coggins (Baroque Viola) provided quality as usual.

New to me were first and second 'soloists' (bit hard to stretch that term in most of this music as it was mostly for flute / recorder soloists) Kirra Thomas (Baroque Violin) and one other lady which name escapes me (stand-in for bassoonist Kate Walpole, maybe? Oh, I'd love to see more of Kate! She rocks!), both very good, although a bit fuzzy around the edges this night.

The music selection was as always interesting and good, lots of stuff most people haven't got an intimate familiarity with, and as always, superb inner notes in the program from Tim! I want the published for all to see and share, damnit!

Thank you again for taking the time to visit the Baroque musically-challenged city of Canberra, and I'm already looking forward to the next visit.

Single-sign-on (SSO) and Service Oriented Architecture (SOA)

Today I'll share a few thoughts and design issues when dealing with Single-sign-on (SSO) and Service Oriented Architecture (SOA) (or read John Reynolds's SOA Elevator Pitch). SSO is a pipe-dream that's been around since the dawn of computing, where you sign into one service, and if you need to enter other services and those services is under the domain of where you first logged in, you're already logged in and your session is with you. (If any of this didn't make sense I fear this post is not for you; it is rather technical :)

Our problem

We're a big organisation with a diverse set of operating systems, servers and skilled people. We've got Solaris (servers), MacOS X, Windows and Linux (servers and users), carefully spread across both servers or users, although most users are Windows, some MacOS X. We have bucketloads of different services, from backend servers and logging stuff, budget applications, HR systems, issue tracking tools, wikis, time reporting tools, staff directories ... too many to count. I spend a significant part of a week logging into systems, some of them with different usernames and passwords.

For many years, vendors have pushed their various SSO solutions on us, most complicated and fragile, some better but with a lot of work, and a few reasonable ones. We've created a few minor and sub-par ones ourselves. They all are pretty expensive systems though, not nescessarily from a purchase angle alone, but certainly from an implementation stand-point; lots of work needs to be put in to configure, implement and maintain these systems. Lots of people in the IT industry deals with SSO as their prime job.

SSO systems usually tries to handle the problem of user identity, or co-operate with other systems, such as LDAP and X.500, or pure authentication such as Radius or even Kerberos ticketing systems. Then applications themselves store bits of stuff in their local session, some user information in their local database, synchronises some of that out, but mostly keeps it to themselves. There are lots of problems here, so let's talk about what I'd like to see them do.

A better system

Here's what I would want from a better system ;

Web Services API

User identity management

Roles and groups management

Profile management

Session handling

Most a) SSO, b) user management and c) session management systems are either just one of these three, or is too linked into some technology (Windows only, or Java only, or LDAP only, etc). We need one that does all of this, elegantly and simply, and through web services, and notice that web services is the first point on that list; if it ain't web services, it's not a solution.

A design I'm considering with my collegues is a simple database system with users and groups, a default profile attached to each user, a default session data blob, a timer mechanism, and the ability to add application-specific data blobs over time (using the same XML schemas). The only interface into these goodies are through a web service; REST or SOAP in, and a generic XML schema (Topic Maps based) out (or embedded in SOAP out).

By doing it this way, any system in the future is technology-agnostic outside the world of web services; we're not tied to Java, Windows, LDAP, whatever. It's very easy to implement into exsisting application (even applications who never thought that they would be part of a larger system such as this), partly by removing complex code (code that does either user management, session handling, and possibly some degree of SSO; out with it, and replace it with web services instead) but also because all of our platforms knows XML in such a basic form.

Now, since this is SOA, it becomes apparent that there's a great lot of oppertunities for innovation here, especially within rapid prototyping and testing out various functionality, mixing in experimental services and so forth; we can create simpler PHP scripts to try out an idea, hack some Perl to discover some new semantics, or use Ruby to put up exciting new applications, or chuck stuff into Lucene without worrying about what technology the data is coming from. It also makes good for dealing with scalability and performance issues; smaller bits are easier to move around than large ones, and these issues can now be handled on the network level instead of within your chosen development technology (instead of designing an application to handle distributed transactions, you split the transaction further up the pipe and design your application simpler; less complex code to worry about).

Finally, we've looking at reusing OSUser from Atlassian (they're working on next next generation of their user-management module called AtlassianUser, but they're difficult to squeese info out of; will it be open-source, will it be available to others, when is it due, etc?), but if you know of alternatives, please let me know.

19 May 2006

The importance of user-interfaces

All through my computer-infused life I've struggled with user-interfaces, and I'm pretty darn sure I'm not the only one.

The funny part about talking about the importance of a good user-interface is that we all know the importance of a good user-interface, yet when it comes down to it, it is the part of our systems that gets the least priority! From start to finish we talk about business requirements, functional prototypes and user acceptance testing (the "user" here not being an end-user, but usually the owner of the project). Rarely, if money is left over or we've got too much time on our hands, things like information architecture, usability testing, persona pathways, participation and interaction design are scarcely interspersed, usually by the wrong people, in a haphazard way. Why is that?

I'm a technologist and a geek; I've been doing technical and functional stuff all my life, yet I'd fight you to the death to do user-centred design, usability testing in iterative lumps and be a bit creative about the information architecture before you try to nail it down in a strict taxonomy!

Ease up, people! We humans have a great sense of order in things; we classify, sort and think about their placement. On the other hand, we are also forgiving fuzzy creatures. Why are we mostly developing systems that adhore to the first group while ignoring the second at the same time? Because it is that crazy combination of the two that makes us humans work the way we work! In other words, why are we creating systems that work against human-nature?

I'm annoyingly baffled, in an unsurprised way.

17 May 2006

Not this time

As some of you might know, I was opting for moving from Canberra, Australia back to Oslo, Norway in or around August time this year; I've had interest from two successful consultancy companies (the only two I've bothered contacting, too). It seems now that both have resulted in nothing;

The smaller company can't overcome their meeting-in-person interviewing practice (meaning, I need to go to Norway for two interviews before they'll truly hire me, but I'm currently a public servant and as such does not have that kind of jet-set budget) and the second company, my old company in fact, haven't replied back to me (nor my chase mails) for a couple of months after proclaiming serious interest. Not sure what happened there, but maybe I jogged their memory to hard? :)

I've also had a few good other leads to jobs elsewhere, but somehow each and every one of them have resulted in interest but not any practical solutions (Some can't pay "enough" [family to feed], some are abroad and there's VISA issues, some have found closer similar skillsets, etc). Maybe I need to address larger companies who can afford me?

I guess I'm stuck in Canberra for now, in a place where 80% of all jobs are inaccessible to me because I'm not an Australian citizen, unless I go contracting ... which I don't feel comfortable doing as I'm the sole provider for my family and my network here is poor (only been here 2 years).

Now, it's not that my current job is so bad, but I feel the time to move on and do great things have come. Yes, there's the option of doing great things in where I'm at, but I've struggled with this places' idea of "innovation" and resource priorities of late. Oh well, I'll bite my teeth harder together and we'll see what happens next.

15 May 2006

Wiki as a KM and PM tool

I posted a comment to Denham Grey's blog about capturing corporate knowledge, from which two people have asked me to say more. Two people asking for that in my book qualifies for a blog post, so here goes;

First, the two acronyms; KM (Knowledge Management) is a cauldron which contains many things (processes, methods, systems, software, etc) that tries to manage (meaning; collecting, storing, finding, repurposing and changing) "knowledge". PM (Project Management) is that crazy category of "things we do to do things on time and within budget."

Right then, a Wiki is basically a really simplified web-based page-driven "anyone can make edits to a page" system, but instead of wasting my time rewriting what's been said before, here's the worlds best Wiki explaining itself. I've been doing Wiki's since 1997 (two years after they were 'invented', so I've been doing them for quite a while now, seeing them grow and flourish).

Knowledge What?

First of all, let me just state that I don't belive in Knowledge Management. I do have some hope in Knowledge Representation Management at least, but the difference between the two is the realisation that "knowledge" is a human thing that computers don't have, don't handle right now, not in the next few years, possibly not until Quantum Computing and serious AI systems taking off, possibly long after I've passed away, and that the only thing they can do and do well, is representation of little bits of information. The current thinking that we're on to that golden path of "Knowledge" in computers is what brought us all that ontology noise and semantic web porn, but I'll leave that rant for another days.

The goal of KM is a worthwile thing though, and we use a variety of systems, methods, tricks, software and sanity to trick ourselves into believing we've got a good grasp on the concept of "we're doing KM." For most people it involves some kind of intranet in the shape of a Content Management System, possibly with a few KM features bolted on, and perhaps some records management and customer relation management system. So, we can list a few good acronyms here; CMS, CRM, KMS, RMS. You can google them if you like; hours of fun reading, if you're a maschocist.

In short, most of these systems are huge databases with an underlying data-model that tries to do what they state on their respective tin. A popular game with enterprise management is to buy one system for each component of your enterprise, so one for taxes, one for the website, one for the intranet, one for customer relations, one for finance, one for leave and pay, another for filling in your hours, one for the helpdesk, one for systems support, one for deployment and / or configuration, one for holding your hand, another for wiping your bottom, etc, and so forth.

So the first obvious problem with all this is of course that there are many of them! And all sporting their own unique way of doing things! With their own unique user-interface! Most of them using some proprietary user-management module which results with you having to have about 5 usernames and passwords just to get you through a normal week.

One can argue that all these systems combined surely holds a bit of the corporate knowledge, and quite pssobily if you merged all those data-models and interfaces and methods and ways of reporting, we might have a pretty good Knowledge Representation System ... provided, of course, that you know all those data-models by heart, the user-interface was far smarter than you, and everybody in the world was working towards making you a happy human being in liue with the universe.

I've seen some pretty complex enterprise setups in my life, and I'll swear that no one - no one! - has ever come close to capture knowledge (in representation-form or otherwise) with this one-system-to-every-part nonsense. The proof is in the pudding, and I've yet to find a pudding that tastes wonderful, is good in shape and form, looks pleasing and leaves me feeling satisfied after use.

What's a document?

It's a good question; what's a document? A word document? A meeting invitation in Outlook? A mail? A picture? A diagram? A todo list? A meeting minutes? A draft of a specification? A combination of many things? An atomic unit?

Very often people's notion of what a "document" is is quite varied; do you mean a document on a company, by the company, for the company, is it about fish, a todo list for fishermen, a complaint on our smell of fish, a fishy document ... what is it? In my book it seems like a worthwhile thing not to do is chasing the "document" paradigm, because "document" often is represented by some finished work, a piece we can fit into our KM machinery. (In rare circumstance we refer to draft documents, which really are drafts, before we treat them again as a produced document)

Instead, let's work with something that has proven itself to work quite well; a web page. It has proven itself over the last decade to be a very good spot for information, especially for changing information. Web pages change all the time.

The Wiki way

The Wiki is a changing web page about something, anything. So instead of creating a document about "Fisheries" you make a Wiki page about "Fisheries". Instead of using a special tool (like a word processor), you use the browser directly. Instead of saving it locally first through drafts (my_doc_v1.doc, my_doc_v2.doc), share it over email (my_doc_v1.doc, my_doc_v2.doc, my_doc_v3.doc ... uh, who's making changes to what document?!), get it back and do more edits (my_doc_v5.doc, my_doc_v6.doc, oops! my_doc_v5.5.doc, my_doc_v7.doc), upload it through the intranet thingy (my_doc_v2.html, using the most abysmal HTML known to man) ... instead of all that, you simply go to the page, click an edit button, make some changes, click the save button, and you're done. Everybody can edit and save all pages; no need to share it around as it is naturally shareable.

Ok, so let's assume we all know the simplicity of this model. What's stopping us from dealing with almost all of those KM tools in a Wiki way? What stops you from setting up a page about yourself with a picture of you, your contact details, where you fit in the organisation, what you do, how you do it, what your hobbies are and what other extraordinary skills you've got? What stops you from setting up a page about a project? With links to documentation of various kind? What stops you setting up a page with your hours in them?

The answer to a lot of those questions are mostly "you can't mine and reuse the data for other purposes", again referring us back to the KM machinery. But that's just where things are about to change, and in big ways. Do you really need everything to be in a highly-structure database. I mean, seriously, I know you want to use that data, mine it, sort it and report on it, but do we really do it? And if we really do it, does it matter if the data comes from a database of fields or a database of pages?

Most good Wiki engines support different ways of taking your input and converting it into something more useful for computer processing, either through crude file export or more sophisticated Web Services API's. This latter is what I've done with huge success.

Web Services

A page in a Wiki system is usually stored internally in a loosly structured way, often in something known as Wiki markup; it consists of plain vanilla text that is given some special meaning, so that "it was *the dog* who ate it" is converted to "it was the dog who ate it" when displayed.

Here's a better example, a page called "DimwitProject_HoursWeek52_AlexJ" ;

Dimwit Project
--------------
Application design : 20h
Usability study : 9h
XML schema work : 3h

It isn't hard to get or write a little parser (a lot of Wiki's have lots of these already out of the box) that can convert the above to ;

<hours>
<title>Dimwit Project</title>
<item type="Application design" duration="20h" />
<item type="Usability study" duration="9h" />
<item type="XML schema work" duration="3h" />
</hours>

The road from using the Wiki with another part of the KM machinery is a lot closer as these systems more and more utilise web services; in fact, you can use the Wiki as the interface to almost all of it.

At work these days we're using an enterprise Wiki system called Confluence that has both SOAP and XML-RPC web services available, and I've created parsers and scripts that basically allows me to use Confluence as a Wiki interface into a number of services. What happens then?

Well, first of all you get one point of origin for most of your normal processes, the very pipe-dream that portal systems dream of, only in portals you're at the whim of developers creating user-interfaces and systems in perticular ways for it to really work. In the Wiki, you're already familiar with the interface, and, perhaps more importantly, it is within the page-paradigm, which is easy to bookmark, easy to reference, easy to modify, easy to remember, and easy to search. And if there's some things portal systems suck at, it is pretty much all those things listed. And the Wiki has a distinct Google advantage; free-text parsing and linking that can convey importance much better than most metadata can! (I know; bold statement, solely based on the tremendous success Google has shown us in this area)

Second, because of the simplicity of adding and editing data, the freshness of the information becomes higher. If you only allow people to use the Wiki for most things, even more so will the Wiki be fresh in content. Instead of John using Word to write down the minutes of a meeting, make him do it straight in the Wiki. Instead of letting Doreen write a draft letter to the fish caterer in Word, let her do it in the Wiki. Instead of adding a meeting to Sonjas schedule in Outlook (where perhaps a few know how to properly use that information), just put it on her Wiki page.

Third, and this bit is a bit philosophical and possibly psychological, but an open space for all to work in helps people a) understand what others are doing and what they're working on, b) helps generate an atmosphere of less secrecy, and c) promotes a less rigid structure for live information (there is no longer just draft and published documents; they are all living, changing all the time).

Here's what I do

First of all, I created a really simple URL for people to go to, such as wiki.work.com or work.com/wiki. (I'd recomend that you create a few shortcuts as well, so that wiki.work.com/project/MyProject is the same as wiki.work.com/projectMyProject, as this gives the impression of structured data)

For project management, every project has a starting Wiki page. On it, I foremost write a) what the project is about and for (divided into 1. the problem and 2. the solution), b) where we're up to, and c) where you can find some current representation of what we're doing (an application in test, a document, a draft design guide, whatever we can prove our exsistance through). Then I write who's involved in the project, stakeholders, developers, watchers, and all these have their own pages which are Wiki pgaes themselves. Finally I have a separate documentation page with links to all our various documentation, all Wiki pages.

If we have a Word document, it will immediatly be Wikified and deleted from the offenders PC. This is important; delete all Word (or other proprietary format) as soon as you possibly can; if the Wiki is to work for us, we must work with it. This is probably the hardest transition for most people at first, but after a short while they'll never look back. :)

Once a day I update the front page with status information. I usually do it at a specific time everyday, like 1pm. For every important information bit I might add a comment of progress and possible resolution. Once in a while I create a GANTT chart (because some people can't live without them) which I'll attach to the Wiki page and link to. If I can't give people a good overview of where we're up to on that front page, I doubt any other PM software would do a better job.

All documentation is a separate page which you'll link through to the documentation page. It doesn't matter if this documentation page gets long; group the links reasonably well and label the links, and people will have no problem finding them. If I need to write a report, I create a page that's a sub-page of the project reports page. You can almost never have enough pages, and you certainly will never run out of them.

Some Wiki's support structured pages (and our Confluence does just that) where you can create sub-pages of a page where that structure can automatically be called upon in terms of navigation, display and organisation. Use this wisely. Some Wikis also support things like tags, blogging, WYSIWYG editors, sub-Wiki's etc, and all this will help you out in creating a good intranet.

Some pages are worth republishing, and this is done by taking the page name and push it through a simple PHP script I've got that fetches the page content through web services and displays them on our various other webs. Over time this will probably run the whole website, but currently there's an assorted pages done this way, and I'me working on making all news / newsletters done this way, repurposing bits of news. (Our Confluence supports various blogging paradigms, and creating and reusing newsfeeds from pages/ Wiki blogs is easy)

Some pages are reports by themselves, sporting simple Wiki macros that take information from various places, and creates a summary page (which is the report itself). If your Wiki markup is well-structured, creating quite sophisticated reports is easy. For example, I can create an automatic page that is a monthly and / or yearly summary of all my hours spent, using the Wiki markup I described earlier.

Hmm, I do do a lot more, and I might update this post as I'm doing them.

Conclusions

I suppose there are a lot of straw-men in this post, ungrounded facts and dubious claims, so I suggest not to take my word for it but simply try it out yourself. Start simple; install some Wiki engine, start documenting your own projects, and invite a select few to participate. See what happens. There are a number of things to be said about whether an organisation is 'ripe' for a Wiki approach or not. I personally have witnessed conservative technologically-challenged folks use Wikis with ease and pleasure, but I'm sure there are counter-stories to be told as well.

Often people think of Wiki's as another tool for their toolbox, but in my experience Wiki's tend to work best when you remove some of those other tools; it seems to be worth the re-training and initial frustration and scare. Just because everybody uses Office doesn't mean it is the best tool for the job, nor does it mean you should even use the darn thing; we have a tendency to use Office for all the wrong things as well, just because it is there. Time to rethink.

Finally; spreadsheets. Yeah. Wiki's can't compete with them. Sorry. :) There are however smart ways to link into the information within them (again, think web services) and reuse that information for all your Wiki pleasures.

Enjoy.

A few thoughts on my online communication

As you may have noticed, I have been on a rather extended hiatus in regards to blogging. I thought I should quickly summarise my absence ;

1. I was rambling too much about my miseries, too much for my own comfort. Yes, work has been trying, so here's the quick summary of that; the public service is a slow bitch when you come from a commercial high-flying world. I've had to learn to deal with this better, because, in this town, 80% of all jobs are government, and for the moment I seem to be stuck here. I also thought that there would be companies out there who would see my wonderful CV and snatch me up before anybody else would, but there were errors in my plan (see next point).

2. Everyone is a frigging expert while I'm starting to sound a bit like an wannabe jerk. Not a good sound at all, so my writing will surely be toned down a notch, and my expertise adjusted. I feel a bit stupid, really, but still haven't figured out if I'm behind or ahead of current thinking on a number of issues. Report at 11.

3. Nothing is surprising anymore, and everyone is a blogger. Everyone writes about a number of "new" things, but seriously, none of the really are. Everything is regurtitations of other ideas, and I simply don't get surprised anymore. I don't feel there is anything important to write about, no matter how wrong that might be. (See point 2 above)

4. Communication was failing me; after 17 years in the IT industry I came to a stand-still in communication, be it with friends, family and people I know around the world. I keep responding to things, but the amount of replies were decreasing. I've felt the dreaded "am I missing emails?" syndrome, another bad state to be in, when emails and blog comments holds higher importance than real-life. I've adjusted my importance on the online world accordingly and don't rely on online friendships for self-realisation.

So I feel a bit fresher and wiser, and I'm being a bit humbler, I feel properly embaressed, and will probably be a bit gentler in my online approach.

12 May 2006

Knee-deep in SOA

Lately at work I've been knee-deep in SOA; Service-Oriented Architecture, a concept that for me has roots in web services but extends beyond that in that it has made my holistic thinking possible.

There is more to application design than to solve the problem at hand, far beyond the scope of any requirements document; it's more about supporting the infrastructure than whatever else you think you're doing. Maybe this requires some explanation;

Business Analysist create requirement documents to solve some business problem. However, unless that business analyst is especially sharp and holistic, there are so many undercurrants, twists and turns to the final solution that more often than not we shouldn't even attempt to solve problems. A lot of places employ Solution / systems / information Architects to try to rectify this problem, but often it simply ain't enough; detachement from technical solutions seems to be a huge problem for understanding business problems.

A lot of us are technically inclined people; we try to take the business requirements and make technical solutions of them. Anyone with a speck of experience in these areas know how dreadfully wrong it can go, and we say "oh, you need to employ the right people to make it work." Most of the time, that's true; with really super people these things will work much better. How often do we have the luxury of only working with top-dogs? I'll leave that an open question, of course.

Enter the SOA; think of your business requirements as services, tiny and disjointed or large and intertwined, and open up Interfaces to them, and what happens? Well, notice that bold word; interface. Not application. Not program. Not even requirement. It's a service, a service that programs, applications and requirements can use to solve their problems.

We all know web services by now; most of the time it means either a SOAP or REST call where XML is the carrier of various bits of information. Because of the openness of these technologies we can quickly cook up various other applications and programs from a smørgåsbord (yeah, look it up) of services that might address your issues or wants. They are completely disjointed from the applications that use them, meaning a clear separation of business logic, application frameworks and user-interfaces. If played correctly, it can have an amazing synergetic effect on everything you do.

If you're a geek like me, the prospect of this is great, but over the next little while I'd like to talk about all those things it affects in better business management, usability, information architecture, user-interface design, application design, application scalability and performance enhancements, and more.

To sign off though, I'd like to talk a little about what I've done so far. First, I've created a hug from which all web services come, something like http://ws.example.com/ which works as either an application context for your servlets/scripts/etc that are plain web services, or it works as a proxy for external services. This hub has a wrapper so that it doesn't matter if you want to use SOAP or REST or even partly RSS/Atom feeds of stuff.

Next, put some good services in there, like user-authentication, and you're half-way there to the single-sign-on pipe-dream. I've implemented it with some thesaurus services, authentication, an OPAC service and a wrapper for Amazon.com.

I can now, through PHP, create a completely new application to a few services that were written in Java and Perl. I can pass bits of information in to our OPAC and do more complex searches. As a test, we recently create a Lucene database prototype of about 11.4 million MARC records. The user-interface looked terrible, but through the SOA hub we split the requests in two (at random, fire your web service request at two different servers for load balancing), took the first record from the service and fed it to another lookup-service, fed the subject headlines from this request to the thesaurus, did a third search in the Lucene prototype for the subjects that had thesaurus entries, and Voila! we could present a new application with a good user-interface, all in two days from start to finish. And we really didn't break a sweat, either.

That's what I want to talk about; what happens once you've got a basic SOA in place; synergy!

I'm baaaaack!

Ok, so I've reconsidered my life and blogging, and I'm back to fiddle some more. This just a warning. Oh, and I've changed to blogger to handle it; I just couldn't bother with all that silly code anymore.

I'm redirecting my old feed http://shelter.nu/shelter.rss to http://shelter.nu/blog/atom.xml. If any of this means anything to you, you know what to do. If not, don't worry.

Also, I'll be reposting some recent mails just in case. Also, all my old posts and comments will still be available as is, but I'll probably just shoot in a little message at the front gate to have go to http://shelter.nu/blog/ from now on.

Wish me luck!