8 December 2009

I ain't dead!

Right, so I'm still here, in the rubble of my mind, trying to work something out. I haven't blogged in the last few weeks, because, again I'm lost to the infinite machine of just too bloody much to blog about. Some of these things are somewhat secret stuff, but a lot of it I should yell out for all to see, and I'm sure with a bit of patience and Macedonian Oil I just might.

But not now. "Now" is just a futuristic recap of things I've blogged about in the near future, using cheesy book titles ;

  • The end is far, far away : Studies in Cosmology, thoughts on the flat universe model and evolutionary natural selection, and how timing is everything
  • Ontology schmology bolony : Everything I know about ontologies, linked data and inference, and just what a bloody mess it all is (and the possible ways through it, as far as I can see)
  • Library end-times : This is what they were, this is what they are, this is what they'll become
  • Evolution as a driver for moral philosophy : Philosophical greats had good questions that now makes for redundant answers
  • Have you heard this?! : Science as a language of beauty, art and transcendence
  • Functionally complete impotence : Programming languages that mean well, but are ugly, smells bad and won't make you light up after having sex with it
  • Atheism and agnosticism : A transsexual ploy to power (or, a Tale of two Ditties)
  • Software : Sitting comfortably? How sitting in front of a computer makes you a terrible programmer
  • Books I've read : The good stuff (and where to go next)
  • Books I've read : Time I've wasted (and the reasons for it)
  • Lingua Panga! : How language poisons everything (the big problems of humanity blamed on humans talking too much)

Right, so that should leave some clues as to where I am and where my mind is. My bedsite table is brimming over with books and notes, and I've got a few half-written articles (novels, is more like it) sitting around waiting for me to retire so I can bloody well finish them.

I also have some real articles in the oven, about service-oriented architecture (SOA) perils and solutions in a time of cloud hysteria, parallel processing mania and key-value minimalist thinking as a way to leverage, er, something or other. I'm sure it'll be great once I figure out what I'm writing.

Oh, and it's hot here now. We've been to the beach rather often, but being a dad of three crazy kids I don't get to go in the water much, but I enjoy helping Sam dismantle the beach with a shovel and bucket. It's also end-of-year stuff with school, Lilje playing in some musical number or two, and generally for Grace and Lilje to say goodbye to Minnamurra Primary as we're moving them over to Shellharbour Anglican Collage next year following Julie's new job there. We're also moving houses in about 10 days (closer to the beach, so it must be good, although I'll miss the close proximity to all my coffee-shops), so that's going to be crazy time.

Ok, time to go and treat kids for head-lice which has rampaged through their school of late. And then, dinner. Wish me luck.

27 October 2009

On identity

What are you talking about?

   We're always talking about something, but have you wondered why we humans are so good at it? It's not because we're smart, that our brain has got some amazing capacity for language, or even that we've evolved a great sense of logic and inference so we can break sentences up into compartments, parse it and make some sense of it. No, it's because we've got a tremendous imagination!

   And it seems that our frontal lobe is to blame; it is linked to a number of cognitively important things, like dreaming (preparing the brain for situations and trauma; did you know that no matter the trauma you will be over it [as in, able to move on] within 7 months?), Déjà vu (the frontal lobe is always a few milliseconds ahead of you), intuition (simulating possibilities, feeding you with probables), and in this context, filling in the gaps as best it can.

   And boy is it good at it. Remember that meme that was floating around some time ago, about how researchers have found that if you removed some of the letters from words in a text, the brain is still able to fill in the gaps so that you can make sense of it? The brain will fill in whatever gap there is, and this is also being heavily linked to religion and why people believe in rather bizarre things, from ghosts to conspiracies to "alternative medicine" ("You know what they call alternative medicine which is proven to work? Medicine.'" -- Tim Minchin). But I'm not going to get into what they believe here, only how they believe in the same bizarre things as their peers.

   But first some background. My recent adventures in library-land is trying to get some traction on identity management, which I have tried to explain there for the last two or three years with little to no success. I'm not even sure why the library world - full of people who should know a thing or two about epistemology - don't seem to grasp the basics of epistemology. (Maybe it's another one of those gaps the brain fills in with rubbish?) How do we know that we're talking about the same thing?

   If I have a book A in my collection and Bob has a book B in his collection, how can we determine if these two books share some common properties or, if we're really lucky, is written by the same author, has the same title, and is the same edition, published by the same publisher? We're trying to establish some form of identity. Now, we humans are good at this stuff because we're all fuzzy and got this brain which fills in the gaps for us, but when we make systems of it we need other ways to denote identity.

   The library world has a setup which is based around the title and the author, so for example we get "Dune" by Frank Herbert (1920-1986), or if we are to cite it, something like this (from NLA's catalog) ;

  • APA Citation:  Herbert, Frank,  1972  Dune  Chilton Book Co., Philadelphia :
  • MLA Citation:   Herbert, Frank,  Dune  Chilton Book Co., Philadelphia :  1972
  • Australian Citation:  Herbert, Frank,  1972,  Dune  Chilton Book Co., Philadelphia :
   Never mind that when you look at the record itself it lists Herbert as "Herbert, Frank, 1920-" confusing a lot of automata by not knowing he died over 20 years ago. So we've got several ways of citing the book, several ways of denoting the author ... what to do?


   The library world is doing a lot of match and merge (on human prose, no less!), where since you know that a lot of authors have died since their records were last updated, you can parse the author field and try to match "sub-fields" within it to match on that. However, this quickly becomes problematic ;

  • Herbert, Frank (1920-)
  • Herbert, Frank (1921-1986)
  • Herbert, Francis (-1986)
  • Herbert, Franklin (1920-)
  • Herbert, Franklin Patrick Jr (1919-)
  • Herbert, Francis (1030-)
  • Herbert, Frank Morris (1920-)

   Which of these is the real Frank Herbert who wrote the book "Dune"? Four of them, actually. Now, if you're a human you can do some searching and probably find out which ones they are, but if you're a computer you have buckleys trying to figure these things out, no matter how well you parse and analyse the authors individual "sub-fields". People make mistakes and enter imprecise or outright wrong information into the meta data (for a variety of reasons), so we need some other method that's a bit better than this. However, do note that this is the way it's currently being done. Add internationalization to the mix, and you'll have loads of fun trying to make sense of your authority records, as they are called.

   Now, my book A just happened to be "Dune" by Frank Herbert, so I sent a mail to Bob with the following link and asked if that happened to be the same book ;
http://en.wikipedia.org/wiki/Dune_(novel)
   Did you notice what just happened? I used used an URI as an identifier for a subject. If you popped that URI into your browser, it will take you to WikiPedia's article on the book and provide a lot of info there in human prose about this book, and this would make it rather easy for Bob to say that, yes indeed, that's the same book I've got. So now we've got me and Bob agreeing that we have the same book.

   How can our computer systems do the same? They cannot read English, certainly not to any capacity to reason or infer the identity of the subject noted on that WikiPedia page. But here's the thing; that URI is two things ;

  1. A HTTP URI which a browser can resolve, will get a web page back for, and which it displays to a human to read.
  2. A series of characters and letters in a string.

   It's the second point which is interesting for us when computers need to find identity. It is a string that represents something. It isn't the web page itself, just an identifier for that page, just a representation of a particular subject. This brings us back to epistemology, and more specifically representialism; we've created a symbol, a string of letters, that doesn't need to be read or understood when the strings are put together, but simply a pattern, a shape, a symbol, an icon, a token, whatever. It's not an URI anymore, but simply a token. And because it's a string of characters, it's easy to compare one token against the other. "http://bingo.com" and "http://bingo.com" have the same equivalence as "abc" and "abc", that is, they are the same. Those symbols, those tokens, are equal.

   So now we can say that the URI http://en.wikipedia.org/wiki/Dune_(novel) is simply a token and a URI at the same time. This is deliberate, and bloody brilliant at the same time; it means that we can compare a host of them for equality as well as being resolvable in case we want to have a look at what they are. This becomes a mechanism for both human understanding of what's on the other end of the URI, and for doing computational comparisons.

   So are we to use an URI for each of the variations of Frank Herberts name? No, that would bring us back to square one. No, the idea is for sharing these URIs (but more on URIs for multiple names in a minute) in a reasonable fashion, but this is where it gets slightly complex because when you talk to Semantic Web people it's all about established ontologies and shared data. When you talk to people, it's all about resolvable URIs. But there's a bit that's missing ;
I love http://en.wikipedia.org/wiki/Semantic_Web
   That's a classic statement, but what am I saying? Do I love the Semantic Web (the subject), or do I love that web page article at WikiPedia explaining the Semantic Web (a resource)?

   Incidentally, my classic statement is known as a value statement in the RDF world, and as a triplet (because it's got three parts, the three words / notions). Whenever we're working with RDF, we're working with URIs. Every single entity is translated into its URI form like such ;
I [http://shelter.nu/me.html]
love [http://en.wikipedia.org/wiki/Love#Interpersonal_love]
Semantic Web [http://en.wikipedia.org/wiki/Semantic_Web]
   I need to talk a bit about namespaces at this point. If you're not familiar with them, they're basically a shorthand for mostly the first part of an URI, like a representation that can be reused, and then glued together by the means of the magical colon : character, so for example I have many things to say about me and my universe, which each will get translated into a URI ;
me [http://shelter.nu/me.html]
topic maps [http://shelter.nu/tm.html]
fields of interest [http://shelter.nu/foi.html]
blog [http://shelter.nu/blog/]
Writing out the URI for each thing is tedious, and also is prone to errors, so what we do is to create a namespace as such ;
alex = http://shelter.nu/
Now we can use that namespace with a colon to write all those URIs in a faster, less error-prone way ;
me [alex:me.html] 
topic maps [alex:tm.html]

fields of interest [alex:foi.html]
blog [alex:blog]
   Namespaces is also a good way to modularize and extend easier existing stuff, and helps us organize and care for our various bits and bobs. Well, so the theory goes. But when you muck around with lots of data from many places, it quickly becomes a situation that I call name-despaced, where there's just too many namespaces around. When it gets complex like that with hundreds of namespaces around, we're pretty much back to having non-semantic markup again and no one really wants that. This all is of course the result (but not end result) of the organic way information and people organize stuff. Some namespaces will die, while others will be popular and live on, and we're still in early days.

   Anyway, back to solving our identity management problems. The issue here is that just sharing the data doesn't give us semantics (meaning), nor does sharing our ontologies. We need both human comprehension and computational logic in order to pull it all off, and the reason we care about this these days is that the amount of data is growing beyond our wildest imaginations and will continue to grow. The computational part is reading in ontologies and sort data thereafter. The human part is creating the ontologies.

   So what are these ontologies? Well, they're just models, really, an abstract representation of something in reality, so when FRBR spends its time in prose and blogs and articles and debate, it's really trying to make us all agree on a specific way of modeling said domain. When we formalize this effort, mostly into XML schemas or RDF / OWL statements, we are creating an ontology. It's like a meta language in which we can describe our models further. This is usually modularized from the most abstract into the most concrete way of thinking, so from what's known as an upper ontology (pie-in-the-sky) through various layers (all called many different things, of course, like middle, reason, core, manifest, etc.)


   Karen Coyle (a voice of reason on the future of the library world)  recently "debated" with me on these things, and I pointed her to "Curing the web's identity crisis", an article by Steve Pepper (fellow Topic Mapper like me) which more people really should read and make an effort at understanding. Now I think there's some confusion as to what is being explained (well, I never got a reply, so I don't know, to be honest. It's probably me. :), and also to why we (us terrible representialists) keep bringing this up, but I'm kinda back to where I started in this blog post of trying to argue the case for creating identity of things through more layers than currently is being used.

   We (both RDF and Topic Maps) use URIs as tokens for identity. But in the RDF world there is no distinction between subject identity and resource identity, and I suspect this is where Karen's confusion kicks in. In the Topic Maps world we make this distinction quite clear, in addition to the resource-specific identities as well (so URIs for internal Topic Map identity, external subject identity, and external resource identity), and this is vitally important to understand!

Let me examplify with how I would like to see future library cataloging being done ;

I have a resource of sorts at hand, it could be a book or a link or a CD or something. Doesn't matter, but for the example it's written by Frank Herbert, apparently, and is called "Dune Genesis." It's an eBook. I pop "Frank Herbert" into a textbox of sorts, the system automatically does some searching, and finds 5 URIs that match that name. One of those URIs are WikiPedia and another is The Library of Congress. That means LoC has verified that whatever explain the subject of "Frank Herbert" is at the URI at WikiPedia, and that there is a reasonable equality between the two; one WikiPedia page, one authority record at LoC. The other URIs more or less confirm it (and this speaks to trust and government) I choose to accept the LoC URI as a author subject URI. Nothing more needs to be entered, no dates, no names, no nothing. Just one URI.

   Now I pop the name "Dune Genesis" into by tool, and it does its magic, but it return only a WikiPedia URI, and because it's tradition not to "trust" WikiPedia it means I have a "new" record I need to catalog. However, the WikiPedia URI contains RDFa, so my tool asks if I want to try and auto-populate meta data, and I choose yes. Fields gets populated, and I go over them, controlling that they are good, add some, edit some, delete some, and hit save.

   Two things now happen; the system automatically create an URI for me, a subject identity URI that if resolve will point to a page somewhere on our webserver with our meta data. That URI is fed back into whatever loop that tool uses for federated URIs, it could be library custom-made (see EATS below, or look to the brilliant www.subj3ct.com website for federated identity management) or something as simple as Google (for example, I use Ontopedia a lot, so if I do do "Alexander Johannesen Ontopedia", I will get as a first result a page representing an URI I can use for talking about me). This creates a dual system of identity, one for the subject, one for the meta data about the book, both using the same URI.

   Do you dig it? Can you see it? Can you see the library world slowly using such a simple mechanism for totally ruling the meta data and identity management boulevard, or what? I pointed to Conal Tuohy's EATS system. Make him give it to you, collaborate to make this just work, open-source and make make it a tool for librarians to automatically create, use, harvest and share identities and resources using the same URIs, and you've got what you need.

   This is complex stuff, and I think I need a drink now. A nice hot tea will do, and I'll try to clarify more in the coming days. Until then, ponder "what the heck you are talking about."

21 October 2009

Old post, as good as new

I just realized that I wrote this ages ago but never posted it. It has a few gems in it ;
Criticism is mostly about rocking the boat. Sure, there's positive criticism, like "you're not ugly, just beautiful-impaired!", but aren't we over this silly overly political correctness by now? Criticism is to tell it straight, that what someone else has done is not up to scratch, that surely there must be some improvement that could be done. But the library world don't work like that. Criticism in the library world uses a different word; approval.

15 October 2009

Ontological Ponderings

The last few months have been interesting for me in a philosophical sense. My job is on an architectural level in using ontologies in software development, both in the process (development, deployment, documentation), the infra-structure (SOA, servers, clusters) and the end result of it (business applications). So needless to say, I've been going a bit epistemental, so I promised myself yesterday to jot down my thoughts and worries, if for no other reason than for future reference.

One big thing that seems to go through my ponderings like a theme, is the linguistic flow of the definition language itself, in how the mode of definition changes the relative inference of the results of using that ontology over static data (not to mention how it gets even trickier with dynamic data). We usually say that the two main ontological expressions (is_a, has_a) of most triplets (I use the example of triplets / RDF as they are the most common ones, although I use Topic Maps association statements myself) defines a flat world from which we further classify the round world. But how do we do this? We make up statements like this ;

Alex is_a Person
Alex has_a Son

Anyone who works in this field understand what's going on, and that things like "Alex" and "Person" and "Son" are entities, and defined with URIs, so actually they become ;

http://shelter.nu/me.html is_a http://psi.ontopedia.net/Person
http://shelter.nu/me.html has_a http://en.wikipedia.org/wiki/Son

Well, in RDF they do. In Topic Maps we have these as subject identifiers, but pretty much the same deal (except some subtleties I won't go into here). But our work is not done. Even those ontological expressions have their URIs as well, giving us ;

http://shelter.nu/me.html http://shelter.nu/psi/is_a http://psi.ontopedia.net/Person
http://shelter.nu/me.html http://shelter.nu/psi/has_a http://en.wikipedia.org/wiki/Son

Right, so now we got triplets of URIs we can do inferencing over. But there's a few snags. Firstly, a tuple like this is nothing but a set of properties for a non-virtual property and does not function like a proxy (like for instance the Topic Maps Reference Model do), and in transforming between these two forms gives us a lot of ambiguity that quickly becomes a bit of a problem if you're not careful (it can completely render inferencing useless, which is kinda sucky). Now given that most ontological expressions are defined by people, things can get hairy even quicker. People are funny that way.

So I've been thinking about the implications of more ambiguous statement definitions, so instead of saying is_a, what about was_a, will_be_a, can_be_a, is_a_kindof_a? What are the ontological implications of playing around with the language itself like this? It's just another property, and as such will create a different inferred result, but that's the easy answer. The hard answer lies between a formal definition language and the language in which I'm writing this blog post.

We tend to define that "this is_a that", this being the focal point from which our definition flows. So, instead of listing all Persons of the world, we list this one thing who is a Person, and moves on to the next. And for practical reasons, that's the way it must be, especially considering the scope of the Semantic Web itself. But what if this creates bias we do not want?

Alex is_a Person, for sure, but at some point I shall die, and then I change from is_a to a was_a. What implications will this, if any, have on things? Should is_a and was_a be synonyms, antonyms, allegoric of, or projection through? Do we need special ontologies that deal with discrepancies over time, a clean-up mechanism that alters data and sub-sequentially changes queries and results? Because it's one thing to define and use data as is, another completely to deal with an ever changing world, and I see most - if not all - ontology work break when faced with a changing world.

I think I've decided to go with a kind_of ontology (and ontology where there is no defined truth, only an inferred kind-system), for no other reason that it makes cognitive sense to me and hopefully to other people who will be using the ontologies. This resonates with me especially these days as I'm sick on the distinction people make between language and society, that the two are different. They are not. Our languages are just like music; with the ebb and flow, drama and silence that makes words mean different things. By adding the ambiguity of "kind of" instead of truth statements I'm hoping to add a bit of semiotics to the mix.

But I know it won't fix any real problems, because the problem is that we are human, and as humans we're very good at reading between the lines, at being vague, clever with words, and don't need our information to be true in order to live with it. Computers suck at all these things.

This is where I'm having a semi-crisis of belief, where I'm not sure that epistemological thinking will ever get past the stage of basic tinkering with identity in which we create a false world of digital identities to make up for any real identity of things. I'm not sure how we can properly create proxies of identity in a meaningful way, nor in a practical way. If you're with me so far, the problem is that we need to give special attention to every context, something machines simply aren't capable of doing. Even the most kick-ass inferencing machines breaks down under epistemological pressure, and it's starting to bug me. Well, bug me in a philosophical kind of way. (As for mere software development and such, we can get away with a lot of murder)

I'm currently looking into how we can replicate the warm, fuzzy impreciseness of human thinking through cumulative histograms over ontological expressions. I'm hoping that there is a way to create small blobs of "thinking" programs (small software programs or, probably more correctly, script languages) that can work over ontological expressions without the use of formal logic at all (first-order logic, go to hell!) that can be shared, that can learn what data can and can't be trusted to have some truthiness. Here's to hoping.

The next issue is directional linguistics, in how the vectors of knowledge is defined. There's things of importance to what order you gain your knowledge, just like there's great importance in how you sort it. This is mostly ignored, and the data is treated as it's found and entered. I'm not happy with that state of things at all, and I know that if I was taught about axioms before I got sick of math, my understanding of axiomatic value systems would be quite different. Not because I can't sit down now and figure it out, but because I've built a foundation which is hard to re-learn when wrong, hard to break free from. Any foundation sucks in that way, even our brains work this way, making it very hard to un-learn and re-train your brain. Ontological systems are no different; they build up a belief-system which may prove to be wrong further down the line, and I doubt these systems know how to deal with that, nor do the people who use such systems. I'm not happy.

Change is the key to all this, and I don't see many systems designed to cope with change. Well, small changes, for sure, but big, walloping changes? Changes in the fundamentals? Nope, not so much.

We humans can actually deal with humongous change pretty well, even though it may be a painful process to go through. Death, devastation, sickness and other large changes we adapt to. There's the saying, "when you've lost everything, there's nothing more to lose and everything to gain", and it holds remarkably true for the human adventure on this planet (look it up; the Earth is not really all that glad to have us around). But our computer systems can't deal with a CRC failure, little less a hard-drive crash just before tax-time.

There's something about the foundations of our computer systems that are terribly rigid. Now, of course, them being based on bits and bytes and hard-core logic, there's not too much you can do about the underlying stuff (apart from creating quantum machines; they're pretty awesome, and can alter the way we compute far more than the mere efficeny claims tell us) to make it more human. But we can put human genius on top of it. Heck, the ontological paradigm is one such important step in the right direction, but as long as the ontologies are defined in first-order logic and truth-statements, it is not going to work. It's going to break. It's going to suck.

Ok, enough for now. I'm heading for Canberra over the weekend, so see you on the other side, for my next ponder.

7 October 2009

Stupidity of systems and debt collection

Today's tale is an example of stupidity put into system. Or, a system that has accumulated enough stupidity to grow sentience, and has become a cancer onto society.

A preamble; in my distant, distant past (over 20 years ago now), I accumulated a bit of debt due to unfortunate circumstances, not too big for the world to get scared, but not small enough not to cause trouble. I lost a house over it, basically stemming from taxes on income the government of the country I was living in at the time thought I should pay when I, in fact, didn't have an income at the time (in their wisdom they demanded I had to prove that I didn't have an income, a bit like proving that something doesn't exists which is, in fact, impossible. And when you're arguing with a system, you're not going to be heard). It's a long story, one I'd rather try to forget, but suffice to say I have some experience of debt, debt collection and the various instances and how they work.

Since my distant past I try to help people make sense of these systems, mostly for minor things (like when you forget to pay a bill twice ... you'd be surprised how easy it is :), but sometimes also for larger debts that take time, patience and good negotiating skills to overcome. But I've done it again and again.

So, the other day we got a message on our answering machine from some person who's got the worlds fastest talking voice, saying something like 'Hi, Ribbedy Rabbedy from Bing and Bong here (honestly, it sounded just like that!), calling on an urgent matter, call us back on !*$*!!!*$$%%!*$ (I had to go through the message over 10 times to get these numbers right) with reference number %*@%*@%*$$* (another 10 times to get this number), bye!'

I called back straight away, because we have a pretty good system in our house for bills coming in and getting dealt with and knew of nothing outstanding, where everything gets put into the 'in' folder and dealt with at least three times a week, and if dealt with, moved from one side of the desks folder drawer to the other, big cross across the bill, and typed 'paid' in large numbers, before filed safely. But when I called the number, I was greeted by a receptionist who didn't know who'd called me, couldn't find anything with my reference number, couldn't tell me quite what it was they do ('business services' yeah, that explains it) and in the end we gave up. I thought, if it is that important, they'll get back to me.

Didn't hear another thing for two weeks. Maybe they made a mistake, and were after someone else.

Then last night we get a call from someone with a thick Indian accent, probably some poor outsourced guy in Bangalore just trying to fill his quota, trying to explain to first my 9 year old daughter, then to my wife, and finally to me, about something or other. We just couldn't work it out, except big words such as "serious matter" and "debt", and this all smack down in the middle of dinner-time. What they hell? It sounded more and more like a scam, as he was being very secretive, refusing to tell me anything of value, so I tried to just get out of him what company he was calling from, which was something like B'n'B, D'n'D, E'n'E, or any other combo of letters that go with ee-enn-ee ("what do you do? We do business services" Aaaargh!). My daughter confused and my wife worked up, I ended the conversation with saying that if there is a serious matter and you can't communicate properly, send us a friggin' letter, in a stern but polite manner.

Today came a letter. Well, a bill actually, accompanied with threats of "garnishee your wages, tax refund, bank account or *** or take you to court" with "urgency" and "serious" plastered all over it.

I paid the bill after going through our paperwork and not finding a 'paid' version of it, ticking it up as 'human failure to pop an old bill where it belongs for filing' (so, most likely my fault), and then the phone rings. Yup, another representative for this company bugging us. Having just paid the bill, I asked why he's calling, but because these guys (and no Indian accent this time, albeit there was a foreign element to it, since I'm a foreigner myself I detect these things) can only read from scripts he insisted to talk to my wife. I said, no, you just called me on my phone, I'm her husband, is there anything we need to know that the letter / bill doesn't address. "If I could only talk to your wife, I could answer that question."

This is where it gets complicated, and I must induce the powers of logic, inference and bloody common sense. The next 3 minutes went on with me stating "you called me, you tell me, my wife doesn't want to speak to you because you're rude, incosiderate and mysterious about matters which could be cleared up in no time and you insist on being stupidly pigheaded because 'for legal reasons' that you can't explain further you can't explain it to anyone but her *if* there is or isn't anything of importance you need to tell her that the letter didn't."

"For legal reasons" is more often than not business speak for "we don't want to get into legal trouble ourselves", and is something I've been thinking a bit about lately. I've had phone calls from various companies we have services from, Telstra being one of them, who do courtesy calls to you to make sure everything is fine, or nag about some service they're pushing, or other somesuch, and they all start with asking me about info to confirm that I am who I am. "For legal reason."

So I am to tell a stranger, who is calling me on my own bloody phone, that claims to be from Telstra or otherwise to give out personal info for verification of who I am? What is my option for verifying that they are who they claim to be? At current, there is none; this is a one-way street, because I am me, lucky to their client, and they are whoever hell they want to be. This whole identity conundrum has been bugging me more and more of late, and culuminated today with this idiot (who in his defence was reading from a script) failing miserably to understand that in any conversation there are two parts; you and who you are addressing at the time. It may not be who you want to be talking to, but that doesn't alter the reality of it.

I ended the conversation by saying 'I'm going to say no' to his insistant nagging to talk to my wife. The letter and all this insane phone terror comes from Dun & Bradstreet (signed 'sincerely' Corey Smith, National Collections Manager, who I suspect has his name and scanned signature in many D&B templates), one of the bigger players in the debt collecting and reporting business (who I've had slightly better dealings with their Norweigan branch in the past, but only marginally).

What's all this hubbub about, you may ask? 63$. Yup, that's right, 63 Australian shiny little dollars, and not only that, but CentreLink - an arm of the Australian government for family benefits, like child support, pensions and the like - had overpaid us the 63$, and now apparently wants it back the hard way, at any cost (and you can just imagine the cost of all this rubbish!). Instead of, you know, just deduct it from our next payment.

63 friggin' dollars. They should feel so ashamed of themselves. This is what you get when stupid systems grows sentinence instead of a brain.

29 September 2009

Library Pontifications

Once in a while I get some email from people who ask me some questions or ask me to clarify something I've said in some setting. The other day I ranted on the NGC4LIB (Next-generation catalog 4 libraries) mailing-list about, uh, something or other. And I got email, which I answered, but since I got no reply I'm posting it here in a blog-edited form so that it doesn't go to waste ;
I think I am starting to understand your rants against the culture of MARC, and I'd probably feel offended if I knew what all of the above meant.
Hmm. Well, it wasn't meant to offend anyone. I guess if people thought they were hardcore into persistent identity management, then maybe they would feel I've either overlooked their hard work or don't think what they're doing is the right kind, or something.

I usually have two goals with my "rants"; 1. flush out those who already are on the right track, and make them more vocal and visible, and 2. if no one is on the right track, inspire people in the library world to at least have a look at it. I can do this because I have no vested interest in the library world as such; I cannot lose my library job as I'm not working for a library. :)
Naturally, to feel outside of the mainstream creates a crisis of confidence in one's abilities. What does it mean these days to say that one is a cataloger or that one works in tech services, and is it perceived as a joke for those on the outside? Oh yeah...they still produce cards. What do they know about databases?
Librarians are from the outside an incredible gifted bunch of people who knows what they're doing, they have granted powers outside the realm of normal people (including professionals like software developers, believe it or not), and they know stuff we normal folks don't.

However, having been on the inside you get to glimpse the reality of an underfunded, underprioritized sub-culture of society who knows as little about the "real-world" as normal folks know of the library world. There is a great divide between them, and very little has been done to open up. The blame for this I put squarely on the library world (as the real-world is, well, real and out there) who for many years have demanded a library degree even for software development positions, and when we finally get there we are treated as second-class citizens because we don't have that mark of librarianship that comes from library school. It's a bizarre thing, really, and perhaps the most damaging one you've got, this notion of librarians must have a library degree, as if normal people will never understand the beauty of why a 245 c is needed, or the secret of why shelves must be called stacks, and so on.

One thing that has got me very disillusioned about the library way is philosophy. I deliberately sought out the library as a place to work because I have a few passions mixed with my skills which I thought was a good match, and one of the strongest passions were epistemology. One would think that if there was one institutional string of places that could appreciate the finer details of epistemology, it would be the libraries and the people within. That's what they concern themselves with, no?

Err, no. No, they don't. There's the odd person that ponders how a OCLC number can verify some book's identity, but these are very plain boring questions of database management. Then along came FRBR which does not only dip its toes into epistemology, but outright talks about it! The authors of it clearly had knowledge and wisdom about such things. So, one would think there was hope. Like, when it came out in 1993. That's more than 15 years ago. And people still haven't got it. How much time do you reckon it's going to take, and more importantly, how many years until it's way too late?

But no, RDA comes out of the woodwork and proves once and for all that there is no hope of libraries ever taking the issues at a philosophical nor practical level. Let me explain this one, as it sits at the core of much of my "ranting."

FRBR defines work, expression, manifestation, item, and these are semi-philosophical definitions that we're supposed to attach semantics and knowledge to. There's primarily two ways to do that; define entities of knowledge, or create relationships between entities. (Note these two basic ways of doing knowledge management; entities and relationships, as they spring up in all areas of knowledge representation)

Now, can you without looking stuff up tell me the difference between a work and an expression? Or between manifestation and an item? Sure, we can discuss if this or that thing is an item or something else, back and forth, but is that a good foundation upon to lay all future library philosophy? Because that's just what it is; a philosophical model we use to make sense of the real world. FRBR is confusing, even if it is a great leap forward in epistemological thinking, for example when it comes down to identity management (persistent identifiers for one thing can be expressed through a multitude, like a proxy, which FRBR fails at miserably, for example) it is right there in the centre of it, but a lot of it focuses on the wrong part of it, the part that involves human cognition to make decisions about identity.

Anyway, I guess at this point all I'm trying to say is that there are glimpses of what I'm talking about in the library world, and I was attracted to it, I wanted to dedicate parts of my life to fixing a lot what was broken in the real-world. I came to the library because they are the shining beacon of light in our society.

So, what happened?
Which is why I am interested smarting up about some of these things. Where should one go for a decent but not mind-blowing introduction to the types of things you have described lately?
It's hard to say what will blow your mind, and what will not. But since you're a library type person I'm going to go out on a limb here,and assume you're a smart person. :) So, I'm going to assume that http://en.wikipedia.org/wiki/Epistemology won't blow your mind. So let's assume we're using the definition for "subject" as such ;
  • An area of knowledge, a topic, an area of interest or study
In terms of philosophy we usually expand that definition a bit wider (so it will also include most discourse and literature) but I'll try to keep it simple. First, a question?

"What does it mean that something
is something?"

This is the basic question for identity, that something exists and that we can talk and refer to it. Refering to things is a huge portion of what the library does, not only as an archive, but as a living institution where knowledge is harboured. We're talking about subjects put into systems, about being subject-centric in the way we deal with things. Just like our brains do.

Now, for me there's a few things that have happened the last 20-30 years. The world has become more and more knowledge centric (they've gone from "all knowledge are in books" to "knowledge can be found in many places", and the advent of computers and the internet plays no small part in that), while libraries have become more book specific, more focused on the collection part rather than what the collection actually harbours in terms of knowledge (and I suspect this is because there are no traditional tracks within the library world for technology), probably because it's easier and fits better into budget driven government run institutions.

However, this isn't beneficial to the knowledge management part. Libraries are moving steady towards being archives, but the world wants them to become knowledge specialists. Ouch. And so the libraries will be closed down when they
don't deliver knowledge. Archives is what Google does best, and they're not that bad at harbouring basic knowledge. What hope in hell have you got then?

I'm running out of time right now, but feel free to ask any question and point to any of my wrongs, and laugh at it as well; I need the discourse as much as (I hope) you do. Let me just quickly run through that list with comments and pointers ; [
editors note : this is a list of things I felt the library world 'have no clue about' from my mail to the mailing-list]
  • No idea about digital persistent identification.
What happens to identifiers when people stop maintaining them? They lose their semantic and intrinsic value, and become moot. How many libraries maintain their age old software? No, a more human, less technological means of resolving is needed, and when when the world went digital the choice of multiple identities became not only possible but inevitable. Yet, when the library world manages identities as OCLC / LOC record numbers at the item level, things go horribly wrong and you cannot take what you've defined and learned into the philosophical space. Even if the OCLC / LOC numbers are maintained till the end of the world, they do not solve basic epistemological problems.
  • No subject-centricity.
FRBR does actually provide some, but it is not focused on the epistemological problems, only one of identifying the problem of identification without providing a mechanism (real or philosophical) for doing so.
  • No understanding of semantics in data modeling.
The AARC2 / RDA world is, in some definition of the terms, a data model. And between entities in data models there are semantics, meaning the relationships themselves, their names, roles and thought purpose. But you have to understand, as a human, all of AARC2 / RDA to be able to model anything with it; there's no platform on which to stand, there's no atomic parts you can use to build molecules and then cells and then beings. The whole model is, in fact, a hobbled-together set of fields without structure (and no, numbering them is not a structure :), and without structure there's only rules. And rules without structure is only human-enforceable.
  • No clue about ontologies, inferencing, guides by analogy
This is a stab at what the Semantic Web people are doing. They have a long background from AI and knowledge management, and if you guys were at least on par with that group, there could be some better understanding of the issues. The SemWeb crowd understand a lot of first-order logic, inferencing, analogy, case-based reasoning, and so forth, all stuff you need to have computers understand a tad bit better how your data is hobbled together, how they all interact, how entities and relationships (remember those? :) are mapped.

I should of course make a note here that I think that the SemWeb efforts are mostly wrong, and that they could learn an awful lot from librarians in the way to deal with collections and access, but that's a different discourse for some other time. :)
  • no real knowledge about collection management ( ... wait for it ...) with multiple hooks and identities
I was actually hoping people would jump on this one, getting offended that I said they had no real knowledge of collection management (which is their forte, it is what they do!), but I guess either they saw the hook and line of *identities*, and jumped over it. Dang.

It's all about the identity of what you are collecting. Crikey, publishers haven't even got ISBN to work (how many times to I put in one ISBN to get a completely different book ...), and one would think that would provide hints to why this is hard, and perhaps what to do otherwise. Hmm.

-- end of mail except some more personal ramblings not fit for generic consumption --

24 August 2009

What event model ontology?

Hmm, it seems that no one has blogged, tweeted or mentioned my blog post in my last plea, which I'm quite disappointed with. However, I'll chalk this one down to the complexity of what I'm trying to accomplish, and my failed attempt at explaining what it is.

In the mean time I've been working at it, converging various models from all sorts of weird places (anything from WebServices and SOAP stacks, to operating systems like Linux, to event models in Java and .Net, to more conceptual stuff in the Semantic Web world), but boy, you can tell that we live in a world shaped by iterative imperative paradigms of approaching the software world.

One thing I learned quite early was declarative and functional programming, introduced to me, of all places, with using XSLT many years ago. It may not be the most obvious place to find it, and this is one of those hidden gems of the language which still doesn't enjoy too much of a following. And no wonder; people come into it from the imperative stuff that dominates the world, polluting us all with filthy thoughts of changing variables (at least in Scala you can choose between var and val), functions that aren't truly functional, and the classical idea in object-oriented programming of a taxonomical structure that doesn't hold up to scrutiny.

Let me clarify that last point. Wht are we doing this stuff? Why are we creating computer programs?

To solve problems. And who are we solving problems for? For humans. It's the classical example (albeit extrapolated) of garbage in, garbage out. I've talked about this in the past a lot, about the constant translation that happens between huna and machine, and how we are creating translation models in both worlds in order to "move forward" and solve problems better. But this excercise becomes increasingly harder as our legacy grows, so trying to teach functional programming to people who don't understand certain basic principles of Lambda Calculus is going to be hard, just like it's hard to teach Topic Maps to people who live in a SQL world. Or like it's hard to teach auto-generating user-interfaces to a user-interface developer.

These are usually called paradigm shifts, where some important part of your existing world is totally changed as you learn some other even more important knowledge. You must shift your thinking from one way to a rather different other. And this is hard. Patterns of knowledge in your brain is maintained by traversing certain paths often, and as such strengthening that path (following the pattern that an often travelled path must be the right path). But if the path is wrong, there's some pretty strong paths you need to unlearn. Damn, that is hard! Which is why I urge you to try it out.

I'm currently using Topic Maps, human behaviour driven ontologies for auto-generating applications and user-interfaces over functional complete models of both virtual and concrete human domains, all with temporality and continous change as the central paradigms. Yeah, pretty hefty stuff, and I've spent years trying to unlearn stuff I learnt in the years before that. And those years were unlearning some other stuff before that. My whole life has been one huge unlearning experience, and I don't think any other way conceptually grasps the beauty of life better; nature and life both are in perpetual change. Needless to say, I'm enjoying every single crazy second of it!

But back to my event model ontology. I've learned one important thing in all this; Sowa has suggested a shift from logical inference to analogy, and this coupled with the OODA loop can create an intriguing platform for knowledge management and eco-system forsoftware applications. I'll let you know more as things progress from here. I'm excited!

And as always, I'd love to hear your comments on all of this. I beg you. Again. :)

14 August 2009

Emotional thunderstorm from the Ukraine

I have to share this one with you. My grandfather, Hans Adolph Johannesen, fought during WWII as part of the Norwegian underground resistance to the German invasion, was captured and put in prison for many years (1 year at Grini, Norway, and 2.5 years in Germany, till the end of the war, with and befriended former prime minister of Norway Trygve Brattli). All of this many years before I was even born. But he shared the stories and the pain and the heroism. I recorded it, I interviewed him, I lived with him. And then he died, almost 10 years ago.

And right now all those emotions came washing over me, a thunderstorm, pouring rain, because I was silly enough to watch this amazing - in the true sense of the word! - artist retelling of such pain and memories, and in the least likely of all places ;
Ukrainian sand artist proves that reality TV's got talent: "James Donaghy: Kseniya Simonova, the winner of Ukraine's got talent, has become a YouTube phenomenon by telling stories through sand animation. Who needs Susan Boyle?"

5 August 2009

Can I ask you a favour? (Does social media actually work?)

Hi everybody. Could I ask you a favour? I'm not getting much response to my quest for a unified software architecture ontology, so could I humbly ask you to blog, tag, link or otherwise gossip about my previous post on the matter? I would really appreciate it, and I promise I'll share my findings with you all.

(My subtitle "Does social media actually work?" is a blatant attempt to get circulation going by mocking the whole debacle which I try to, ahem, you know, promote. Thanks.)

31 July 2009

Boundless?

Hehe, had to giggle a bit when reading this (in my never-ending quest for semantic mapping of software systems architecture) ;

"The Open Group is a vendor- and technology-neutral consortium, whose vision of Boundaryless Information Flow™ will enable access to integrated information within and between enterprises based on open standards and global interoperability." (My emphasis)

To embrace "open" you would think that getting to the info would be, you know, just open, but no. There are free chapters available for you to see, just to make sure, you know, there's an engine under the hood, all you need to do is to jump through a few hoops, register, and ... ugh. So I bit this bullet, registered and I got the introductory guide to the TOGAF framework, but it reads as any other fluffy "use our framework, and all shall be well in the world" vendor selling you miracle cures out there. Disappointing, really. I could get the full thing if I apply for a 90-day personal license, but I don't think I'll bother as I react badly to fluff, and I definitly get an allergic reaction to having to register, revealing personal info and such, just to get to read their fluffy bits. What gives?

I think these people have completely misunderstood what "open" means, which is a shame considering that the content might be useful. But I say only might, as I would have no idea. Phft, open, my ass.

29 July 2009

Missing ontological serinity in the world of software systems architecture

Updates: See bottom, but also this question on StackOverflow.



Ok, so let me say from the get go that I'm a little bit upset. Well, maybe angry and bewildered more than upset, but nevertheless not happy. And it all has to do with the dingbat way we architecture our various computer systems. So, yeah, quite generic and not really something we can do much about.

Let's rehash. I'm a SOA junkie, an EDA pimp, and I hate by default the bullshit in any Enterpise camp that promotes their way of doing must be right. And by SOA, I don't mean no ESB bullshit, I mean a hard-core focus on services for architectural means. I build ontologically driven systems, and care deeply about semantics where most others don't give a monkey's bottom.

Lately I've had to rehash my knowledge on plugin architectures (both implementation specific and theoretical), how to modularise complex pieces of software, and implement an event-driven platform on which to run my systems. So I've been snooping around, and there's a ton of models and architectures to be found. But being found is not the same as finding what you're after, especially as I have a few criteria to my search; I want to find something that's generic, simple (but not simplistic), elegant (as in, does not suck) and extendable, an architecture that's event-driven, modular and open. Nothing. I've found nothing. Of course they all claim to be amazingly fantastic and super and great, but looking under the hood, if allowed, reveals yet another staticly created shared library stack with some hooks for your software to use, using some misnomer like SOA or EDA or any of the hundreds of other Enterprise bullshit terms out there.

So, I set my goals lower in the hopes of finding anything of value, even went and asked real programmers what I thought was a simple question, making it specific enough to hopefully muster some replies. Nothing. It seems everybody's got their own way to handle their own little piece of the universe, that people cling to their silos of comfort or something, afraid of what might happen if we all agreed on something. Even when you dig into large architectures, like my own Linux Kernel which I'm using to write this post, there's tons of layers and shared libraries that's hubbled together in a way that does the job, ok, but doesn't make it, in my eyes, an easy job to do, elegant to extend or easy to change.

I guess I should clarify. I'm knee-deep in ontology work for software systems architecture, a field that's almost chemically free of any active community, has a few scattered experiements that went no where (and I'm tempted to put ADL in that category, too), a few papers here and there that talks about it in very generic terms (either as abstracts to academic stroke sessions, or a white paper claiming to be the second coming of Jeebus!), but as to hard-core practitioners like me who want to inject a Topic Map with events of given types that matches certain ontological expressions and Topic Map fragments of certain types of architectual patterns, tough! You're on your own, kid.

So, what am I after?

Well, many things, but I'll try to be a bit clear here. I've cut down on my wants, to, in order to try to find others out there doing similar things. So. I'd like to see a simple event-driven software stack that scales ontologically, and isn't bound to any technology, company or otherwise religious platform. This means that the stack with its names and values work just as well for a small plugin as it does for a larger system like an extra-OS or a cloud, works for potato-peelers as well as online booking agents, database connection pools and kernel space memory managers, but also can grow and shrink with need, in such a way that all other parts of it when they need to can find out what those changes are. This digs into creating an upper ontology for information science, of course, but more importantly it means I'd like to plug software into various parts of a stack, so that everything - and I mean everything! - is an event listener. I know some micro-kernels work in similar ways but highly statically bound, but regardless these ideas are way past the cradle stage by now and need to have a greater exploration in the real-world.

So when I download an open-source package of sorts and try to find out what its stack of operation looks like, why is this information so hard to find? Or compare the Java event model and the .Net model. Or OSes. It seems it's very hard to agree on these things, but I doubt the state of things isn't because they've tried and failed, but because they haven't tried. It's a big world and this is a big field, yet this has not been tried in any meaningful way.

Sure, the technologies promoted through OASIS, ECMA and W3C in themselves have various solutions and tries to bind stuff together in a coherent way as not to confuse us too much, but even within their own stacks of proposals and standards there are huge gaps, great leaps of faith, and generally no clear direction. Even W3C who pushes the semantic web movement hasn't got anything to say on the matter. It's starting to drive me bonkers.

Ok, I'm done. My steam has gone out, but I'm not feeling any better. Off to do my own thing, like the rest of them. :)




Update: Ok, it seems I'm not getting my message across. Let me create a simple (and wrong) example ;
  • SOA : Start
  • SOA : Configure
  • SOA : Map
  • ENV : Start
  • ENV : Configure
  • ENV : Map
  • APP : Start
  • APP : Configure
  • APP : Init
  • APP : Connect
  • APP : Perform
  • APP : Teardown
  • ENV : Teardown
  • SOA : Teardown
Here we got an application session events where SOA is, er, SOA, ENV the "environment" (whatever that should mean), APP is an application, and so forth. This list should be HUGE! Think of all the interesting events one could generate from you turn the computer on until someone gets Rickrolled on the other side of the planet! I want to map environments, systems and eco-systems, with labels. In some regards it's an enumerated list of points that any computer system traverses on its path from being loaded into memory until it leaves it. And possibly then some.

I want to map the software system world! I want to know what people call their various points on the software stack, what they call their events, how they see them work together, how they forsee workflow interactions, how they define system integrity, thoughts on implementation, named entities, the works.

I can find heaps of this stuff, but none of it is globally agreed upon, it's all tucked away in projects or companies, it's their own version of how things should be and what happens. Even big players such as Sun / Java and Microsoft / .Net have very different event models and ontologies, and they are not compatible in any meaningful way. I would expect some parts of CORBA had done work in this area, but what I've seen is very transaction oriented where clients already know the ontology and uses CORBA to travel through rather than be defined by.

As an example of the closest I've found so far in the realm of mapping machine-parts ("machine" here is "software systems") is the Open architecture computing environment which tries to define up the most important parts of software systems (although the final version was released in 2004 ... these things can be considered to be final? Where's the clouds?), but lacks the ontological and semantic definition, has no event or message structures or standards, nor does it have any notational value or end-points which, admittedly, I could spend the next couple of weeks doing, but let's see what else is out there.

Making any sense?


Update: To be even more specific, trawling through IPC is really what I have been doing for the last few days, but getting to the core ontology of all of this is soooo painful. Surely someone out there have done something like this? I've even gone through POSIX trying to gleam what nuggets I could find, but the system level of that beast is just so low it's not funny. Promising is the DBus architecture and event stack, but this again is very low-level, covers only a fraction of the software systems, and is littered with duplication of complexities.

Anything else I should hack at? Yes, I've gone through the most of the WS-* stack as well, digging into past knowledge I had hoped to never see again, but here as well as most other technologies out there they seem to be obsessed with being so flexible that they forget to be defining. So, we get a lot of scaffolding and frameworks that you can extend and define your stuff in, but no clear definitions of what the world looks like. Even an obvious contender like WS-Events and the less-know WS-Event from Hewlett-Packard have nothing more than a functional approach to defining and registering events but that's it, leaving the defining to some semi-ontological layer.

But I'm still convinced lots of people have done this sort of work, especially in these Semantic Web haydays. Browsing through the thousands of OWL ontologies in Swoogle for 'software architecture' (which doesn't really cut it, but is the closest term that yield results) leaves me just overloaded. Sure, the OpenGroup SOA ontology for example, does provide me with, eh, lots of interesting stuff, but again it's a special domain (SOA, obviously) using a certain moniker (service orientation, which sucks when you want to define events across operational stacks).

Argh! Can you tell I'm going bonkers?

28 July 2009

A submersive state of mind

Sorry for the low blogging of late. I've hit the opposite of a blog-drain; I'm in a state where there's simply too much to write about, and instead of just exploding with it I retract into myself think I should mull on it a bit before I pop it out. Today is one of those popping days, and I want to talk about something that has been new to me for the last 5 months or so and has proven itself to be a mixed bag of pro's and con's ; working from home.

As of the beginning of this year I started to work for Free Systems Technology Labs, an Indian company bent on doing funky stuff the right way with the right people (I had to say that, didn't I?), and as such I now live in my wife's home-town of Kiama, a couple of hours south of Sydney, Australia. We moved here from Norway at the beginning of the year for a number of reasons, but being closer to (my wife's) family also with little kids and the nice climate were two strong contenders (but we're still talking about moving back to Norway someday... or somewhere else entirely, wherever fate and lust drives us, really).

Kiama at dawn Working from home can be boring, I know, but we're actually living in a 1880's built old two-storey farmhouse, verandahs all along the house, with some of the best views in town. Here's a pic I took last evening before finishing up work for the day. Yes, it can be hard concentrating on hard-core ontologies and magic Tuple-stores when you can stare at the sea for hours, and it doesn't help either that there's a number of comfy chairs with fluffy pillows right outside my French-doors that leads out to the verandah. Especially on a nice sunny day. Like today.

So, in order to break up my day I've got a schedule of sorts. First, after the kids are out the door for school and breakfast is tidied up, if I still got tea left I bring it upstairs to my office. Now here's a crucial part of my day; do I sit down and get started, or go and get dressed? Ah, the number of times I've written important emails or talked on Skype in my underwear. Well, the sensible thing to do - and, really, what I try to do every morning - is to get dressed. I know it sounds pathetically lame to lament over this, but it's so easy to just get going. I'm not going anywhere, right?

Amaki Cottage Cafe Well, that's the thing. Part of my schedule, which I don't do absolutely everyday, but every so often, is to go to my local coffee-fixer-upper-place. I gives me an excuse to get dressed, and makes for a nice 2 minute (!!) stroll down the picturesque Kiama town-houses, all the way to the bottom of the hill to get my double-strength coffee, double-strength chocolate Mocha. Some days when I do my walk on lunch time, I might even get one of their amazingly yummy salmon on Turkish-bread, and just stroll another minute down to the park, sit in the sun on the grass, and enjoy the serenity.

Of course, too much serenity is kinda boring, especially when your mind is racing with ontologies, event-models, dual-stored Tuples, or worrying if I need to consider using Bessel functions for subject equality in the Topic Maps Reference Model, it can get a bit busy in my head. Thankfully when the day is over I've got a way to kinda deal with all that, but during the day itself it's sometimes hard to focus on just one thing and one thing alone. So I need to schedule even such things.

I do have a schedule, though. 9am - 10am is the time for all things not specifically work tasks related, such as emails, news, blogs, etc. At around 10.30am till about lunch I do the more practical things about my work, such as coding, writing, testing, dowloading and installing, meddling, fiddling, prototyping and breaking my machine. Then I have lunch, quite often with my wife who downstairs somewhere chasing Samuel around, trying to stop him from getting into stuff he shouldn't. And then after lunch, at about 1pm, I fix my machine and do more thinking-related stuff, hold meetings (mostly through calling through Skype at around 2-2.30pm when India is getting into the office), write emails, and try to come up with plans, thoughts for the next day, and scheming in general.

I try to follow this pattern as much as I can. It's a lonely job in many ways as I don't have that office intermingling that I love. So, to keep myself sane I go places. I often go to the Kiama Library where I meet up with Tim, the crazy-fun-beardy local IT librarian. I sometimes meet up with a few guys I know around the place (not that many) and even got to meet up with Murray Woodman from Sydney the other day. As often as I can, at least once or twice a week.

And then, just like in over a week or so, I go to India (Bangalore mostly, but sometimes Mumbai) for 10-12 days to do an intense stint of socializing, hacking, planning, talking, planning, teaching, drinking excellent Chai tea, more planning, around the clock until I don't know what day it is (which suits, given the jet-lag). Then back home for another 2-3 months of working from home again.

Jones' Beach, Kiama It works. It's not perfect, but it sure beats living a crazy stressful life in a big city where you don't have control over your surroundings. Here, if I'm stuck with something and need a break, I put on my slippers, open the door, and walk 4 minutes to the beach. All is well, and when I get back I know for sure that implementing the Bessel function in my Topic Maps Reference Model is an excercise for the modeler and the TMDM engine, not for the technical implementation itself. Problem refreshingly solved.

Oh, and do come and visit. I'll buy the coffees.

30 June 2009

Sorry, moderation switched on

Sorry everybody, but I've been attacked by spammers of late, and have had to switch moderation on, at least for now, but I'm terribly liberal and will approve every single message that talks badly of my, uh, bum. When things calm down again I'll turn it off I'm sure, but I seriously wish Blogger.com had a better comments system (or even a better way to kill spam from an infected site; the current way is just absolute rubbish and painful!). Or maybe this is another sign from below to switch to WordPress which I've got a half-finished Topic Maps plugin for and integrates against my shiny new xSiteable Framework 3.0. Hmm.

24 June 2009

Linux sound-system sucks!

Yeah, so I've been running Linux / Ubuntu now for about 4 months, and it has been a pleasure almost the whole way. I've had to dabble in Windows from time to time, especially "supporting" our two other Windows machines in the house, but every time I meddle with them, I'm extremely happy to return to my Ubuntu Jaunty 9.04. For the most parts.

There is this one area which sucks, though, and I mentioned it in one of my previous reports that I couldn't get the microphone to work. Here's the low-down on this whole thing ;

Ubuntu comes out of the box with ALSA (Advanced Linux Sound Architecture) and PulseAudio (a client/server system for sound over networks, amongst other things), and the two connected together in GNOME (the default Window manager it uses) should in theory work. But the forums and intertubes are abundant with problems relating to sound setup, anything from sound not working at all, some aspects not working, cracking or garbled sounds, and so on. Because Linux is open-source and has the advantage of "so many options", then the disadvantage of "so many options" also becomes quite clear.

When you write your software you write for either OSS (Open Sound System; try Googling for OSS and they'll translate it into Open Source Software ... AAARGH!) or ALSA, and both packages have wrappers for eachother, but it means that there's a multitude of ways to reach that haven of good supported sound. We can throw ESound and Gstreamer and JACK into the mix for further confusion as well.

So, one perticular part of these options was that the Linux kernel guys decided to throw out OSS and put in ALSA instead, at around Kernel version 2.5.x or so. The reason was mostly that OSS v.2 was in wide use as the developers entered into a lengthy v.3 rewrite, the company that sponsored them changed the license (dual license, one branch for GPL [often lagging], another for a sellable version). Then they scrapped v.3 to start on the new and improved v.4, which was to be fully GPL'ed and all things sorted out. In theory, but in the mean-time the world moved on, and ALSA became the standard.

So, back to my story. My sound system was finally working, except for the fact that my microphones (plug and internal) didn't work. I tried it all, including downloading and manually compiling and modding the Kernel with the latest version of ALSA drivers and libs, without any luck. (Well, I upgraded ALSA nicely, but alas no microphones). I tweaked manually the modprobe configuration files, upgraded and updated any Esound, Gstreamer or ALSA thing I could find, tweaked their links configs, reset them, autodetection and manual stuff, on and off, on and off. It drove me nuts!

So in the end I got the whiff of the OSS story. First it was simply dismissed because it was "the old system", but as more and more reported success with the latest OSS v.4 where ALSA failed, I thought I'd give it a try. I removed anything ALSA and PulseAudio (and frankly, not that many people have a need for PulseAudio, even less be able to correctly set it up, so why make it default?), installed some dependencies for OSS, downloaded a .deb package (trickier than it sounds), restarted, installed, configured (setup GNOME with the OSS sinks), and ...

Microphones work! They friggin' actually work, and after only 4 months I can make Skype calls which I need for work. But, as sound in general works fine - and here's the punchline! - now the left channel has crackling when the sound reaches a certain low / mid threshold (no, the mixer is set correctly; this is weirder in that it's only in the left channel. It's not overdrive, but some clicking-ish noise), and I can't friggin' get rid of it!

And yeah, you try searching the intertubes for Ubuntu 9.04, sound and OSS v.4 where "OSS" is treated as "Open Source Software" by Google. Bloody smart-arses.

Sure I love Linux, but I friggin' hate the Linux sound system. And I hate Google a little bit, too, this time.

5 June 2009

My creative past

Moving to a different country away from old friends and family can make you somewhat nostalgic, so add to that when playing my music collection at random I bump into either something that has memories attached, or, as in this case, blows the memories meter. I can't not want to share and talk about it.

Many years ago now I had a music studio down-town Oslo (near Børsen, top-floor where that great Indian restaurant is) which I shared with an old musical buddy of mine and a movie production company (more on that later, I suspect). There I laid down the foundations of much which was to become my music and musical style for years to follow. It was sitting in this loft office in the murky hours of the night I first met my wife online in one of the few chat sessions I ever did back in those days, chatting with Julie who was in the Australian bush near Bowral in the Southern Highlands. Instead of continuing my musical and movie carreer, I chose to go to Australia to meet the woman I fell in love with instead. And 10 years later I'm married to her, got three kids, a house and a Volvo S70 station-wagon and live in Australia. Things certainly took a different path.

But before my married life happened, there was a few years of back and forth and the pain of separation from both Julie and my first daughter, Grace. Two years in which a lot of my frustrations and lonely nights after long working days were filled with the remnants of my old music, and in this brew I concocted a whole slew of stuff. And some of that old music I stumbled upon by random last night, and I've got three tunes I'd like to share.

I popped them into my MySpace, and they are ;

Flying Through - an alien observing life on earth. Well, probably an alien. Could be anything or anyone observing us. This tune is somewhat in the style of Klaus Schulze, and features some well-planned syntheziser counterpoints, and probably most importantly my old friend Bjørn Rummelhoff-Hansen (my old band-mate from Sundrunk) on guitar. It's dedicated to another friend of mine, Øystein Aarseth, who turned me on to old-school synth music. Oh, and if you followed that WikiPedia link, don't take the bad stuff written there as absolute truth; there was more to Øystein that could fit into his act (our shared passions were classical and old-school synth music, protagonist philosophy and port-wine, stuff rather far from the public image he put on).

Sexy DJ - Back in the days when MP3.com was a place of good music and a fantastic community, singer/songwriter Nadine Renee started a cool competition where she release the vocal tracks from her song "Sexy DJ" to the hoards of the interwebs, saying "let's see what you can make of it", and my contribution won the Rythm'n'blues category (although I think this is rather far from rythm'n'blues). She has sadly passed away during complications of child-labour a few years back, so I note her for posterity that the whole competition was as fantastic as she was good-natured and kind. This tune happens to also feature my own dad on saxophone, Milos Ocasek.

Dunish - I'm a Dune-fanatic. If Frank Herbert was a woman, I'd have a crush on her for sure. This music is like a collage of musical themes and styles, and was for me an excercise in music production as I was working on film music at the time. If you loved the movie by David Lynch, you'd hopefully enjoy this one as well.

Update: added a song ;

Bekk - What e-business consultancy company with respect for itself doesn't have a theme song? My old company in Norway, Bekk Consulting, is truly the most rockin' gig in town. This is a tune I made in the wee hours of the night for no apparent reason, featuring my dad on sax, my good friend Hanne Svenningsen on "vocals", and Bjørn Rummelhoff-Hansen again on guitar (what would I have done without you?).

Let me know what you think of my MySpace adventures of the past.

3 June 2009

03.06.09

Wow, what cool sequence of numbers is that?

03.06.09

And that's today's date, a very special day indeed. Expect me to meddle and go slow and enjoy family and friends, and I'll see you all on the other side tomorrow. (And given my wife's fantastic treatment I'd better start planning something seriously cool for 09.09.09. Suggestions welcome!)

2 June 2009

Successful crap

It never ceases to amaze me peeking into various successful open-source projects, seeing the innards, and wonder how they even got this crap code past their own pride. Yes, I need to vent.

This weekend I was head-down in various content-management systems and their ilk, digging into anything from WordPress to Habari to Joomla to Simple CMS (which you would expect be simple) eZ publishing. All of them had rather abysmal code scattered throughout (with eZ publishing being the better of the lot), oddities, and all the worts you'd expect of systems hobbled together, where their success is more an afterthought.

But hang on, I can't criticise systems for their organic growth. But I can criticise them for not doing much about the trouble that comes from it. Sure, I understand that rewriting core parts of a system requires a huge ego and nerves of steel, and I understand how "if it ain't brok, don't touch it" rules the end of the day, but surely the end means is good software, right?

But.

It's crap. It's rubbish. And more importantly, as I can put up with crap if it gives me opportunities and love, it hinders innovation, flexibility and, well, love.

Let's pick on a random contender that I worked heaps with last week, WordPress. If we cut away comments, this is its index.php ;
define('WP_USE_THEMES', true);
require('./wp-blog-header.php');
If we look into wp-blog-header.php, here's what we get ;
if ( !isset($wp_did_header) ) {
$wp_did_header = true;
require_once( dirname(__FILE__) . '/wp-load.php' );
wp();
require_once( ABSPATH . WPINC . '/template-loader.php' );
}
Ok, so let's peak into wp-load.php, and we find about 20 lines of code and more includes. Don't you just love playing hide and seek with files to find out where it's going and why? These things maybe have come around from shuffling the organic growth of the system into other files, and left them with these sad little snippets that's hard to get an overview of and takes a slight performance toll, too, as well as eat away your sanity and good programming ethics. And they all contain that newbie error of putting this at the end of every business logic file ;
?>
It's not needed, and if you like your whitespace under reasonable control, you're stuffed. Not a big crime, mind you, but just one of those niggling little things. Then you've got functions and objects, some with the wp_ prefix, some without, business logic in files called wp_settings.php, repetition of code everywhere, hundreds of DEFINE's scattered about in different files, and so on and so on. (And yeah, I should contribute as it is open-source and all that, and I'm actually writing a embedded Topic Maps engine as a plugin, so we'll see)

But I'm not here to pick on WordPress per se. It's more about how this organic growth hinders innovation and opportunities. So let's talk about frameworks. All little pieces of code together form a framework, so we're not necessarily talking about a framework of disjoint classes or functions that aid developers making stuff like the Zend Framework, or Cake, or Symfony, or CodeIgniter, or 1000 others out there. Well, kinda; all those little things the app is made from also constitutes a framework, but it isn't disjoint nor refactored or synergetic or stable or well thought-out as you get in a more established framework, but never the less that's what it is.

PHP itself is a framework, of course, and most PHP frameworks are wrappers and added code to make PHP act more like a coherent system, fixing inaccuracies and bugs and niggles, stabilizing behaviour and increasing the need for spending hours and hours learning some new paradigm you can't use elsewhere.

Hmm. Where was I? Oh, right; every app is a framework. But when the framework isn't a perticulary good one, where the pieces are either too fragmented or too disjoined to make any sense, making stuff in that framework is going to be a pain. Like WordPress is a pain. And before you know it we get to the next rewrite, and this time we'll get it right, although we need to keep our legacy intact, and hence we write hacks on top of fresh code to drag it back to the hole it came from. The data model needs to rewritten, but it won't because "well, it works, doesn't it?" and the framework needs to be rewritten, but it won't because "well, it's not broken!"

When I can't replace MySQL with something else, that's a hindrance. If I can't change the way tagging works, I can't move forward. If I can't change the URI handling, I'm stuffed. If I can't use portions of it to write something else, nothing new will come. Sure, there might be a plugin architecture somewhere, perhaps a simple event model that one can tap into, but if I can't replace the model in which it operates I can't make it more beautiful. I am forced to accept the model and framework in which WordPress sits.

And it sits quite squarely on top of everything I want to do. I want to create better and typed links, I want to reuse a model for sequences and storage, I want to replace tags with guided controlled vocabularies (maybe even typed and binary linked to external WordNet sites), I want to use it as a CMS and skip the URI handling alltogether, and so on. But I can't, because WordPress wasn't designed with change and innovation in mind.

But a day will come when even the most successful project will face its own innards. And some people will branch it, some will stay on, some will create something new, and some will stop using it alltogether. And it's all a really good thing; this is organic growth, it's a framework that spawns other frameworks. And more crap will be successful. Things will be broken and ugly and hackish, just like some things hopefully won't.

And in a few iterations, something beautiful - with a probably different name - will emerge.

Update: It would be great to get your suggestions for open-source projects which are designed for change and has quality and / or elegant code to boot. Let's make a list!