3 July 2008

Round and round it goes

This morning was a good one. I got on the bus, armed with breakfast banana in hand, and right there in front of me sat fellow Topic Mapper Stian Danenbarger (from Bouvet), who happened to be living just literally down the road from me. I've been living at Korsvoll (in Oslo) for 6 months now without bumping into him, how odd is that?

Anyways, the last few days I've written about Language and Semantics and about context for understanding communication (all with strong relations to programming languages), and needless to say this became the topic (heh) of discussion on the bus this morning as well.

In this post I'll try to summarize the discussion so far, implement the discussion I had on the bus this morning, coupled with a discussion I've had with Reginald Braithwaite on his blog, from "My mixed feelings about Ruby". Let's start with Reginald and move backwards.

Matz has said that Ruby is an attempt to solve the problem of making programmers happy. So maybe we aren’t happy with some of the accidental complexity. But can we be happy overall? Can we find a way to program in harmony with Ruby rather than trying to Greenspun it into Lisp?
I think that the goal of making programmers happy is a good one, although I suspect there's more than one way to please a programmer. One way is perhaps rooted in the syntax of the language at hand. Then there's the semantics of your language keywords. Another is to have good APIs to work with. Another is how meta the language is (i.e. how much freedom the programmer has in changing the semantics of the language, where Lisp is very meta while Java is not at all), and yet another is the community around it. Or the type and amount of documentation. Or its run-time environment. Or how the code is run (interpreted? compiled? half-compiled to bytecodes?).

Can we find ways in programming that would make all programmers happy? I need to now point back to my first post about Language and Semantics and simply reiterate that there's a tremendous lack of focus on why we program in most modern programming languages. Their idea is to shift bits around, and seldom to satisfy some overal more abstract problem. So for me it becomes more important to convey semantics (i.e. meaning) through my programming more than just having the ability to do so. Most languages will solve any problem you have, so what does the different languages offer us? In fact, how different are they most of the time?
At this moment in time I have extremely mixed feelings about Ruby. I sorely miss the elegance and purity of languages like Scheme and Smalltalk. But at the same time, I am trying to keep my mind open to some of the ways in which Ruby is a great programming language.
I think we really agree here. My own experiences with over 8 years of professional XSLT development (yes, look it up :) has taught me some valuable lessons about how elegant functional programming can be, just like Lisp and the mix-a-lot SmallTalk (which I like less of the two). But then I like certain ways that Ruby does things too, with a better syntax for one. I like to bicker about syntax. Yeah, I'm one of those. And I think I bicker about syntax for very good reasons, too;


In "just enough to make some sense" I talk about context; how many hints do we need to provide in order to communicate well? Make no mistake; when we program, we are doing more than solving the shifting of bits and bytes back and forth. We are giving hints to 1) a computer to run the code, and 2) the programmer (either the original developer, or someone else looking at her code). Most arguments about syntax seems to stem from 1) in which 2) becomes a personal opinion of individuals rather than a communal excericse. In other words, syntax seems to come from some human designer trying to express hints best to the computer in order to shift bits about, instead of focusing entirly on their programming brothers and sisters.

In the first quote about Ruby being designed in order to please the programmer, that would imply that 2) was in focus, but the focus of that quoted statemement is all wrong; it pleases some programmers, but certainly not all, otherwise why are we even talking about this stuff?

Ok, we're ready to move on to the crux of the matter, I think.
I am arguing that while it is easy to agree that languages ought to facilitate writing readable programs, it is not easy to derive any tangible heuristics for language design from this extremely motherhood and apple pie sentiment.
Readability is an important and strong word. And it is very important, indeed. We need everything to be readable, from syntax to APIs to environments and onwards. I think we all want this pipe-dream, but we all see different ways of accomplishing it. Some say it's impossible, others say it's easy, while people like Reginald I think is right there in the middle, the ultimate pragmatic stance. And if I had never done Topic Maps I would be right there with him. Like Stian Danenberger said this morning, there's more to readability than just reading the code well.

Topic Maps

Yeah, it's time talk about what happens when you drink the kool-aid and you accept the paradigm shift that comes with it. There's mainly x things I've learned through Topic Maps;
  • Everything is a model, from the business ideals and processes, to design and definition, our programming languages, our databases, the interaction against our systems, and the human aspect of business and customers. Models, models, everywhere ...
  • All we want to do is to work with models, and be able to change those models at will
  • All programming is to satisfy recreating those models
Have you ever looked at model-driven architecture or domain-driven design? These are somewhat abstract principles to creating complex systems. Now, I'm not going to delve into the pros and cons of these approaches, but merely point out that they were "invented" out from a need that programming languages didn't solve, namely the focus on models.

Think about it; in every aspect of our programming life, all we do is trying to capture models which somehow mimics the real-life problem-space. The shifting of bits wouldn't be necessary if there wasn't a model were working towards. We create abstract models of programming that we use in order to translate between us humans and those pesky computers who's not smart enough to understand "buy cheap, sell expensive" as a command. This is the main purpose of our jobs - to make models that translate human problems into computer-speak - and then we choose our programming language to do this in. In other words, the direction is not language first then the problem, but the other way around. In my first post in this series I talked about tools, and about choosing the "right tool for the job." This is a good moment to lament some of what I see are the real problems of modern programming languages.

What objects?

Object-oriented programming. Now, don't get me wrong, I think OOP is a huge improvement over the process-oriented imperative ways of the olden ways. But as I said in my last post, it looks so much like the truth, we mistakenly treat it as truth. The truth is there's something fundamentally wrong with what we know as object-oriented programming.

First of all, it's not labeled right. Stian Danenbarger mention that someone (can't remember the name; Morten someone?) said it should be called "Class-based programming", or - if you know the Linnean world - taxonomical programming. If you know about RDF and the Semantic Web, it too is based loosely on recursive key/value pairs, creating those tree-structures as the operative model. This is dangerously deceitful, as I've written about in my two previous posts. The world is not a tree-structure, but a mix of trees, graphs and vectors, with some semi-ordered chaos thrown in.

Every single programming approach, be it a language or a paradigm like OOP or functional, comes with its own meta model of how to translate between computers and the humans that use them. Every single approach is an attempt to recreate those models, to make it efficient and user-friendly to use and reuse those models, and make it easy to change the models, remove the models, make new ones, add others, mix them, and so on. My last post goes into much detail about what those meta models are, and those meta models define the communication from human to computer to human to computer to human, and on and on and on.

It's a bit of a puzzle, then, why our programming languages focus less on the models and more on shifting those bits around. When shifting bits are the modus operandi and we leave the models in the hands of programmers who normally don't think too much about those models (and, perhaps by inference, programmers who don't think about those models goes on to design programming languages in which they want to shift bits around ...), you end up with some odd models, which at most times are incompatible with each other. This is how all models are shifted to the API level.

Everyone who has ever designed an API knows how hard it can be. Most of the time you start in one corner of your API thinking it's going smooth until you meet with the other end, and you hack and polish your API as best you can, and release version 1.0. If anyone but you use that API, how long until requests for change, bugs, "wouldn't it make more sense to ...", "What do you mean by 'construct objects' here?", and on and on and on. Creating APIs is a test of all the skills you've got. And all of the same can be said about creating a programming language.

Could the problem simply be that we're using a taxonomic programming language paradigm in which we try to create a graph structured application? I like to think so. Why isn't there native support in languages for typed objects, the most basic building block of categorisation and graphing?

$mice = all objects of type 'mouse' ;

Or cleanups?

set free $mice of type 'lab' ;

Or relationships (with implicit cardinality)?

with $mice of type ('woodland')
add relationship 'is food' to objects of type 'owl' ;

Or prowling?

with $mice that has relationship to objects of type ('owl')
add type ('owl food') ;

Or workflow models?

in $workflow at option ('is milk fresh?') add possible response ('maybe')
with task ('smell it') and path back to parent ;

[disclaimer : these are all tounge-in-cheek examples]

I know you can extend some languages to do the basic bidding here, for example in JavaScript I can change the prototype for basic objects and types, but it's an extension each programmer must make and the syntax is bound to the limits of the meta model of the language, amking most such extensions look kludgy and inelegant. And unless they know all the problems that I think we've been talking about here, they really won't do this. This sort of discussion certainly does not appear where people learn programming skills.

No, most programming languages follow the tree-structure quite faithfully, or more precise the taxomatic model (which is mostly trees but with the odd jump (relationship) sideways in order to deal with the kludges that didn't fit the tree). Our programs are exactly that; data and code, and the programming languages define not only the syntax for how to deal with the data and code, but the very way we think about dealing with blobs of data and code.

They define the readability of our programs. So, Reginald closes;
Again we come down to this: readability is a property of programs, and the influence of a language on the readability of the programs is indirect. That does not mean the language doesn't matter, but it does make me suspicious of the argument that we can look at one language and say it produces readable programs and look at another language and say it does not.
Agreed, except I think most of the languages we do discuss are all forged over the same OOP and functional anvil, in the same "shifting the bits and byes back and forth" kind of thinking. I think we need to think in terms of the reason we program; those pesky models. Therein lies the key to readability, when the code resembles the models we are trying to recreate.

Syntax for shifting bits around

Yes, syntax is perhaps more important than we like to admit. Syntax defines the nitty-gritty way we shift those bits around in order to accomplish those modeling ideals. It's all in the eyes of the beholder, of course, just like every programming language meta model have their own answer. What is the general consensus on good syntax that convey the right amount of semantics in order for us all to agree to its meaning?

There's certain things which seems to be agreed on. Using angle brackets and the equal sign for comparators of basic types, for example, or using colon and equal to assign values (although there's a 50/50 on that one), using curly brackets to denote blocks (but not closures), using square brackets for arrays or lists (but not in functional languages), using parenthesis for functional lists, certain keywords such as const for constants, var for variables (mostly loosly typed languages, for some reason) or int or Int for integers (basic types or basic type classes), and so on. But does any of this really matter?

As shifting bytes around, I'd say they don't matter. What matters is why they're shifting the bytes around. And most languages don't care about that. And so I don't care about the syntax or the language quirks of inner closures when inner closures are a symptom of us using the wrong tools for the modeling job at hand. We're bickering about how to best do it wrong instead of focusing on doing it right. Um, IMHO, of course, but that's just the Topic Maps drugs talking.

Just like Robert Barta (who I'd wish would come to dinner more often), I too dream of a Topic Maps (or graph based) programming language. Maybe it's time to dream one up. :)

No comments:

Post a Comment