The Fallacy of Simple User Interfaces

A half a year ago, I've been challenged with improving functionality of a piece of software without making its interface more bloated than it already is. That made me thinking about the complexity of user interfaces, how it affects the acceptance of the software, and what should programmers do about it.

Even Catholic Church pays attention to "User-Friendly Features"!

I'm not a UX specialist of any sort. But, I was a Gentoo user once. Gentoo is a Linux distribution that is famous for two things: 1) it is very flexible so every user can only build a part of the system they need, and 2) most installation/configuration operations are performed from your console via CLI tools (at best) or editing text files (at worst). I found out that long-time Gentoo users who are finally tired of it develop a picky taste for really good user interfaces. So I hope that my thoughts on that matter would be somewhat useful.

Why Do We Want The Interface Simple?

The famous cartoon on complexity of interfaces.

There is a cartoon that explains it all (like many other computing-related topics), which is linked in sidebar. Complex interfaces are annoying. They make people think, they make people actually fill endless forms, ask questions about it, satisfying the kafkaesque bureaucrat monster minds behind designing them. Remember all the "forms" you had to fill at consular sections, customs, and other public offices? They usually come with an instruction how to fill them twice as large... Well, the complex interfaces of computer software are even worse because they change as you interact with them!

Note that I would like to talk here about a broader concept than just "interface" as what user sees on a single screen. In this post I actually talk about User Experience (UX), which includes but is not limited to how forms are laid out. Instead, its concern is holistic evaluation of how user interacts with software throughout its whole lifetime.

Of course, most things you fill into these forms is necessary. Compared to Google search bar, which is approaching just answering a question you put there, they seem like getting to 20th floor by stairs compared to 2nd. But (provided the system cares about the citizens), these government guys really need that information you're to fill in into each form. Being long doesn't mean being complex, and a long-ish questionnaire really makes the user think less than a freetext field to enter your essay.

Bad Interface as a Programmer's Failure

So there is a huge class of problems where a complex, comprehensive interface is the answer. At the same time, there are complexities of other sort, which do not have any right to exist, and should be perished from each and every product by all means possible. It is when we make the user do our job, to follow an algorithm. Making machines follow algorithms is our job as programmers. We are paid for it, respected for it, and despised because of our devotion to it, but it's what we are to do. Note the emphasis on the word "we". We, not our users.

If you have to instruct your user like this, "if this checkbox is active, then you should set this flag; otherwise, set that flag", or "when you select this option, click the button below as well," then you are implementing a monster your users are going to hate. An only excuse to putting such decision is that you should build the tool immediately, and are going to improve it ASAP.

Not your job, actually? There are designer guys, User eXperience managers who should think this through rather than you? But do they really have a foundation that allows them enough variation? Don't they have to lower their expectations of what they can design because some of their ideas would require too costly architectural changes? I'm sure you at some point had a talk like, "Whoa, we can't implement this! It would require too much something!" Well, what it sometimes actually requires is additional effort on our part. We can--and do--affect whether the software is hard to use and has a handy interface, and you should be aware of that power in your hands from the very beginning. It's easy to screw up, and put too much limitations, and then ditch all interface improvements because their cost will rise.

Smells

Some designs are definitely programmers' failures. Look what I saw trying to transfer funds from one account to another in my online banking:

What's so damn hard in telling me when my transaction is going to complete if I perform it NOW? Doesn't the web server have access to clock??? Especially if it refers to a standardized time, but it could look up my local time via Javascript either. Why is it making *me* to look at my watch, and compute the date my transaction will complete? I know how programmers in banks are paid, don't tell me they aren't qualified to do this. It's just a lack of understanding that interface is bad if the user has to follow some kind of program rather than have a computer do it..

In my opinion, there are two smells that denote that your interface is wrong. It's badly designed if

  1. you write policies and instructions instead of preventing mistakes programmatically. The designers of industrial machines got this ages ago. In addition to writing in big red letters DO NOT PUT YOUR HANDS HERE, they just install a damper that prevents you from doing this. This has been saving lives and limbs for centuries;
  2. the interface can have a large number of inconsistent states. When you act upon a control, something changes in the interface, i.e. the interface's state changes. If your interface allows that in a comparatively large number of cases a state is inconsistent with your policy or your intention, or is just meaningless, it starts to smell. The fewer meaningless states you have, the better your interface is.

I tried to implement "Request an Update" checkbox Bugzilla extension for ROSA Desktop so that it follows these guidelines. The checkbox is disabled when there's no enough information in the comment, and that one checkbox (one bit of information) correctly requests an update and notifies the interested party regardless of what stage in the overall workflow it's in.

Of course, more established distribution maintenance systems, such as Bodhi or Launchpad are even simpler, but that's nearly the best you could do with a small patch to Bugzilla.

The reverse doesn't hold. If all states are meaningful, it doesn't mean that interface is already easy enough. It may lack handiness, expose too many parameters, greet users with insane default values. However, if it's too easy to make the wrong choice, the interface is definitely awful.

A somewhat modern Alan Wake game (I highly recomment it). The action item is highlighted, you don't need to hunt for it—and frankly this does improve the gameplay. Screenshot courtesy of MobyGames.

Game designers got it recently, and modern games highlight action items for you, expose less options where to go, and always provide an infinite box of ammo if you are to shoot your way through. The result? I see advertisements of Borderlands 2 on a bus stop. Gaming is no longer for nerds only!

Another case is automating the processes user shod not understand. One of these cases is installing drivers. Most of the time you install Linux, you have to install a proprietary driver for hour video card, and all the manuals say you should select the correct one based on its make. These manuals even provide you with instructions how to learn which video card is installed.

Why in the world can't the software look this up, and choose the correct video card on their own? New hardware comes out, and you can't know the names beforehand? Well, do guess! You think that casual users are smarter than your gueses? You update the wiki pages with instructions, why not update the software at the same time? No reason.

Back to game developers, they all bundle the latest version of DirectX, and automatically install it along with tge game itself without any, unnecessary questions asked from the user. Assassin's Creed 3 on billboards all over the city; that's the result.

How Easy Should the Interface Be?

Note that I didn't say anything about complexity; why? Some people think that the complexities of interface is what prevents others from using a product. This is an easy explanation. Many dream about a "brain-machine interface" as it would solve all problems with the complex interfaces. Myriads of buttons, overwhelming amount of gauges and interactive controls, and warnings like "your signature must fit completely in the box" will be useless because the answers to the questions would be transferred directly from human brain.

Contrary, the brain-machine interface is going to solve nothing. Because most users just do not have any brain.

The brain-machine interface does not jump over a barrier unachievable by other means. It is a result of gradual convergence of the means of control and our thoughts. The simplification of interfaces we see nowadays merely exposes the truth we should have unveiled years ago: if our thoughts are a mess, then a simpler interface doesn't solve anything. The interfaces are getting more and more simple, and people's persistent incapability to cope with them frightens me. I mean, are we really that dumb?

Since the user is going to be the problem, there's just one thing to do. To increase its knowledge. Extended to many systems, it could be called as increasing the likelihood that a person who encounters a system will infer the principles behind it, and starts asking the right questions.

One of the things that surprised me when I tried to deploy a wiki software in an office was that people couldn't create a page. I'm not sure if it was just an excuse to dodge a mundane job, but users reported that they couldn't find a button "Create New Page". Mediawiki, which is the software behind Wikipedia doesn't indeed has this button. Why? Because, unlike Google Docs, wiki is a set of interconnected pages, so pages not linked from elsewhere don't make sense. So the correct way to create a page is use it as if it's already exists, put a link to it. That simple and tricky at the same time.

What I think my failure was is that I didn't show how to use the wiki pages, didn't educate users. Without cross-reference, wiki pages are inferior to Google documents, which puts the whole idea into jeopardy.

No matter how simple your interface is, it is never going to be less complex than the concept behind the software. Search engine and other information-retrieval system are lucky: their concept converges to a single line of text (or a verbal query) where user enter what they need to know. Such systems then automatically suggest if user was looking for videos, images, and/or nearby shops (like Google currently does). However, when the concepts of information behind the scenes become more complex, such as maintaining a set of interconnected pages (wiki), or adjusting privacy settings (Google Plus).

So What?

So here's the list of things I would keep in mind as a programmer to make the interface of the product better:

Code With User Experience in Mind

Consult UX specialists early and often. Make sure your coding decisions you made at earlier stages of development do not prevent improving user experience after you get alpha-version feedback. The attitude "let's get it working first, and then think about User Experience," can easily lead to inability to actually improve it. It will be very easy not just to say, but to show with compelling evidence that you can't improve your interface because your code doesn't allow that. But really, putting code in front of User Experience is like putting the cart before the horse.

Convenient interface (or, let's say A Better User Experience) doesn't only mean that your knobs should be shiny. For instance, Google is famous for the speed of its products. Apple's interfaces (Mac, iPhone) are faster than many other their competitors (modern powerful Android phones changed it though). And it's not just convenience: being slow may even kill your project, as, I think, it killed one of mine.

And isn't it the most straightforward task for programmers: to make the gears faster?

Provide Early Response for User Input

If your user has to follow the rules when filing something, provide feedback ASAP. Don't make user click "Next" and only then receive response. Don't punish users for providing incorrect input because you think they should have read about the exact format for entering their SSN. Such approach is abomination. Do you really want your product make the same impression as an embassy where you apply for a visa or a public office full of nine-to-five clerks with their "read 10 pages of how to submit an application, and we'll reject it without explanation if something is wrong"? No, you don't.

That's why you provide pop-up hints, color input fields red until user gets it right. Just take a look how New Google Account window hints you and checks your input as you type, even before you reach the "Submit" button:

Decouple Rules From Code, Make Them Configurable

If your user should follow some rules (mainly when providing input to forms), try to make them re-configurable in a declarative style. This is important since the rules tend to change over time, though, I wouldn't say it's crucial. It's hard to get, too, especially since the previous section suggests that you should tailor your feedback instead of just verifying your model. Just make sure it doesn't end like this:

Credit card validation is my pet peeve... Why can't you tell me, "Sir, you missed a digit?" You know exactly how many digits there should be. But instead you're telling me something I can interpret as, "Your credit card is blocked!" This one, at least, doesn't ask me explicitly if I have a Visa or Master card, because you can learn it from the number I enter. Good girl.

Apparently, form validator they employed for that checkout page made it so hard to configure nice just-in-time validations that they resorted to the abomination of "error messages for the form". Of course, that's an additional challenge: to make this configuration just flexible enough for your purpose (, and here's where your judgement, as of a professional, comes to play. I thing that flexibility is one of the reasons "cancan" wins over "declaratice_auithorization" in Ruby on Rails web authorization plugins: the former is much more flexible and easier to configure.

Educate Your Users

If the concept your software tries to manage is complex, then simplifying the interface will only increase frustration because user will feel like he doesn't have sufficient tools to tailor the information to its needs. Instead, invest time into making users more informed. Make educational, not promotional videos; Talk at conferences, explain the concepts as soon as users encounter difficulties automatically by popping out videos. The thing is that, even if your interface is not ideal yet, it will never become simpler than those concepts, so it will never hurt to explain them. You can also pop up a chats with customer representative. The latter was recently employed by many service providers, and now their websites pop up windows offering human help if you are lost while browsing the site.

You think it's a job of your PR manager, not yours? Well, are you sure your PR folks understand the concepts in the first place?

Conclusion

I tried to summarize my thoughts on what makes a good user interface, why the simple interface isn't necessarily the best, and what to do about it as a programer. And you really can do a lot to improve the interface of your software: devise the ways of automating figuring out unnecessary details (such as video card make) for your users, improve speed, explain the concepts behind and typical use-cases. And, as a programmer, your unwillingness to deal with this can destroy your project, well, just as many other things you fail to take care of can.

Consistency Models Explained Briefly

Wikipedia does a ridiculously bad job at explaining different consistency conditions of multithreaded/distributed systems. Just take a look at the articles, and see how inconsistent the consistency model descriptions are. I'll try to get them together.

What is a Consistency Model?

When you're describing how your distributed system behaves, you usually specify something named a "consistency model". This "model" describes how distant the behavior of your system is in multi-threaded or multi-machine environment from the "ideal" behavior.

Assume that you're implementing a data structure (a set, for instance; say, you can add/remove elements, and check for presence). The specification (model) of its behavior usually implies that the operations are performed sequentially. For instance, a set indicates a presence of an element if ant only if it was added before the presence check, and not removed after that. The terms "before" and "after" imply that there is an order in operations over that structure. While it is straightforward to know the order of operations in a single thread, it gets unclear if there are several threads or machines:

  • Calls may overlap. If two threads call methods on the same structure at approximately the same time, the calls may overlap because they don't happen simultaneously in real world. In what order the calls took effect is not clear, however, because thread scheduling or network connection introduces arbitrary delays in processing.
  • Calls may not take effect immediately. When a method call returns, it is sometimes not clear when the operation actually "takes the effect." By diluting the notion of "taking the effect" one may achieve increase in speed. I.e. the elements added to a stack may not be fetched in the exactly same order, but the operations could be faster. The rationale behind this is that the "order" of events that happen in different parts of your cluster is not that well-defined and meaningful in the first place. I talked about why this is natural in my other post "Relativity of Simultaneity in Distributed Computing.

Object-Oriented Model and Executions

A consistency model usually assumes Object-Oriented approach to describing a system. Assume that we have something (say, a set), and its internal state can be observed and altered via a well-defined set of methods only (say, add(elem), remove(elem), and bool in(elem)). The OO approach assumes that internal state is concealed, and it has no meaning to discuss it per se, and we do not define consistency models in terms of object's state. Instead, we restrict the values these methods may return based on when other methods were called in different threads or machines.

In practice, method calls don't happen instantly, and our models totally acknowledge this. The set of method call times, the times it takes to complete each, their arguments, and their return values is named an execution. Executions are usually depicted as intervals plotted at lines. The parallel lines represent how time simultaneously passes in all threads. The intervals are the method calls, the beginning corresponding to the time the method was called and the end when the method returned. The call arguments and the return values as well as the method and object name are usually written near the method call. Here's a sample execution of a set object:

We will omit the actual call values further because consistency models do not operate in terms of actual arguments and values, but rather try to "map" concurrent executions to sequential ones. In our further diagrams we will also assume that all methods are called against the same object.

Since consistency model imposes restrictions on an execution, the latter may match or not match a consistency model. A data structure adheres to a consistency model iff every execution over its methods we may observe adheres to it. That's what's hidden behind such phrases as "sequentially consistent stack".

No Consistency

The most important consistency "model", is, of course, no consistency. Nothing is guaranteed. The methods called on the object may observe anything. We may add two different elements to an empty set, and observe that the size of set is three, and none of these elements is what we added.

All other consistency models are opposite to the No Consistency in the following sense. They all assume that the method calls may be re-arranged in a single-threaded sequence that is correct with respect to the object's specification. Let's call it arranged correctly in the rest of the post. For example, a set of pets that adheres to any consistency model described below can't tell us that it contains a cow, if only dogs were added to it.

"Multithreading" Consistency Models

What I discovered is that consistency models are divided into two categories: that lean to multithreading (shared memory multiprocessor systems), and to "multi-machine" (distributed systems). I find the former category more formal and mathematical; these models can be used to describe behavior precise enough to allow math-level, rock-solid reasoning about system properties. Some "distributed" consistency models are less precise in their definition, but are widely mentioned in documentation and description, and do allow users to make informed decisions.

Let's start with the "multithreading" ones.

Strong Consistency aka Linearizability

An execution is strongly consistent (linearizable) if the method calls can be correctly arranged retaining the mutual order of calls that do not overlap in time, regardless of what thread calls them. (See "No Consistency" section for definition of "correctly arranged").

In other words, if two calls overlap, then they may "take effect" in any order, but if a call returns, then you may be sure it already worked.

A more anecdotal definition is "before the method returns, it completely takes effect consistently with other threads." This is also usually called "thread safety" unless more precise definition of consistency model is specified. This is quite a stringent restriction, especially for distributed systems, where threads run on different machines.

Two linearizable executions combined also form a linearizable execution. This is an important property, since it allows us to combine linearizable components to get strongly consistent systems. This is not true for other models, for instance....

Sequential Consistency

An execution is sequentially consistent if the method calls can be correctly arranged retaining the mutual order of method calls in each thread.

The calls in different threads can be re-arranged however you wish, regardless of when they start and when they end.

An example of a sequentially consistent system is a naive notion of RAM in multi-core processor. When different threads read from the memory, they actually read from a cache or a register, which is synchronized with a common RAM at writes. While values read and written on different cores may not be immediately synchronized (cores may diverge in the notion of what value a cell contains), the values read and written on the same core are always consistent with each other. This is naive, though: even this guarantee is relaxed by modern compilers, and accesses to different variables within one thread can be reordered (as far as I recall, ARM platforms are guilty of this).

You might notice that the definition of sequential consistency does not directly prevents methods to return values from the future (i.e. to read what will be written later in a different thread). Yet I claim that RAM is sequentially consistent; how's that? The thing is that each execution should be sequentially consistent. If read and write do not overlap, and write follows the read, we may just stop the program between them. Since each execution should be consistent, the value read should be written before in some thread. Therefore, this "future prediction" is not possible in consistent data structures (not only those that are "sequentially" consistent) even that the definition doesn't directly prevent it.

However, if we combine two sequentially consistent executions, the result won't be sequentially consistent. But there is another model, weaker than linearizability, that makes it possible.

Quiescent Consistency

An execution is quiescently consistent if the method calls can be correctly arranged retaining the mutual order of calls separated by quiescence, a period of time where no method is being called in any thread.

This consistency condition is stronger than sequential consistency, but is still not strong enough that we usually expect from a multithreaded system. See my other post for a more in-depth example of a quiescently consistent stack. The reason to use this model is to attain a higher degree of concurrency and better response time in return of weakening consistency a bit.

Quiescent consistency is not as esoteric as you might think, especially since you barely heard about it. Here's an easy way to imagine a quiescently consistent system: assume that all writes are cached, and the cache is flushed to a persistent storage when there are no ongoing writes (or after it reaches certain size). There are special data structures that are quiescently consistent.

Quiescent consistency components can be combined to form quiescently consistent components again. Note also, that strong consistency implies quiescent consistency, hence you can combine these two kinds of systems, and the result will be quiescently consistent.

"Distributed" Consistency Models

Consistency is not Fault Tolerance

It may be tempting to mix the concepts of "consistency" and "fault tolerance". The former is how object reacts to method calls depending on the calling thread (or, in other words, how data are seen by multiple observers). Fault (or Partition) Tolerance describes how system behaves when system becomes partitioned, i.e. messages are lost, and/or parts of the system get disconnected from the network, or form several networks. These concepts are orthogonal to each other: an object can have one of them, both of them, or neither.

It will be, more likely, one of them. There is a scientific result known as CAP theorem that states that a system can't at the same time guarantee:

  • Linearizability
  • Availability (whether RPC calls send replies)
  • Fault Tolerance

In this post, however, we will focus on consistency rather than other properties of distributed systems. Note that multithreaded systems due to close proximity of their components, and less versatile medium that connects them, do not have to worry about fault tolerance...at least, at the current state of art.

The following models are usually seen in the context of distributed systems, i.e. those that reside on multiple machines, communication between which is slow enough. To bring Object-Oriented abstraction to distributed systems, programmers invented "Remote Procedure Call" (RPC) pattern. Method calls (object identifiers, method name, and arguments) are encoded into messages that are then sent across the network to specified destination, which may be the part of the encoding too. Method calls that do not expect a reply (say, unreliable "fire-and-forget" writes) may be treated as instantaneous.

Eventual Consistency

A system is eventually consistent if the method calls can be correctly arranged retaining the mutual order of calls separated by a sufficiently long period of time.

The eventual consistency definition—as well as its understanding—is specifically vague on how much time it would take, and whether this time depends on the call history. We can't really tell how soon an eventually consistent set will respond that an element X exists in it after it was added. But "eventually" it will.

This looks like a quite useless consistency model, since it doesn't actually restrict anything mathematically. Why is it used then? Exactly because of its lack of restriction. Users usually expect some consistency from a system, and if there's actually not much of it, something should be said about this. Eventual consistency is actually a limitation of liability; it's used in documentation to convey that the system can't guarantee that different threads become in sync immediately. However, the system is engineered in such a way that they eventually take effect. Here's an example (here's more on eventual vs. strong consistency)

Some authors, according to wikipedia, require that no method calls should be invoked during the "long period of time" for system to get in sync. This actually sounds like a stronger version of quiescent consistency, where a qualifying quiescence should be long enough.

Strict Consistency

An execution is strictly consistent if the method calls sorted by completion time are correctly ordered.

This is as simple and powerful as it's impossible to attain in most practical systems that have more than one thread. First, it order calls according to "global" time, which is a purely mathematical abstraction unachievable in real life, both according to Physics, and to the theory of distributed computing. Second, any implementation I might imagine would essentially disallow multithreading per se.

This is my rendition on the more "official" definition (1, 2), which defines a strictly consistent system as one where any read operation over a certain item returns the value supplied to the latest write operation according to global time counter. I don't like this definition because it needs to distinguish between "reads" and "writes", and introduces the concept of "data item"—i.e. it breaches the object-oriented approach. Still, I'm sure that if the system really consists of "data items," the definition at the beginning of the section would work as well.

Serializability

An execution is serializable if its methods in each thread are grouped in contiguous, non-overlapping transactions, and the methods can be correctly ordered in such a way that transactions are still contiguous.

In other words, the outcome of a sequence of method calls can be described as a serial execution of transactions, where methods of different transactions are not mingled, while the transactions actually overlap in time across different threads. The useful outcome here is that calls within overlapping transactions can be reordered, even if they don't overlap. However, "serializability" is not enough to describe the system: while it limits reordering of methods across transactions, it is unclear to what extent transactions can be reordered themselves. This is a job for a different consistency model. And serializability is usually used to "remind" about this important property of transaction-based systems.

Practice

Some systems provide several levels of consistency for operations (like the AppEngine's High-Reliability Datastore linked above). You should understand what tradeoffs each model has to make the decision best suitable for your specific application.

Most system you encounter will be not consistent at all ("thread-unsafe"), or linearizable (this is what usually understood by "thread-safe"). If the consistency model differs, it will be noted explicitly.

Strictly speaking, most systems that claim linearizability will be not consistent due to bugs, but these nasty little creatures aren't suitable for careful scientific discourse anyway.

Now that you know what consistency models are out there (and that they actually exist), you can make informed decisions based on what's specified in documentation. If you're a system developer, you should note that you are not bound to choose between linearizability and "thread-unsafety" when designing a library. The models presented above may be viewed as patterns that enumerate what behaviors in multithreaded context are considered universally useful. Just consider carefully what your users need, and do not forget to specify which model your system follows. It will also help to link to something that explains what lies behind the words "somethingly consistent" well--like this post, for instance.

Comments imported from the old website

Pavel Shved on 13 November 2012 commented:

I'd like to thank ssmir for valuable comments on how to improve this post.

How to Watch HDTV from Internet on your TV with Linux PC (Legally)

Hooray! The Airbus finally brought me to the land where law is taken slightly more seriously than in Russia. With great responsibility, however, usually comes great power, and a lot of absolutely legal video streaming services are available in the U.S. Moreover, my corporate-sponsored temporary apartment has a large, like, 30-inch TV, and it would be a sin to just let it collect dust in silence.

No, I couldn't just watch cable TV! The very idea that I watch something on a predefined schedule is not appealing to me. I'd have to spend money on a TiVo or something that records TV shows anyway. And what about DVDs and feature movies? TV shows isn't the only thing I watch, by the way. And what about shows that are not airing? how to watch older seasons? No, TV is for the weak.

Sadly, I could only find how to watch TV shows in HD, not movies. I expand on this in a section about Amazon. So, TVs are certainly for the weak, but... I eventually solved the problem by buying a big-ass TV that has enough DRM to stream all kinds of HD moving pictures. Sorry, Linux fans.

Streaming has even some advantages compared to these filthy, illegal torrents. You can start watching an HD movie at once, rather than wait until its download is complete!

So the most rational idea is to stream movies from Internet via my desktop PC to the TV. Surprisingly, the hard thing was that I use Linux, and this was far from the usual "OMG settings are so complex in Lunupz, I can't do console!" whining.

"Nobody uses Linux... Get a real OS, nerds."

The first thing to choose is the content provider. And that's where Linux users get into the ambush.

The thing is that the mission statement of GNU contradicts what content vendors are trying to achieve. Linux, or, more specifically, "GNU/Linux", because we're talking about operating system rather than the kernel here, is all about being open, allowing user to see what software is running and have full access to its contents. Content providers, however, insist that DRM—one of technologies to revoke control over content—is imposed onto end-users. This is why it is not easy to find Linux support in these matters.

Netflix

The first service to try was Netflix. Actually, that million of dollars worked well: I only heard about Netflix form a Communications of the ACM article (as far as I recall it was this one) about the Netflix prize. No wonder I decided to try it first.

When I came to the U.S., I didn't have any personal computer at all. In Russia, we have an adage, "to come to Tula with one's own samovar". Tula was famous for its samovar-makers, and the adage mocks those who carry the specific items they possess to places that naturally have a great choice of such items. So no wonder I didn't bring my, like, 15-pound chassis here. Instead, I bought a 4-core i8 from Best Buy, but for the several first days all I had was Galaxy Note smartphone/tablet. Netflix installed smoothly from Google Play, I watched some TV shows there, and I thought I won't have any problems at my desktop PC.

I was very wrong. Netflix doesn't have a way to run under generic Linux distribution. Windows7, MAC—you go, Android—here you are, GNU/Linux—go fsck yourself. Seemingly reassuring was that it ran under ChromeOS, but that was only for proprietary Google's Chromebooks. Yes, even the new Samsung $250 ARM Chromebook. Recently, Google and Netflix managed to arrange a seamless delivery of video to the laptops, while it didn't work until the March'13. I tried it on mine, and it works! However, if you install ChromiumOS in a virtual machine, it also won't work (I tried.) A Netflix Chromium plugin also looked promising, but it's just a frontend to their web-site, and doesn't offer the playback capabilities to Chromium itself; there are a lot of angry reviews by Linux users, and a response to them is a caption to this section.) Here's a blog post with a more comprehensive overview.

Hulu

Hulu, on the contrary, has a GNU/Linux application. However, very few of their movies were available in HD, and I was not interested in those that were. So I had to skip this service.

Amazon

Yes, Amazon has a movie service as well. Marketing also worked here: I learned about this from IMDb, which was bought by Amazon years ago. And Amazon is the thing that finally worked but not without a plot twist.

The thing is that Amazon doesn't support GNU/Linux PCs for its HD movie content! Yep, that's right, check their list of compatible devices. You see a list of TiVo-s, Consoles, Smart TVs, and an iPad, but where's the PC? Nope, it's not there. Yet it works.

Not only this stuff requires Flash, but it requires special DRM capabilities in Flash, which relies on already deprecated HAL software maintainers started to remove from modern Linux distributions, but bring back just because of Amazon. If your distribution is released in 2012-2013, this may require additional actions you can google. ROSA Marathon 2012 plays Amazon Videos out of the box though, and in ROSA Fresh, you just need to urpmi hal from contrib repository.

That's right, when you pay for streaming an HD TV show, it just plays in your browser. It requires Flash, though. The list of compatible devices apparently means that if you "buy" a movie instead of renting it, you can only download it to one of the compatible devices rather than to your PC.

I used to think that it was a mistake of sorts, but the further research revealed that Amazon would probably love to stream all sorts of HD stuff to as wide variety of users as possible. This looks like a matter of what publishers allowed Amazon to do. They seemingly allow HD movies to tivoized and DRM-ready devices, and just don't care that much about TV shows. Streaming the movies I tried didn't work.

Equipment

To connect a TV to your PC, you'll need some extra equipment. When I looked at the price of 25-feet HDMI cable at BestBuy, I was shocked: it cost more than my monitor! So I bought that 25-cable from Amazon for $14. You should already have Ethernet cable, but in case you don't you may buy it there as well, and, of course, you should have a good and fast connection.

Setting up Linux

The screenshots below are from the Linux distro that I use at home, namely ROSA Marathon 2012.

First, I connected the TV via cable, turned it on, and set the input to the corresponding slot. I'm not sure if it's necessary, but if never hurts to do this first.

Unfortunately, KDE settings (available via "Configure Your Desktop" → "Display and Monitor") thought that I don't have any other monitors:

It remained this way even after I managed to configure it, so don't rely on it that much.

The second thing to try was the settings for my video card, nVidia GeForce GT 520. That's where I finally found managed to do this. This configuration tool will need to write to your /etc/X11/xorg.conf file, so make a backup first by copying it somewhere. Because of this very thing, you should run it under root (sudo won't work, use its KDE-specific counterpart):

$ kdesu nvidia-settings

Got to "X server Display Configuration" tab, and press "Detect Displays." Your connected TV should appear there. You could try TwinView, but you don't want to spread your desktop across two screens (it makes no sense), and TwinView sometimes limits your bigger monitor resolution to your TV resolution, which you don't want. However, TwinView doesn't require restarting X.

So you could try TwinView, and resort to "Separate X screens" if TwinView doesn't work. If the "Separate X screens" option is not shown, don't panic, select it, and other options should become available (it's a bug); maybe, you'll have to Quit with saving configuration, and reopen again. The goal is to set the "Configuration" to "Separate X screen (requires X restart)" for both displays:

By the way, rosa-launcher (the Start Menu) doesn't work (it crashes) when several screens are set like that. The message has been sent to developers, I hope they'll fix it.

I also found out that the least intrusive is to place the TV display "Below" the main one. This means that the mouse will appear on the TV screen when you move it over the bottom edge.

Click "Save to X configuration File" (if you get an error here, you forgot to run as root with kdesu), and Quit. Restart your X server (or restart your PC), and there you go.

Now we should actually run something on that new screen. It's not that easy. There are different displays that are different X processes, and within one process live several "screens," which we have set up.

Remember, that videos play from your browser? So you should start a web browser on that another screen (this is not required if you use TwinView - just drag browser window to where your TV screen resides). Screens are set up nearly like displays, via the DISPLAY variable; the screen number follows the display number after a dot. I didn't manage to start a second Firefox window on that screen, unfortunately. But nowadays we do have a choice.

$ DISPLAY=:0.1 chromium-browser amazon.com
$ DISPLAY=:0.1 firefox amazon.com

Now move your mouse beyond of the bottom edge of the screen, and start the movie you like. Pro tip: start browser as usual, log in to it, and only then re-start it on another screen to avoid excessive strain on your neck.

Start the movie... wait-wait-wait, why is the sound going from my PC speakers rather than from the TV itself?? Back to configuration. Open KDE desktop configuration tool, click on "Multimedia", select Phonon, and adjust the priority of the "Audio Playback" (not of any of subcategories, but of the category itself).

As usual with the sound, adjust mixer settings of HDMI device. Use alsamixer console command and unmute/volume-up everything there like crazy if the usual mixer doesn't give results:

For me, sometimes, I had also to select another profile on "Audio Hardware Setup" tab. Now you're set. The sound should be re-routed to TV without restarting the browser, but, again, give it a try if it doesn't work.

Enjoy. Here's a proof that it works:

Internet connection is not used very intensively, and I'd be able to watch, like, three HD videos simultaneously:

***

That's how you fight your way through bugs, tricky settings, "X displays" (wtf? why do I need to worry about that?) Of course anybody would love to avoid all that. But that's basically the same thing that you have to select and tune things like medical insurance on your own, because you are, as with Linux, in the land of the free.

A Dying Tux on an Airbus

A couple of days ago I took a transatlantic flight on an Airbus A330.  It is a large plane where each seat is accompanied by a computer named "Entertainment System".  It contained games, movies, and stuff like inflight information (plane location mapped onto the world map, and flight progress details, which is kind of reassuring given that the flight lasts for 10+ hours.)  A killer feature of such system is its USB port useful when you want to read/watch something on your mobile phone (or type this story, for instance.)

Not only did I learn that day what operating system that computer ran, but I also found out another example how machines can demonstrate such inherently human feelings as courage and hope.

It all started with nothing special.  I casually opened inflight information, and fell asleep.  When I woke up, and realized that there was a couple of movies worth watching, I tried to launch them.  It didn't work.

"I am a programmer, aren't I," I thought, and started to click all the buttons on the remote control.  None of that worked.  I tried to click the near-screen buttons; they worked properly, while their remote-control counterparts did not.  I concluded that the driver for the remote died somehow, and asked the stewardess to reboot my terminal.

As it started to reboot, I noticed a lot of familiar words.  It used a bootloader named RedBoot copyrighted by RedHat, and it obviously ran Linux with a 1.59 kernel or something.

However, after "decompressing image," it showed, "decompression error: incorrect data check," and rebooted again.  The system did not boot far enough to start that entertainment.

It also displayed a reboot counter. After twenty five reboots something should have happened, and I didn't know what. Curiosity took me over, and I stared at the screen, finding amusement in the fact that I know what some of the words being printed there meant.

After 25 reboots, the computer realized that there was something wrong with it, and ran the firmware update program.  It did not help.  He entered the same circle of pain again...and again.

As I watched the guy reboot over and over, I noticed that my curiosity is being replaced by different feelings.  I felt sympathy and compassion to that guy.  It kept trying and trying, and kept failing and failing as if it didn't even know what would happen next.

There are theories that as computer programs start to gain their own cognition, they will frown upon the awareness about their own limitations.  "I envy you humans for your unknown boundaries," says a reddit bot that automatically handles pictures posted to reddit in "his" answer during the open interview (aka "AMA").  I don't know if these theories will have anything to do with reality; they do not sound illogical at all. But this little computer clearly had no awareness of his incapability to boot, and kept crucifying himself.

Pain replaced my compassion. Having watched over a hundred reboots and reflashes, I've lost all hope. "Why should we keep him alive?" I thought. "He'll just keep doing this being unaware that he broke.  Why can't we just shut him down and end the series of reboots?  Do we humans even realize how painful each reboot could be," I thought, "maybe he wants to die already?" I tried to recall what people do when euthanasia is not an option. Morphine?  Even morphine can't make the pain relinquish a computer—though, none is allowed on an aircraft anyway.

I asked the stewardess to shut the entertainment system down at my seat to release him from the pain.   "Oh, unfortunately it's impossible," she replied. "We can move you to another place if you want though: we have some extra chairs available."

I suddenly realized that I can't leave him.  To most other passengers and crew, it was just a malfunctioning computer to be rebooted, replaced, or ignored.  To me, however, it was a dying Tux  who fought for his survival against its own limitations.  I recalled how friends and relatives stay at the bed of a dying loved one, and imagined how they'll soon gather near the charger of a family robot who's about to leave for his silicon heaven after a long, deadly malfunctioning.

I sat by his side for the rest of the flight.

And he finally succeeded!   After six hours of effortless reboots, he finally reflashed a working kernel, and booted himself!  A usual routine went on, the first-boot symlinks had been set up, the X11 system started, showing its tapestry with the well-known black-and-white embroidery.  I could launch one of the movies I wanted several hours ago, but I gave the Tux some rest instead.  He deserved it.

So I experienced this enlightening encounter with a machine who fought for his life, never lost hope, ignored what was believed to be his limitations, and did finally succeed in this struggle.   Not only did he demonstrate the feelings only humans seemed to possess, the dying Tux also happened to have more faith in his own victory than the human who was sitting alongside him.

.

Comments imported from the old website

Oh my god! You hacked the airplane!!!!

Pavel Shved on 10 October 2012 commented:

Oh, I wish I could hack him to perform the surgery necessary to bring him back to life. I tried some shortcuts Like holding Ctrl+C when booting, but then I realized that it was just silly. It was as fruitless as applying acupuncture to a patient in coma thinking that this could bring him back to normal.

A Tie for a Python Programmer

A couple of weeks ago I started a new project in ROSA. The project is a dynamic system that tracks health of our Linux distribution repository in real-time by watching the builds and repository status, and running external analyzers. This is a relatively small system that should end up with less than 50k LoC, so I could choose any language I wanted. I would choose Ruby, of course, but this time I had to raise the significance of social issues, and chose Python. I never wrote a single Python program before, but all my peers didn't want to learn Ruby, so I took this chance to learn a new language.

Python logo

This is the logo of Python the programming language. Definitely, these aren't snakes, sure.

Python is named after Monty Python's flying Circus, as per wikipedia entry rather than after a non-venomous snake family. Snakes, however, are featured at the Python's logo, and Monty Pythons are not well known in Russia, so it's easier to associate snakes with Python language when it comes to computing.

To aid me in learning a new language, and to make it more fun, I needed an artifact that helps me. Due to my unhealthy devotion to shirts/ties, I thought of buying a tie with snakes painted on them. I didn't find one although I searched all shops in my district, including those that sell clothes for kids. I couldn't even find a tie made out of snake skin; while they're referenced in literature, they're either mythological or require the social connections I don't have to obtain one.

It wasn't until my colleague linked me this website when I found the way to make a snake out of my tire. Indeed, if I can't buy one, why can't I make one?

Ingredients

I started with a black tie I no longer liked. Time to revive it by turning it into a programmer-friendly snake! Here's a step-by-step guide, inspired by the one linked above. You'll need

  • 50 grams of cotton wool;
  • a piece of red cloth (polyethylene should go as well)
  • a sheet of A4 paper to experiment with the eyes, a pen to paint the pupils; and
  • something to attach the two latter things to tie (glue, pins, and/or sticky tape, see below).

Making the Snake Tie

Start with tying your favorite knot, and marking the place where it ends. Starting from that place and below, the two-dimensional tie will turn into a three-dimensional snake. Open the tie up making sure it won't fall apart (I had to re-knot the threads that were running through the yet-to-become Python manually—I didn't have any GIL to assist me!) I had to also open the bottom end of the tie, because it was detached from the upper part:

Start filling the tie with the cotton wool. To make the wool through the tie, you might need something long and stiff, I used a toothbrush.

As you fill the tie, it may no longer keep the original shape. I used a pin to strengthen the lower end. To make the pin completely hidden, stick it into the inner half outwards, and pin the outer one onto it, hiding the pin itself in the fold.

Do not fill it all the way to the knot point! Tie the knot first. Here's the Windsor knot I tied; look how you still have the access to the hole you made at the very beginning. Stuff the tie with more cotton wool.

Now you can attach a tongue. A nice red split tongue is what makes your former tie a mighty snake rather than a boring worm. Cut it from the red cloth you prepared. I had to cut it several times before I managed to make a tongue good enough. Attach it with a small pin, sticking it as close as possible to the tip of the tie. The pin will be concealed when you look from the front side.

Now make eyes for your snake. I just painted with a pen on small pieces of paper. This is an easy part; you may experiment as much as you like; I found that the snake looks most crazy if the eyes are placed close to each other. I also felt that the color of the pupils should fit the color of the tie itself, so I used black.

After you found the proper shape and place for the eyes, you should attach them. Sticking a nail into your future friend's eye is disgusting (and the result wouldn't look real enough), so you'd rather use glue. I didn't have any glue, so I used one-sided sticky tape a body patch was made of, and rolled it up to make it "two-sided".

Your new friend is ready. Make sure it stays away from mice!

If you're brave enough, you may even go to work wearing it.

I hope I won't get fired because of that. This would be programming language discrimination anyway!

***

So that's how you—or, more likely, your kid—can assemble a Python tie from the old useless things you find at home. I hope that programming Python itself will be different than putting the same old stuff together—and more fun.

How Static Verification Tools Analyze Programs

One of my previous posts featured a plaque we got for scoring an achievement in the Competition on Software Verification'12. I promised to tell how the tool we improved analyzed programs to find bugs in them... without actually running them.

Please, note that most of what's described here was not devised by me. It was a state of art when I first started to study the field, and my contribution was kind of tiny—as usual when newbies do their (post)graduate research.

Program Verification is Impossible

I promised to tell you how we do it, but the thing is that... this is actually impossible. Not that I'm holding a secret vigilantly protected by the evil Russian Government, but it's just mathematically impossible. It's not just "hard", as in "NP-hard", but plain impossible. A proof of impossibility of a relevant problem, the halting problem, published in the famous Alan Turing's paper is acknowledged as one of the stepping stones at the beginning of Computer Science as a math discipline. Later it was found that halting is not an anly property you can't universally verify with a computer program: in fact, any nontrivial property can't be universally verified! This is known as Rice's theorem, and was known since 1953.

The undecidebility (one more term to denote the inability to build a program that solves a problem) of program verification only concerns Turing complete languages, i.e. the languages that are as powerful as Turing machine. Some languages can be correcly verified, such as a hypothetic language with no variables and no statements other than print (always terminates), or its extension with non-folded for...to and for downto Pascal-like loops, so that halting can be solved by comparing the numbers involved.

Most interesing real-world problems involve reasoning about constructively non-trivial languages though.

Absence of a program that can verify if any given program halts or bears some nontrivial property means that for any program that claims to be such a universal verifier, there exists a counterexample, a piece of source code that will confuse the self-proclaimed verifier or make it loop forever. But the very same verifier may provide useful results for a multitude of programs that at the same time. Considerable pieces of computer science and software engineering are devoted to studying, building, and putting such verifiers into production. A lot of program analysis, which is closely related to verification, happens in compilers, so you most likely unanimously encounter the results of this research on a daily basis.

In the next sections I'll explain one of the approaches to verifying programs, to which my graduate research was closely related.

Why Go Static?

So how would you verify that a program doesn't have certain bugs (say, never dereferences a NULL pointer ever) without having it run and take potentially harmful effect on customers and equipment?

If your initial idea was to try to interpret program in such an environment that its running wouldn't affect anything, that would not work. First, it hardly counts as "verifying a program without running" as it actually runs, but in isolation. Second, and more important is that the program may involve a user input. This may be an input from keyboard, from text file, or from network. How do you verify the program behavior if the way it behaves depends on what comex from external sources?

You could emulate the behavior of these external stimuli by writing some mock-ups (say, by reimplementing fread("%d",&x) as something that make x a zero), but this is effectively resorting back to testing. The whole idea of static verification—when the program is not run—is to extend the number of possible program execution scenarios beyond the those you can cherry-pick by testing. When you see a scientific paper on static verification techniques, it will surely list the possibility of checking all the possible scenarios as a justification of the increased resource consumption compared to testing in isolated environment—well, who tests in production anyway?

Of course, white-box static verification is not the only complicated testing technique. An interesting testing technology, UniTESK has been developed in ISPRAS, where I previously worked and studied.  One of its features is that it can automatically adjust the tests generated to cover all possible disjuncts in if statements of a method's postcondition. This means that not only each condition branch should be covered, but each of the operands in a disjunction that forms each condition should be true during the test execution. The disjuncts we're talking about here are those in the postcondition, the system under test itself being a black box.

And a justification it is! It's not possible to enumerate all the possible inputs and stimuli, because the number of their combination is infinite even for simple programs. And if you omit some input pattern, who can guarantee that something wrong doesn't happen exactly when users behave like this? You surely know a lot of cases where seemingly "dumb" users unveil behavior no sane person could ever predict when writing tests to software! In fact, any black-box test generation technique is inherently incomprehensive, and can't cover all cases possible. Static analysis "opens" the black box, sees and works with what is inside it, and it is more complicated than writing tests and measuring coverage.

Another case that is better left to automated comprehensive tests is security-related operations. Security-related bugs, such as buffer overflows, are hard to test. Ample testers for such bugs are called "hackers", and they are rare, expensive, and still are not reliable, and may miss something. One would like to ensure that no matter how malicious or damaged input from the network is, no matter what mischief the user clicking on the screen has in their mind, the program doesn't crash or expose certain security breach patters such as the aforementioned buffer overflow. Static verification is used in some very big companies, for instance, to make sure that a specific security bug occurs nowhere else in the system, so that the updated system would only laugh at the inspiration the descriptions of the problem might have caused.

Thus, static verification that reads the source code and may prove that the system is not exposed to certain kind of bugs and security violation patterns is also useful. Static verification is the only way to vouch for infinite number of inputs that they do not expose any bugs in the program. It allows us to consider all possible inputs at the same time as if a randomly-spanking monkey is operating the console rather than a sentient human being.

But, more on that later. It's still not clear how the multitude of possible patterns of security breaches can be studied generically. Do we have to study pointer dereferences, buffer overflows, memory allocation errors etc separately? Some tools try to, but there are fundamental approaches that address this problem of excessive diversity.

Reduction to Reachability

Reachability problem can be formulated like this. Given a program "state," find out if this "state" is reachable for some program execution. If program allows external input, this involves checking if there exists an input such that a program would reach the target state upon execution. We'll explain what the "state" is later.

This problem is as hard as the halting problem because the latter may be reduced to the former. If we know how to solve reachability, we know how to solve halting by checking if the exit state is reachable. It's not just very hard, as an NP-complete problem, it's just impossible to solve in a generic manner for all programs, as pointed out at the beginning of the post.

Reachability is important because a number of programming problems and bugs may be exposed as reaching "error state", a specific program line that gets executed when something undesirable happens. This could be a code that's responsible for printing "Segmentation fault", and checking reachability of this would give us a notion whether NULL pointer is dereferenced anywhere.

As this code is OS-specific, the programs being checked often are prepared to verification by inserting this code explicitely. So the code a = r/p, for instance, would decome if (p == 0) ERROR(); x = r/p;, where ERROR() terminates a program with error. Well, it doesn't actually have to terminate anything because we're not going to run this program as the verification is "static". Its purpose is to mark a specific place in program as undesirable. The process is sometimes called instrumentation. A special tool performs this operation; we developed a GCC-4.6-based one in ISPRAS during our Linux Driver Verification project.

Modeling Nondeterminism

So, how do we model the fread("%d") function? Let's get back to the original goal, which is to check if a program is free from runtime crashes and vulnerabilities. Thus, we can't assume anything on the user input, because a hacker may be operating the keyboard trying to enter a number specifically. What can we do then?

Static analysis techniques allow us to assume that the user just enters something—and soundly act upon that. The integer read may bear any value, literally! How does it play out then? The thing is that the integer entered is read once, and its value doesn't change arbitrarily after this, unless directly assigned to it, or to a pointer to its memory location. Then, such value is transferred to other variables, which make conditions over these variables either true or false, which affect how subsequent conditions are evaluated, and what code is executed. The ties that exist between values during the program execution are exposed, and it becomes possible to rule out some code as unreachable.

Note that if all ERROR() functions share the body, we may treat the body itself as an error location. This means, that any number of "error" locations is equivalent to one "error" location, so we sometimes go plural or singular for the sake of a better narrative.

We're lucky if we detect that all the ERROR() function calls are unreachable, which means that there's no bug. In other words, while we don't know the value of x, we know that any concrete value of x predetermines how the program executes from now on, until one more nondetermined variable is read from somewhere. Let's see on a simple example.

This uses a more simple model of reading an integer (to avoid discussion on pointer aliasing), which is read_int() function that returns an integer the user has entered. So, let's assume that it may return any integer. Can we still prove that a division by zero never happens, and the program never crashes at that statement? Sure. Whatever user returns, we can't get past the while loop until x is greater than zero. We may loop forever, or exit after several iterations, but if we're out of the loop, the x is surely equal to one or greater, i.e. x >= 1

Now, the crash happens if x == 0. But x == 0 and x >= 1 can't both be true. Moreover, whether it can be true or not, can be checked automatically with specialized software that is largely presented both in open-source and in proprietary forms. This demonstrates that we don't need to know exact value of each variable to determine if a program reaches an error location.

So, to model all possible ways the program works (including those ways that depend on the external, unknown input,) and to determine if any of these ways has a bug (i.e. ends in an "error" or a "crash" location) is the purpose of reachability solver's work. Let's see how is it possible.

Feasibility of Program Paths

The "equation" we used in the previous section to demonstrate how we could model nondeterminism is actually a powerful technique. It is generic enough to work in both nondeterministic and deterministic cases, and no distinction between them is even required! Each program where a certain line is marked as "error location" contains a number of "paths", sequences of statements that end in an error location, but for which it is not yet known if the program could actually follow this path. An example of such path is the above assignment x to zero, and having a check if it's greater than one passed. We've already shown that it's not feasible if we look closer at he path equation rather than just at program structure.

This usually requires some conversion. For instance, if we convert a fairly common assignment i=i+1 to an equation component i==i+1, it would render any path infeasible, since this can never be true for real numbers. In such cases, i and all its further occurrences are is renamed after each assignment (this is called Static Single Assignment (SSA) form).

Checking such equations requires using of advanced software such as SAT-solvers, theorem provers or customized built-in add-ons or reimplementations of them. These tasks are NP-complete problems, so the size of programs being verified matters. Modern software doesn't struggle with such "path equations" for programs as large as thousands to hundreds thousands of lines of code. Program size is not the only factor though. The size also depends on how precisely the formula describes the state of the variables and of the memory as a whole. For instance, pointer alias analysis makes the equation bigger for a given program.

Analyzing Program as a Whole

So far we've examined how tools like BLAST reason about individual paths. How does it turn into the whole-program analysis? The problem here is that, in any program that contains a loop, there may be an infinite amount of paths to analyze. Analyzing them one-by-one will never lead us to final solution provided we only check each single path. We need to somehow analyze paths in large quantities.

One of the possible patterns to perform such analysis is to split all paths into classes, such that we need to analyze one path, and somehow understand that the result of this analysis applies to a whole class of other paths, potentially infinite. This approach is applicable to many different tasks, and is not limited to the problem we are discussing.

At this point we notice that, wherever the program counter is, the behavior of program is completely predetermined by the values of all variables given the subsequent user input is also predetermined. For example, assume that at line, say, 15, in the middle of the loop, the values of all variables, and of the all readable memory cells are the same on the 2nd and the 3rd iterations of the loop. Immediately follows that the paths that "start" at each of these locations (given the equivalence of the memory) have a one-to-one match between these two "classes". Getting back to the original problem, if we prove that no path that only iterates two times has a bug on it, then the paths where the loop iterates three times or more also don't have bugs on them. This is how we can decrease the amounts of paths to analyze from infinity to several.

However, the loops where memory states are equal on subsequent iterations are rare. Such loops are either infinite (which is good for our analysis, since we don't encounter a crash if we loop forever doing nothing :-), or contain of user input only. The loop in the division by zero program does satisfy the condition, but what if it looked like this:

This way, attempt counter on each loop iteration is different. How do we tackle this one?

From State of the Whole Memory to Regions

While the memory state of the program in question is different at each loop iteration, it really does not matter. Indeed, whether the x equals zero does not depend on the attempt number. If user enters zero on second attempt, he or she could have entered it on the very first attempt, and our analysis algorithm accounts for that, as shown above in the "Modeling Nondeterminism" section. So all paths with two or more loop iterations are equivalent to the path with one iteration only--if we consider only the relevant memory cells that really affect whether the bug appears. How do we detect these memory cells?

Let's get back to the analysis of a single path. The path that only contains one loop iteration is converted to this equation:

attempt_old == 1, and
attempt = attempt_old + 1, and
x >= 1, and
x == 0

The SAT solver analyzes this, and says that "this equation has no solution." But what if we ask "why?" Why does this equation have no solution? "Well," we could answer, "because x can't be no less than one and equal to zero at the same time!" What is important in this answer is that it never mentions attempt variable. So we may discard this variable from the list of "important" variables, and compare paths with 1 and more loop iterations based on the value of x only, and from this point of view they are all equivalent.

Of course, if the value of attempt is later discovered as relevant (i.e. someone tries to divide by it), the equivalence of paths may be reconsidered. But sooner or later, (if we're lucky: remember that the problem is not solvable universally!) we run out of relevant variables.

Note that we eluded the question of how we realize that x is relevant, and attempt is not automatically, without asking any human (that's the whole point of automated large-scale program analysis). First, we can try to ask the SAT solver. Such solvers do not just run on magic, they execute established algorithms, that can sometimes give us notions why the answer is such and such. Second, we can try to play with the equation a bit by trying to remove the parts related to attempt variable and checking if the equation is still not solvable. Third, we can use the mathematical concept that formalizes the "why?" question, such as "Craig interpolation".

In most cases, such algorithms not only tell what variables are relevant, but also give us hints about when they are relevant (say, Craig interpolation may tell us that it's enough to check if x is greater than zero rather than see its actual value. This also solves another problem: tracking exact values of variables is suboptimal. If it is important that x is greater than 0, then every single positive integer value of x is equivalent, which would not be the case if we separated paths based on the exact value of x.

Such "conditions" that constrain the memory values are composed into "memory regions", (potentially infinite) sets of memory cell values that can be used instead of the previous execution history. Indeed, at each line we need to either know the complete history, what lines have been executed by now, or know the current memory state—or at least some constraints on it. We don't need to know both to predict how the program will behave after we continue execution.

The overall algorithm

To sum up the previous sections, here's how the anti-bug program analysis algorithm works:

  1. First, consider that memory is totally irrelevant.
  2. Try to find some path given that any memory cell may hold any value at any time (even if variables change magically between they're written to and read from) that ends up in the ERROR location
  3. If there's no such path, there's no bug!
  4. If there is such a path, convert it into the path equation, and try to solve it.
  5. If the equation has a solution, then the bug is here.
  6. Otherwise, ask the solver or other advanced tools "why" is the equation unsolvable (more generic question is, "why is this path infeasible?").
  7. Reevaluate the paths in the program, not blindly this time, but rather tracking how the memory regions we get on the previous step affect the equivalence of the paths (we leave other variables change magically though). Only follow one path among those equivalent to each other.
  8. Is there any path that ends up in the ERROR location under the new circumstances? Go to 3 or 4.

The algorithm is laid out this way to provide the explanation how it works. An explanation why it is correct, as well as the most convenient form to advance the state of art, requires us to think from the last poit to the first. One should first envision a novel representation of the multitude of program paths, and then devise how to compute it efficiently.

If we're lucky, the algorithm finishes. If we're even more lucky, the algorithm finishes in a sane amount of time. Depending on the task, the "sane" ranges from five minutes to several days. In LDV, we considered it unacceptable for an individual driver to take more than 15 minutes to get verified against a single bug, but Linux kernel contained 4500 of them. In a less massive setting the time constraint can be increased; some mid 2000-s studies considered three days for exhaustive verification of 2500-LOC programs.

This algorithm pattern is called CounterExample Guided Abstraction Refinement (CEGAR). It was devised less than 15 years ago. It was based on the 45-years-ago advancement by Cousot&Cousot that were first to propose to "abstract away" from the actual variable values (what we did when we allowed variables to change magically unless specifically tracked), and iteratively "refine" the "abstraction" by tracking more and more properties. This is not new stuff, but is not old either.

We omitted certain points in tese algorithms. For instance, even the early framework algorithms, even unimplemented, envisioned tracking whether classes "fold" into each other rather than whether they're equivalent only, leaving the exact implementation behind the "lattice" abstraction of program states. Second, the way how "memory regions" are affected by statements is also a complex theory, and it's surprisingly easy to create an algorithm that piggybacks on formula equation building and SAT solving, see this paper for details if you're curious. The paper also outlines a framework approach on alias analysis and memory modeling.

Common Limitations

The common limitations of this algorithm (compared to other static analysis algorithms) is that it relies on a complex equation solving and proving tools that are only well-defined when working with numbers and not very complicated formulae. Even C programs, however, are fairly complex, and consist of complicated concepts such as arrays, pointers, structures, bit shifts, bounded integers when adding two positive integers sometimes makes the result negative. Converting them to good ol' numbers introduces complications. For instance, it's increasingly more hard to answer the "why?" question as you introduce operators beyond addition, subtraction and multiplication: bit shifts, modulo calculations, square roots make it nearly impossible. The fact that pointer may point anywhere—or, at least, to several locations—makes the number of variants of program behavior explode. And if it's a pointer to function, things get ever worse.

Still, the approach works. We found several dozens of kernel bugs in Linux Kernel device drivers—and keep in mind that we did not have the actual devices! Slowly, scientists make it less fragile, more reliable, and some day it will also commercialize to products based on different static analysis approaches, such as Klokwork and Coverity.

Meanwhile, you may try to use such research-space tools as BLAST or CPAchecker, as well as check other SV-COMP participants.

Conclusion

So this is one of the approaches to program verification. There are others, but you may always notice the same pattern in them: abstract away from the concrete program, discard everything that seems impossible, and hope that you discard all bugs and bugs only. However magical it seems, it does work. If you run Windows, your PC is actually executing millilons of lines of statically verified tools. Microsoft has been carrying a lot of really interesting and relevant research and development in this area for the past decade, as I learned at one of the conferences hosted by MS Research.

I wanted to write this article for years, but never managed to find time for this. This was fun even though I stopped active participation in static analysis research or development. I hope I outlined one of the approaches to program verifications so that a non-scientist could understand. If not, mail me or post a comment, and I'll improve the article. Don't hesitate to ask about something in comments, and I'll try to answer your questions—or at least to direct you somewhere.

Time-Based Protests

Sample time-based Protest

(photo by Ivan Afanasiev). The largest "Strategy 31" time-based protest I attended (January 2011). A protester features a sign with a protest logo.

A month ago I bumped into a protest manifestation in Moscow. It was one of a series of events "Different Russia" movement has been organizing since 2009. The interesting thing in this event is that it happens on 31st of any month consisting of enough days. This day was selected because the 31st clause in the Russian Constitution stated the freedom of assembly the movement believed was violated. Russian protest movement considered innovative to have a massive manifestation at a a specific date regardless of whether it was approved. For software engineers, however, doing such thing is not an epiphany of any kind. We call it "time-based releases", so I call the way of arranging such protests "time-based protests".

Time-Based Releases

Time-based release is a pattern in software engineering which have taken the open-source world's mind over the latest five years. It means that the product team promises that each new release will be completed on a preliminarily established schedule no matter what new functionality or bug fixes it is about to contain. The release schedules are then arranged to ensure that a complete release cycle with such stages as feature proposal and selection, development, testing, and distribution completes by a given date. Release cycles often span from several months to years.

Many open-source projects switched from "when it's done" to time-based release algorithms. The most notable examples are Linux distributions named Fedora and Ubuntu, which followed GNOME project. Our ROSA Desktop Linux distribution does this as well. It's usually commented that time-based releases help to improve stability, to maintain a sustainable product delivery, and to strengthen the collaboration between developers and testers and users.

The key result here, however is that time-based release model involves a trade of innovation to an established process. No matter how many new, potentially innovative features we develop, we should release at this point, over. In the aforementioned Ubuntu this leads to criticism.

Sometimes I feel that just making release schedules is enough, and it is why people use time-based release as their planning paradigm in the first place. The thing is that you can't make a time-based release without a schedule. This makes me wonder if some projects opt for time-based release model just to finally make themselves do release schedules.

Why It Should Work Then

The most important reason to choose time-based release over "when it's done" is that this makes you think less and make less decisions. How is this important?

There are theories (I heard it at a conference, but failed to find any evidence of this at all, so let's consider it a fairy tale that is to teach us something useful :-) ) that the amount of decisions a person is to make per period of time is an expendable resource. Making each decision supposedly makes our body to spend some specific amino-acid; its supply renovates with time when you rest, or have a meal, but it never hurts to save some.

So, making decisions is hard, and making as few decisions as possible is a good thing rather than bad. Any open-source project has a lot to decide upon, and making one less decision is a small relief of sort anyway. That's why agreeing on a period to make a release in lets the project save some strength for other things.

A funny story happened there at the first meeting the protest committee had managed to get approved by the police. The committee split into two groups, one of which, led by Limonov, tried to manifest in a place not approved by the police.

Their aim was to protest against a collaboration with police forces, the necessity of such collaboration these manifestations was initially to protest against. At the meeting an interesting case of oppression happened: the police lifted the leaders of the "unapproved" protest group, and forcibly moved them into the approved location, saying, "This is where you should protest [against us]."

What "Strategy 31" has become

(photo by RBC). This is a typical representation of a latest manifestation on the 31st. A lot of wandering with cameras, some police, and only few protesters.

Do you want your software project to become like this?

...And Why It May Not

However, one should never forget the ultimate mission, and never let keeping the schedule be put "in front of the cart". I feel that tying innovations into a time-based release schedule is much harder than doing the same into the release that has more freedom in picking the final date. It's not impossible, no, but is just harder. Mark Shuttleworth, the leader of Ubuntu, speaks about the same in his article on this matter.

I can't back this up these claims with statistical data about any software projects. What I did see, however, is how the energy of these "time-based protests" decayed over time. Many political commenters observed that the regularity became the only thing these protest could deliver. At first, it was enough, because the meetings were oppressed with an unusual degree of violence with no lawful reasons. After the government had relinquished the oppression directed at these particular meetings, and there even were two approved by government, the participants realized that the assemblies lost the nerve, and carried no message whatsoever anymore.

This is an important lesson for software engineers as well, especially for those that are involved in open-source projects , which consist of the volunteers just like public assemblies. Establishing a regular schedule is an important thing, but this is only the first step. If you commit to it too much, and sacrifice the nerve and the sense of the project to make it, you will en up like the time-based protests the "Different Russia" was arranging. At the manifestation a month ago, I saw dozens of police officers, a hundred of loiters who came to take a look at the meeting and photograph it on their iPads, and only 1 (one) actual protester.

A Different View on the Strategy Pattern

If you read a short description of a patern in a usual pattern book or in a Wikipedia article, you will most likely not realize its full power. A true realization comes when you derive the pattern from the programming tasks you are to accomplish on your own, bottom-up.

A month ago I was chatting with a peer who was about to finish his Object-Oriented Programming course. One of the patterns he mentioned was Strategy. As I didn't recognize what it was about at once, I thought that it was fairly useless. Having gotten home and refreshed my memory with this article, I realized that this heuristic, of inferring the lack of use from if I remember what it's about, led me to the right result. The pattern was about encapsulation of a family of algorithms behind an interface. The lack of sense was confirmed by a detailed, but not assuring description of how the Strategy was different from the Bridge pattern.

"Well, just a yet another useless pattern," I had thought... until I found myself writing a bunch of strategies in my script three days later.

The Strategy pattern

First, lets refresh what the strategy pattern is. This is yet another behavioral pattern that supplies the programmer with encapsulation of the behavior. Strategy encapsulates several methods with a common interface, and allows a programmer to switch between strategies seamlessly and reuse them in different environments.

This functionality is not unique. The Interface pattern, Virtual Function pattern, and even simple Function Pointers provide more or less the same. Other unimportant details are only interesting to OOP aficionados. Why bother with Strategies at all then?

There is one case, however, when thinking in terms of Strategies is not artificially induced by OOP courses, but rather is quite a natural way to think about your algorithms. This us when you play a game--or write a computer program that plays one.

Strategy games

There is a whole genre of video games named "Strategies". Though gaming is not the main focus of this post, a lot of what I describe below applies to these games as well. Sadly, in these days of video game consoles, Strategy games do not get as much traction as earlier.

"Still narrow", you might think, "I'm not a game AI programmer, so I'm not going to need this". This argument is relevant indeed, but it.misses the fact that some activities may be modelled as games. This doesn't necessarily refer to gamification, a popular concept in our times of ubiquitous crowdsourcing and Web2.0.

The games I'm talking about are closer to mathematical game concepts, when the environment of your application is modeled as "adversary" who doesn't really tend to respond to your actions in the nicest way possible. An example of that is the script I was implementing. I wanted to create a crawler for a web forum, that should download new posts as soon as possible, given that posts may be deleted by evil moderators, which should not confuse the script. I'll expand on how this was modeled as a "game" later; let's talk about strategies for now.

Let's take a less abstract example of a game to explain what the strategies are about. You a played Battleship, a game where you are to destroy ships on the adversary's board with as few moves as possible. in this game you make a move by shooting to a particular cell of the battlefield, and get a response whether you missed, or hit the ship, or destroyed it completely if it's the case. In fact, you have a "Strategy" that reacts with moves to the messages from the battlefield. (We intentionally leave the art of ship positioning aside).

Implementing a strategy for Battleship game was the very first task I was given as an intern at my first job at ISPRAS. We were tow implement a strategy for the game, and our algorithms competed with each other afterwards. They played thousands of matches, each being a comparison of who was faster to shoot out a randomly generated battlefield. I came up with two strategies: one hacked into the random number generator by srand()-ing it to a predefined value (that won all the matches), and the other was built via a series of experiments with combinations of the strategies implemented following the pattern described in this post.

The programs we wrote back then revealed a lot about our consequent professional lives. One of my peers pre-analyzed the probabilities of a cell being occupied by a ship, and built his strategy on top of this; he's finishing his Ph.D now. The algorithm of the team supervisor was the simplest, the fastest, and had the largest number of victories (naturally). And my algorithm was over-engineered, slower than the others, but flexible and efficient enough: it scored 3rd in the competition.

Most strategies look like this: Shoot the battlefield along diagonals, starting from the middle, until you encounter a hit. As soon as you do, finish th ship you hit by shooting nearby cells, and keep shooting across the diagonals.

Of course, a multitude of other, more complex strategies exist. Each of the strategies, however, fits the common interface: given a result of the previous move, make the next one. The strategy should, thus implement the following interface:

Decomposition of a complex, stateful strategy

A strategy to play even such a simple game may be quite sophisticated. The implementation of next_move will embody a number of states we're currently at (such as "shooting along diagonals", or "trying ti finish a ship".), and the strategy will also implement a complex logic to transition among the states based on the shot result. This pattern, known as "State Machine" is viable, but the whole logic may be simplified.

We can use the rethinked strategy pattern to decompose the strategy into smaller ones. The "global" strategy described above in italics may be turned into a local one as follows. Instead of approaching the way we play a game as one "big" strategy, we instead consider that our strategy changes each move as well. In response to a message from the field, the rethinked next_movefunction now returns the strategy we are now using as well as the next move.

This way, we begin with "shoot diagonals" strategy; it changes to the "kill ship" strategy as soon as the response is "hit", which, in turn, changes back to the shoot diagonals strategy as soon as the response is "ship dead".

A clean way to implement it is to change the interface of a strategy slightly. Instead of a single next_move function that both returns the next move, and changes the internal state, we transform the interface as follows:

The peek function returns the next move you should make according to this strategy, and move function returns the next strategy.

This offers a simpler way to switch strategies. When the strategy feels that a strategy should be switched, it returns NULL from peek, this makes the caller detect that the strategy is going to switch, and doesn't pass the move to the adversary. Instead, it peeks the next strategy. The overall algorithm now looks like this (memory allocation is intentionally put aside):

You may do this in a more complex but more encapsulated manner. In the peek function, you make construct the next strategy, and peek its next move; save the strategy, and then route the move function to it, returning the strategy built.

The implementatinon for the Bttleship game stereos would look like this. Shot diagonals strategy is initialized with a battlefield state, and this should be enough for choosing the next move. As soon as a hit is encountered, the peek function returns nil, and the move func returns "finish the ship" strategy. The "finish ship" strategy is initialized with the last hit, and explores the area around the hit. As soon as the result is "killed", it initializes the diagonal strategy with the current battlefield, and returns it as the next strategy.

Now the implementation looks much more clean and decomposed. You can combine and chain together various strategies, discovering the most efficient through experimenting.

Note that this algorithm doesn't have any protection against infinite loops, so you have to ensure the termination manually. On the other hand, no other way to implement a strategy for a game has any protection against this.

A functional way to encode the strategy change

The decomposition may be continued. We see that the implementation of the "kill ship" strategy has the switch to a specific strategy hardcoded. This limits the reuseability of the class, because we would like to switch to killing a ship and back from many different strategies, rather than just from the diagonal one only. This hardcoding may be solved via subclassing or via using functors, i.e. passing a function to the strategy constructor (I prefer the latter).

When a specialized strategy exhausts the moves it can—or was ought to—make, it calls the function supplied to the constructor (or a virtual method, if you preferred subclassing), and this function returns a new strategy. Most likely, the function is taught to return the previous strategy via a closure. I.e. the "shoot diagonals" strategy's move function may contain such (pseudo)code (I tried to use C++0x lambdas here):

Concurrent execution of strategies

An established interface allows us to define combinations of strategies. Assume we would like to shot along the diagonals in different regions of a battlefield. Instead of implementing a strategy that shoots along two, three, etc regions, we may implement a strategy composition.

A simple composition takes an array of strategies, and asks them for the next move in a round-robin way. In some cases, it's a sufficient composition. In battlefield, however, we don't want, for instance, shoot into the othee regions, if we've hit a ship; what we want is to pay our full attention to that ship. In this case, we would use a priority queue instead to store strategies instead od an array.

What about forum downloading?

Now assume we are to download all messages from a forum, given

  1. each message has a unique id that automatically increments;
  2. there's no way to get the current maximum id without posting a new message;
  3. Moderators may erase messages, and there's no way to distinguish if a message with a particular ID was erased, or it never existed (aside from encountering a message with a greater id);
  4. we should be capable of both getting new messages and downloading the old messages at the same time

We can model the work of such a crawler as a battlefield-like game. Our "move" consists of an ID of a message to download, and of a pause to make before doing so. Our response is the message contents or a notification the it doesn't exist.

Our "strategy" would be a round-robin combination of two strategies. The first goes down from a specified id down to the message number one. The second waits for the new mesages in the following way.

The strategy goes up message by message until it encounters a nonexistent message. Then the strategy switches to a "random walk" strategy that finds a message with a greater that exists, making appropriate pauses. No matter how many messages a moderator would delete between our attempts we should be capable to finding a message that is not deleted. After we found one, we may be sure that all the nonexistent messages between the last one we downloaded successfully and the one we found during the random walk either exist or are removed forever. This part is implemented by another strategy, which then switches to the "random walk", looping the execution.

I implemented a slightly different downloader, given that we always know the ID of the last message, regardless of whether some latest messages were deleted. You may check its sources here (ahe downloader as at the time of writing), and here is the latest version of the script

Other applications and limitations

The most obvious limitation is that all the strategy classes you implement depend severely on what you're move is. Each strategy class you implement is going to construct move objects. If the interface in the constructor used changes, or its meaning becomes different then you have to alter all the classes you implemented. The same applies to the response objects.

Limited applicability

One of the oldest turn-based strategy games, chess, is one of the examples where the approach described is not the best one possible. A more effective way to play chess may be something like Alpha-Beta-pruning rather that structural exploration.

Trying to analyze the possible applications of this strategy framework, I came to a conclusion that it's quite limited. It is a nice way to model a strategy when your game is to explore "terra incognita", to reveal what is initially hidden. Indeed, in Battleship you do not react to intelligent moves of your adversary. Instead, you explore a carefully prepared field, initially covered by the fog of war. The forum downloader is no different: the actions of the evil moderator are in no way affected by the "moves" the program makes.

The framework supports parallelization via clever strategy combination classes, like the one I described earlier. Note that this abstraction moves the parallelization code away from the strategies themselves.

I don't think that this strategy framework it's useful for games where you are to make intelligent moves, such as Chess or Go. Moreover, even in "exploration" games (such as Battleship), it's ability to combine and decouple may just lose to a better tailored but a less decoupled counterpart. However, this framework is invaluable when you need too explore a complex and yet uncovered environment with limited resources, and want too experiment throughout the process.

BLAST at the Competition on Software Verification'12

At my previous job in a research laboratory, I was improving a Static Analysis tool named BLAST. The name abbreviates its origin, and is decomposed as "Berkeley Lazy Automation Software Verification Tool."

By the way, my former peers have prepared a nice comparison of Static versus Dynamic analysis for the Linux Kernel.

The tool solves a problem commonly described as unsolvable, the Reachability Problem. BLAST is capable of telling—sometimes—if a certain line of code is reachable. It doesn't just run the program with some input data; rather, it reads the program source code, and tries to devise if some input data exist that make the program enter a specific location, a specific line of code.

Of course, it can't actually solve it (nothing can). What it can is to prove that a line is never reachable for a certain fraction of programs. A much more useful purpose is to find bugs. For instance, if we want to check a null pointer dereferences, then we could insert checks like this:

And it does find such bugs. In ISPRAS, a several-years effort within the Linux Driver Verification program/project has led to finding several dozens of problems in Linux device drivers. I even committed some fixes to these bugs to the kernel sources.

Tool Competitions

Well, I got distracted again. I've been planning a big blog post about this amazing technology for years, and I'll make one, but not now. Anyway, it was not me who devised how a program that checks reachability of lines in other programs works (it were two Cousuot-s), and I only fixed and improved an already released open-source tool, BLAST. I did create some novel things, but they were quite minor to my taste (it was just a new zero-overhead structure aliasing algorithm).

So, my last project in the Institute for System Programming was to prepare the tool to the Competition on Software Verification held at TACAS'12, which, in turn, is a part of ETAPS'12. There is a lot of domains where people write tools to solve the unsolvable (like the reachability problem) or the NP-hard (I wrote that you should try to solve this anyway). Such problems can not be universally solved, but a tool may solve them "better" than another, and it's interesting and useful to know who's the best.

A natural way to determine who is "better" is to measure the results of different tools on a common set of representative problems. This leads to specialized competitions among the tools that solve a specific problem. SMT-COMP for Satisfiability Module Theories logical solvers, and Answer-Set Programming System Competition are only two I know about, you can also name a couple of Chess and other formalized game tool competitions. I hope that SV-COMP will become one of them.

Our contribution

The Cousuots' approach is about 30 years old, and the technology beinhd BLAST itself is about 10 years old. However, BLAST contained a lot of improvements that made it apply to many programs, not just artificially created small "research-like" programs. That's why we chose it for finding bugs in Linux Drivers, and worked hard to improve it. Our improvement efforts are stored here, and are available under free software licenses, just like the original tools are.

But, being a less developed technology, it was eventually overtaken by more recent tools, such as CPAchecker, which provided a better ground for research experiments, and had a more developed theory. Still, the competition results demonstrated that the tool was still capable (and I hope that our improvements played a significant role in that). We got 5th place with 231 points, the 4th place being occupied by SATabs with 236. The final table was divided into two groups (200+ and the rest), and we belonged to the top tier.

What was funny that we even got a plaque for achieving the best results in "DeviceDrivers64" category... naturally. Too bad I wasn't there to see the ceremony :-(. Here's a photo of our team who contributed to the improvement of BLAST since version 2.5:

By the way, I already featured my former peers in a fictional interview about a "collaboration tool" by Penn&Paper's.

(Left-to-right: Vadim Mutilin, Pavel Shved, Mikhail Mandrykin.)

I'm sure that this project will eventually be superseded by more advanced software, but I still hope that I'll have time to have some fun with it as well.

A more detailed post about the verification technology pending, I'd like to thank my former peers for the great time I spent in the ISPRAS working on what I wished to do since I was a 7th-grader who first heard about the Halting problem. And I wish all the best to the Linux Driver Verification project as well.

Logging in Expressive Languages

I am not aware of any strict mathematical definition of expressiveness of a language. Informally, a language is more expressive if it allows you to achieve the same with less keystrokes. From this point of view, for instance, Perl or Python are more expressive than C. Of course, you can't always achieve with Python most things that C is capable of at low level, and therefore one may question whether Python is really more capable than C. But capability has little to do with expressiveness. In certain domains, one language may be more expressive than the other, and in other it could be just vice versa. (For instance, Python is more expressive in string operations while C is more expressive in low-level raw data representation.)

One may notice, however, that the majority of languages are Turing complete, and you basically may express the same with all of them. What is the difference in expressiveness then, as seen by a Practicioner rather than a Computer Scientist?

I usually use the following criterion. If you write one non-splittable expression instead of a chain of statements, the language is more expressive. Compare the following two programs that compute a sum of an array in C:

...and in Ruby:

..and, in C, we also should find a place to deallocate the array when it's no longer needed.

You may notice that Ruby can achieve with a single expression what C needs several operators and variable declarations for. Three Ruby features contribute to this, anonymous lambda functions, syntax sugar to pass them to functions (such as inject), and to avoid the really useless variable declarations. Isn't it always that beautiful?

In some cases it's not.

What does it take, to log?

Now let's try to add some logging to the programs above. Assume we actually want to hook into the loop, and make it print the square result it has calculated to a logging stream.

Let's take a look at the code again.

C:

Ruby:

Ewww! Why does our Ruby code looks like C? Where is our one-liner? Where is the expressiveness? How come we're explicitly introducing a temporary variable just like in C?

I spotted such patterns, "compute-log-return" in a lot of places in my practice. For i instance, the BLAST codebase I maintain, contains these here and there, because it's written in an expressive functional language (OCaml), and uses logging as a primary debugging mechanism. Because of it, the code that should look beautiful looks as an old prostitute trying to mask their weary skin with tons of makeup, and I'm not happy with this.

Trying to get asleep on a plane, I suddenly realized that this logging could be achieved with a better code. Since "assign-debugprint-return" is a recurring pattern, let's define a helper function

Then, the resultant Ruby code will look like this:

Now our Ruby code looks like Ruby again.

Of course, this is not limited to Ruby. Though less versatile, the sane function may be implemented for the C code as well.

And this brings us back to the original comparison with 1 line in Ruby and five--in C.

Discussion

while you may write such function for a less expressive language (as seen above), it will "stand out" in a large C code sheet and look like ugly duckling. What may benefit most are languages that mix functional and imperative coding styles, such as the aforementioned Ruby and OCaml. The reason why this helper might be useful is that it's relevant to the functional languages in two different ways.

First, functional language allow the programmer to define anonymous function inline, which makes them be frequently passed as parameters to other functions. In C, defining a separate function is, inmost cases, uglier than just adding a convenient logging shortcut.

Second, logging an intermediate result contradicts the functional level paradigm. In functional languages you define a computation as a whole rather than specify concrete sequence of actions. In case of logging, you have to violate a chain of functions with a side-effect operation. Of course, you can try to make it look nice, as we did, but the very reason we need the kludge is that we're doing something a bit wrong.

But it's not limited to functional languages. Most expressive languages that do not require from user a strict specification of type of every identifier can. For example, we might want to debug Perl's regular expression substitution with using a similar function:

(we couldn't use log because it conflicted with the mathematical logarithm function that bears the same name).

***

In any case, the "compute-log-return" function that encapsulated the debug print is a widely useful pattern. And it's especially useful when you're writing a program in an expressive language. Do not let the programming cruft conceal the actual idea of your code, use loggers that hide all the details.