Multithreaded Consensus Versus Practice

Many programmers complain that most of what they've learned in their universities is completely useless in their daily work. Anecdotal evidence, a survey among some MIPT alumni, suggests that the median fraction of courses actually useful to a former student is about 5%. That's like taking one class per week instead of whole week of studying, so could you simply follow those wise brats who indeed spent the whole week smoking weed and boozing?

I'm sure that you simply should look closer. In this post, I tell how a portion of academic knowledge appeared in my work out of the blue.

The Challenge

Last week, we've been trying to improve concurrency of a multithreaded messaging system. When multiple threads wait for incoming messages with distinct labels, one and only one of them has to do the work of parsing the incoming message and directing it to the thread that actually waits for it. Choosing a designated parser thread was not possible, so the waiting threads should somehow assign the work to one and only one of them.

This is not a hard problem, but we had some minor block when we first approached it. So I suggested to step back and take a look at the theory.

That Sounds Familiar

I recall that making several threads designate one and only one of them as a "worker" is called "consensus problem" (Wikipedia provides) a slightly different but equivalent definition.) I also recalled, from my reading of TAoMP ("The Art of Multiprocessor Programming" by Herlihy and Shavit, link), that it was used to demonstrate limitations of various synchronization primitives, and most of them were incapable of solving it. That's right, in the world of formalized computing, some software problems can be proved computationally infeasible. An by "infeasible" I don't mean making a decent AI in first-person shooters, which is definitely infeasible, that's more like Turing machines and Halting problem. So maybe "mutex" is one of the tools that can't solve Consensus?

Who Can and Can Not Solve Consensus

A concurrent object is wait-free if there exists and upper limit N on time/instruction count it takes every call to complete, such that the same limit remains in effect as the number of thread grows indefinitely. Very few concurrent algorithms satisfy this property, and those who do are considered impractical.

Chapter 5 of the book is devoted to assessing relative strength of synchronization primitive operations, and it uses their capability to provide a wait-free solution to consensus problem as a measure of their strength. It later turns out, in chapter 6, that Consensus problem is universal enough to construct a wrapper that turns any single-threaded object into a thread-safe one. So, which synchronization primitives are universal?

Consensus vs. Registers

Assume you only have “simple” memory cells or registers at your disposal, which can turn either way when multiple threads write to them simultaneously. Given these, you only can solve Consensus in a wait-free manner for no more than one thread. (In other words, they are completely useless).

The complete proof is sketched in the book. It's constructed as follows. Assume Consensus can be solved for two threads. At one point, the memory state must be such that the next operation over one of these registers in some thread will determine the outcome. Then all the cases of these operations are enumerated, and viable executions, which usually end up with one thread “running solo,” are constructed. They lead to a situation where the algorithm should return different results based on the same thread-local data, which can't happen.

In particular, this can't happen if you're trying to solve Consensus for two or more threads with simple registers. It is scientifically thus proven that one needs more sophisticated synchronization primitives than just reads and writes to one memory cell. Which could these be?

Consensus vs. Multi-Register Operations

Round Robin Table

That's a sample round robin table with some competition results. Threads fill a similar data structure concurrently when they solve Consensus with multi-cell atomic writes.

If you can atomically write to N*(N+1)/2 memory cells rather than just one, you can solve Consensus for N threads. The impossibility of solving it for more than N follows the same scheme as outlined in the previous section, but the sample solution is very interesting.

Imagine that threads compete in a round robin tournament; well, not really compete, but simply fill the table with its results. Each game is either a win or a lose, and the score is a number of wins against threads that played at least one game.

When a thread participates in the Consensus, it atomically writes “I lose” in N-1 cells corresponding to its results; it also writes something in the “against itself” cell to distinguish itself from those who did not participate yet. If all threads do that, then, whenever somebody examines the table and computes the score of each thread. the thread who has the highest score will never change. The highest-score thread will always be the same one even if the reads are not atomic, and results change as one is reading the table.

Consensus vs. Complex Data Structures

Section 5.4 of the book demonstrates a proof that synchronized FIFO queues can't solve Consensus for more than two threads. It also claims that the same is true for “many similar data types, such as sets, stacks, double-ended queues, and priority queues,” which I don't have any reason to not believe.

Consensus vs. Atomic Integer Operations

By “atomic integer operations” I mean such functions as swap(&x, v) (set x to v and return the previous value) and compareAndSet(&x, e, u) (set x to u if x == e; do not change otherwise; return whether the value was equal to e) and others. Infrequently used, they are somewhat known to system programmers.

Turns out, these operations are not created equal. For instance, swap can't solve Consensus for more than two threads, while comareAndSet can solve it for infinitely many threads with a very simple algorithm:

Mutex Solution Paradox

The book begins with demonstration how one can implement mutexes using only simple registers. I formulated one of such implementations as a puzzle in my previous post. If we could use mutexes succesfully to solve Consensus, we could have replaced them with register implementations, which is impossible. Does it mean that we can't solve consensus with mutexes for more than one thread? Well, what about this simple algorithm?

More than this, we can implement the all-powerful compareAndSet with a mutex by wrapping the comparison and setting into a mutex' lock-unlock. Yet the book claims it's not possible. How come?

Different Flavors of “Solve”

A concurrent object is starvation-free if every call to its method eventually completes regardless of how threads are scheduled. For instance, an algorithm that hangs if only one thread is (unfairly) scheduled, while others sleep indefinitely for no reason, is not starvation-free per se, but may be such if OS promises not to do this, and be fair. It is weaker than wait-freedom because it doesn't require a global bound on the call length.

The mutex-based consensus algorithm is only correct if mutex has strong enough progress condition, for instance, if it's starvation-free (wait-freedom is not required here; see the last exercise to the Chapter 5 in the book). Mutexes implemented with simple registers are not starvation-free hence they don't completely “solve” Consensus. The theoretical results on impossibility given above only provide proofs for wait-free algorithms, however, it depends on the actual implementation of mutex whether the resultant algorithm will be wait-free or not.

It indeed is a corollary of the results discussed above that you can't make a starvation-free lock with simple registers only. Threads in such locks usually “spin” waiting for the memory to enter a state that signals this particular thread (and only it) to enter the critical section. If, at this point, some other thread “runs solo,” you are stuck forever. If you apply the register "weakness" proof scheme to a register-based mutex, this infinite loop will eliminate the contradiction that same thread-local data will lead to different results: they'll lead to endless spinning instead. Notice how that's not the case for compareAndSet() solution to Consensus presented above.

So the resolution to the paradox described above is that the mutex-based program is a solution only if the underlying mutex is starvation-free--in addition to simply being a correct mutual exclusion algorithm. To implement a starvation-free mutex, you need to have compareAndSet or similarly strong primitive operations, but they may be concealed within the lock or the OS implementation, and don't have to appear in your program explicitly.

Consensus vs. Humans

Consensus and Humans

Here's how choices of individual subjects changed over time during the experiment on how humans solve Consesus. Read more here.

Another interesting example of a computational system that can's solve consensus problem is a social network of human beings. A group of researchers conducted an experiment where several people were given material incentives to reach a specific consensus (an equivalent of threads proposing their own value), and were set out to reach one. They could not communicate in any way except for announcing the current value each person agrees to to their neighbors, i.e. they didn't have a designated leader or a common strategy.

The experiments demonstrated that people can't reach consensus in this settings. An interesting finding, though, was that the subjects quickly created a "bipartisan" system with only two colors, and started to juggle them around. Sometimes, new colors appeared as an attempt to unite under a different color, but these attempts quickly died off. Doesn't it remind of anything?..

Unsurprising Conclusion

So, it turns out, the theory didn't help us much with the original task. It helped us understand that thread nomination is a separate, studied problem that can be solved given the right tools. However, our initial choice of tools, the "mutex" was correct, and the knowledge of theory couldn't improve our initial choice. I'm glad it made me read "The Art Of Multiprocessor Programming" book more thoroughly than I otherwise would, and I do not consider it a waste of time. Multicore computing is getting more ubiquitous, and knowing the theory won't hurt.

A Mutex Puzzle

I wrote earlier that I know very few decent programming puzzles. To be more precise, I know one, The Prisoners Riddle. So I was reading that The Art of Multiprocessor Programming book (I already made a post influenced by what I read there, and I'll soon make another one--stay tuned). Some very interesting things the book tells about can be formulated as puzzles. Here's one.

Two threads need a mutex object with Lock and Unlock methods, but they only have three usual shared boolean variables. How to implement such a mutex?

Every thread can call GetThreadId() function which returns 0 for one thread and 1 to the other, and the boolean variables are automatically initialized to false before threads execute.

If you solved it, congratulations; you have just reinvented this wheel.

Since I know the solution, I can't really understand if it's a good riddle or not. It's definitely too hard for a job interview, but still, it can be a good one. So I need your comments. What do you think? Could you solve it?

How to Get Binary Search Right

As you already know, binary search is a O(log N) way to search if a particular value exists in a sorted array. There are some reports on how "most of our implementation" of binary searches are broken. The mistake discussed in that post is the overflow when calculating the index of the middle element. But regardless of whether you get this bit right, there's a different problem with all binary search implementations I've seen.

They're just too damn complicated and hard to get right out of your memory.

Key points

Almost always, if you're relying on your memory, you should prefer simple algorithms to complex, like you should implement a treap instead of an AVL tree. So here's a sequence of steps that helps me to write binary search correctly.

Bisection instead of "binary search"

Starting from this point, we aren't searching for anything. Searching for an element in array has a too complex definition. What should we return if the element doesn't exist, or there are several of them equal to each other? That's a bit too confusing. Instead, let's bisect an array, pose a problem that always have a single solution no matter what.

Given a boolean function f, find the smallest k such that, for all i < k, f(a[i]) == false, and for all j >= k, f(a[j] == true). In other words, we are to find the smallest k such that for all i within array bounds f(a[i]) == (i<k). If the array is sorted in such a way that f(a[i]) is monotonic, being first false and then true, this problem always has a single solution. In some cases, when f(a[i]) is false for all array elements, the resultant k is greater by one than the last index of the array, but this is still not a special case because this is exactly the answer as defined by bold text above.

This is more powerful than the binary search, which can be reduced to this problem I'll call here "bisection". Indeed, to find an element x in the array a of length N, we can do the following:

Now let's get back to solving the bisection, once we've defined it.

Well-defined bounds

The answer to our bisection problem is between 0 and N inclusively, so we will start by assigning i to 0, and j to N. No special cases allowed: we can do without them at all. At each step, the answer we seek will be between our values i <= k <= j, and we will not know any information that would help us to reduce the range size.

Non-overflowing pivot point

It's easy to make an overflow mistake in computing the pivot point; i.e. the point at the middle of the array we'll apply f to to make a step. This problem is discussed in the article I linked above more. If we were in the world of non-bounded integers, it would simply be (i+j)/2, but we live in the world of integer overflows. If i and j are positive integers, which makes sense for array indexes in C++, the never-overflowing expression would be:

Another useful property of this expression (in C++, at least) is that it's never equal to j if i < j. Here's why it's important.

Well-founded termination

A basic way to make sure that your program terminates some day is to construct such a "loop variant" that strictly decreases with each loop iteration, and which can't decrease forever. For our bisection, the natural variant is the length of the part of the original array we currently look for the answer in, which is max(j-i, 0). It can't decrease forever because its value would reach zero at some point, from which there's nowhere to proceed. We only need to prove that this variant decreases each iteration by at least one.

Fun fact: special programs can prove that a program terminates automatically, without human intervention! One of these programs is named... Terminator. You can read more about this in a very accessible article "Proving Program Termination" by Byron Cook et al. I first learned about it in a summer school where Mr. Cook himself gave a talk about this.

First step is to make sure that the time it reaches zero would be the last iteration of the loop (otherwise it wouldn't decrease!), so we use while (i < j).

Now assume that we've understood that the pivot point s does not satisfy the target function f. Our key difference with most binary search algorithms is that we should assign the new i to s+1 rather than s. Given our definition of the problem, s is never the answer in this case, and we should keep our candidate range as small as possible. If f(a[s]) is true, then s could still be the answer, so it does not apply here. Here's our iteration step now:

We mentioned in the previous section that i <= s < j, hence i < s + 1 and s < j, so each of assignments decreases j-i by at least 1 regardless of f's value. This proves that this loop never hangs up forever.

The complete program

So here's our bisection that works for all sorted arrays (in the sense of f(a[i]) being first false then true as i increases).

The search itself in terms of finding if a value exists in a sorted array, would be then written (I'm just repeating what I wrote above, no surprises here):

Proof By Authority

And GNU C++ STL implementation of binary_search follows the same guidelines I posted here. In my version of GCC 4.3.4 binary search looks like this:

Where lower_bound is the bisection function.

Conclusion

Here's my version of bisection and binary search I usually use when I need one. It is designed to be simple, and simple things are easier to get right. Just make sure your loop variant decreases, you do not overflow, and you get rid of as much special cases as possible.

Comments imported from the old website

Pavel Shved on 13 July 2013 commented:

Many thanks to Martin Ward who pointed to inaccuracies in termination proof description.

The binary search is a little bit more complicated than usually presented. An exercise that can help a lot in writing a correct version of binary search is given here http://arxiv.org/abs/1402.4843 I highly recommend looking at it. Without it, it looks like guessing game. If we exclude recursive solution, there are no more than 4-5 correct implementations of the algorithm. All solutions including the one you gave can be derived from the table given in exercises. (Actually I am collecting implementations and showing how to do that.) Your example is optimized so it is not a good starting point. j in your code is keeping the index which is one after the last element in the selected subarray. You need to explain this explicitely. GNU code is optimized for several reasons, one of them is speeding up on assembly level. I am saying this is an advanced level what you are writing about. Nobody can use it as a starting point.

The Pipe-Pipe-Equals

It is hard to come up with a google-friendly name for the ||= construct you see in some programming languages quite often, "pipe-pipe-equals" being the is the closest (other names include "double-pipe equals," "or-equal," or "double-or equals"). Why should we name this monster in the first place? While cryptic at first sight, it is a very convenient shorthand that, I believe, allows us to write cleaner code. Let's see how it works on this example:

Essentially this ||= construct is equivalent to other convenience operators, such as +=. x += 2; means x = x + 2. The "or" operator || usually denotes a weak logical "or" that doesn't try to compute the left-hand side if the right-hand side is true. Usually, meaningful values evaluate to true, while the only evaluating to false are NULL pointers/empty values, zeroes, boolean Falses, and, sometimes, empty strings.

Alternatively, you could set a default hash value, but ||= is used more widely, and also takes exactly one line... doesn't it? ;-)

The statement is used mostly to set a variable to a "default" value if it's "unset" at some point. This gives the most visible advantage when you initialized dictionary elements in a loop. Here's a piece that saves counts of each array element into a dictionary (Ruby):

In some languages (Perl, Javascript), you could not even bother with this, as += 1 on an unset value would result in its assignment to 1.

If you don't initialize output[x], you'll get a runtime error for trying to increment NULL. The advantage of ||= against other ways is that you don't repeat anything. You could have written the same piece as

Oh yes... we forgot about the most natural use of ||=, for booleans. Here's how we'd check if an array contains zeroes if we need to iterate over its elements for something else.

Pipe-pipe-equals is rarely used this "natural" way, though.

But here you have to type output[x] twice and add two more lines of code, which is a complete waste of screen space and, possibly, computing resources if the interpreter doesn't optimize the duplicated computation out. Let's take a look how pipe-pipe-equals works in different languages (we've already seen it in action in Ruby).

Perl

Perl was the first interpreted language I learned, and the first place I saw the pipe-pipe-equals operator in. It works as expected, but is used less often than the direct version of the weak logical "or" to specify default parameters of a function. Here's how you'd put a default to a hash bucket:

This works in a strict mode, too. Note that, in Perl, many things evaluate to false, including empty strings (I once even tried to emulate it in Ruby). To restrict this action to undefs only, use //= in Perl6 instead, which sinks to its right like the Pisa tower, and look as if you're trying to put a division while drunk.

Python

Python has no equivalent of this operator. You have to type the very[long(expression)] twice, and it will be computed twice. You have several different ways to do this:

The engineers more experienced in Python programming, though, assure me that this is not a big deal since you have a different way of supplying default arguments for a function, which seemingly covers half of the use cases for pipe-pipe-equals (this doesn't prevent Ruby from having both, though). Another half is covered by dictionary's dict.get(key, []) method, so that the code piece #1 can be written in a succinct manner. But I still miss it.

Bash (Linux shell)

Bash? Its language is so simplistic, and it looks creepy; how come it would have an equivalent of such a beautiful shortcut? Here it is:

This assigns y to variable if the former is unset or null (as per Bash manual). I mostly used it to set default parameters for environment variables user may or may not set before invoking the script.

C++

While |= is a syntactically correct expression (C++ does not have the "short-circuit" version of this expression), it doesn't do what we discussed here.

C++ is statically typed, so the result of the standard logical "or" is boolean. Retaining the nice semantics we find in a dynamically typed language would require it to be "either the type of the left-hand side or the type of the right-hand side". This is hard to pull in a pass-by-value statically typed language.

Pass-by-value semantics also means that not everything can be assigned a NULL, and not everything can be converted to boolean value. C++ has default arguments as well as Python, so the same reasoning could apply here. You'll have to be more verbose in C++. That's probably why only |= expression is available, which is only useful if its left-hand-side is bool (see sidebar above for similar usage.)

OCaml

Everything said above about C++ applies to OCaml as well. Moreover, OCaml, as a functional language, doesn't have a flawless support for mutation, and pipe-pipe-equals statement its inherently mutational. However, its matching operator would require us to use the very_long_variable twice. However, OCaml and other functional languages have a very interesting construct called "option". If something has "X option" type, it may contain either "nothing" or a value of x. Then, this value may be "unpacked" trough pattern matching:

let very_long_variable = match very_long_variable with None -> y | Some t -> t

here, t is not an shorthand for another very long expression; instead, it's just an identifier, written as is. The match...with allows us to "unpack" values of structured (algebraic) types with shorthands like this. Since this was too long, OCaml has made this a library function Option#default:

let very_long_variable = Option.default very_long_variable y

Anyway, OCaml programs are even more explicit than those in C++ and Python, so trying to tie pipe-pipe-equals into them is quite pointless.

Ruby

"Erm, we saw how it is in Ruby at the beginning," you might think. Well, I lied to you a bit. The thing is that, in Ruby, it is not strictly equivalent to an assignment to a result of logical "or". Which do you think x ||= y is equivalent to?

In Ruby, and only for ||= and &&=, it's the second. If you assign to something other than a local variables, what looks like an assignment is actually a method call (think properties,) and, if so, this assignment does not happen at all if the left-hand side of ||= is false. Which makes sense, but looks like a special case. Read more here.

Why You Might not Need This

Some argue that this operator is mostly useless, especially if their favourite language doesn't have it. Here are some arguments they list.

Redundancy

Indeed, this expression is redundant. You can do the same in a multiple different ways, All the examples I demonstrated above showed how to write essentially a simple if statement in a very short-hand form. The number of characters spared is probably not worth it to include support for this feature to a language designer's must-have checklist.

The statement discussed decreases the redundancy in code in return of broader language definition; each language seeks a balance between these, and often leaves the pipe-pipe-equals aside.

Default Function Arguments

The ability to specify default function argument (C++, Python, Ruby, but not Bash or Perl) covers many use-cases for ||=. However, this doesn't help fancy loops that fill complex structures, one of which we showed above. Nor helps it when you have to specify the default parameter anyway but the concrete value is not known at the time of coding, and is an optional field user may or may not fill.

Confusion

It is confusing. The pipe-pipe-equals requires explanations how it works. It becomes a special snowflake, different from its "mathematical" counterparts +=, if a complex language wants to make its very useful (see Ruby section above). While "confusion" as an excuse of not doing something is my pet peeve (and I tried to explain why), it indeed requires some effort to understand the mechanics. I'm sure that you'll love it once you understand it, and the way I learned about this operator is not by reading a textbook or "The Most Confusing and Obscure Programming Language Features Possible" newsletter, but by reading someone's code.

To Use or Not To Use?

When it comes to deciding whether or not to use pipe-pipe-equals in the "creepy" way we discussed throughout the post, the criticism somehow fades out. If the language supports this, it will be used. I haven't encountered any coding style document that bans this feature. In Ruby and Perl, it is considered as good of an idiom as many. So the answer on the question of whether or not you sohuld use pipe-pipe-equals (||=) in your programs is, definitely, "yes".

Zeno's Flat Screen TV Paradox

If I was to explain Zeno's paradoxes to a layperson, I'd start with the Zeno's Flat Screen TV Paradox. It is very practical: most have firsthand experience in real-life application of this deeply philosophical problem each time we shop for something.

That's just too much choice! (pic taken here). Thankfully, ancient Greek philosophers are to the rescue.

Imagine you come to a store, and see two TVs: a cheaper 30 inch one for $500, and a more expensive 50 inch one for $1000; otherwise the TVs are identical. Which one would you buy?

Zeno's Flat Screen TV paradox states that the only rational choice would be to buy a 50'' TV. Here's the proof.

Imagine you buy a 30'' TV instead. You bring it home, unpack it, start watching your favorite show, and a salesman knocks on your door. "Hey, I see your TV is only 30'' long," he says. "Would you like to upgrade to a 31'' TV for just 25 bucks?"

Well, 25 bucks isn't that much. It's like a half tank of gas, two cinema tickets, or a cab ride across the city. For that price, you permanently get one inch of screen diagonal! Only a fool wouldn't agree. You seal the deal, the salesman takes your old TV, and installs a new one, and you enjoy your 31 inch LED flat screen.

A moment later, a salesman knocks onto your door and asks if you'd like to increase a diagonal of your TV for 1 inch at only a $25 price. This is as good of a deal as the previous one, Even better: as the diagonal length increases, you get more screen space per inch for the same price! You seal the deal, and get a 32 inch TV. But before you even install it, a salesman knocks onto your door with a great offer...

This repeats until you end up paying $500 + 20*$25 = $1000 for a 50'' inch TV set--just as you could have done at the store in the first place. Since you got it as a series of increasingly lucrative deals, there is no sense to buy a 30'' one, and you should definitely go for a thousand dollar 50''. Which means that, a choice of a 30'' TV is never optimal.

Quot Erat Demonstrandum.

Note that it's sensible to manufacture smaller TVs in smaller quantities, because the comparison with a smaller screen is the gist of the paradox. Meanwhile, people have been buying 30'' televisions over and over, while this is completely irrational! And that's the Zeno's Flatscreen TV Paradox.

Why I No Longer Work On Weekends

I remember this picture of my typical day two years ago. As sunny Russian winter morning finally banishes the dull winter night, I finally say "bye" to my girlfriend and leave for historical part of Moscow. Where I'm heading to, in a three-story, old cottage that predates even the communist government, lies my beloved office. I'm alone there. It is deserted, but not because of lay-offs, a party I'm not invited to, or because it's a one-man startup. It is just Sunday.

I log in to my workstation, and start doing it. Checking experiment results. Debugging new experiment frameworks. Just thinking, trying to push limitations of a suboptimal algorithm beyond what the humanty knows so far. Being productive, though less than usual: the presence of peers doing something pressures you to check reddit less frequently.

That was my typical Sunday afternoon two years ago. Today is Sunday too, and I'm writing this very sentence riding a ferry over the Bay rather than sitting in an empty office. To answer why, first I should understand why I went to the office on Sundays in the first place.

Integrity

Of course you've realized that the Sunday office was my shelter from the girlfriend, because it would be impossible to work at home in her presence. Keeping off of distractions is the reason why one would prefer to go to the office to do work, not just on Sunday, but on any other day. Other reasons exist, too.

Sadly, I couldn't find any reference to "anchoring" in a more reputable context than, say, site about fitness or life-coaching. This could be more "scientific" foundation behind it.

Some people take keeping everything off from home to the extreme. For instance, one successful finance consultant doesn't even have a fridge, let alone a PC, at home, because cooking and dining do not belong there as well as working.

One of them is psychological "anchoring". Office is associated in your mind as a place to do work, so your mere presence there, in formal clothing and in working hours, sets the right mood, and makes you more productive than relaxing in a comfy chair wearing slippers and a robe. The reverse is useful too: keeping work off of your home as much as possible makes your relaxation there more fulfilling.

Purely practical reasons kick in, too. At work, you have a fast intranet with a fast access to your cluster. It's somewhat amusing that at every place I worked as a programmer there was a cluster to which a large part of computation was offloaded. First it was a cluster that runs experiments, then it was a distributed software compiling and packaging system, and then the product itself was a distributed system that runs user's code. You definitely want to access this and other helpful corporate resources, such as repositories, with low latency.

Moreover, on your office machine you have a ready environment you work in, with editors opened, and consoles chdir-ed, so you can just dive in in and start instantly where you left. Finally, your corporate policies could prevent you from working from anywhere except the office for security reasons.

However, the reason why I no longer work on weekends is not that office became a less appalling place. The list above misses one crucial component.

Alone in the Ivory Tower

After all, I was not a biologist, so I didn't need any help from a fellow researcher to stare at my tube.

When I was a research intern, working on weekends was easy. I spent a lot of time coding new algorithms and experimenting with them, reading papers written by others, writing my own, or just thinking on stuff. I spent total of 2/3 of the overall time on these activities. They all have one thing in common: I could do this alone.

While interacting with others is a very important part of a researcher's work I used to underestimate, you could work alone for a good chunks of time and still be somewhat productive. No wonder I did. Besides, our project had only 4-5 people, and interacting with them just couldn't occupy much enough time.

Industry and a Larger Project

When I, as a Software Engineer, moved to what they usually call "Industry," I also tried to establish the stream of working on weekends. I managed to sneak in on two Sundays, but that was it. With a much more "officy" office, and still a challenging project, it didn't work anymore.

This is a real picture from Android patch workflow. Many workflows in a larger team look no less insane.

The work now demanded collaboration. The team was 5 times larger, and most of the work I was doing required to check with others frequently. Sometimes it was as ridiculous as making three meetings over three weeks to agree over a certain feature or change that took 15 minutes to implement. More often, you were just stuck waiting on others' decisions, on others' feedback, and on others' work as programmers or sysadmins.

I could find side projects and less important things a programmer always has in his backlog that required no collaboration, but unimportant work was never worth a Sunday spent in the office.

Besides, working on weekends now involved making peers feel uncomfortable. Some practical applications involved assigning a security guard to protect me working, which required me to apply for each weekend work permit separately as per company's policies. Application could be automated, but making another person waste his time was too much. I went twice, when the presence of the guard was justified by sysadmins and managers doing other work they couldn't postpone, and never bothered with it afterwards.

Let's Multiply By Ten

Although I spent weekends just like the normal people, I still stayed longer hours to get more work done. I still made security guards uncomfortable because they have much less opportunities to close earlier, but it was now legally my time anyway.

Now let's multiply by ten the size of the team, the volume of the codebase and the complexity of tasks, which describes my shift to the next job. The amount of work that could be done without the communication also shrunk by the same factor.

Email became the primary work tool. Most of the work is triggered either by notifications from others, then it is discussed with the peers who are much familiar with the codebase. When I am stuck with something I can spend hours in the late evening trying to debug the problem on my own, or I can ask my peers who worked on the project for a long time, and get it resolved within minutes.

Not only the office stopped to enjoy my visits on Sundays, I started to work late hours less. The communication with peers is no longer just red tape, it is the only way to be productive. And this communication is so much more efficient in the office than over e-mail, video conferencing, and phone calls, that you being in the office is inevitable. Studies showed ridiculously big numbers for how the bandwidth of the information our brain receives in real-world communication is larger than what we get by a video call. And it's more fulfilling, too.

To increase my productivity I even started watching my sleep cycle, and I now try hard to move my typical shift from 12-9pm to the normal 9-6pm. It doesn't work out well. I try.

Christmas as a Way to Enjoy an Empty Office

This amount of communication sometimes becomes overwhelming. The incoming mail speed starts to become more than the speed with which you can read it. No wonder that people started to value the Empty Office, when distractions and stress decrease, and you can just enjoy exercising your favorite art. Of special value is the US Christmas holiday, where you have two days others usually take an unpaid leave on so you can work with fewer interruptions.

Few come on weekends to achieve the same effect, of course, because it is still not productive; but a legitimate reason to have a less populated office is of understandable value.

***

I still like my job, and am as passionate about it as when I was younger; I still can handle 6-day work week. But, on Sunday, I'd rather read about something in the comfort of my home, go out with my friends, or visit some local nature. Hiking usually ends with a ferry ride with an awe-inspiring view on San Francisco skyline I can't really enjoy because—let's face it—I open my chromebook and read or write something about programming. I just don't do it in the office anymore.

How to Select a Place to Live via Google Maps

This post is not about programming at all, but just about solving old problems with new computing tools. The old problem is finding the neighbourhood to rent the apartment in, and the tool is a map. Not just a map, but the Google Map. While looking at map when renting is a good advice, Google Map has a killer feature that unlocks a very new option.

Google Map can quickly recalculate the best public transportation route between two points on-the-fly while you drag one of them. This allows you to quickly learn what neighbourhoods provide a good commute to your specific office. You quickly uncover all possible commute options you have by moving the start marker around a neighborhood in order to filter then out the worst from your Craigslist search.

You may have other priorities rather than just getting the best commute, of course: quality schools, places to go out, parks nearby, et cetera.  I only describe how to filter by one criteria, and it doesn't have to be primary.

I only have four "small" requirements to daily commute:

  • Public transportation. Driving a car in a big city is a nightmare, and commuting by bus or train allows you to read an interesting book or magazine (well, for now);
  • Zero transfers. Each transfer drains your lifepower, and robs you of several minutes you could have been standing calmly, reading.
  • Short total walk (5-10 minutes). Walking extra 10 minutes on a flat surface will have zero effect on your health, but will rob you of 10 minutes--or even more if you cross a lot of traffic lights.
  • Overall time about 30 minutes. While you can do something while commuting, in-city public transport distracts you too much. 10 minute shorter commute saves you about 2.5 days per year, so investing 20 more minutes into commute time screening repays itself.

So for me, an ideal commute is a 30-minute ride on a bus or train directly to the office. And without Google Map I wouldn't know that I actually can accomplish this, and live in a cheaper neighbourhood at the same time. The map also uncovered a train stop I didn't pay much attention to--right next to the office.

So, here's a 6-minute video how to scan the area with a couple of tips of Google Maps usage.

(Here's a raw mkv if YouTube is banned in your country).

That was exactly how I scanned the city when I was looking for my new home. It turned out that I avoided the areas many people prefer to live in, but I found the best place specifically tuned for me instead.

Please, bear in mind that I know very little about the City of San Francisco, so I might have missed something, or discarded certain areas too vigorously. But knowledge may play against you: say, you know that you can arrive to Embarcadero by any train, and this prevents you from even trying to find closer public transport stops to your specific office, which Google Map does for you automatically.

There are more features in the Map I didn't show, and more ideas still implausible. You can set more preferences (click "More options," and select Bus/Train or prefer "Fewer Transfers"). You could use waypoint search if you were looking for a good commute for two persons at once (make a route between two offices, add a waypoint, which represents the home, and drag it over the map), but waypoints don't work for public transport for some reason, and this doesn't scale well for three, sadly. And if you know more tips, make your videos and comment here, I'd be happy to hear about them!

Not So Static

"Forget these silly runtime errors you had in Ruby: C++ is statically-typed!" Levap, a seasoned engineer, was enlightening the new hire. "You no longer have to wait while the sample run or unit tests finish to learn that your variables have the wrong type, nor you have to worry about this happening in production".

So, Paul, the younger engineer, stopped worrying. He went back to implementing a subclass that, unlike its parent, would only perform a low-level blocking write of data stored in a buffer object instead of more complex writing semantics found in the parent:

The new Write function would assert that the second parameter was false, and would just call a regular system's write, familiar to any C programmer. The only exception was that it fetched the target from the SimpleWriter object's private variables, and it wrote data chunk-by-chunk until finished. Hence, it didn't have to return the number of bytes written, and threw a fatal error on a severe error.

The yonger engineer soon found out that creating NeverBlockingWriter in place of SimpleWriter only required modifying method, and should involve no data copying. In a dynamic language, he would do just that by replacing the SimpleWrite method of the object with a nonblocking-only version, but in C++, Paul had to instantiate a whole new class and copy the information from the parent one to it. An alternative could be to force the superclass to be friends with its own child, which sounded more suitable for a dumb scene in a sitcom. Oh, if he only had the powers of Ruby...

Paul first merely called the parent's SimpleWrite(buffer, block), but then decided that calling SystemWrite directly was clearer and faster, so he fixed the code shortly afterwards:

This code looked right, but didn't work. What was worse, Paul planned to get the code working on the last hour before Christmas, and he couldn't, which created a sense of incompleteness and incompetence for the whole holiday.

What happened was that Paul updated the name of function to call, but forgot to update its arguments. Normally, C++ would barf a compile error in an instant. In this case, however, the trap was perfect: C++ silently, without a single warning, cast the pointer to Buffer to pointer to void, and boolean true to one. The program expected the other endpoint to reply to the data written, so the lack of the complete write created a very weird bug. Apparently, these functions were not different enough for C++ to show a compile error.

The fixed version immediately demonstrated that everything else was OK:

"Statically typed, my ass," were Paul's next words. Nobody heard him, since he stayed later to fight this bug.

Paul saved the code, and finally left the office. On his train, he shed a tear for OCaml that distinguished between integer and boolean, and promised that next time he saw a weird bug, he would check if he fell into another loophole in C++'s "static" typing.

"Lol, just pay more attention next time, smartass," the seasoned engineer advised.

Comments imported from the old website

http://www.koryavov.net/ on 04 January 2013 commented:

Forget C++. Use Java! It is a true statically typed language. :)

Petar Petrov on 29 January 2013 commented:

From: http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html I tried using: -Wconversion what people on StackOverflow recommended too. NADA! no err no warn :( there must be a way to atleast warn on implicit conversation!

Reproducible on G++3.7

Pavel Shved on 08 February 2013 commented:

Well, technically, it isn't a bug. The conversions are very well defined, and should be expected. You are very unlikely to encounter them at the same time, this unlikeliness is why the post appeared in the first place :)

The Fallacy of Simple User Interfaces

A half a year ago, I've been challenged with improving functionality of a piece of software without making its interface more bloated than it already is. That made me thinking about the complexity of user interfaces, how it affects the acceptance of the software, and what should programmers do about it.

Even Catholic Church pays attention to "User-Friendly Features"!

I'm not a UX specialist of any sort. But, I was a Gentoo user once. Gentoo is a Linux distribution that is famous for two things: 1) it is very flexible so every user can only build a part of the system they need, and 2) most installation/configuration operations are performed from your console via CLI tools (at best) or editing text files (at worst). I found out that long-time Gentoo users who are finally tired of it develop a picky taste for really good user interfaces. So I hope that my thoughts on that matter would be somewhat useful.

Why Do We Want The Interface Simple?

The famous cartoon on complexity of interfaces.

There is a cartoon that explains it all (like many other computing-related topics), which is linked in sidebar. Complex interfaces are annoying. They make people think, they make people actually fill endless forms, ask questions about it, satisfying the kafkaesque bureaucrat monster minds behind designing them. Remember all the "forms" you had to fill at consular sections, customs, and other public offices? They usually come with an instruction how to fill them twice as large... Well, the complex interfaces of computer software are even worse because they change as you interact with them!

Note that I would like to talk here about a broader concept than just "interface" as what user sees on a single screen. In this post I actually talk about User Experience (UX), which includes but is not limited to how forms are laid out. Instead, its concern is holistic evaluation of how user interacts with software throughout its whole lifetime.

Of course, most things you fill into these forms is necessary. Compared to Google search bar, which is approaching just answering a question you put there, they seem like getting to 20th floor by stairs compared to 2nd. But (provided the system cares about the citizens), these government guys really need that information you're to fill in into each form. Being long doesn't mean being complex, and a long-ish questionnaire really makes the user think less than a freetext field to enter your essay.

Bad Interface as a Programmer's Failure

So there is a huge class of problems where a complex, comprehensive interface is the answer. At the same time, there are complexities of other sort, which do not have any right to exist, and should be perished from each and every product by all means possible. It is when we make the user do our job, to follow an algorithm. Making machines follow algorithms is our job as programmers. We are paid for it, respected for it, and despised because of our devotion to it, but it's what we are to do. Note the emphasis on the word "we". We, not our users.

If you have to instruct your user like this, "if this checkbox is active, then you should set this flag; otherwise, set that flag", or "when you select this option, click the button below as well," then you are implementing a monster your users are going to hate. An only excuse to putting such decision is that you should build the tool immediately, and are going to improve it ASAP.

Not your job, actually? There are designer guys, User eXperience managers who should think this through rather than you? But do they really have a foundation that allows them enough variation? Don't they have to lower their expectations of what they can design because some of their ideas would require too costly architectural changes? I'm sure you at some point had a talk like, "Whoa, we can't implement this! It would require too much something!" Well, what it sometimes actually requires is additional effort on our part. We can--and do--affect whether the software is hard to use and has a handy interface, and you should be aware of that power in your hands from the very beginning. It's easy to screw up, and put too much limitations, and then ditch all interface improvements because their cost will rise.

Smells

Some designs are definitely programmers' failures. Look what I saw trying to transfer funds from one account to another in my online banking:

What's so damn hard in telling me when my transaction is going to complete if I perform it NOW? Doesn't the web server have access to clock??? Especially if it refers to a standardized time, but it could look up my local time via Javascript either. Why is it making *me* to look at my watch, and compute the date my transaction will complete? I know how programmers in banks are paid, don't tell me they aren't qualified to do this. It's just a lack of understanding that interface is bad if the user has to follow some kind of program rather than have a computer do it..

In my opinion, there are two smells that denote that your interface is wrong. It's badly designed if

  1. you write policies and instructions instead of preventing mistakes programmatically. The designers of industrial machines got this ages ago. In addition to writing in big red letters DO NOT PUT YOUR HANDS HERE, they just install a damper that prevents you from doing this. This has been saving lives and limbs for centuries;
  2. the interface can have a large number of inconsistent states. When you act upon a control, something changes in the interface, i.e. the interface's state changes. If your interface allows that in a comparatively large number of cases a state is inconsistent with your policy or your intention, or is just meaningless, it starts to smell. The fewer meaningless states you have, the better your interface is.

I tried to implement "Request an Update" checkbox Bugzilla extension for ROSA Desktop so that it follows these guidelines. The checkbox is disabled when there's no enough information in the comment, and that one checkbox (one bit of information) correctly requests an update and notifies the interested party regardless of what stage in the overall workflow it's in.

Of course, more established distribution maintenance systems, such as Bodhi or Launchpad are even simpler, but that's nearly the best you could do with a small patch to Bugzilla.

The reverse doesn't hold. If all states are meaningful, it doesn't mean that interface is already easy enough. It may lack handiness, expose too many parameters, greet users with insane default values. However, if it's too easy to make the wrong choice, the interface is definitely awful.

A somewhat modern Alan Wake game (I highly recomment it). The action item is highlighted, you don't need to hunt for it—and frankly this does improve the gameplay. Screenshot courtesy of MobyGames.

Game designers got it recently, and modern games highlight action items for you, expose less options where to go, and always provide an infinite box of ammo if you are to shoot your way through. The result? I see advertisements of Borderlands 2 on a bus stop. Gaming is no longer for nerds only!

Another case is automating the processes user shod not understand. One of these cases is installing drivers. Most of the time you install Linux, you have to install a proprietary driver for hour video card, and all the manuals say you should select the correct one based on its make. These manuals even provide you with instructions how to learn which video card is installed.

Why in the world can't the software look this up, and choose the correct video card on their own? New hardware comes out, and you can't know the names beforehand? Well, do guess! You think that casual users are smarter than your gueses? You update the wiki pages with instructions, why not update the software at the same time? No reason.

Back to game developers, they all bundle the latest version of DirectX, and automatically install it along with tge game itself without any, unnecessary questions asked from the user. Assassin's Creed 3 on billboards all over the city; that's the result.

How Easy Should the Interface Be?

Note that I didn't say anything about complexity; why? Some people think that the complexities of interface is what prevents others from using a product. This is an easy explanation. Many dream about a "brain-machine interface" as it would solve all problems with the complex interfaces. Myriads of buttons, overwhelming amount of gauges and interactive controls, and warnings like "your signature must fit completely in the box" will be useless because the answers to the questions would be transferred directly from human brain.

Contrary, the brain-machine interface is going to solve nothing. Because most users just do not have any brain.

The brain-machine interface does not jump over a barrier unachievable by other means. It is a result of gradual convergence of the means of control and our thoughts. The simplification of interfaces we see nowadays merely exposes the truth we should have unveiled years ago: if our thoughts are a mess, then a simpler interface doesn't solve anything. The interfaces are getting more and more simple, and people's persistent incapability to cope with them frightens me. I mean, are we really that dumb?

Since the user is going to be the problem, there's just one thing to do. To increase its knowledge. Extended to many systems, it could be called as increasing the likelihood that a person who encounters a system will infer the principles behind it, and starts asking the right questions.

One of the things that surprised me when I tried to deploy a wiki software in an office was that people couldn't create a page. I'm not sure if it was just an excuse to dodge a mundane job, but users reported that they couldn't find a button "Create New Page". Mediawiki, which is the software behind Wikipedia doesn't indeed has this button. Why? Because, unlike Google Docs, wiki is a set of interconnected pages, so pages not linked from elsewhere don't make sense. So the correct way to create a page is use it as if it's already exists, put a link to it. That simple and tricky at the same time.

What I think my failure was is that I didn't show how to use the wiki pages, didn't educate users. Without cross-reference, wiki pages are inferior to Google documents, which puts the whole idea into jeopardy.

No matter how simple your interface is, it is never going to be less complex than the concept behind the software. Search engine and other information-retrieval system are lucky: their concept converges to a single line of text (or a verbal query) where user enter what they need to know. Such systems then automatically suggest if user was looking for videos, images, and/or nearby shops (like Google currently does). However, when the concepts of information behind the scenes become more complex, such as maintaining a set of interconnected pages (wiki), or adjusting privacy settings (Google Plus).

So What?

So here's the list of things I would keep in mind as a programmer to make the interface of the product better:

Code With User Experience in Mind

Consult UX specialists early and often. Make sure your coding decisions you made at earlier stages of development do not prevent improving user experience after you get alpha-version feedback. The attitude "let's get it working first, and then think about User Experience," can easily lead to inability to actually improve it. It will be very easy not just to say, but to show with compelling evidence that you can't improve your interface because your code doesn't allow that. But really, putting code in front of User Experience is like putting the cart before the horse.

Convenient interface (or, let's say A Better User Experience) doesn't only mean that your knobs should be shiny. For instance, Google is famous for the speed of its products. Apple's interfaces (Mac, iPhone) are faster than many other their competitors (modern powerful Android phones changed it though). And it's not just convenience: being slow may even kill your project, as, I think, it killed one of mine.

And isn't it the most straightforward task for programmers: to make the gears faster?

Provide Early Response for User Input

If your user has to follow the rules when filing something, provide feedback ASAP. Don't make user click "Next" and only then receive response. Don't punish users for providing incorrect input because you think they should have read about the exact format for entering their SSN. Such approach is abomination. Do you really want your product make the same impression as an embassy where you apply for a visa or a public office full of nine-to-five clerks with their "read 10 pages of how to submit an application, and we'll reject it without explanation if something is wrong"? No, you don't.

That's why you provide pop-up hints, color input fields red until user gets it right. Just take a look how New Google Account window hints you and checks your input as you type, even before you reach the "Submit" button:

Decouple Rules From Code, Make Them Configurable

If your user should follow some rules (mainly when providing input to forms), try to make them re-configurable in a declarative style. This is important since the rules tend to change over time, though, I wouldn't say it's crucial. It's hard to get, too, especially since the previous section suggests that you should tailor your feedback instead of just verifying your model. Just make sure it doesn't end like this:

Credit card validation is my pet peeve... Why can't you tell me, "Sir, you missed a digit?" You know exactly how many digits there should be. But instead you're telling me something I can interpret as, "Your credit card is blocked!" This one, at least, doesn't ask me explicitly if I have a Visa or Master card, because you can learn it from the number I enter. Good girl.

Apparently, form validator they employed for that checkout page made it so hard to configure nice just-in-time validations that they resorted to the abomination of "error messages for the form". Of course, that's an additional challenge: to make this configuration just flexible enough for your purpose (, and here's where your judgement, as of a professional, comes to play. I thing that flexibility is one of the reasons "cancan" wins over "declaratice_auithorization" in Ruby on Rails web authorization plugins: the former is much more flexible and easier to configure.

Educate Your Users

If the concept your software tries to manage is complex, then simplifying the interface will only increase frustration because user will feel like he doesn't have sufficient tools to tailor the information to its needs. Instead, invest time into making users more informed. Make educational, not promotional videos; Talk at conferences, explain the concepts as soon as users encounter difficulties automatically by popping out videos. The thing is that, even if your interface is not ideal yet, it will never become simpler than those concepts, so it will never hurt to explain them. You can also pop up a chats with customer representative. The latter was recently employed by many service providers, and now their websites pop up windows offering human help if you are lost while browsing the site.

You think it's a job of your PR manager, not yours? Well, are you sure your PR folks understand the concepts in the first place?

Conclusion

I tried to summarize my thoughts on what makes a good user interface, why the simple interface isn't necessarily the best, and what to do about it as a programer. And you really can do a lot to improve the interface of your software: devise the ways of automating figuring out unnecessary details (such as video card make) for your users, improve speed, explain the concepts behind and typical use-cases. And, as a programmer, your unwillingness to deal with this can destroy your project, well, just as many other things you fail to take care of can.

Consistency Models Explained Briefly

Wikipedia does a ridiculously bad job at explaining different consistency conditions of multithreaded/distributed systems. Just take a look at the articles, and see how inconsistent the consistency model descriptions are. I'll try to get them together.

What is a Consistency Model?

When you're describing how your distributed system behaves, you usually specify something named a "consistency model". This "model" describes how distant the behavior of your system is in multi-threaded or multi-machine environment from the "ideal" behavior.

Assume that you're implementing a data structure (a set, for instance; say, you can add/remove elements, and check for presence). The specification (model) of its behavior usually implies that the operations are performed sequentially. For instance, a set indicates a presence of an element if ant only if it was added before the presence check, and not removed after that. The terms "before" and "after" imply that there is an order in operations over that structure. While it is straightforward to know the order of operations in a single thread, it gets unclear if there are several threads or machines:

  • Calls may overlap. If two threads call methods on the same structure at approximately the same time, the calls may overlap because they don't happen simultaneously in real world. In what order the calls took effect is not clear, however, because thread scheduling or network connection introduces arbitrary delays in processing.
  • Calls may not take effect immediately. When a method call returns, it is sometimes not clear when the operation actually "takes the effect." By diluting the notion of "taking the effect" one may achieve increase in speed. I.e. the elements added to a stack may not be fetched in the exactly same order, but the operations could be faster. The rationale behind this is that the "order" of events that happen in different parts of your cluster is not that well-defined and meaningful in the first place. I talked about why this is natural in my other post "Relativity of Simultaneity in Distributed Computing.

Object-Oriented Model and Executions

A consistency model usually assumes Object-Oriented approach to describing a system. Assume that we have something (say, a set), and its internal state can be observed and altered via a well-defined set of methods only (say, add(elem), remove(elem), and bool in(elem)). The OO approach assumes that internal state is concealed, and it has no meaning to discuss it per se, and we do not define consistency models in terms of object's state. Instead, we restrict the values these methods may return based on when other methods were called in different threads or machines.

In practice, method calls don't happen instantly, and our models totally acknowledge this. The set of method call times, the times it takes to complete each, their arguments, and their return values is named an execution. Executions are usually depicted as intervals plotted at lines. The parallel lines represent how time simultaneously passes in all threads. The intervals are the method calls, the beginning corresponding to the time the method was called and the end when the method returned. The call arguments and the return values as well as the method and object name are usually written near the method call. Here's a sample execution of a set object:

We will omit the actual call values further because consistency models do not operate in terms of actual arguments and values, but rather try to "map" concurrent executions to sequential ones. In our further diagrams we will also assume that all methods are called against the same object.

Since consistency model imposes restrictions on an execution, the latter may match or not match a consistency model. A data structure adheres to a consistency model iff every execution over its methods we may observe adheres to it. That's what's hidden behind such phrases as "sequentially consistent stack".

No Consistency

The most important consistency "model", is, of course, no consistency. Nothing is guaranteed. The methods called on the object may observe anything. We may add two different elements to an empty set, and observe that the size of set is three, and none of these elements is what we added.

All other consistency models are opposite to the No Consistency in the following sense. They all assume that the method calls may be re-arranged in a single-threaded sequence that is correct with respect to the object's specification. Let's call it arranged correctly in the rest of the post. For example, a set of pets that adheres to any consistency model described below can't tell us that it contains a cow, if only dogs were added to it.

"Multithreading" Consistency Models

What I discovered is that consistency models are divided into two categories: that lean to multithreading (shared memory multiprocessor systems), and to "multi-machine" (distributed systems). I find the former category more formal and mathematical; these models can be used to describe behavior precise enough to allow math-level, rock-solid reasoning about system properties. Some "distributed" consistency models are less precise in their definition, but are widely mentioned in documentation and description, and do allow users to make informed decisions.

Let's start with the "multithreading" ones.

Strong Consistency aka Linearizability

An execution is strongly consistent (linearizable) if the method calls can be correctly arranged retaining the mutual order of calls that do not overlap in time, regardless of what thread calls them. (See "No Consistency" section for definition of "correctly arranged").

In other words, if two calls overlap, then they may "take effect" in any order, but if a call returns, then you may be sure it already worked.

A more anecdotal definition is "before the method returns, it completely takes effect consistently with other threads." This is also usually called "thread safety" unless more precise definition of consistency model is specified. This is quite a stringent restriction, especially for distributed systems, where threads run on different machines.

Two linearizable executions combined also form a linearizable execution. This is an important property, since it allows us to combine linearizable components to get strongly consistent systems. This is not true for other models, for instance....

Sequential Consistency

An execution is sequentially consistent if the method calls can be correctly arranged retaining the mutual order of method calls in each thread.

The calls in different threads can be re-arranged however you wish, regardless of when they start and when they end.

An example of a sequentially consistent system is a naive notion of RAM in multi-core processor. When different threads read from the memory, they actually read from a cache or a register, which is synchronized with a common RAM at writes. While values read and written on different cores may not be immediately synchronized (cores may diverge in the notion of what value a cell contains), the values read and written on the same core are always consistent with each other. This is naive, though: even this guarantee is relaxed by modern compilers, and accesses to different variables within one thread can be reordered (as far as I recall, ARM platforms are guilty of this).

You might notice that the definition of sequential consistency does not directly prevents methods to return values from the future (i.e. to read what will be written later in a different thread). Yet I claim that RAM is sequentially consistent; how's that? The thing is that each execution should be sequentially consistent. If read and write do not overlap, and write follows the read, we may just stop the program between them. Since each execution should be consistent, the value read should be written before in some thread. Therefore, this "future prediction" is not possible in consistent data structures (not only those that are "sequentially" consistent) even that the definition doesn't directly prevent it.

However, if we combine two sequentially consistent executions, the result won't be sequentially consistent. But there is another model, weaker than linearizability, that makes it possible.

Quiescent Consistency

An execution is quiescently consistent if the method calls can be correctly arranged retaining the mutual order of calls separated by quiescence, a period of time where no method is being called in any thread.

This consistency condition is stronger than sequential consistency, but is still not strong enough that we usually expect from a multithreaded system. See my other post for a more in-depth example of a quiescently consistent stack. The reason to use this model is to attain a higher degree of concurrency and better response time in return of weakening consistency a bit.

Quiescent consistency is not as esoteric as you might think, especially since you barely heard about it. Here's an easy way to imagine a quiescently consistent system: assume that all writes are cached, and the cache is flushed to a persistent storage when there are no ongoing writes (or after it reaches certain size). There are special data structures that are quiescently consistent.

Quiescent consistency components can be combined to form quiescently consistent components again. Note also, that strong consistency implies quiescent consistency, hence you can combine these two kinds of systems, and the result will be quiescently consistent.

"Distributed" Consistency Models

Consistency is not Fault Tolerance

It may be tempting to mix the concepts of "consistency" and "fault tolerance". The former is how object reacts to method calls depending on the calling thread (or, in other words, how data are seen by multiple observers). Fault (or Partition) Tolerance describes how system behaves when system becomes partitioned, i.e. messages are lost, and/or parts of the system get disconnected from the network, or form several networks. These concepts are orthogonal to each other: an object can have one of them, both of them, or neither.

It will be, more likely, one of them. There is a scientific result known as CAP theorem that states that a system can't at the same time guarantee:

  • Linearizability
  • Availability (whether RPC calls send replies)
  • Fault Tolerance

In this post, however, we will focus on consistency rather than other properties of distributed systems. Note that multithreaded systems due to close proximity of their components, and less versatile medium that connects them, do not have to worry about fault tolerance...at least, at the current state of art.

The following models are usually seen in the context of distributed systems, i.e. those that reside on multiple machines, communication between which is slow enough. To bring Object-Oriented abstraction to distributed systems, programmers invented "Remote Procedure Call" (RPC) pattern. Method calls (object identifiers, method name, and arguments) are encoded into messages that are then sent across the network to specified destination, which may be the part of the encoding too. Method calls that do not expect a reply (say, unreliable "fire-and-forget" writes) may be treated as instantaneous.

Eventual Consistency

A system is eventually consistent if the method calls can be correctly arranged retaining the mutual order of calls separated by a sufficiently long period of time.

The eventual consistency definition—as well as its understanding—is specifically vague on how much time it would take, and whether this time depends on the call history. We can't really tell how soon an eventually consistent set will respond that an element X exists in it after it was added. But "eventually" it will.

This looks like a quite useless consistency model, since it doesn't actually restrict anything mathematically. Why is it used then? Exactly because of its lack of restriction. Users usually expect some consistency from a system, and if there's actually not much of it, something should be said about this. Eventual consistency is actually a limitation of liability; it's used in documentation to convey that the system can't guarantee that different threads become in sync immediately. However, the system is engineered in such a way that they eventually take effect. Here's an example (here's more on eventual vs. strong consistency)

Some authors, according to wikipedia, require that no method calls should be invoked during the "long period of time" for system to get in sync. This actually sounds like a stronger version of quiescent consistency, where a qualifying quiescence should be long enough.

Strict Consistency

An execution is strictly consistent if the method calls sorted by completion time are correctly ordered.

This is as simple and powerful as it's impossible to attain in most practical systems that have more than one thread. First, it order calls according to "global" time, which is a purely mathematical abstraction unachievable in real life, both according to Physics, and to the theory of distributed computing. Second, any implementation I might imagine would essentially disallow multithreading per se.

This is my rendition on the more "official" definition (1, 2), which defines a strictly consistent system as one where any read operation over a certain item returns the value supplied to the latest write operation according to global time counter. I don't like this definition because it needs to distinguish between "reads" and "writes", and introduces the concept of "data item"—i.e. it breaches the object-oriented approach. Still, I'm sure that if the system really consists of "data items," the definition at the beginning of the section would work as well.

Serializability

An execution is serializable if its methods in each thread are grouped in contiguous, non-overlapping transactions, and the methods can be correctly ordered in such a way that transactions are still contiguous.

In other words, the outcome of a sequence of method calls can be described as a serial execution of transactions, where methods of different transactions are not mingled, while the transactions actually overlap in time across different threads. The useful outcome here is that calls within overlapping transactions can be reordered, even if they don't overlap. However, "serializability" is not enough to describe the system: while it limits reordering of methods across transactions, it is unclear to what extent transactions can be reordered themselves. This is a job for a different consistency model. And serializability is usually used to "remind" about this important property of transaction-based systems.

Practice

Some systems provide several levels of consistency for operations (like the AppEngine's High-Reliability Datastore linked above). You should understand what tradeoffs each model has to make the decision best suitable for your specific application.

Most system you encounter will be not consistent at all ("thread-unsafe"), or linearizable (this is what usually understood by "thread-safe"). If the consistency model differs, it will be noted explicitly.

Strictly speaking, most systems that claim linearizability will be not consistent due to bugs, but these nasty little creatures aren't suitable for careful scientific discourse anyway.

Now that you know what consistency models are out there (and that they actually exist), you can make informed decisions based on what's specified in documentation. If you're a system developer, you should note that you are not bound to choose between linearizability and "thread-unsafety" when designing a library. The models presented above may be viewed as patterns that enumerate what behaviors in multithreaded context are considered universally useful. Just consider carefully what your users need, and do not forget to specify which model your system follows. It will also help to link to something that explains what lies behind the words "somethingly consistent" well--like this post, for instance.

Comments imported from the old website

Pavel Shved on 13 November 2012 commented:

I'd like to thank ssmir for valuable comments on how to improve this post.