Recent blog updates

Shown here are only posts related to similarities. You can view all posts here.

Time-Based Protests

Sample time-based Protest

(photo by Ivan Afanasiev). The largest "Strategy 31" time-based protest I attended (January 2011). A protester features a sign with a protest logo.

A month ago I bumped into a protest manifestation in Moscow. It was one of a series of events "Different Russia" movement has been organizing since 2009. The interesting thing in this event is that it happens on 31st of any month consisting of enough days. This day was selected because the 31st clause in the Russian Constitution stated the freedom of assembly the movement believed was violated. Russian protest movement considered innovative to have a massive manifestation at a a specific date regardless of whether it was approved. For software engineers, however, doing such thing is not an epiphany of any kind. We call it "time-based releases", so I call the way of arranging such protests "time-based protests".

Time-Based Releases

Time-based release is a pattern in software engineering which have taken the open-source world's mind over the latest five years. It means that the product team promises that each new release will be completed on a preliminarily established schedule no matter what new functionality or bug fixes it is about to contain. The release schedules are then arranged to ensure that a complete release cycle with such stages as feature proposal and selection, development, testing, and distribution completes by a given date. Release cycles often span from several months to years.

Many open-source projects switched from "when it's done" to time-based release algorithms. The most notable examples are Linux distributions named Fedora and Ubuntu, which followed GNOME project. Our ROSA Desktop Linux distribution does this as well. It's usually commented that time-based releases help to improve stability, to maintain a sustainable product delivery, and to strengthen the collaboration between developers and testers and users.

The key result here, however is that time-based release model involves a trade of innovation to an established process. No matter how many new, potentially innovative features we develop, we should release at this point, over. In the aforementioned Ubuntu this leads to criticism.

Sometimes I feel that just making release schedules is enough, and it is why people use time-based release as their planning paradigm in the first place. The thing is that you can't make a time-based release without a schedule. This makes me wonder if some projects opt for time-based release model just to finally make themselves do release schedules.

Why It Should Work Then

The most important reason to choose time-based release over "when it's done" is that this makes you think less and make less decisions. How is this important?

There are theories (I heard it at a conference, but failed to find any evidence of this at all, so let's consider it a fairy tale that is to teach us something useful :-) ) that the amount of decisions a person is to make per period of time is an expendable resource. Making each decision supposedly makes our body to spend some specific amino-acid; its supply renovates with time when you rest, or have a meal, but it never hurts to save some.

So, making decisions is hard, and making as few decisions as possible is a good thing rather than bad. Any open-source project has a lot to decide upon, and making one less decision is a small relief of sort anyway. That's why agreeing on a period to make a release in lets the project save some strength for other things.

A funny story happened there at the first meeting the protest committee had managed to get approved by the police. The committee split into two groups, one of which, led by Limonov, tried to manifest in a place not approved by the police.

Their aim was to protest against a collaboration with police forces, the necessity of such collaboration these manifestations was initially to protest against. At the meeting an interesting case of oppression happened: the police lifted the leaders of the "unapproved" protest group, and forcibly moved them into the approved location, saying, "This is where you should protest [against us]."

What "Strategy 31" has become

(photo by RBC). This is a typical representation of a latest manifestation on the 31st. A lot of wandering with cameras, some police, and only few protesters.

Do you want your software project to become like this?

...And Why It May Not

However, one should never forget the ultimate mission, and never let keeping the schedule be put "in front of the cart". I feel that tying innovations into a time-based release schedule is much harder than doing the same into the release that has more freedom in picking the final date. It's not impossible, no, but is just harder. Mark Shuttleworth, the leader of Ubuntu, speaks about the same in his article on this matter.

I can't back this up these claims with statistical data about any software projects. What I did see, however, is how the energy of these "time-based protests" decayed over time. Many political commenters observed that the regularity became the only thing these protest could deliver. At first, it was enough, because the meetings were oppressed with an unusual degree of violence with no lawful reasons. After the government had relinquished the oppression directed at these particular meetings, and there even were two approved by government, the participants realized that the assemblies lost the nerve, and carried no message whatsoever anymore.

This is an important lesson for software engineers as well, especially for those that are involved in open-source projects , which consist of the volunteers just like public assemblies. Establishing a regular schedule is an important thing, but this is only the first step. If you commit to it too much, and sacrifice the nerve and the sense of the project to make it, you will en up like the time-based protests the "Different Russia" was arranging. At the manifestation a month ago, I saw dozens of police officers, a hundred of loiters who came to take a look at the meeting and photograph it on their iPads, and only 1 (one) actual protester.

Read on | Comments (0) | Make a comment >>


Relativity of Simultaneity in Distributed Computing

About a year ago, I described an allusion to physical phenomena in computing in "Caching and Levers". This post is devoted to a more complex theory, namely the Special Theory of Relativity (STR), and mostly to one of its implications, the "relativity of simultaneity".

Relativity of Simultaneity

Special Theory of Relativity may be approached as a set of equations for coordinate translation that take their effect as the speeds of objects approach the speed of light, c.

STR is based on two postulates, one of which states that light is always propagated in empty space with a constant velocity c regardless of the speed of the emitting body. This may be used to introduce a tricky paradox.

Train car experiment

As seen by a passenger of the train. The events happen simultaneously.

As seen by a spectator on the station. The "back" event happens earlier than the "front" event.

(pictures from Wikipedia)

Assume that a train car is passing you when you're standing on a station platform. The train car has a lamp installed exactly at its middle. When the train passes you, a lamp flashes, and then you watch when the light reaches the front and the back of the car. It's interesting that those who sit in the car will notice the light hitting the two walls simultaneously, while those who stand at a platform will notice the back wall lit earlier than the front one. This is implied by the postulate described above: as seen by a spectator on the platform, the light propagates with equal speed at all directions, so it will hit the rear wall before the front one, as the former moves toward the light source, while the latter reproaches it. See wikipedia for more info.

This paradox, known as relativity of simultaneity may be formulated more generically: whether several events are simultaneous, depends on the location one watches them.

But what does it have to do with the computing? Before I answer this, I'd like to share what I've learned from a CACM magazine.

Quiescent consistency

In the multicore age, the classical data structures we all know about are about to change. The reason for that is the increasing demand on computation speed and volume that careful "sequential" CPU chip engineering is no longer capable to provide.

This challenge makes us rethink our approach to algorithm and data structure design. If we want data strctures to be efficient, we no longer may expect them to behave as good old ones in the sequential edge.

In distributed programming, there is an approach to describe data structure behavior known as quiescent consistency. There is a number of consistency conditions, sequential consistency, linearizability and others. These conditions describe how an object behaves when there are several threads calling its methods.

A data structure possesses quiescent consistency if it is consistent between its states of quiescence, i.e. when there are no methods currently in progress. As soon as a quiescently consistent structure has no operations pending (i.e. reaches the quiescence), we may be sure that the executions of methods before this state and after this state is never interpositioned.

Imagine a quiescently consistent stack. An implementation of it is described in this CACM paper "Data Structures in the Multicore Age", the one where I first encountered the quiescent consistency concept. Assume the following sequence of events happen to the q.c. stack:

  1. Three pushes x,y,z
  2. Quiescence (the pushes are processed)
  3. Three more pushes a,b,c
  4. Quiescence (the pushes are processed)
  5. Three pops
  6. Quiescence (the pushes are processed)
  7. Three more pops
  8. Quiescence

Note that q.c. doesn't mean that a data structure guarantees nothing except for this specific consistency. Consistency condition only maps data structure behavior in a concurrent setting to a behavior in a single-threaded environment, i.e. it only limits the number of a multitude of different sequences of method calls that may happen for a specific set of multithreaded events. All the properties a data structure exhibit in this sequential processing should still apply.

Quiescent consistency guarantees that the first three pops return x, y, and z (in an arbitrary order), and the next three pops return a, b, and c, somehow intermixed as well.

Note that, unlike linearizability, quiescent consistency doesn't impose any ordering on results of pops if there was no quiescence between the initial pushes. For instance, if processing of the first and the third pushes do not overlap, the linearizability ensures the pops will respect this order, while q.c. (in case that the second push overlaps with both of them) doesn't ensure that.

Having noted that, q.c. looks like a very weak and useless property. However, it still implies correctness, which, however, is enough in many circumstances. Imagine that you need a pool of unique numbers. It is enough to use a q.c. counter; the numbers it returns will be unique, which should fulfill our initial purpose.

Do we really need stronger consistency?

However, the reasons why we may be satisfied with weaker consistency conditions are not constrained with the specific examples. Let's try to prove the opposite. Assume we're implementing a stack that is accessed from a number of threads. Quiescent consistency may be not enough because if a push of A precedes the push of B, then the pops should be ordered in the specific way, and quiescent consistency may not guarantee that.

But wait... We say, "If one push precedes the other," but do we really know what "precedes" mean? If two threads in our distributed computational system invoke push call independently, how can we be sure that one of them precedes the other? And is there any sense in defining a measure for that?

Let's reckon the paradox we described at the beginning. In one reference frame, one event precedes the other, and in a different reference frame it's vice versa. So whether one push happens before the other, depends on the place we look from, and is not—and may not be—determined by a central, ubiquitous authority. That's a law of the universe, and we can't change this in computing.

Surprisingly, it makes quiescent consistency be a fairly appropriate constraint for a distributed data structure. If there's no strict sense of precedence between events that happen in different places of our network then why can't we use this to develop a more efficient data structure? The correct answer is, "indeed, why not?", and such data structures are well-defined nowadays.

OSI model

OSI model is an abstraction that aims to make the design of distributed computation systems easier. The chain of events that happen during a process of data transmission is separated to several separated layers: each layer has its own protocol, which is independent of the actual implementation of the underlying layers. You may learn more at the wikipedia.

STR vs computing: the difference

We have successfully used a metaphor from physics in distributed computing, but there is an inherent limitation of applying the concept further. In physics, if we have a car of a specific length (as measured by its riders), we may install the light source in such a way that the events at two sides of it happen predictably simultaneous in the reference frame of the spectator at the station.

In computing, we can't do that. OSI model prevents us from predicting how fast the events happen by masking out the capabilities of a system at lower layers. In physical reality, we know that as soon as the light "hits" the back and front covers of the car, the event "happens" (or, more precisely, a piece of reflected light is launched towards the spectator). In computing, however, we don't even know how long it takes for a signal to reach another node. Moreover, the OSI model makes this impossible to predict. All the layers guarantee correctness, but do not have any timing conditions.

Conclusion

The absence of timing in the OSI model suggests that may be no sense in saying, "this structure may not be used, as it processed the request in the different order than they're issued in". The "order they're issued in" is inherently unpredictable in our computing models.

This inherent unpredictability of timings of processes and events, as described in OSI model, is why we really can't apply the special theory of relativity to computing. However, we still may notice that simultaneity in the world around us, as confirmed by physical experimentation, is relative. And the nature still works well!

And if the nature accepts it, why can't we learn from it, and allow our distributed systems to be less predictable, and trade this predictability for speed?..

Read on | Comments (1) | Make a comment >>


Pipes in Linux and in The Real World

Linux pipes and open3

Three articles on how open3 is implemented in Linux, and how it is properly used

The importance of inter-process communication

In Linux programming the concept of a diversity of small and highly specialized tools collaborating their efforts to achieve the goal a programmer instructed them to has always been dominating. Take a look at shell scripting. Here's how one searches for a string FOOBAR in all *.txt files in their home dir:

Pipes

A lot of pipes stocked somewhere in Russia.

Pipes that transfer solid items

This photo is taken from here.These pipes are more widely known as pneumatic tube transport.

Pipes that transfer documents

This pneumatic tube seems to transfer medical documents and samples packed into capsules. I saw such system in one of a banks I had an account at; there it was used to transfer documents between counters.

It's a good illustration of sequential payload delivery by pipe if the width of the pipe is the same as the width of the objects.

You may find more sophisticated examples in my article on xargs, which is a robust spawner of such commands. It is such robustness that allowed me, for example, making such a simple "web service" that sends the information to the web triggered by just adding lines into a text file--look how concise its code is!

The usage of such tools is not constrained by making a desktop user convenient at their workstation. These tools are actively used in production scripting as well. Having a lot of ready-to-use tools at hand provides a programmer with extra opportunities for building larger programs: instead of trying to find a library or to re-implement something, one may just try to find a small tool for a specific purpose, which interacts with the outer world via command line, input and output streams.

But the combination of tools requires an efficient communication between them. Small tools can nevertheless exchange large amounts of data! Being the world largest consumer of such communication primitives, Linux uses pipes.

What are "pipes"?

Before you read this section, I strongly encourage you to visit your bathroom or kitchen and examine closely the metal or plastic extended round items that have moving water inside (or, if your country isn't rich enough for you to have them, take a closer look to the things the oil your country sells flows through). These are pipes.

Pipes that have started a war!

A week ago I watched "Fair Game" movie. Being a non-US national, I nevertheless consider the problems discussed there relevant. And one episode of a plot attracted me. One of the "casus belli" for attacking Iraq was that it purchased pipes that could be used to make nuclear weapons (in fact, they couldn't, and it all was a speculation of politicians). Above is a picture of such pipe.

See how important pipes could be? Read the rest of the article to learn more!

Pipes are used to direct a flow of something from one place to another; it could be water, oil, other liquid, or even documents, or change (see side pictures). Pipes can even start a war! And Linux pipes transfer bytes: you can write to one endpoint, and read the data written from the second endpoint, which may end up in another process!

The three properties of a pipe

The first is that a pipe can transfer payload only in one direction. You can't use a single pipe to transfer water in both directions, such that it would leak from one end and simultaneously consume water for it to leak from another. For that, you need at least two pipes.

Second, a pipe has a limited capacity. When you close your valve in the kitchen, your pipe is full of water, and no matter how the pump station tries, there will never be more water inside the pipe than there is now. (Of course, the station may try harder and make your pipe leak, and water can undergo compression under certain conditions, but it's not generic). When the station tries to pump more water, the new water is "blocked". It continues until the valve at the other end is opened, and the water is removed from the pipe for the new to come from the other end.

The third property is that a pipe transfers the payload more or less sequentially. Even transfer of liquids, which can mix easily, is somewhat sequential: when you turn on your shower after a cold night, you literally feel how the cold water is removed from the pipe before the hot water starts to erupt.

The interesting thing is that Linux pipes are designed closely after "real world pipes", the only difference being that Linux pipes transfer information, bytes.

Pipes in Linux

The main attributes of Linux pipes, one-side transfer of data, limited capacity, and sequential output, are found in the real world too, as shown above. There is, however, one difference.

The pipes in the real world are usually found in "full" state, i.e. the pipe is full and waiting for items to be removed from it. In Linux programming, however, the "empty" pipes, where it is the consumer who waits for the input, are much more widespread.

To create a pipe, you just invoke the relevant system call via a standard C library. A standard pipe(2) call returns two file handlers, one is for reading (fh_out), and another—for writing (fh_in).

The use of these file handlers usually happens in a blocking mode, such that:

  • a read from fh_out blocks until something is written to the other end of the pipe;
  • a write to fh_out returns when it has written everything into the pipe. If the pipe has no capacity to consume everything, the call is blocked until the consumer reads some data from the other end, so that more data could be written.

Of course, you can use ioctl to adjust the modes and make such calls nonblocking. Still, you can't bypass the basic restrictions: you obviously can't read what's still haven't been written, and you can't store more data in a pipe than it's capable to keeping. If you want to continue execution as the data are automatically written to an overflown pipe, you have to allocate a buffer and a concurrent thread that pushes data there.

You should never forget about these two blockages, as it will affect your everyday workflow (see, for example, this StackOverflow question about less command). Yes, sometimes it's an obstacle you have to specifically overcome (in the third article about pipes in Linux I'll address it). But in most cases such two-blockage behavior is really what you want.

Pipes as a synchronization means

Let's return to the find ... | xargs -L 100 example shown at the beginning of the article. If the find command has already found a lot of files, there's no sense for it to work further, damaging the response time (the frequency of matches found printed) of the system. With pipes, it will be seamlessly blocked by write() to a pipe, and you don't even have to write anything to support it: your simple, usual printf() will just return control only when the second party does some work!

In other words, the design of Linux pipes makes, for two processes connected with a pipe as A | B, this two-blockage system automatically "pause" the faster to let the slower a chance to accomplish its job!

So, basically, pipe is a synchronization "primitive" that pauses a program connected to one of its ends at certain operations. It's not that "primitive", actually, as it may be implemented via a semaphore, but it's simple enough to consider it as such. And—as with other synchronization mechanisms—you may deadlock when using a single pipe, let alone using multiple pipes.

Road to open3

So, let's return to using inter-process communication for more efficient programming in Linux. A classical way for a tool to work is to print to standard output, and read from standard input, putting error messages to standard error stream. How could you efficiently use such a tool in your program? The answer is: connect to it with pipes! Linux has all the capabilities to arrange the filehandlers in such a way, that the program won't even notice that it's printing to a pipe instead of a console terminal.

We used named pipes to illustrate how Bash may be used for parallelization.

Of course, you may do it upfront, without impersonating a genuine console. Linux has a capability to create named pipes with mknod call. These look like usual files, but are actually pipes. If a target program can read from it a file instead of reading from standard input (or write to a file instead), you're lucky. However, this sometimes makes the target programs unnecessarily complex—and they're already complex enough, just take a look at various echo implementations, of a program that is supposed to just print its arguments. Second, this functionality is rarely provided for standard error stream, and error log is a very important piece of information for tackling bugs. Therefore, you will have to either use shell redirection, or just to establish a direct connection from the child's streams to the parent, which is an easier solution.

As we've already learned, you'll need three of them: one for each channel among input, output and error. This has gave the name to how such a function is usually called in standard libraries of various languages, open3. It takes a command line as an input, and returns three filehandlers corresponding to the said streams of the program spawned. Here's what it looks like in Ruby:

However, open3 implementation may sometimes be not powerful enough to meet all your desires (it happened during my work on a cluster software for a project at work, see some rant in my earlier post), and you'll have to code a more sophisticated version. That's why it's important to know how open3 is implemented. It has several quirks, and in the next blog post I'll explain the intrinsics of an open3 implementation.

Proceed to "How to Implement open3"

Read on | Comments (0) | Make a comment >>


Prisoners and Fairness

Fairness is rarely pronounced, yet an important concept in computing. Its applications appear in different domains. Even when its unspoken about in documentation and descriptions of systems, I sometimes notice, "hey, for this to work, fairness is required!" So, what's it all about?

Prisoners

The first thing I all of a sudden encountered fairness in is the prisoners riddle. Here's how it's usually formulated:

100 prisoners are sentenced to death. However, they're given a chance to survive. In one of the cells, there's a light bulb, which is initially off, and a switch connected to it, which can turn the bulb on and off. In an arbitrary order, which is not known to any of them, jailer lets prisoners into the cell; they can observe if the bulb is on or off, and, optionally, turn the switch.

If a prisoner, after visiting the cell, says that every prisoner visited the cell at least once, and he's right, then everybody is set free. If he's wrong, then every prisoner is executed. Of course, every prisoner can pass and say nothing; then the iterations continue.

The prisoner's can't talk to each other after this trial begins, they're kept in solitary. However, before this "trial" is started, they're given a chance to make up a plan, which they would follow. Can the prisoners make such an arrangement, that makes them sure that they are set free, no matter in which order they are allowed into the room?

There are just a couple of riddles of this sort that I don't consider stupid, and this is one of them. Take time to solve it, and you'll be satisfied with the process and the result. You can keep reading, though, there won't be spoilers.

So, does the problem look sound? Is its definition complete? It sure is not. Indeed, it's quite possible that one prisoner never enters the room at all! Since they enter then cell in an arbitrary, not a random order, it's possible. That means that it's possible that not everyone will ever enter the cell, and in no situation the statement "everyone was here" will be true. So we should conclude that no matter how prisoners decide to act, they may have never get freedom.

This makes the problem not interesting. What should we add to make it have only non-trivial answers? You already got the idea: we should enforce fairness.

Definition of fairness

Picture A

A simple state machine for the jailer. The jailer nondeterministically picks one of the prisoners, lets him into the bulb room, and checks their answer, if any.

So let's formalize the work of the jailer a bit. All prisoners reside in their separate cells, and sometimes the jailer decides to open one, and to invite its inhabitant into the bulb room. If the prisoner doesn't decide to make a final guess, then the jailer picks a cell to open again. This can be modeled as simple state machine. See picture A.

The jailer makes an arbitrary choices whom to invite next. But it wouldn't be fair if a certain prisoner is never let into the bulb room, as we've shown before? It wouldn't. And it wouldn't be more fair, if a prisoner is not let to the room after a certain moment. It's just the same as if it's never let there, because up to this moment the prisoners' plan might have not moved forward. (My argumentation is kinda weak, but I try to avoid spoilers.)

So the work of the jailer would be fair if it never happens that a prisoner is not let to the bulb cell after some moment of time. If the jailer follows this, then its work is fair. This is closely tied with a concept of fairness in computing.

In one of my previous posts, the one about the search for a declarative language, I came up with my own, generic definition of fairness. I'll quote it, slightly reworded, here:

Fairness is a property of a nondeterministic system, which repeatedly and infinitely faces a multiple choice, to not allow such behavior that a certain item is never chosen past any moment of time.

For this prisoners riddle to be interesting and well-defined, prisoners should be let into the bulb room in a fair manner.

Applications of fairness

Task scheduling

A scheduler implemented in your operating system is one of the most important parts that affect its performance. If your system supports multitasking, it should spare limited CPU time between tasks to reach some goals you set. For a desktop system you might tune your process scheduler to react fast to mouse clicks you make, while running some tasks in background (can't help quoting Linus' sarcastic letter on this matter). For a system which doesn't need small response time, but should effectively schedule long-term tasks that communicate with each other, you might need another scheduling algorithm (see my Stack Overflow question, during which I encountered such an inefficiency on a desktop-tuned system).

Usually schedulers work like this: timeline is divided into chunks, and before the next chunk begins, scheduler chooses, to which process it should assign the next chunk of CPU time. Sounds much like the work of that jailer, doesn't it?..

In scheduling, fairness has more strict definitions, and more complex metrics. They calculate and impose restrictions on fractions of CPU time allocated to tasks.

Scheduling algorithms should not adhere to communism: some tasks have more priority than others. But schedulers should be fair when choosing the next process to take CPU's core. Indeed, if they wouldn't be fair, then it could be theoretically possible that our programs just never have chance to start if there's something else happening! However, if scheduler is fair, we at least know that program will terminate (thought we still don't know when exactly).

Distributed computing

The first place I heard about fairness as of a separate concept, were studies of distributed algorithms in my university.

When several machines run in parallel, and communicate with each other, it's usually quite a complex system to study and implement. Just reckon all these race condition and concurrency bugs you faced!.. To study distributed systems we should model it in some way.

Actually, what I call "fairness" is known as "strong fairness". There also is a concept of "weak fairness". Its "weakness" is that the systems eventually should make a certain choice only if it has such option always from now on. If such an opportunity regularly disappears, the system can opt it out and be nevertheless weakly fair.

The most obvious way to do that is to pretend that

  • at each moment of time, only one machine works, while all others are paused
  • there is an independent "oracle", which decides what machine will run next, the oracle being not associated with any machine
  • the oracle can stop machine only after it has executed an atomic operation completely

Picture B

Example of model of a machine in a network, when fairness wouldn't be the best option to model machine's behavior. Fairness forces the machine to eventually end up in a "broken" state.

Sometimes it would be nice to assume that the "oracle" makes machine run in a fair manner. However, this assumption is not frequently taken. Consider that we need to model what happens when a machine in a network breaks. We can model it as having a nondetermenistic choice to either break or to keep working (see picture B). Enforcing fairness would make us think that any machine will eventually break, and this doesn't conform to what we want to model.

Nevertheless that's how I first learned about fairness. Theory of distributed computing contains a lot of other interesting stuff, thought it's nontrivial. Our lectures used slides like these, it might be interesting to read.

State machines

Fairness can also be defined in terms of state machines. You can check wikipedia article with a paragraph about that. Basically, there's nothing new compared to what I described in previous paragraphs; I even used state machines to demonstrate what I'm talking about.

Implementation of fairness

Fair choice is usually implemented as random choice. Probability distribution doesn't need to be equal (for example, in a dumb implementation of a scheduler, it may be associated with task priorities).

However, this is not completely precise. Fairness requires every choice to be made. Random choice leaves theoretical possibility for an option to be completely ignored. And if this is allowed to happen, the choice is not fair anymore. What are the odds of such behavior? Assume we have two choices, each made with probability equal to ½, and the first choice is never made. This happens with probability ½⋅½⋅½⋅½⋅..., which is equal to exactly zero.

Therefore, the "random" implementation only implements fair choice with 100% probability, but not always. With zero probability the choice would be unfair. Isn't it too much? Only requirements of your implementation would tell. After all, we all were born and grew up the way we are with exactly zero probability...

While fairness is implemented as randomness, it shouldn't be forced in specifications of a system. The Whenever language I described in one of my posts doesn't really benefit from its choice of a line to execute being random. However, fair choice would be more suitable, as some executions, with zero probability, can make a correct Whenever program useless.

***

Fairness is an important concept in our life. You should be fair to prisoners, you should be fair when you have powers and use it to allocate resources to others. In computing, it's all the same.

In many domains of computing definitions of fairness vary. In this article I described only a couple: "strong" and "weak" fairness as understood in theory of distributed computing. Instead of being "totally" fair or unfair, fairness may be a metric, and system behavior is defined in such a way, so that this metric is not less than a certain value.

The exact definition of fairness is complex and hard to coin in a general way: it becomes too vague. I made such an effort in the middle of this article, but if you have a better idea, it's welcome in comments. Nevertheless, I hope this post gave you at least a notion of what fairness is. And I also hope, that when you notice it in in a system you design, use or encounter, you'll use the correct and concise word to describe its behavior.

Read on | Comments (0) | Make a comment >>


Programming as gold mining

This article was planned before the creation of this blog. I tried to understand what programming is as an approach to problem-solving, rather than just as a tool. And I found out that the approach of programming reminds what happens when you mine gold (but the analogy is not the most obvious one you probably thought about). The article describes the gist of this point of view to programming and depicts the connection to mining.

The difference between programmers and common herd

Unlike the IT guys, like me and you, people aren't that excited with programming. Connecting and arranging components of an upcoming system on a blackboard is not an inspiring activity for them. They don't cheer when they manage to write a function in a curiously small number of lines. They have no possibility to settle back, satisfied, and smile, when the program works outright after the first compilation without need to debug it.

But the world is somehow arranged in such a way that these insensitive people nevertheless are those who pays us money! What programming is valued for by them is that it's a good way to solve problems. These problems are usually beyond those a human can handle, and you usually need a horde of zombies to tackle them.

However, problem solving is approached differently by those who can code and those who can't. The management of a non-software company usually needs to do something concrete, like being able to fill forms M-186 and P-337b through the web browser. After they find out that such an automated system would help to run the business, they assign their IT guys to the task. They actually don't care how beautiful the code will be, how much satisfaction will the IT crowd have. They have a problem, and they need a solution.

Sorting ten cards

Assume you are given the task to sort 10 cards... You could write code like this...

But I am sure that there's no one who calls himself a programmer and does it this way. The code's too inefficient, impossible to maintain and can't be written without help of an external program.

So you end up writing something like this (that's a Bubble sort):

But let's reckon the original task. It was, let me remind, to sort ten cards. But it happens, that to sort ten cards you literally have to write a somewhat complex program, which sorts any number of cards. You can't write 10-element sorting without writing any-number-of-element sorting!

In fact, the fact stated in previous paragraph explains the crucial difference that prevents others from understanding that programming is hard. They think like "what is so difficult in sorting ten cards?" "You could just ... what's the word?.. Um, ah! Hardcode the form M-186 onto the webpage and collect results into a database!" They can't understand that you need to create a program, which is close to processing any form, in order to do it only for two forms.

This is the evidence of the idea that programming a solution to a particular problem is always nearly as hard as programming the solution for a general one.

Boon or bane?

The downside of a tendency of "programming" to solve problems more general is that people don't understand how sorting ten items may be so hard. Books are written about it (reckon Don Knuth's "Sorting and Searching" TAoCP book?). Research is held about it (new sorting methods are still developed to solve ever increasing number of edge cases and to handle specifically ordered arrays effectively). But, hell, any kid can arrange ten cards in the order of increasing value! He can't devise an algorithm of doing it for N cards, but we didn't ask for N, did we? Thinking like this, most people just can't understand what's so hard, and seem to value the art of sorting less than it really deserves. Let alone the other crafts of programming.

It's of no wonder that scaling makes a program more complex, and thus changes the algorithm in a way more than just increasing an array's boundary. But scaling and generalizing are nevertheless different problems: while the former is about looking for effective solutions for greater inputs, the latter is just about a theoretical capability of a developed algorithm to solve a larger problem.

On the bright side is the possibility to recycle the waste of programming. Once you've written a sorting function for array of 10 cards, it takes you just to increase the array size to make your program sort a lot of cards. Of course, if you wrote an Ω(n²) sort, like I did in the example above, it won't scale that well, but we assume you picked a better solution to the initial problem. So now you have a program that can sort any array.

Creating such residue is inevitable. We have a good analogy in the other fields: the one who extracts oil should also extract the accompanying gas, build facilities to handle it and make the site protected from accidents that involve gas. In the very same way, a programmer is doomed to produce programs that are more general, and he should be aware that it's just more hard.

By the way, printing invitations in an automated way is a good sign that you're dealing with a programmer. Even if a programmer never encountered a task of such sort, he will always use a programmable way to print invitations.

When I was at school, I was assigned a task to print invitations to the prom. I never did anything like that before, but, being a programmer I managed to find a way to do this. I scanned and recognized a list of persons into Microsoft Excel, imported it to Microsoft Access database, and printed invitations as a series of nicely formatted database records with external background image.

Completely lame, but what else could I do back then?

This property has even been a source of jokes. Alex Exler in his "Notes of a programmer's bride" (oh, I re-read that and fixed the description that follows) describes a situation that a programmer spent a day to write a program that prints invitations to his wedding. The bride bragged that her husband has such a good and a very useful profession. The man she bragged to questioned if it was easier to just fill them manually in a text editor, and after bride's reply about "generalization", he astonished her with, "Yeah? So how many times are you going to get married?"

Conclusion

Back to a programmer's residue. Brooks in his famous "Mythical man-month" claims, however, that producing a reusable component takes three times greater effort than producing the non-reusable one tied to a specific task. Most programs that don't have such a clear abstraction as array sorting, are far more tricky to be scaled to greater inputs without effort.

But that does not contradict the main idea that you can't evade producing residue. The raw "general" residue programming leaves after solving a particular problem is useless, just like in gold mining. To concentrate the ore, you must put effort into doing it, and to get some product from the residue, you should work more. And that's what happens when we write programs. So, isn't programming a gold mining of sorts?

Read on | Comments (0) | Make a comment >>


Software engineering as Kama Sutra

At the meeting today we thought that the idea of our bored manager (you've seen it in Dilbert) to adapt Scrum is not so bad. It seems we're experiencing sudden attempt to use yet another cool software engineering framework; that is all developers usually don't like. Having been unusually and weirdly excited about such a commonly undesirable change, I tried to figure out why's that...

And I think I got it. Tomorrow we're trying to try a new sex position with my beloved wife. Yes, I lately realized that I'm totally married to my job (which I unconsciously referred to in a recent blog post). And the excitement is that we're finally going to switch that missionary position to something new and with more... movement, I guess.

But then I suddenly realized. The whole software engineering thing is just a large group sex Kama Sutra.

Just look at it.

"Agile methodologies" will totally require you being in a good physical shape to satisfy the project and do a lot of different things for that.

We actually started doing scrum at work a couple of months after this post was published. Here's a post with my impressions on what Scrum is like.

Scrum, with all its "chicken" and "pigs" being a reference to "men" and "women", requires rhythmic moving. However, I think, constantly questioning the partner whether things go well doesn't work well in the bed. Hell, "think"; I know; you too do.

Some techniques are even worse, however. "Extreme programming", that values fast delivery, isn't that women usually want. And this DSDM technique... I don't think it needs further explanations.

Of course, while in nearly every domain of human activity you can find sexual reference, it was funny to look it up in software engineering. However, it appears to be not that funny as I expected. So, basically, tomorrow I'm having a new sex. Isn't it what 80% of blogs are about?

Read on | Comments (1) | Make a comment >>


Haskell's "that" operator

«Практика функционального программирования»

Translated as "Practice of functional programming" is the title of an introductory magazine. The magazine is free (free as in "free beer", not "free speech"); you can download it at fprog.ru (yes, it's in Russian). It contains a lot of interesting stuff and is a very fruitful source of references to [mostly English] articles and books about functional programming in general and particular languages.

The key is that it's focused on practical aspects of functional programming. This is useful as a reference to industrial success stories, which you may need to convince your boss to use such a powerful tool as functional programming.

I liked it so much that even donated 10$ to it. And I recommend this magazine with no hesitation.

I recently started reading a magazine called «Практика функционального программирования» ("Practice of functional programming") to learn more about the world of the functional programming languages. There I noticed a nice piece of Haskell's syntactic sugar; it is what I'd like to tell about.

They called it "infix version of function application". I call it "that" operator.

Here's how it works. In functional languages you usually write f(x,y) as f x y. Moreover, you pass a lot of functions as parameters, so f(x(y)) should be expressed differently. Here's where parentheses come into play: f (x y). Lisp, for example, requires these parentheses to wrap each application of a function. Instead, Haskell introduces syntactic sugar to avoid lots of parentheses in expressions like f (x (y (z (t v)))). Or, to be less abstract, in expressions like this (it's functional pseudocode):

In Haskell you can avoid these parentheses with functional application operator. (Or it's a function, not an operator? I'm still not much into this.) Its has different "handedness" from that of usual function application, and that's why it's convenient. Here's the example above with this operator.:

Or, we can use another operator, "dot", which expresses "function composition", and was chosen to resemble its symbol in math, ∘.

Now, why do I call $ "that" operator? Because "that" word in English language has the similar purpose: it frees the language from unnecessary parentheses. Compare these two versions of "This is the house that Jack built..." nursery lyrics by Mother Goose:

With "that" operator:

This is the farmer sowing his corn,
That kept the cock that crowed in the morn,
That waked the priest all shaven and shorn,
That married the man all tattered and torn,
That kissed the maiden all forlorn,
That milked the cow with the crumpled horn,
That tossed the dog,
That worried the cat,
That killed the rat,
That ate the malt
That lay in the house that Jack built. 

With parentheses:

This is the (farmer sowing his corn,
kept (the cock that crowed in the morn,
waked (the priest all shaven and shorn,
married (the man all tattered and torn,
kissed (the maiden all forlorn,
milked (the cow with the crumpled horn,
tossed (the dog,
worried (the cat,
killed (the rat,
ate (the malt
lay in (the house Jack built))))))))))). 

Parentheses

One of the most famous functional languages is LISP. Its name is an acronym for "Lots of Insignificant Stupid Parentheses" There are several LISP jokes involving parentheses. Here's good a joke about a soviet spy and LISP.

Doesn't look too nice. And it finishes with this notorious tail of parentheses (see side note). But this natural language developed itself in such a way that it provided a nice notion of function application. Was it what inspired Haskell authors? Hardly. But still, I think that there is a lot of similarity, between these concepts.

"That" operator is a very helpful thing, and it's sad that some functional languages (namely, OCaml, which I use at work) don't have such a nice feature.

Read on | Comments (0) | Make a comment >>


Programming as Controlling Mindless Zombies

The question that bothers many developers and, maybe, even people in related topics is "What is Programming?"

Indeed, many of us do it every day (more seasoned ones taking a break on weekends), but who really bothered himself with a philosophy of what he's doing? Since I'm fond of making up my own philosophy, I'll try to pore upon the subject in several blog posts. In this one I'd address one side of it, and you, having read the title, already know which.

Computer programming is indeed controlling a horde of mindless zombies. Let's check definition of a zombie on wikipedia:

A zombie is a creature that appears in books and popular culture typically as a reanimated dead or a mindless human being. Stories of zombies originated in the Afro-Caribbean spiritual belief system of Vodou, which told of the people being controlled as laborers by a powerful wizard

First of all, computers are dumb. They really are. Just compare the time you need to devise an idea, to think it out with the time you have to spend to explain it to a computer. To a machine that understands only ugly, primitive texts that do not make sense to any human, unless he's a programmer. How many times did you scold your computer for being not as smart as you? For not understanding that here he shouldn't do this, because it's obvious that he should do that instead?

Recently I had been writing an OCaml program. OCaml is a very smart language that can infer the types of identifiers by looking at operations applied to them. That's right, he's smart--as long as things go the way he used to perceiving. But as soon as you write "separate the list to a list concatenated with an element" instead of "separate the list to an element concatenated with a list", he becomes unbelievably dumb. If it was a russian bureaucrat, I would even think he's doing it on purpose to extort a bribe...

So, the computers are totally incapable of doing anything not told to them explicitly. That's quite a good definition of being mindless.

But these mindless machines happen to do some useful work. And the complexity of this work, usually inhandleable by any human, sometimes gives a false sense that these machines possesses intellect. But that's not intellect of their own. The intellect within a binary is that of the programmer who coded it, of the tester who checked that the dumb machine doesn't crush all around, of the manager who gathered them, and of millions of people who invested their mindpower into compilers and interpreters--the things that make one mindless machine teach another. I can't help quoting the famous Conway's law:

organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations

(quoting via Wikipedia). This law is a direct consequence of the fact that computers are dumb, and it's we, the humans, the organizations, that inculcate our own properties to them.

So the result of computer programming is a horde of creatures that carry the will of their masters. Well, sometimes not exactly what the masters meant, thus you see strange error messages here and there, where you really don't expect them. That's because the creatures are so dumb, that a separate programming engineering discipline is devoted to "talking to zombies" (it's also known as "coding"), and the discipline is not practiced well enough by some developers.

And the last thing that concludes the post. While it's always hard to instruct a zombie, it's sometimes also hard to make an already instructed zombie to fulfill your needs as user's. Therefore, users are sometimes required to posses the thing these creatures do not have: brains...

Mindless creatures that look for brains? That just can't sound more zomby! And it's precisely what we deal with while programming.

Read on | Comments (0) | Make a comment >>


More posts about similarities >>