Recent blog updates

Shown here are only posts related to bash. You can view all posts here.

The Pipe-Pipe-Equals

It is hard to come up with a google-friendly name for the ||= construct you see in some programming languages quite often, "pipe-pipe-equals" being the is the closest (other names include "double-pipe equals," "or-equal," or "double-or equals"). Why should we name this monster in the first place? While cryptic at first sight, it is a very convenient shorthand that, I believe, allows us to write cleaner code. Let's see how it works on this example:

Essentially this ||= construct is equivalent to other convenience operators, such as +=. x += 2; means x = x + 2. The "or" operator || usually denotes a weak logical "or" that doesn't try to compute the left-hand side if the right-hand side is true. Usually, meaningful values evaluate to true, while the only evaluating to false are NULL pointers/empty values, zeroes, boolean Falses, and, sometimes, empty strings.

Alternatively, you could set a default hash value, but ||= is used more widely, and also takes exactly one line... doesn't it? ;-)

The statement is used mostly to set a variable to a "default" value if it's "unset" at some point. This gives the most visible advantage when you initialized dictionary elements in a loop. Here's a piece that saves counts of each array element into a dictionary (Ruby):

In some languages (Perl, Javascript), you could not even bother with this, as += 1 on an unset value would result in its assignment to 1.

If you don't initialize output[x], you'll get a runtime error for trying to increment NULL. The advantage of ||= against other ways is that you don't repeat anything. You could have written the same piece as

Oh yes... we forgot about the most natural use of ||=, for booleans. Here's how we'd check if an array contains zeroes if we need to iterate over its elements for something else.

Pipe-pipe-equals is rarely used this "natural" way, though.

But here you have to type output[x] twice and add two more lines of code, which is a complete waste of screen space and, possibly, computing resources if the interpreter doesn't optimize the duplicated computation out. Let's take a look how pipe-pipe-equals works in different languages (we've already seen it in action in Ruby).

Perl

Perl was the first interpreted language I learned, and the first place I saw the pipe-pipe-equals operator in. It works as expected, but is used less often than the direct version of the weak logical "or" to specify default parameters of a function. Here's how you'd put a default to a hash bucket:

This works in a strict mode, too. Note that, in Perl, many things evaluate to false, including empty strings (I once even tried to emulate it in Ruby). To restrict this action to undefs only, use //= in Perl6 instead, which sinks to its right like the Pisa tower, and look as if you're trying to put a division while drunk.

Python

Python has no equivalent of this operator. You have to type the very[long(expression)] twice, and it will be computed twice. You have several different ways to do this:

The engineers more experienced in Python programming, though, assure me that this is not a big deal since you have a different way of supplying default arguments for a function, which seemingly covers half of the use cases for pipe-pipe-equals (this doesn't prevent Ruby from having both, though). Another half is covered by dictionary's dict.get(key, []) method, so that the code piece #1 can be written in a succinct manner. But I still miss it.

Bash (Linux shell)

Bash? Its language is so simplistic, and it looks creepy; how come it would have an equivalent of such a beautiful shortcut? Here it is:

This assigns y to variable if the former is unset or null (as per Bash manual). I mostly used it to set default parameters for environment variables user may or may not set before invoking the script.

C++

While |= is a syntactically correct expression (C++ does not have the "short-circuit" version of this expression), it doesn't do what we discussed here.

C++ is statically typed, so the result of the standard logical "or" is boolean. Retaining the nice semantics we find in a dynamically typed language would require it to be "either the type of the left-hand side or the type of the right-hand side". This is hard to pull in a pass-by-value statically typed language.

Pass-by-value semantics also means that not everything can be assigned a NULL, and not everything can be converted to boolean value. C++ has default arguments as well as Python, so the same reasoning could apply here. You'll have to be more verbose in C++. That's probably why only |= expression is available, which is only useful if its left-hand-side is bool (see sidebar above for similar usage.)

OCaml

Everything said above about C++ applies to OCaml as well. Moreover, OCaml, as a functional language, doesn't have a flawless support for mutation, and pipe-pipe-equals statement its inherently mutational. However, its matching operator would require us to use the very_long_variable twice. However, OCaml and other functional languages have a very interesting construct called "option". If something has "X option" type, it may contain either "nothing" or a value of x. Then, this value may be "unpacked" trough pattern matching:

let very_long_variable = match very_long_variable with None -> y | Some t -> t

here, t is not an shorthand for another very long expression; instead, it's just an identifier, written as is. The match...with allows us to "unpack" values of structured (algebraic) types with shorthands like this. Since this was too long, OCaml has made this a library function Option#default:

let very_long_variable = Option.default very_long_variable y

Anyway, OCaml programs are even more explicit than those in C++ and Python, so trying to tie pipe-pipe-equals into them is quite pointless.

Ruby

"Erm, we saw how it is in Ruby at the beginning," you might think. Well, I lied to you a bit. The thing is that, in Ruby, it is not strictly equivalent to an assignment to a result of logical "or". Which do you think x ||= y is equivalent to?

In Ruby, and only for ||= and &&=, it's the second. If you assign to something other than a local variables, what looks like an assignment is actually a method call (think properties,) and, if so, this assignment does not happen at all if the left-hand side of ||= is false. Which makes sense, but looks like a special case. Read more here.

Why You Might not Need This

Some argue that this operator is mostly useless, especially if their favourite language doesn't have it. Here are some arguments they list.

Redundancy

Indeed, this expression is redundant. You can do the same in a multiple different ways, All the examples I demonstrated above showed how to write essentially a simple if statement in a very short-hand form. The number of characters spared is probably not worth it to include support for this feature to a language designer's must-have checklist.

The statement discussed decreases the redundancy in code in return of broader language definition; each language seeks a balance between these, and often leaves the pipe-pipe-equals aside.

Default Function Arguments

The ability to specify default function argument (C++, Python, Ruby, but not Bash or Perl) covers many use-cases for ||=. However, this doesn't help fancy loops that fill complex structures, one of which we showed above. Nor helps it when you have to specify the default parameter anyway but the concrete value is not known at the time of coding, and is an optional field user may or may not fill.

Confusion

It is confusing. The pipe-pipe-equals requires explanations how it works. It becomes a special snowflake, different from its "mathematical" counterparts +=, if a complex language wants to make its very useful (see Ruby section above). While "confusion" as an excuse of not doing something is my pet peeve (and I tried to explain why), it indeed requires some effort to understand the mechanics. I'm sure that you'll love it once you understand it, and the way I learned about this operator is not by reading a textbook or "The Most Confusing and Obscure Programming Language Features Possible" newsletter, but by reading someone's code.

To Use or Not To Use?

When it comes to deciding whether or not to use pipe-pipe-equals in the "creepy" way we discussed throughout the post, the criticism somehow fades out. If the language supports this, it will be used. I haven't encountered any coding style document that bans this feature. In Ruby and Perl, it is considered as good of an idiom as many. So the answer on the question of whether or not you sohuld use pipe-pipe-equals (||=) in your programs is, definitely, "yes".

Read on | Comments (0) | Make a comment >>


Syntax elements? User-defined functions!

Many languages, especially those non-functional ones that originate from 20th century, have immutable syntax. Such symbols as +, (, or . have predefined meaning, which can't be changed by user.

Based on this, when you see a similar symbol, you think that its meaning is predefined by specification of the language. But it's not always true.

C++

The most beautiful example of operator overloading in C++ is "lazy evaluation". Assume you defined a matrix class. You may want to use operator overloading to work with it, so you could write:

The expression at the right-hand side will be evaluated like this: first A*B will be computed and stored in some internal temporary variable tmp1, then (tmp1*C) will be computed, and stored in tmp2, etc.

However, it's known that evaluation of matrix multiplication chain can be optimized. So instead of straightforward definition where multiplying two matrices returns a matrix, we could define it in such a way that it returns a special "chain handler". This "handler" would accumulate references to all matrices involved in the multiplication, and perform the actual multiplication only once, when it gets converted to matrix M as a part of the assignment.

C++ is notorious for its operator overloading. You can imbue any meaning to (nearly) any operator used in expressions. This means, that any simple expression, such as "a = b + c" could actually invoke a complex user-defined procedure.

Of course, you can't redefine anything. Syntax elements used to define control-flow can't be redefined, as well as some operators that provide. And, most importantly, you can't create your own operators. But even with the powers you have, you can hide some userspace functions behind the usual syntax.

Linux Shell

If you wrote or read Bash programs, you probably encountered such constructs:

The [...] "conditional" expressions are used to query file attributes and to compare strings. They are builtin shell operators...or not? Let's check:

pavel@lonely ~ $ ls -l "/usr/bin/["
-rwxr-xr-x 1 root root 35032 Jul 20 01:53 /usr/bin/[

What the hell? So these brackets are actually small programs that merely take the subsequent tokens as command-line arguments? Well, they could be:

pavel@lonely ~ $ "/usr/bin/[" -f /bin/bash ] && echo Yes
Yes

Thankfully, Bash itself doesn't invoke them (it uses built-in implementation of this functionality). You can check it with strace or by temporarily removing this file. Perhaps, this file is kept for compatibility with other shells.

This example demonstrates that what you think of as syntax elements could actually be a user-defined function.

OCaml

Haskell language has a special notation to make an arbitrary function infix. I.e. you could make an "operator" out of a simple function:

Prelude> min 1 2
1
Prelude> 1 `min` 2
1

However, another functional language, OCaml, doesn't have such syntax. It can define function as infix, but not "convert" it to infix as shown above. And there's no hope for it to work: it would require introducing additional syntax to language specification... would it?

# let ( /* ) x y = y x
  and ( */ ) x y = x y
  in 1 /*min*/ 2 ;;
- : int = 1

The idea is to define some functions that look and feel like "syntax", leaving language definition intact. The above code defines infix functions named "/*" and "*/". And if you saw code like this (1 /*min*/ 2) at distance from the definition of these "functions", you would really thing it's some kind of OCaml syntax.

Here's the origin of this idea.

***

Modern languages give users more freedom to define and re-define meaning of what looks like syntax elements. This freedom can be used to write more beautiful and concise code, as shown in the examples above. By the way, do you know more?

Read on | Comments (4) | Make a comment >>


Limiting time and memory consumption of a program in Linux

For those impatient developers, here's the link to the script that limits time and memory. It's implemented in Perl, weights 14 kilobytes, and has several additional features, such as hangup detection or collecting resource usage statistics. You may also browse the timeout GitHub project. The sources are licensed under Apache-2.0.

The others, more patient, are welcome to read the rest of this article.

Why do we need to enforce limits?

Most programs we are used to dealing with are well-written. Several years of development, stable versions, bug tracker with a lot of problems fixed — this describes the majority of the programs we work with. Good software is usually developed to do its job in a minimal and predictable time. And if we deal with persistent programs, which are designed to stay in memory, then its memory consumption is usually watched for, and the bugs about violation of the pre-specified limits are fixed... well, at lest, such issues are considered as defects :-)

However, in certain domains, some programs are not like that. They're immature, support only some narrow use-cases (while claiming that they're more generic tools), and contain a lot of bugs. The time for simple usage is more or less predictable, but no one can tell if it won't suddenly take much more time to finish a particular case. Yes, I'm talking about "research-quality software".

These programs usually attempt to solve NP-complete problems with heuristics (and we know it's a good thing). And for many such heuristics, neither time nor memory consumption of an implementation have known theoretical limits, let alone practical ones. Usually such algorithms work well for most cases, and in these cases the limits are reasonable. However, for certain cases, which can't be easily distinguished, they can easily go off of these limits.

This doesn't seem to be a big problem if your aim is just to do some research and throw your code away (it's sometimes even referred to as "throwaway code"). However, if you're trying to ship something of production quality, you can't accept such immaturity. If the program works with a particular case for too long, it may block providing service to the other customers. And if it consumes all the memory available, it may throw the whole machine to the swap hole until reaped by OOM killer.

How the solution would look like

To solve the problems listed above, we need to run programs under control, and enforce time and memory limits, terminating the program upon their violation. These limits should be set manually (in order to enable tweaking, and allow setting memory limit lower than the available maximum). The most obvious command-line interface would be:

        limit -t 600s -m 1000000k command arguments ...

The commans is executed with arguments, and is constantly monitored. As soon as one of the limits is violated, a relevant message is printed to STDERR, and the program is terminated.

The biggest problem with setting these limits is that the processes we run may be "black boxes". I.e. we should be able to limit the processes, to source code of which we don't have access. We can't install signal handlers to them, or affect their code in any other way.

Another hard problem is that these processes spawn children, which do considerable fraction of the overall work. Some of these children are mature and well-written programs, while the others are the very same "research soft", which can eat all the memory with as great appetite as its parents have.

This means that we should count the time and memory consumed by all processes spawned by the one we watch for.

"But what's the problem?" you might ask, "Linux provides enough standard utils to solve this problem!" However, not all of them fulfill all the requirements to a good limiting tool. I'll enumerate them after the explanation of how the time limiting script is implemented.

What works

The following solution to the problem is a result of several months of experimenting, which involved studying of a lot of manuals, and several "Eureka!" moments.

It looks fairly dumb. Just poll each process in the tree the process spawned for time and memory consumption several times per second, and kill them on violation. However, it was no simple task to make this actually work correctly.

Tracking memory consumption is very simple. At each moment of time, every process contains a valid information about its virtual memory. The total memory is no greater than the sum of current virtual memory values for all the controlled processes in the tree. If the sum of memory stamp sizes exceeds the pre-specified limit, just terminate the controlled processes in the tree, and print the relevant message.

However, it's not that simple for time consumption. Time consumption accounts for the history of the process tree evolution, as well as for the snapshot in a specific moment of time. By examining the CPU time for each process in the tree, we can learn for how long they had been running, but we don't know how much time was spent there in the processes that had already finished. The total time is greater than the sum of such snapshots. Even if we manually track which processes die, and store the sum of their times, we miss very short processes that are born and finished between two out polls (see Figure 1).

I found out that Linux kernel calculates, for each process, the cumulative time of its reaped children (like in man 2 times). If a process has been terminated then its total runtime is accounted in the cumulative children time of its parent; if the parent is also finished, then the time's added to its grandparent, et cetera. This way, the total CPU time of the already finished children of the processes we watch for is just a sum of their cumulative user times.

Thus, we may learn the CPU time of all the processes currently running, and of all the processes that have already finished. All we need is to sum these two metrics, and to compare the result with the specified time limit. If the limit is violated, the message is printed, and the controlled processes are terminated.

We may fetch these values accurately fetched via /proc even for the black-box processes.

Implementation — the timeout script

Of course, the bare text isn't very inspiring. It just has to be accompanied with a script. And here it is, 14kb of Perl code. The sources are licensed under Apache-2.0. They were a part of the project we develop at work, but we decided to fork this particular script as a separate project.

Being seemingly stupid, that's the only thing that works. At first glance, any person, who knows something about signals, children, wait calls and all this crunching, will say that the solution is quite dumb and inefficient. However, I couldn't find anything better, and it will be much appreciated if you suggest any improvements. Here's the list of the things I tried

What doesn't work

РЖД totally looks like PID!
We started speaking about Linux processes, and I suddenly reckoned that the logo of Russian Railways company (РЖД) totally looks like pid (Process IDentifier)! See it for yourself!

Ulimit

Ulimit (the console tool, not the system call) is the first thing a Linux user would think about. You launch it in the shell, set "soft" and "hard" limits for each resource, and children, which are invoked in this shell, are limited in memory and time.

Its limits are imposed like this:

  • Time limit is set by setrlimit system call. When the soft limit is met, a SIGCPU is sent (which kills by default). This signal can be handled by a process; for example, it can decide to limit its own resource consumption on receiving it. On hard limit violation, a simple SIGKILL does its dirty job without any possible impediments.
  • Memory limit is set via the same system call. Memory limit is enforced by returning ENOMEM from a violating malloc() syscall. This usually leads to an unhandled exception of sorts, and the program (or its interpreter) is forced to terminate.
However, this doesn't meet all requirements. The thing is that
  • setrlimit() call only limits the process, in which it's invoked. Children of the process inherit the limit, but their accounting "starts over," whereas we are interested here in limiting sum of child and parent resource consumption.
  • You can't determine if the reason to terminate the process was a limit violation. When a memory limit is violated, we should print a message about that, but for that we should know all possible outcomes of a no-memory exception. Shell, Perl, Ruby, Python, C and other languages print message about lack of free memory in different, non-portable ways, and we can't possibly enumerate them all.

    However, setrlimit notifies the process when it exhausts its CPU time limit by sending SIGCPU. This, theoretically, can be handled, if we have the sources of a controlled process available, which we don't have—all processes are "black boxes". We also can't create a wrapper around the program, and catch this signal in a wrapper: the wrapper's time would be limited then, not the target's.

Sleep && kill

Another "easiest" solution to track time consumption is to sleep for a time equal to the limit and kill the controlled process afterwards. This only limits real time, while it's more interesting to limit CPU+sys time.

This may sound an overkill, but believe me, you will have to measure CPU time. Hardware develops in such a way that a performance boost may only be achieved by employing parallelization. And parallelization involves competition for a shared CPU between multiple threads and programs, and when that happens, the "astronomical" time elapsed becomes much greater than the CPU time, the time the process had actually been running for.

Coroner's way

A special wait-like call, wait3, can harvest information about a deceased child. It can return CPU time spent by the child. This would present precise information about the runtime of deceased processes.

However, we can't reap grandchildren, and such information should be harvested by each process in the hierarchy then. Glibc has a capability of making program send SIGHUP to parent when it dies. However, this function should be invoked from a process that spawns children. But the processes we watch for can't be modified. So this option can't be the solution.

One more thing is that if a child process hangs up in an infinite loop, and never terminates, we can't wait for it, hence we can't detect it, and watching for such a limit will fail.

Thus, this way of tracking is not close to the solution for limiting time. Let alone that this way won't allow us to track memory consumption: it's an "instantaneous" property, not a cumulative one.

Using process groups to control children

One of the previous versions of the script used process groups to control the children, just like the shell does!. It put the spawned process into a separate process group, and its children were assigned to that group automatically. It was easy to kill them all with a single kill command, which is capable for sending a signal to the whole such group.

However, the issues arisen by that "just like the shell does" part were too severe. If your black-box process contains a shell script somewhere inside (it may be a system()-like call or a mere Bash script), it would unconditionally interfere with your process group setting, and all the processes spawned through such a shell would lurk out of our control.

The current version of the script tracks process trees by default (but you may switch to tracking process groups). OF course, it will miss the children explicitly detached from their parents (such as daemons spawned). But in an only such a case we noticed in our project (an ssh process spawning to serve a network mount), we did not actually want to track such children! So we switched to trees over process groups without hesitation.

What could be improved

Of course, the sources are not ideal. They contain several problems, which are already identified, but not fixed (due to YAGNI principle). Here's the list of them.

Terminating several processes in a group correctly

Let's first elaborate the case with process groups. The Linux command that sends signals, kill, is capable to sending signals to the whole process group. I.e., after such sending, each process in the group receives the specified signal.

However, the processes in the group we track may have internal dependencies. Assume that two processes are connected by a pipe, and the group receives SIGTERM. If there is an interval between delivery of these signals, then, after death of one process, the other may receive SIGPIPE. This signal may cause reporting an error that would look like there's an internal error in that black box. However, this problem was actually caused by the resource controlling script.

I asked a question on Stack Overflow about that, and an only solution close to the truth was to pause the process group, then send term signal, then continue execution of the processes in the group. I.e., here's the part of the script:

If there's a better solution, or it's proved that after kill syscall the next instruction in all processes would be the signal handler, please, let me know!

Terminating several processes in a tree correctly

Process trees are even more tricky to terminate. Here's how we do it:

  1. Get a list of all PIDs in the controlled tree
  2. Stop all these processes by PIDs
  3. Repeat 1 and 2 for the processes spawned between getting list of PIDs and sending signals
  4. Repeat 3 until no new processes is spawned
  5. Send TERM signal to all the processes in the tree
  6. Get list of PIDs in the tree
  7. Send "continue" signal to them

If this solution can be simplified, please, let me know.

Error due to insufficient frequency

The case when a grandchildren is born and finished between two measurements, was mitigated by taking cumulative children time into account. However, the controlled process is still free to go insane between two wakeups of the polling script. This is especially important on an overloaded system, where timeout script has to compete for its CPU share with other processes. Perhaps, making the controlled process more nice would help.

Perl quirks or lack of knowledge?

The actual source contains more tricks undisclosed here. For instance, Perl has issues with ALARM signal, and I had to work this around the way I found in the article linked.

However, I'm not sure if I worked this around correctly, especially in the part of terminating the processes after the script detects limit violations. It involves the usual sequence of SIGTERM-sleep-SIGKILL, but that sleep sometimes never ends, leading to hangups in the slave's SIGTERM handlers sometimes.

Conclusion

In this small article, I shared what I experienced during the development of a script that limits time and memory consumption of a "black-box" process and its children. The result was a "dumb" but a sophisticated solution implemented in Perl, and reasoning why there's not many options to choose from. Further directions, in which this script could be improved, are also listed. I also linked the GitHub project for the timeout script.

I hope, that will help you in your affairs. Apache license allows you to use the script in commercial products in return of just mentioning the usage of the script and its original copyright owners.

Read on | Comments (3) | Make a comment >>


Uploading and processing data with inotify-tools

I'm not a muscleman. In fact, I'm kind of a wimp. But given all this, my weight is 15 kilograms greater than I should have with my height. Time to lose some.

I've been logging my weight for quite a time, but I realized that logging alone doesn't help. I need other people to nudge me, so I would be more ashamed of my condition. To achieve this aim, starting today, I will publish a graph of my weight. It is rendered by gnuplot, and that page contains the script that renders it.

And, of course, since I'm a geek, the process of displaying the graph should be made as automated as possible.

Nearly every morning I weigh and log the result in the text file. My aim is to automatically upload it to the server right after I edit it; of course, manual uploading is out of the question. The update happens at different times, and I don't want to poll the file every hour, since it is just inefficient, and induces an unnecessary latency. And here's the neat solution.

What is inotify

Inotify is an interface to Linux kernel; its purpose is to watch and report when certain filesystem events happen.

Just like Berkeley sockets, inotify calls create file descriptors, which can then be watched by usual select() calls.

A list of events that may be watched includes accessing file, modifying file, moving files and modifying metadata. Directories can also be recursively watched.

More info you can get in man 7 inotify and on Wikipedia.

Tracking filesystem events with inotify-tools

Starting from 2.6.13, Linux kernel has an interface that watches for filesystem events without need to wake up on timer to check them. This subsystem is referred to as "inotify". We can utilize it for our aims.

The generic idea is to run daemon that waits for the weight log file to be changed. After I click "Save" in the text editor, the daemon notices it and sends file to server via scp (copy over ssh). On the web server another notification daemon watches for changes in the destination file, it renders an image with a plot and copies it into the webserver place.

Of course, nowadays one doesn't need to write C programs to access inotify directly via syscalls. Instead, I installed inotify-tools package (sources), which allows to use inotify from shell; a command we will use is inotifywait.

Inotify, as noted in Kernel docs, is interfaces via establishing watches that write information about the events via file descriptors. Waiting for a filesystem event is a mere blocking read from a file descriptor, the intrinsics being handled by the Kernel. The inotifywait command just wraps this in a convenient way.

One of the ways inotifywait can be used is to "monitor" events: the inotifywait process just prints a line to STDOUT when an event we watch for occurs. So we can just pipe the output of inotifywait invocation to a simple shell script that reads the lines and does work on each read. Here's how the daemons look like:

Local PC:

Remote server:

These daemons are currently deployed on my local machine and on server.

Version without monitoring

Due to misunderstanding of how stuff works, I encountered strange denials when I used the -m option to inotifywait. That's why previously I used the following daemons:

Local PC:

Remote server:

The calls to inotifywait here contain no -m option, which means that they exit after the event watched for occurs.

There are several wait operators and subshell spawns that may seem excessive at the first glance. However, they're there to avoid losing data during the race condition.

Assume there's no waits and no subshells, just inotifywait and copy/plotting commands. Then the following scenario is possible:

  1. I enter new value and save file
  2. The file is copied to remote server
  3. Remote server notices that file's changed and starts plotting it
  4. I notice a mistake I made in data, fix it and save the log again
  5. The file is copied to remote server
  6. Remote server finished plotting previous log with mistake
  7. Remote server starts waiting for file modification
  8. Changes made in p.4 are ignored!

If the file is copied to remote server while it was plotting it (and plotting/converting takes noticeable time), the subsequent inotifywait call won't notice any changes. Because when the file was modified, inotifywait was not running!

We solve it by doing conversion work asynchronously. Then, step 7 doesn't necessarily follow step 6. Instead, it's executed (hopefully) right after step 3.

If we spawn subprocesses instead of running the whole conversion, the time when we miss modification events is dramatically decreased, and doesn't depend of length of processing.

On the other hand, we can miss modifications when we're waiting for child processes to terminate. But even if we miss modification events, when we finish waiting, we will handle the modified file anyway. So that's acceptable. In general, it's also of no harm to do a, say, sleep 10 after inotifywait: another data save is likely to happen after the first one.

Conclusion

Finally, we've created a synchronized system of two daemons for the local and remote machines, based on inotifywait program of "inotify-tools" package. And now I have a public tracker of my own weight deployed within an hour, but I still wonder, wouldn't it be better if spent this time jogging?..

Read on | Comments (0) | Make a comment >>


Easy parallelization with Bash in Linux, part 2

Easy parallelization with Bash in Linux

Two articles that show how concurrency can be utilized for bash scripting.
  • Part 1 - pipes and xargs thread pool
  • Part 2 - process substitution and flocks

In the previous article, I told a bit about xargs command and piping in bash. But it turns out that I missed a lot of important information... So, to complete the topic, I just have to post another article. The second thing I'll tell about are flocks, and the first thing I dont understand how I missed is...

Subshells and process substitution

Bash has a convenient syntax for spawning subshells implemented in bash language. They are written inline the script, unlike the processes spawned by xargs, and interaction between them is more tight.

Each subshell acts like a small shell invocation. It has its own set of file descriptors and environment variables; they are inherited from invoking shell, and can be overridden, but no change affects those of the parent shell.

Bash also has the functionality called process substitution. It spawns a subshell and is substituted in parent process with a name of a FIFO (for example "/dev/fd/63"). You must connect one of the file descriptors of the invoking shell to one end of that FIFO at once, since the FIFO will be invalidated when the subprocess ends. Here's a small example.

Assume that two kinds of lines come to the input, and you need to process them by two different processes. This can be achieved by spawning a subprocess that separates the input, and feeds it to the two other subprocesses via diffrent FIFOs. No wonder that all three processes will be run concurrently. But the cool thing is that it will take surprisingly few characters to implement it.

You'll see something like that:

pavel@lonely ~/social/blog/bash $ bash split.sh
aaayyy
bbbzzz
aaayyy
aaayyy
bbbzzz
aaayyy
bbbzzz
aaayyy
aaayyy
bbbzzz
bbbzzz
bbbzzz
The $line is unharmed : unharmed

Yes, it's true that when the shell script finishes execution. its subprocesses are still working. Unfortunately, I don't know less "dirty" way to synchronize with the subshells. The good news is that you seldom need to wait in such cases (this one is just an artificial example where stdout is in use), and that if you use process substitution for substituting input only, you won't need to wait.

Mutual exclusion via flocks

For now we've only seen pipe-like synchronization: certain processes might wait until data appears at the other end of a pipe. There also is another kind of synchronization, when a process waits until a shared lock is released--mutual exclusion. To implement it, you can use flock functionality available in Linux systems.

The flock function takes a file descriptor (of an opened file) as an input and waits until a process becomes a sole one among those which flocked it. When the file is closed, the lock is released and the next process that flocked receives control. Sounds like a "named mutex", and indeed it is one.

Here's how I used it in a job server to prevent concurrent scripts from installing to the same directory simultaneously.

If two concurrent scripts wanted to install to same folder, one of them waited for the other to perform installation and then quickly checked by succesfull make run, which compiled nothing, that the software is installed.

You can play with this one:

The output shows that "installations" of as become blocked until sleep finishes and hence the flocked file is closed:

pavel@lonely ~/social/blog/bash $ bash flock.sh
Installed b
Installed c
Installed a
Installed a
Installed a
Installed a
Installed a
Installed a

However, flocks don't block all read/write operations for the file locked. It only blocks other flocks. Yes, the file must be opened to be flocked (that's why we needed these numbered file descriptors), but don't let it confuse you--no read/write operations block over the flocked file if reader/writer just doesn't use flock. That's a named mutex, not a protection feature!

Yet another further studies

While I was writing this post, I checked manpage of bash and suddenly bumped into coproc keyword. I didn't see it before because it is said to be introduced in version 4.0 I recently upgraded to. Here's a googled article about it.

Yet another conclusion

That's most likely all I know about concurrency that's available via Bash primitives in Linux (it could be one article if I didn't forget about process substitution :-) ). I hope that it'll help you write better programs and value shell script powers a bit more.

Read on | Comments (0) | Make a comment >>


Easy parallelization with Bash in Linux

Easy parallelization with Bash in Linux

Two articles that show how concurrency can be utilized for bash scripting.
  • Part 1 - pipes and xargs thread pool
  • Part 2 - process substitution and flocks

I love concurrency. The Idea of several mindless zombies collaborating to provide a useful result thrills me and I'm always so self-satisfied when I see that these are my boys.

Writing a concurrent program is considered complex. But fairly simple concurrency tasks can be accomplished in a casual Linux shell Bash with just a small effort (about Bash: wiki, manual).

Basic piping

The famous pipe command allows one program supply its output to the input of another one.

The thing is that these programs are run concurrently and the reading one automatically pauses if the feeding program is still computing next line.

Of course, there are some issues with whether such a pipeline will indeed be scattered to different cores, but on modern kernels (starting somewhere between 2.6.28 and 2.6.30) you'll be okay without special measures or hacks.

Note that you can also pipe several commands, because pipe command is of low priority:

Thread pool with xargs

What is xargs?

xargs (pronounced as 'ex-args') is an utility that allows pass a large, externally computed number of arguments to a command. For example,

builds up a command

and executes it. Or, it can build several commands, so

is equivalent to

Learn more from wiki or man page.

There is also a nearly unknown functionality of xargs utility. It... is capable to act as a thread pool! (Well, this would be a process pool, actually.) Given a lot of arguments, it is capable to maintain the fixed number of processes that will be invoked for these arguments sequentially, picking next one from standard input of xargs when one of the processes finishes execution..

Usually, xargs maintains one process, and only when this process finishes execution, will xargs run the next one (see second example at side note). But this can be extended from "one process" to "one of N processes" with --max-procs=N argument. Try it:

What is seq?

seq is an utility that just prints a series of consequent numbers. It's really useful command, especially if you need to feed xargs with a specific number of lines, and don't really care for their content (like in our examples).

The simplest usage would be

which just yields 15 numbers starting from one (not from zero!), each on new line. I edited this post to use it instead of loops like

A more advanced example,

yields

Unfortunately, this command is not in POSIX, but it is included in LSB 4.0, and presents on all major distributions on x86 arch (proof link).

Without --max-procs the numbers appear in natural order, while with this option they're a bit mingled. That's the sign of concurrency!

Sample usage: test how web server suffers the load

(-I NONE means that numbers that come to our input will be substitutes to the place indicated by NONE string. But since there's no such string, we just discard these numbers, still running command 1000 times. It also implies -L 1, meaning that each line represents a separate argument).

Arbitrary commands can be managed this way. For simplicity, let's assume that your commands do not contain quotes or special characters (see the next section for this). Consider you have a file with simple bash commands:

Now let's execute the commands:

(Here we used -I option as intended: to specify placeholder that will be substituted with the line read from input file).

You'll see that they will run concurrently.

Here is an example of a script you'd have to write unless you knew about this feature of xargs.

Automatically escaping incoming commands for xargs

In the example above, you had to manually escape the commands you wanted to run concurrently with xargs. A command that uses quotes would be expanded, and an attempt to execute it would result in a syntax error or an incorrect behavior:

(what we wanted was two lines without that "n" between the words!)

GNU parallel seems like a better alternative to xargs. This tool was mentioned in this reddit comment to the link to this article someone posted to /r/compsci (why?).

Although parallel is more powerful, it is less easy to use. Besides, it is not available by default on as many distributions as xargs, and you have to install additional packages on Fedora 15 mentioned in the comment, for instance.

If your shell is Bash, you still do not need any external utils such as GNU parallel. Instead, you may use its printf's %q format sequence. It prints the escaped version of the string argument. Our previous example would then look like

This may be used to run arbitrary commands stored in a file, one per line:

First, we should convert them to their escaped versions, and then feed to xargs. The former is done with Bash's native linewise reading from file, and printf mentioned above:

Concurrency when jobs depend on each other

For this task suites a good ol' make tool.

Having been called as make -j 5, make will work as a thread (process) pool, but taking dependencies into account. For example, it won't spawn a process for a target until all its prerequisites are completed. Here's an introductory article I quickly found in google.

Further studies

This post doesn't cover race run-time conditions at all (all race logic is in the amount of concurrent processes or--in make's case--in dependencies). For this, you can use another Linux utility flock; here's a couple of examples how. Maybe I'll write about it more.

But anyway it's obvious that Linux shell is powerful in its concurrency powers. I hope my post will make you use more of it in your daily work.

Read on | Comments (1) | Make a comment >>


More posts about bash >>