Porting Perl's features to Ruby

Diamond. A Girl's Best Friend. Ruby. A Programmer's Best Friend.

An official motto of Ruby language with its joke explained.

As I told in my previous post, I now use Ruby as my primary scripting language. The most prominent reason is that Ruby is superior to Perl in both syntax and in expressive power. Being object-oriented, Ruby allows a lot of things to be expressed in a more concise manner, and exception handling relinquishes the developer from the burden of checking return code of every damned operation.

However, there are some features of Perl, which don't map to Ruby directly. Here's a couple of them.

Boolean String Interpretation

The first is "conversion to boolean". In Perl a lot of things convert to false in boolean context: an empty string, a zero, an undef (undefined value).

In Perl6 (and in recent releases of Perl5), there is a "//" operator, "lhs // rhs" being equivalent to "if lhs is undef, then rhs, otherwise, lhs". It will make the language slightly more expressive, but, unfortunately, won't bring us closer to our aim: making Ruby as convenient as Perl.

This allows us the following. If we want to assign a default value to a string parameter, we'd just use || operation:

However, in Ruby, only nil (undefined value) and boolean false do. So the code similar to the above (which uses Ruby's || operator) would only work if the str is nil. It wouldn't work if it was empty. And in many situation it would be very useful. Consider that you're retrieving a string, which, being not specified, should default to a value. If this string is unspecified, and comes from a user's input in a form, it'll most likely be an empty string. And if it originates from a database, it'll have a nil value. How you we write the code that sets the default value as concise as the Perl's one?

The solution is redefining operators in a class. We will introduce a new operator for strings that works like the Perl's ||. Note that we can't just redefine Ruby's ||, because it may be used by a third-party code, with which we shouldn't interfere.

One of Ruby's features, which originates from SmallTalk, is that we can redefine methods of objects and clases on-the-fly. This seems unusual for the most common C++ and Java languages for OOP support, but in Ruby a lot of thigs are implemented this way.

This feature is sometimes referred to as "monkey patching"

Let's call the new operator or, and let's redefine it for strings:

We notice that it works well if left-hand-side is a string. But what if it's a nil? Our aim was to have a proper behavior for expresions like a.or("b"), where we don't know if a's defined. But we can't redefine how nil works... can we?

In Ruby the nil keyword represents "undefined value". Techincally, however, it's just a... singleton object of class NilClass. So we can open this class the same way we did for String! In Ruby we can add methods to nil (undefined) objects! Here's a complete code for a sample program:

Check it out: it prints four "aaa" lines, just as planned. Such a convenient feature in exchange for just six lines at the utils.rb file of your project (you have a file with your most lovely code pieces, don't you?).

Interpreting lists in function call arguments

Perl has a creepy way of handling arrays of values. It uses tricky terminology to distinguish "arrays" from "lists", while they seem like a same thing. "List context" and "scalar context" add more complexity, and "references to lists" finishes a programmer's brains.

However, there's a lot of places where this nonuniformity turns into features. Here's one of them.

A regular Perl function (unless special measures are taken) takes arguments as a single list. If a function is said to "take two arguments", it takes a list instead, and interprets its first two elements as "arguments".

This looks ugly at funciton's site, but it's beautiful at caller's site. Let's write a sample program, that, say, finds all "backup" files in a given list of directories, by calling one of the Linux shell tools, find. Here's how it would look in Perl:

We pass a list of arguments to system, imbuing command-line arguments of the script @ARGV into its middle. A flat list with a list of folders at the beginning will be passed to system function, not a nested list! However, it's the nested list that will be passed in Ruby:

Ruby is more strict about its arguments, which I find more convenient than Perl, but in this particular case it leads to rejecting of concise and transparent code. Ruby does have a way to pass an arbitrary number of arguments, similar to C's "ellipsis":

The arguments beyond a certain point form a list, which can be explored. However, it's not of much help, since Kernel.system can take an arbitrary number of arguments, but, nevertheless it doesn't work.

What Ruby lacks is interpretation of function call arguments as a list at caller's site. But since we can't change it at caller's site, let's change at callee's! Here's how our wrapper around the system function may look like:

This will work, and you'll see an array of arguments printed on your screen before the command executes.

But that's not all. What about nested lists? Perl doesn't have nested lists, it flattens them all at construction. You could, of course, create nested data structures with "references", but in a pass-by-reference Ruby language it's not an option.

For example, if you call a similar wrapper with arguments like these:

it will flatten it down to a list of eight arguments, which may not be what we want. Instead, we might expect the fourth argument to be a three-element list, while List.flatten wouldn't keep it. The solution would be to use another function (I called it soften), which we can create by the same monkey-patching technique. Here's a code of a sample solution:

Here's the script's output, which demonstrates that it works as expected.

[1, 2]
[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, 4, 5, 6, 7, 8]
["nohup", "ls", "-l"]
[1, 2, 3, 4, ["nesting", "is", "important"], 5, 4, [5], 6, 7, 8, 9]

***

I prefer Ruby to Perl because it's more strict. Its strictness makes Ruby less tolerant to some perl-ish "tricks", though. On the other hand, the flexibility of Ruby allows us to mimic these features without even fixing the interpreter.

This makes me suggest that "strict but flexible" is a more preferable option to "more free, but less flexible". Keeping ungrounded assumptions aside, at least, the Ruby tricks listed above make its use more convenient, especially if you are used to being an undisciplined Perl programmer.

Contents

Boolean String Interpretation

Interpreting lists in function call arguments

***