Searching for the declarative language
Contents
I'm really tired of telling these dumb machines every instruction they should perform! I know that they only pretend to be dumb, and they can act clever if they're taught to.
After I implemented yet another graph traversing algorithm, after I wrote "if it doesn't exists, create, and only then proceed" for the million times, after I had troubles with parallelization of totally independent tasks just because I didn't write the program to be parallel, I felt that I'm done. Most programs I wrote had a similar scheme: loosely connected computations at different nodes with some dependencies between the results of them.
But over and over again I had to implement the exact sequence, while it really either doesn't matter, or is easily computable. I started looking for a language that takes this burden off me, for the language that allows me to declare my intent and build the sequence of actions on its own.
I discovered that there's a programming paradigm devoted explicitly to this: declare intent rather than steps. And, not least, I should be able to use it at work, to write production code with it! However, the languages I knew back then didn't support this paradigm. So I started looking, both for new languages and inside the things I already knew to spot declarativeness in them.
I was disappointed. Nearly nothing out there was really what I want. Here's what I've discovered so far.
Markov algorithms
What is Markov algorithm?
Markov algorithm is one of the basic forms of algorithm notations. It expresses a computations as a series of string rewrites. Given an input—a string—execution of Markov algorithm is a prioritized substitution of one substring with another, the substitutions being prespecified and immutable; the list of substitutions forms an algorithm. Here's a program that converts binary to unary:
|0 → 0|| 1 → 0| 0 →
You can see why this works on the Wikipedia page.
You can simulate a Turing machine with Markov algorithm. Just draw a "caret" on the string and express the algorithm as the caret movements. Inside the caret you may store and modify the "state" of a Turing machine. Here's how a Turing machine's rule "if a caret reads A and is in the fourth state, write B, change state to 10th and move right" would look like:
|4|A → B|10|
That is, Markov algorithms form a Turing-complete language.
This is a declarative language, about which almost everyone who studied computer science knows. (Well, at least, in Russia; due to national origin of its inventor it may be more widespread there). However, nearly no one could name it when asked about declarative languages. Sad that people don't establish connections between what they learn at the university and what they call "real life"...
Back to the topic, Markov algorithm is an ordered set of statements A → B, each of which really means "there can't be sequence A in the resultant string! But if there is, it should be replaced with B"
It's nevertheless questionable if Markov algorithm is declarative. Sure, it doesn't tell exactly how the computation should occur. Of course, the exact calculations can be easily inferred, but it is true for any declarative language.
The problem with it is that Markov algorithm is a mathematical abstraction. Although, following the notation of Markov algorithms, some useful languages were designed (one of them is the Refal language), they still resemble mathematical abstractions, and I can't see how I could use them in production.
Make
What I could—and do—use in production is Make, the language of Linux makefiles. It doesn't have a specific name, actually, so I'll call it "Make" here.
Basically, a structure of a Make program is a list of "rules". Each rule specifies a target (or a target pattern, like %.cpp), and a list of prerequisites. A rule means that in order to accomplish the target, you must
- get done with prerequisites
- execute commands specified within the rule.
The commands usually take prerequisites as input, but they don't have to. This creates an oriented graph of dependencies, which is walked until you reach the target. Sounds like a cool concept, which should have been backed up with some kind of mathematical theory and notation!..
In reality, it's not. Make evolved from (and, perhaps, still remains) a build system. And here go some gory details. The targets and prerequisites denote files in the filesystem, and the timestamp of these files (i.e. the time they were last modified) is used as a measure of whether you're "done" with prerequisites.
The rules are implemented in one of the shell languages you choose. Technically, they could be written in any language, but shell scripts are chosen because they're tied to work with files as input—and files are the primary objects Make works with. This fits building application perfectly, but you can go over this domain if you treat file system as just a key-value database.
However, the biggest limitation of Make is that it can't modify the dependency graph after the execution is started. There are techniques to overcome this restriction, but they're not generic, and are too tricky and fragile to be used in production.
Another thing is that Make is just...too old. Programming has changed a little since 1978, you know... Make is not flexible, has problems with debugging, and doesn't evolve much: its computational model is already exhausted.
Whenever
An esoteric language I was introduced to in a programmers.SE question is Whenever. Basically, a Whenever program consists of an unordered list of clauses. A clause can contain an operator. An operator is executed if the clause that contains it is in the to-do list, and if some conditions apply. Operators may add and/or remove clauses from the to-do list, or print text (and we know that printing text is no different from any useful work, actually).
Memory cells are implemented as possibility to have a clause several times on the to-do list and as expression that returns this number. The conditions mentioned above can refer to this expression.
Fairness is a property of a nondeterministic system, which can repeatedly and infinitely face a multiple choice, to not allow behavior when a certain item is never chosen past any moment of time. I wrote a big article about fairness, check it out for more details.
The language is called "Whenever" because its main feature is that it's not clear when a certain clause could be executed! Each clause in to-do list is considered for execution with uniform probability. (I think the equality of probability could be safely replaced with enforcing fairness). Sadly, the author was so excited with the concept of such "absence of urgency", that he overlooked the declarative nature of the language.
Here's a sample program, which is just a "while" loop. It should print A ten times:
1 defer (2) 3; 2 print ("A"); 3 3; 4 defer (1 || N(3)<=10) -3#N(3),-5; 5 again (5) defer (1 || N(3)>10) 2,1;
Play with the language yourself! You can read more about the available commands, and download Java (sic!) interpreter at the Whenever language homepage. It seems that David Morgan-Mar is its author.
Whenever language's key difference from Make is that it has control over its own dependency graph. It can impose and drop dependencies as the clauses are executed. However, the weakness of this language is its esoteric nature coupled with absence of a developed mathematical model (it resembles several concepts at once, but not any particular one).
Still not what I want
So, my results of searching a declarative language, which could fit into production, are still not great. The languages I observed here are either too conceptual (Markov algorithms and Whenever), or too limited (Refal, Make). I believe this gap will be getting closer over time. Or, perhaps, I'm just searching in the wrong place? Maybe such languages as Haskell and Prolog can already do what I want, but I'm just not aware of it? And do I really know what I want?..Well, this only means that there'll be another blog post when I get to it!
Comments imported from the old website
Hello, fellow programmer!
Thanks for your comment, I learned some very interesting things from it. So, here we go.
The search has stopped without succeeding for what I believe are good reasons. But let me start from the beginning.
First, the questions you ask are straight to the point. "Modifying dependencies" was a plea to have this limitation of Make removed when results of previous computations will not affect what rules will be applied afterwards. The way to combat it is to write an overcomplicated makefile that generates another makefile or calls itself recursively, and other questions are perfectly valid.
I simply wanted to find a language/paradigm/something that addressed them--and many others--for me :-)
I skimmed Wikipedia pages of dataflow languages at the time I wrote this post; I remember I reasoned that they are completely different from what I wanted to achieve, but details have been forgotten.
At the time of writing, I also didn't know about Prolog, and in your comment I first encountered Linda! This looks very promising, especially if I consider more interesting matching implementations. I hope I'll find time to play with these things.
Now to the reason why I deliberately stopped searching. As I wrote more practical programs, I realized that the model I had in mind is rarely used--at least, in the work I've been doing daily.
Persistence of results of intermediate computations is rare, and for a good reason, because most of them are rarely reused, and should perish quickly. Where persistence makes sense, it can be solved with caching just fine.
Coordination usually constitutes very small part of a product, and re-use of readily available "imperative" coordination systems (message-passing and shared memory of all flavors) is simply good enough. Improving a small part with great effort is a violation of Amdahl's Law.
We still will need to solve translation of outputs of one component into inputs of another. This is a problem that separate language, as opposed to bindings, solves, but this is even less practical.
So this search transformed from practical problem to theoretical issue hence lost my focus. While small, the coordination part can be made seamless, but I don't have enough vision to formulate the requirements at this point; I can only keep dreaming.
How's the search two+ years later?
You're right, you're not really clear what you want, but I love this area.
Here are some things you don't specify:
Pattern match?
How do you want to identify the things you want done? Can you produce some task description object and give it a simple ID? Is the description something you could hash? Does it need to include wildcards? Is it a predicate calculus expression with unification and maybe even ands and ors of subexpressions?
Persistent results?
Are the tasks going to produce data that can be reused by multiple further tasks, or are they "eaten" by the task that asked for them?
Modifying dependencies???
Are you sure you know what declarative means? :-) If so, how can the dependencies of a task change? In particular, how can they change without invalidating what's already been done with the graph? Are you just talking about something simple, like an OR, where finishing one task makes others unnecessary (at least for that OR)? Or a piece of code that, once it gets run, may change its mind about what subtasks it needs done?
Concurrency
How much control over allocation of tasks to processors? Do you want the control and coordination to be distributed or centralized?
Those are just points that struck me as important to nail down.
One idea is just to set up a module to make dynamic programming (or caching) easier in whatever language. Then the language for goals is just to call the function that computes the result you want. I know there's a simple module like that in the Python Package Index. I don't mean that caching is all you need, but that using a function interface to each task-type lets you start with a single-threaded program and incrementally add caching, threading, fine-grained scheduling, etc..
The Linda "coordination language" is a language with requests and triggers based on pattern-matches to assertions in a database. Very clean, and it interfaces to multiple languages. In contrast you could look at the CONNIVER manual for more complex and hairy ideas.
There seem to be many dataflow languages. Looks like some research would be needed to find out which ones might be useful to you.
There is a make system called SCons that is written in Python and is extensible to new target types. Its way of expressing dependencies and actions is somewhat different from make's.
You can steal my tiny, Python, single-threaded clone of make, myke. It's a little messy but it's so small you can turn it into your kind of mess.
There is Flat Concurrent Prolog. What's "flat" about it is that it's designed to give you control over the parallelism, within the Prolog paradigm, rather than spawning an arbitrary tree of processes on an ordinary Prolog program. Here's one free implementation.
Here is a link to some old discussions of FCP vs. Linda vs. the E language, a distributed object-oriented language.