Recent blog updates

Shown here are only posts related to me. You can view all posts here.

Why I No Longer Work On Weekends

I remember this picture of my typical day two years ago. As sunny Russian winter morning finally banishes the dull winter night, I finally say "bye" to my girlfriend and leave for historical part of Moscow. Where I'm heading to, in a three-story, old cottage that predates even the communist government, lies my beloved office. I'm alone there. It is deserted, but not because of lay-offs, a party I'm not invited to, or because it's a one-man startup. It is just Sunday.

I log in to my workstation, and start doing it. Checking experiment results. Debugging new experiment frameworks. Just thinking, trying to push limitations of a suboptimal algorithm beyond what the humanty knows so far. Being productive, though less than usual: the presence of peers doing something pressures you to check reddit less frequently.

That was my typical Sunday afternoon two years ago. Today is Sunday too, and I'm writing this very sentence riding a ferry over the Bay rather than sitting in an empty office. To answer why, first I should understand why I went to the office on Sundays in the first place.

Integrity

Of course you've realized that the Sunday office was my shelter from the girlfriend, because it would be impossible to work at home in her presence. Keeping off of distractions is the reason why one would prefer to go to the office to do work, not just on Sunday, but on any other day. Other reasons exist, too.

Sadly, I couldn't find any reference to "anchoring" in a more reputable context than, say, site about fitness or life-coaching. This could be more "scientific" foundation behind it.

Some people take keeping everything off from home to the extreme. For instance, one successful finance consultant doesn't even have a fridge, let alone a PC, at home, because cooking and dining do not belong there as well as working.

One of them is psychological "anchoring". Office is associated in your mind as a place to do work, so your mere presence there, in formal clothing and in working hours, sets the right mood, and makes you more productive than relaxing in a comfy chair wearing slippers and a robe. The reverse is useful too: keeping work off of your home as much as possible makes your relaxation there more fulfilling.

Purely practical reasons kick in, too. At work, you have a fast intranet with a fast access to your cluster. It's somewhat amusing that at every place I worked as a programmer there was a cluster to which a large part of computation was offloaded. First it was a cluster that runs experiments, then it was a distributed software compiling and packaging system, and then the product itself was a distributed system that runs user's code. You definitely want to access this and other helpful corporate resources, such as repositories, with low latency.

Moreover, on your office machine you have a ready environment you work in, with editors opened, and consoles chdir-ed, so you can just dive in in and start instantly where you left. Finally, your corporate policies could prevent you from working from anywhere except the office for security reasons.

However, the reason why I no longer work on weekends is not that office became a less appalling place. The list above misses one crucial component.

Alone in the Ivory Tower

After all, I was not a biologist, so I didn't need any help from a fellow researcher to stare at my tube.

When I was a research intern, working on weekends was easy. I spent a lot of time coding new algorithms and experimenting with them, reading papers written by others, writing my own, or just thinking on stuff. I spent total of 2/3 of the overall time on these activities. They all have one thing in common: I could do this alone.

While interacting with others is a very important part of a researcher's work I used to underestimate, you could work alone for a good chunks of time and still be somewhat productive. No wonder I did. Besides, our project had only 4-5 people, and interacting with them just couldn't occupy much enough time.

Industry and a Larger Project

When I, as a Software Engineer, moved to what they usually call "Industry," I also tried to establish the stream of working on weekends. I managed to sneak in on two Sundays, but that was it. With a much more "officy" office, and still a challenging project, it didn't work anymore.

This is a real picture from Android patch workflow. Many workflows in a larger team look no less insane.

The work now demanded collaboration. The team was 5 times larger, and most of the work I was doing required to check with others frequently. Sometimes it was as ridiculous as making three meetings over three weeks to agree over a certain feature or change that took 15 minutes to implement. More often, you were just stuck waiting on others' decisions, on others' feedback, and on others' work as programmers or sysadmins.

I could find side projects and less important things a programmer always has in his backlog that required no collaboration, but unimportant work was never worth a Sunday spent in the office.

Besides, working on weekends now involved making peers feel uncomfortable. Some practical applications involved assigning a security guard to protect me working, which required me to apply for each weekend work permit separately as per company's policies. Application could be automated, but making another person waste his time was too much. I went twice, when the presence of the guard was justified by sysadmins and managers doing other work they couldn't postpone, and never bothered with it afterwards.

Let's Multiply By Ten

Although I spent weekends just like the normal people, I still stayed longer hours to get more work done. I still made security guards uncomfortable because they have much less opportunities to close earlier, but it was now legally my time anyway.

Now let's multiply by ten the size of the team, the volume of the codebase and the complexity of tasks, which describes my shift to the next job. The amount of work that could be done without the communication also shrunk by the same factor.

Email became the primary work tool. Most of the work is triggered either by notifications from others, then it is discussed with the peers who are much familiar with the codebase. When I am stuck with something I can spend hours in the late evening trying to debug the problem on my own, or I can ask my peers who worked on the project for a long time, and get it resolved within minutes.

Not only the office stopped to enjoy my visits on Sundays, I started to work late hours less. The communication with peers is no longer just red tape, it is the only way to be productive. And this communication is so much more efficient in the office than over e-mail, video conferencing, and phone calls, that you being in the office is inevitable. Studies showed ridiculously big numbers for how the bandwidth of the information our brain receives in real-world communication is larger than what we get by a video call. And it's more fulfilling, too.

To increase my productivity I even started watching my sleep cycle, and I now try hard to move my typical shift from 12-9pm to the normal 9-6pm. It doesn't work out well. I try.

Christmas as a Way to Enjoy an Empty Office

This amount of communication sometimes becomes overwhelming. The incoming mail speed starts to become more than the speed with which you can read it. No wonder that people started to value the Empty Office, when distractions and stress decrease, and you can just enjoy exercising your favorite art. Of special value is the US Christmas holiday, where you have two days others usually take an unpaid leave on so you can work with fewer interruptions.

Few come on weekends to achieve the same effect, of course, because it is still not productive; but a legitimate reason to have a less populated office is of understandable value.

***

I still like my job, and am as passionate about it as when I was younger; I still can handle 6-day work week. But, on Sunday, I'd rather read about something in the comfort of my home, go out with my friends, or visit some local nature. Hiking usually ends with a ferry ride with an awe-inspiring view on San Francisco skyline I can't really enjoy because—let's face it—I open my chromebook and read or write something about programming. I just don't do it in the office anymore.

Read on | Comments (0) | Make a comment >>


Time-Based Protests

Sample time-based Protest

(photo by Ivan Afanasiev). The largest "Strategy 31" time-based protest I attended (January 2011). A protester features a sign with a protest logo.

A month ago I bumped into a protest manifestation in Moscow. It was one of a series of events "Different Russia" movement has been organizing since 2009. The interesting thing in this event is that it happens on 31st of any month consisting of enough days. This day was selected because the 31st clause in the Russian Constitution stated the freedom of assembly the movement believed was violated. Russian protest movement considered innovative to have a massive manifestation at a a specific date regardless of whether it was approved. For software engineers, however, doing such thing is not an epiphany of any kind. We call it "time-based releases", so I call the way of arranging such protests "time-based protests".

Time-Based Releases

Time-based release is a pattern in software engineering which have taken the open-source world's mind over the latest five years. It means that the product team promises that each new release will be completed on a preliminarily established schedule no matter what new functionality or bug fixes it is about to contain. The release schedules are then arranged to ensure that a complete release cycle with such stages as feature proposal and selection, development, testing, and distribution completes by a given date. Release cycles often span from several months to years.

Many open-source projects switched from "when it's done" to time-based release algorithms. The most notable examples are Linux distributions named Fedora and Ubuntu, which followed GNOME project. Our ROSA Desktop Linux distribution does this as well. It's usually commented that time-based releases help to improve stability, to maintain a sustainable product delivery, and to strengthen the collaboration between developers and testers and users.

The key result here, however is that time-based release model involves a trade of innovation to an established process. No matter how many new, potentially innovative features we develop, we should release at this point, over. In the aforementioned Ubuntu this leads to criticism.

Sometimes I feel that just making release schedules is enough, and it is why people use time-based release as their planning paradigm in the first place. The thing is that you can't make a time-based release without a schedule. This makes me wonder if some projects opt for time-based release model just to finally make themselves do release schedules.

Why It Should Work Then

The most important reason to choose time-based release over "when it's done" is that this makes you think less and make less decisions. How is this important?

There are theories (I heard it at a conference, but failed to find any evidence of this at all, so let's consider it a fairy tale that is to teach us something useful :-) ) that the amount of decisions a person is to make per period of time is an expendable resource. Making each decision supposedly makes our body to spend some specific amino-acid; its supply renovates with time when you rest, or have a meal, but it never hurts to save some.

So, making decisions is hard, and making as few decisions as possible is a good thing rather than bad. Any open-source project has a lot to decide upon, and making one less decision is a small relief of sort anyway. That's why agreeing on a period to make a release in lets the project save some strength for other things.

A funny story happened there at the first meeting the protest committee had managed to get approved by the police. The committee split into two groups, one of which, led by Limonov, tried to manifest in a place not approved by the police.

Their aim was to protest against a collaboration with police forces, the necessity of such collaboration these manifestations was initially to protest against. At the meeting an interesting case of oppression happened: the police lifted the leaders of the "unapproved" protest group, and forcibly moved them into the approved location, saying, "This is where you should protest [against us]."

What "Strategy 31" has become

(photo by RBC). This is a typical representation of a latest manifestation on the 31st. A lot of wandering with cameras, some police, and only few protesters.

Do you want your software project to become like this?

...And Why It May Not

However, one should never forget the ultimate mission, and never let keeping the schedule be put "in front of the cart". I feel that tying innovations into a time-based release schedule is much harder than doing the same into the release that has more freedom in picking the final date. It's not impossible, no, but is just harder. Mark Shuttleworth, the leader of Ubuntu, speaks about the same in his article on this matter.

I can't back this up these claims with statistical data about any software projects. What I did see, however, is how the energy of these "time-based protests" decayed over time. Many political commenters observed that the regularity became the only thing these protest could deliver. At first, it was enough, because the meetings were oppressed with an unusual degree of violence with no lawful reasons. After the government had relinquished the oppression directed at these particular meetings, and there even were two approved by government, the participants realized that the assemblies lost the nerve, and carried no message whatsoever anymore.

This is an important lesson for software engineers as well, especially for those that are involved in open-source projects , which consist of the volunteers just like public assemblies. Establishing a regular schedule is an important thing, but this is only the first step. If you commit to it too much, and sacrifice the nerve and the sense of the project to make it, you will en up like the time-based protests the "Different Russia" was arranging. At the manifestation a month ago, I saw dozens of police officers, a hundred of loiters who came to take a look at the meeting and photograph it on their iPads, and only 1 (one) actual protester.

Read on | Comments (0) | Make a comment >>


BLAST at the Competition on Software Verification'12

At my previous job in a research laboratory, I was improving a Static Analysis tool named BLAST. The name abbreviates its origin, and is decomposed as "Berkeley Lazy Automation Software Verification Tool."

By the way, my former peers have prepared a nice comparison of Static versus Dynamic analysis for the Linux Kernel.

The tool solves a problem commonly described as unsolvable, the Reachability Problem. BLAST is capable of telling—sometimes—if a certain line of code is reachable. It doesn't just run the program with some input data; rather, it reads the program source code, and tries to devise if some input data exist that make the program enter a specific location, a specific line of code.

Of course, it can't actually solve it (nothing can). What it can is to prove that a line is never reachable for a certain fraction of programs. A much more useful purpose is to find bugs. For instance, if we want to check a null pointer dereferences, then we could insert checks like this:

And it does find such bugs. In ISPRAS, a several-years effort within the Linux Driver Verification program/project has led to finding several dozens of problems in Linux device drivers. I even committed some fixes to these bugs to the kernel sources.

Tool Competitions

Well, I got distracted again. I've been planning a big blog post about this amazing technology for years, and I'll make one, but not now. Anyway, it was not me who devised how a program that checks reachability of lines in other programs works (it were two Cousuot-s), and I only fixed and improved an already released open-source tool, BLAST. I did create some novel things, but they were quite minor to my taste (it was just a new zero-overhead structure aliasing algorithm).

So, my last project in the Institute for System Programming was to prepare the tool to the Competition on Software Verification held at TACAS'12, which, in turn, is a part of ETAPS'12. There is a lot of domains where people write tools to solve the unsolvable (like the reachability problem) or the NP-hard (I wrote that you should try to solve this anyway). Such problems can not be universally solved, but a tool may solve them "better" than another, and it's interesting and useful to know who's the best.

A natural way to determine who is "better" is to measure the results of different tools on a common set of representative problems. This leads to specialized competitions among the tools that solve a specific problem. SMT-COMP for Satisfiability Module Theories logical solvers, and Answer-Set Programming System Competition are only two I know about, you can also name a couple of Chess and other formalized game tool competitions. I hope that SV-COMP will become one of them.

Our contribution

The Cousuots' approach is about 30 years old, and the technology beinhd BLAST itself is about 10 years old. However, BLAST contained a lot of improvements that made it apply to many programs, not just artificially created small "research-like" programs. That's why we chose it for finding bugs in Linux Drivers, and worked hard to improve it. Our improvement efforts are stored here, and are available under free software licenses, just like the original tools are.

But, being a less developed technology, it was eventually overtaken by more recent tools, such as CPAchecker, which provided a better ground for research experiments, and had a more developed theory. Still, the competition results demonstrated that the tool was still capable (and I hope that our improvements played a significant role in that). We got 5th place with 231 points, the 4th place being occupied by SATabs with 236. The final table was divided into two groups (200+ and the rest), and we belonged to the top tier.

What was funny that we even got a plaque for achieving the best results in "DeviceDrivers64" category... naturally. Too bad I wasn't there to see the ceremony :-(. Here's a photo of our team who contributed to the improvement of BLAST since version 2.5:

By the way, I already featured my former peers in a fictional interview about a "collaboration tool" by Penn&Paper's.

(Left-to-right: Vadim Mutilin, Pavel Shved, Mikhail Mandrykin.)

I'm sure that this project will eventually be superseded by more advanced software, but I still hope that I'll have time to have some fun with it as well.

A more detailed post about the verification technology pending, I'd like to thank my former peers for the great time I spent in the ISPRAS working on what I wished to do since I was a 7th-grader who first heard about the Halting problem. And I wish all the best to the Linux Driver Verification project as well.

Read on | Comments (0) | Make a comment >>


A Visit to the Computer History Museum

There are not many museums around the world that are specifically devoted to computers. The two I know about are Bletchley Park, located 70 km away from London, and Computer History Museum. The former is not about computers per se, though; rather, it's about the decrypting German messaging during the World War II that happened right there. And the latter is completely devoted to computers.

I visited this Computer History Museum two weeks ago, during my short trip to the Silicon Valley. Here's a couple of photos I made there.

Here's my personal favorite, a slide rule tie clip (as I love ties, I want such thing either!)

Oh, here's another Penn&Paper's solution. It's a clone of Microsoft Visio diagram sketching tool:

Nowadays, you may measure the uptime with a single shell command. Back then, there were devices specifically designed for this:

"Man" is a shorthand of "manual" in Linux word, and not what you, a feminist, thought about.

Need a man to learn what uptime is? Here's your man:

And, of course, a full-size working copy of a Babbage machine (I didn't see the demo, bu they say it works!)

The museum has more of these. Due to my schedule, I only had a bit more than an hour to visit it (note that there's no admission on Mondays and Tuesdays!) But if you'll be around, it's totally worth it. Don't miss it, it looks like this:

(And if you happen to find yourself thirsty, don't go to that bar across the street: the service is quite bad there.)

Read on | Comments (0) | Make a comment >>


Microsoft Office Needs Windows... a lot of them!

Microsoft Office

This is a photo of a Microsoft office (stolen from here), namely, the Roger Needham Building in the University of Cambridge, one of two buildings the Microsoft Research resides in. See how many Windows it features? It just couldn't be a coincidence...

Lately, I visited Microsoft Research PhD Summer School 2011 in Cambridge, Great Britain. The event has been organized by Microsoft for more than ten years. Researchers from all the institutes over the Europe that are partners of Microsoft Research (don't know what it means, but apparently the institute I work at is one of them) come there to learn something new about researching in general, to meet researches who work at Microsoft, and to get introduced to what Microsoft is interested in.

The event was mostly held in Roger Needham Building, where, as far as I know, most of the Microsoft Researchers in Cambridge nest. I haven't been given a chance to take a tour over, but it was obvious that this Microsoft Office really has a lot of Windows on purpose ;-)

Microsoft also blogged about the event, but here I'll share my personalized view.

Microsoft and Linux

My poster

That's how my poster looked like; check out the Tux at the upper-left corner! I know it's a total crap (there's just too much text), but this was my first experience, and the next time I'll do it right... or, at least, more right than this time. Yes, next time I will not use LaTeX for making a poster. (And here is the pdf).

As a part of the obligatory participation program, I made a poster presentation on the research I'm currently doing. Of course, I couldn't miss the chance to troll Microsoft with an image of penguin Tux, the mascot of Linux, and with a title that starts with "Linux" as well (see sidenote).

It turned out that Microsoft researchers are really open-minded; they didn't get xenophobic, they joked on this matter, and some of them even confessed that they sometimes even use Linux in their work.

Of course, the most of the school, though, was devoted to introductory "breadth-first" talks on what Microsoft is doing. The talks were made by practicing Microsoft researchers, and I'll briefly overview the ones that seemed the most insightful to me.

Research techniques introspection

Simon Peyton Jones who is known for his work on Glasgow Haskell Compiler, gave an excellent talk on how to make excellent talks at a research conference. Not about GHC, though, which, probably, could be more interesting.

One of his observations, which I also noticed a couple of conferences ago, was that it's easy to be just good enough because most talks on an average research conference are really boring.

He also made a talk on how to write a good research paper, but it went beyond that, and Simon shared his view on how to interweave presentation into the research. He advised to start writing a paper at a very early stage, starting from the "middle", the "core" of the paper. This would help you to shape your thoughts, and to understand what your research is actually about better, he said. However, some people reported that this recipe may be not universal, as it (luckily?) fits the way Microsoft evaluates their employees.

Another good insight was about about some techniques of how to present a research made. I learned that the approach to presentation of our scientific results we've been taught here, in Russia, is not the only one possible. The key difference with what Simon explained is the fundamental motivation why the research has been done. I have always been taught that in each and every paper I should justify my research with how bad, wrong and suboptimal the results previously discovered were, and here I come, with my research has the sole purpose to improve over them. Simon has introduced another approach. The key motivation for the research, in his opinion, was, "look, I've created something new! Isn't it cool? Look, how efficiently it works! And, by the way, foo and bar have also been discovered, and I sincerely thank their authors for they have inspired me".

Of course, it's hard to tell that one of them is the right way to go. However, I consider papers written the second way much more pleasant both to read and to write.

My Website is Closed for Christmas

Kenji Takeda making a talk on cloud computing

Kenji Takeda gave an overview of what Microsoft is doing in clouds. Aside from flying here and there on planes, Microsoft also builds datacenters, both at software and at hardware level. I learned that an interesting solution was developed for cooling datacenters. Instead of installing fans that just move the air away, simple physical observation could be used to improve over such an inefficient and sometimes failing solution as mechanical fans. The idea is to utilize the phenomenon that the hot air is lighter, and moves up. If the upper part of a large room is open, the hot air is removed from there, thus creating the draught that sucks the cold air in from the holes in the bottom.

After the lecture, we had a brief discussion of how to manage utilization of the cloud resources at peaks of the demand, when there's just no enough power to serve all the requests per minute, such as on Christmas. The answer was that it may be achieved by raising the price at certain periods. Indeed, if we are running a scientific computation (such as genome sequencing), it can be safely paused for Christmas, whereas large shops or entertainment information providers may take the resources instead. Raising price is the way to make it self-manage, driven by reducing expenses and maximizing income for all the parties involved.

As a side-effect, it may turn out that some non-commercial services that utilize cloud computing services, such as my website, may also get temporarily shut down—just as a ground shop would be closed on holiday. So "this website is closed for Christmas" is not a distant future for us.

Biology and Ecology, and... Microsoft?

Biological simulation in Sci-Fi

This is how computer simulation in biology is portrayed in the "Hollow Man" sci-fi movie by Paul Verhoeven. As far as I know, Paul used to be a researcher, and really tried to make it look real. But you may notice that the protagonist is running a simulation of a molecule he designed, and it's not model checking: the science was not that advanced at the time of shooting.

Turns out that Microsoft really carries out a lot of research. For instance, heard a talk about fighting cancer with model checking, which was created with help of Thomas Henzinger, a well-known scientist among software verification specialists. Model checking helps us to fight cancer by discovering and verifying models of internal chemical processes in human cell—without simulations or actual experiment. These models, shaped as finite state machines, are first verified by computers, and if the automaton found is proven sane, the real experiment is carried out to confirm the relevance of the model.

I've seen something like this in the "Hollow Man" sci-fi movie, but it turned out to be the reality of a Microsoft Research laboratory! And even more, while the Hollow Man ran simulations (see sidenote), model checking techniques do not require even a model of the experiment: the assumptions about the nature of the processes being studied are verified as a mathematical abstraction instead of as a physical object, thus providing more reliable result.

Another interesting domain is ecological research. Microsoft gathers and works on providing representation of a lot of ecology-related data. Drew, a researcher from Microsoft, also shared that humanity starts experiencing difficulties with water! He said, that today we literally have to mine water as deep as hundreds of meters. And some of the drink water sources are actually getting exhausted: you have to dig deeper and deeper to get to the actual water!.. However, I'm not an ecologist, so that was just an idle curiosity there.

Anthony Hoare and yet another philosophy lecture

Sir Anthony gave a boring talk about the history of computer science projected on the development of philosophy. Ever since the times when I was visiting philosophy lectures, which were obligatory for my postgraduate studies, dozing off became an instant reaction to the history of philosophy.

I couldn't miss the chance to ask a couple of questions to such a well-known person. I, actually, had one prepared. Several months before, I heard a talk on some applications of model checking. Irina Shevtsova advocated the use of models in software verification. Roughly speaking, she insisted that a program should come with additional information to verify its correctness. Ascribing this to Hoare, she explained that the auxiliary information should be a model, a finite-state machine to depict a programmer's—or a customer's—intent. This sounded too controversial to me, but the moderator of the seminar shot my attempts to start a discussion on that matter.

So, I asked Mr. Hoare whether it was really what he meant. It turned out that it indeed was not the model what Hoare was talking about, but mere assertions. Any assertion you put becomes a piece of meta-information on what the correct behavior of the program is. The non-violation of these assertions can be automatically, or, at least, manually, verified, and the stringent adherence of any execution to all the assertions is what makes a program complying to the developer's intent. Can't wait for another encounter with Irina, where I'll have a chance to explain (oh, let's be honest, to brag about) this misunderstanding.

Terminator

Byron Cook gave an awesome talk on termination checking techniques. This is not about the extinction of the humanity by machines (well, if it is, then just indirectly), but about automatically proving that a given program will terminate. I was aware about the techniques of proving that a program will never violate a given property while running (this is called "safety property"), and I even maintain a tool that performs such a verification. However, the details of proving termination (that the program will not get stuck in an infinite loop) somehow remained being not learned by me. Byron has filled this gap with his awesome talk. I won't try to explain it briefly, and will redirect you to the article in CACM he co-authored (or, judging by his talk, made the biggest contribution to).

Cambridge

King's Colledge

This is, perhaps, the most well-known view of Cambridge—shot on my own camera! At the time of making this picture, I was on a boad being punted, the even having been kindly organized by Microsoft Research.

The event location was Cambridge, a beautiful village an hour of a train ride away from London. We quickly got used to a mild smell of potassium the town greeted us with, and the remaining dominant flavor of science encouraged the dissemination of the knowledge at the school.

We couldn't help walking through all the Cambridge at our free time, including, but not limited to several pubs (authentic English ale is awesome!) But the details of the trip are really off-topic here; what I can safely share is that if you get to England, it's totally worth it to spend a day taking a tour over Cambridge.

Conclusion?

What was most valuable to me personally in this School is the breadth. I learned a few about a lot, and that's how you really enjoy getting new knowledge. I met one person who worked in a similar domain (and I missed a change to have a good discussion with another guy—jeez, I should really learn how to approach people!), and had an interesting discussion with him. I learned something new about how to present my research well, and... Well, I learned that Microsoft is not that evil! Isn't this what the organizers hoped for in the first place?..

Read on | Comments (0) | Make a comment >>


Scrum: our three iterations

Yes, Scrum is now 100% buzzword. We learned about it in our university studies, we heard about it from corporate software engineers and consultants at the conferences, we see it in enterprise (see v1), we read about it in job postings, and some GSoC projects even impose Scrum to one-man "teams"!

However, it was not until this summer when I learned that it's actually an effective engineering technique. It revitalized the development of one of our projects (I'll later refer to it as "the Project").

Actually, the idea of adoption of Scrum was mostly inspired by one of our customers. A couple of our specialists worked in collaboration with another company on their product. The customer used Scrum, and held meetings in the room where our group sits. Our managers liked it, and we decided to try this framework.

So, we adopted Scrum for our project as well. When the idea emerged, I made a blog post referring to this idea as of a thought of a bored manager.

To make things less boring, I volunteered for a ScrumMaster role, and we spent three sprints (before we took a pause since the majority of our team went on vacation).

So, I've learned some lessons, and I want to share my thoughts on what Scrum is and how was it applied.

How I understand what Scrum is

First, what is Scrum. I won't repeat all definitions (you can find it on Wikipedia (frankly, quite a bad article) or here), just my perception of what it's really all about.

Our burndown chart

Here's our burndown chart of the 3rd iteration. It's created with use of VersionOne online agile project management tool (it's free for small teams with up to 10 developers!).

(To read the rest of the article, you should know base definitions of Scrum.)

Scrum values:

  • permanent formalized evaluation of the work done
  • early identification and discussion of problems
  • frequently changing local goals (as an Agile tech)
  • orientation to business values

The most important thing is the permanent evaluation. In the beginning of the post I mentioned that Scrum was used in a GSoC project. Of course it hardly was anything else but a disciplinary measure.

Early discussion of problems is also a valuable feature. During daily scrum developers are to speak about problems that prevent them from achieving their future goals. This is a required part of daily scrum, and it appears to be a valuable measure to keep things on time.

Frequent changing of local goals is a natural feature of any agile framework. Scrum's not an exception.

Orientation to business values is provided by separation of those who do the actual work (programmers) from those who make decision of how the product will look like (product owner). When we started doing Scrum, I noticed that nearly all items in my TODO list were replaced by the more valuable ones.

Scrum implementation

Scrum is not easy! Formal definitions say what should be done, but not how it should be done. I volunteered as a ScrumMaster, but only on the 3rd sprint I succeeded in leading a proper sprint planning. Scrum looks more like an interface, in the frame of which you can vary the actual implementation.

I can't explain every single nuance of our implementation. We have only established our way just moderately. And even if we would, it's a material for a 100-pages long book. Books is what you should pay attention to. The book that influenced our scrum the most is "Scrum from the Trenches", you can find the link at the "Acknowledgments" section below.

However, here's a couple of the most valuable notes.

Planning poker

Here's an image of Planning Poker deck (taken from a site of a "Crisp" company, which sells them).

We had printed similar cards and play them every sprint planning. The reasoning why they should be used is based on the fact that an expressed opinion actually affects what other people consider their own opinion. When people are forced to document their estimates independently, it usually leads to more productive discussions, into which the whole team is involved. You can read more about it at the site I already linked.

It appeared that planning poker is really useful during the sprint planning. Contrary to what you may think, it makes sprint plannings faster, as it brings in a formal pace of how the meeting should flow.

Some people suggest that sitting at the table and walking through the backlog is not effective. However, for our 4-man project, it's, in my opinion, the best way to hold it. As we play planning poker, I (as ScrumMaster) make notes about separation things to tasks, and after we select enough stories for the current iteration, we go through the notes, and form the actual tasks.

Downsides and lessons of Scrum

Responsibility

The first, quite a non-obvious consequence of adoption of Scrum, is that it makes developers even less responsible than they usually are :-). People start doing things not because they should, but because it was planned, because there's an action item, and you can't close it unless you complete it.

Motivation of developers

As I mentioned, Scrum teaches to talk about future problems, not about of the solved ones. However, this leads to underestimation of effort the developers made. If someone has successfully solved a tricky problem, and thus made the underestimated story completed on time, it's not possible to detect it in a usual workflow. This makes developers less motivated. To overcome this is a task of the managers, and Scrum doesn't help here.

Inflexibility

Scrum silently assumes that estimation of how much it takes to complete a task can be completed within minutes. To estimate all stories and tasks, (usually 10-20 stories) only 4-hour meeting is allocated.

It is probably true for most well-developed commercial projects. However, our Project was partly a research one, and it was unclear how much time an implementation of something that wasn't really done by anyone else before would take. For example, evaluation of one of the features in our project took a whole week!

We solved this problem by

  1. making sprints shorter (2 weeks long)
  2. introducing "research stories"

When the "product owner" required to implement such a story, which needed additional research for its estimation, we first asked him to add "research story". An outcome of a "research story" is not an implementation of the feature, but rather a report that contained evaluation of several ways to implement the feature. Of course, several approaches to the problem have to be prototyped, but the quality of such prototypes shouldn't make the developers spend too much time on them.

The report is analyzed upon a completion of such a story, and one way to implement the feature is selected by the product owner. Then implementing of this approach is added to backlog, and is to be addressed in the nearest sprint (or the owner may decide that the feature is not worth it).

Of course, to increase the pace, we made the sprints short (2 weeks is commonly understood as a minimum value of sprint length).

Lack of code quality criteria

Tasks and stories are required to speak business terms. However, what if there's a need to refactor code, to set up a new logging system, or to fix the architecture with no obvious visible outcome?

Focus factor (also known as "fuckup factor") is a multiplier for the available man-hours of the team. If your focus factor is "70%", it means that the team can only promise to deliver the amount of stories that takes 70% of its total man-hours. The rest of the working hours is to be spent on doing the "backend" work, which doesn't reflect concrete user stories.

70% is a "recommended" value for the focus factor, but its actual value may vary for different teams and projects. For our team it's 80%, because the Project doesn't have many users, so few bugs are filed.

The standard way is to decrease the "focus factor", and let developers decide how to spend the available time. If there's a great need for refactoring, there's a limited amount of time to be spent on it.

Prespecifying focus factor also makes the team choose wisely what to refactor, as they know they can't spend all the time on that.

Who fixes the bugs?

Some bugs are planned, but some are not! Some bugs get reported by critical customers, but--surprise!--all your working hours are already allocated between different stories. Who fixes the bugs then?

The obvious solution would be to allocate some more time on fixing critical bugs into the "focus factor" mentioned above. Non-critical bugs with lower priority can be made stories for the next sprint.

Scrum requires robots

Integrity and uniformity of the team is required for Scrum to work well. Days off are local catastrophes, and a sudden illness of a developer is a great catastrophe, which may result in a failed sprint (we're experiencing this right now).

It's obvious that Scrum would work for robots better, than for humans. However, if such a problem occurs, it requires really non-formalized, "human" thinking to resolve it.

Acknowledgments

The most fruitful source of information on how to do Scrum was "Scrum from the Trenches" (free pdf version). Shyness aside, having read Wikipedia definitions and this book, I became not the worst ScrumMaster in the world.

And, of course, experience matters. Our managers, Vladimir Rubanov and Alexey Khoroshilov, helped the team a lot to facilitate Scrum efficiently.

Conclusion

Another interesting thing I noticed, is that working without any formalized process is more boring than with it. Some time ago I assumed that it's like having sex in a new position, and after I tried that (Scrum, not sex), I realized that it was a very exact guess.

Though not ideal, Scrum appeared to be a very effective framework. It comprises both agile values and what commercial development wants, thus being the most popular agile framework. Its bureaucratic procedures (maintaining a backlog, doing scrums, drawing burndowns) are totally worth the net positive effect on development speed and quality.

Read on | Comments (0) | Make a comment >>


Today I wrote my first unit test...

As you probably know, I work in the software engineering department of a scientific institution of Russian Academy of Sciences, and my research and area of interest is software engineering. According to tag cloud, the "software-engineering" tag has reached the popularity of the "fun" tag; it does demonstrate my inclination.

But it was today when I wrote my first unit test.

And I loved it. Seven hours of ceaseless coding, brains not involved. Framework that runs it, hack to build system that assembles test application, mock objects controlled by environment variables -- that's how a bash/perl application is unit-tested, and that's totally not interesting. However, it was fun. I discovered two bugs in my application just by writing some unit-tests. Otherwise, I would delay the work, when we would encounter them later.

Unit-testing and automated test systems development

My whole career was devoted to developing automated testing systems. I was totally aware of why the tests are needed--it was the source of my income, actually. But why I never wrote unit-tests?

The answer to this question is simple. We, testing systems developers, get tests for free.

Usually, when you develop a testing system, you have a customer, whose software will be the primary target of it. Most likely, this software has several versions. And these versions differ dramatically from the viewpoint of a tester: the next version has less critical bugs to detect than the previous one.

And here's the test data. Just load the stuff you are to test! And here they are: a lot of regression tests for your tools.

The tests are essentially for free: you don't even need to think of test cases: the comparison with testing previous version yields enough information to decide whether your release if OK. And if you have another level of indirection, such as "test each version under each major Linux distribution", then you're likely to satisfy any vendor with your level of QA.

So if you happen to purchase an automated testing system (or merely have it developed), ask developers what kind of testing they made. If they somehow happen to have no regression tests, then it means that something's really wrong there. Because having such tests for free is an opportunity experienced testers wouldn't miss.

Read on | Comments (0) | Make a comment >>


Uploading and processing data with inotify-tools

I'm not a muscleman. In fact, I'm kind of a wimp. But given all this, my weight is 15 kilograms greater than I should have with my height. Time to lose some.

I've been logging my weight for quite a time, but I realized that logging alone doesn't help. I need other people to nudge me, so I would be more ashamed of my condition. To achieve this aim, starting today, I will publish a graph of my weight. It is rendered by gnuplot, and that page contains the script that renders it.

And, of course, since I'm a geek, the process of displaying the graph should be made as automated as possible.

Nearly every morning I weigh and log the result in the text file. My aim is to automatically upload it to the server right after I edit it; of course, manual uploading is out of the question. The update happens at different times, and I don't want to poll the file every hour, since it is just inefficient, and induces an unnecessary latency. And here's the neat solution.

What is inotify

Inotify is an interface to Linux kernel; its purpose is to watch and report when certain filesystem events happen.

Just like Berkeley sockets, inotify calls create file descriptors, which can then be watched by usual select() calls.

A list of events that may be watched includes accessing file, modifying file, moving files and modifying metadata. Directories can also be recursively watched.

More info you can get in man 7 inotify and on Wikipedia.

Tracking filesystem events with inotify-tools

Starting from 2.6.13, Linux kernel has an interface that watches for filesystem events without need to wake up on timer to check them. This subsystem is referred to as "inotify". We can utilize it for our aims.

The generic idea is to run daemon that waits for the weight log file to be changed. After I click "Save" in the text editor, the daemon notices it and sends file to server via scp (copy over ssh). On the web server another notification daemon watches for changes in the destination file, it renders an image with a plot and copies it into the webserver place.

Of course, nowadays one doesn't need to write C programs to access inotify directly via syscalls. Instead, I installed inotify-tools package (sources), which allows to use inotify from shell; a command we will use is inotifywait.

Inotify, as noted in Kernel docs, is interfaces via establishing watches that write information about the events via file descriptors. Waiting for a filesystem event is a mere blocking read from a file descriptor, the intrinsics being handled by the Kernel. The inotifywait command just wraps this in a convenient way.

One of the ways inotifywait can be used is to "monitor" events: the inotifywait process just prints a line to STDOUT when an event we watch for occurs. So we can just pipe the output of inotifywait invocation to a simple shell script that reads the lines and does work on each read. Here's how the daemons look like:

Local PC:

Remote server:

These daemons are currently deployed on my local machine and on server.

Version without monitoring

Due to misunderstanding of how stuff works, I encountered strange denials when I used the -m option to inotifywait. That's why previously I used the following daemons:

Local PC:

Remote server:

The calls to inotifywait here contain no -m option, which means that they exit after the event watched for occurs.

There are several wait operators and subshell spawns that may seem excessive at the first glance. However, they're there to avoid losing data during the race condition.

Assume there's no waits and no subshells, just inotifywait and copy/plotting commands. Then the following scenario is possible:

  1. I enter new value and save file
  2. The file is copied to remote server
  3. Remote server notices that file's changed and starts plotting it
  4. I notice a mistake I made in data, fix it and save the log again
  5. The file is copied to remote server
  6. Remote server finished plotting previous log with mistake
  7. Remote server starts waiting for file modification
  8. Changes made in p.4 are ignored!

If the file is copied to remote server while it was plotting it (and plotting/converting takes noticeable time), the subsequent inotifywait call won't notice any changes. Because when the file was modified, inotifywait was not running!

We solve it by doing conversion work asynchronously. Then, step 7 doesn't necessarily follow step 6. Instead, it's executed (hopefully) right after step 3.

If we spawn subprocesses instead of running the whole conversion, the time when we miss modification events is dramatically decreased, and doesn't depend of length of processing.

On the other hand, we can miss modifications when we're waiting for child processes to terminate. But even if we miss modification events, when we finish waiting, we will handle the modified file anyway. So that's acceptable. In general, it's also of no harm to do a, say, sleep 10 after inotifywait: another data save is likely to happen after the first one.

Conclusion

Finally, we've created a synchronized system of two daemons for the local and remote machines, based on inotifywait program of "inotify-tools" package. And now I have a public tracker of my own weight deployed within an hour, but I still wonder, wouldn't it be better if spent this time jogging?..

Read on | Comments (0) | Make a comment >>


How I applied for a web server developer, or why the fastest servers are written in C instead of C++

I realized why I was so aggressive in attempts to guide one of the latest StackOverflow questions so it would gain useful answers and wouldn't be closed as subjective. The question title is "Why are most really fast servers written in C instead of C++", and I really wondered why they are. Because that's... a bit personal to me. So, let me tell you this story.

Some time ago, after I suffered a fiasco to get a job in the USA, I thought it was a good idea to change my career to a less ivory-tower-ish. I thought about what was going on in the industry, and decided that since everything was moving to web, it would be a fruitful choice to become a web server developer. So I sent my resume to three companies that were hiring a newbie Linux C++ web server developer. Being under impression of "Smart and Gets things done" stuff, I wrote a resume that way and explicitly stated that I don't know anything about web serving.

First

One company, which is the top Russian search engine, declined me with a note "Your skills are valuable, but we can't offer you a job now" (bold is mine). I still wonder what that means: am I good or bad? Should I get more experience and try again or that's just a polite rejection? But well, alright, that's the number-one company, so I didn't frown upon that.

Second

The other company was an emerging startup. They had a shiny big office with free coffee, and a large restroom (at least that's what they said). A sign of warning was that the team failed its previous startup, but hell, why not give it a try? I was contacted by the CEO (it's a small startup, isn't it? Anyway in one Silicon Valley startup I was also interviewed by a CEO, so that didn't bother me) and was sent a sample task.

The task was to develop a C++ anti-DDOS module that drops any IP that exceeds a connection limit per a prespecified time. The task had been sent without any prior notice, and had a week-long time limit (who cares that I have plans for the week?). Period.

Well, alright. I didn't have any experience with Apache and web serving at all, so I started googling like crazy. After five hours of googling, I learned how to create apache2 modules, I devised a way to create a C++, not a C, module and I wrote a module that worked but was thread-local (apache starts a lot of threads). Essentially it was useless, but, well, spending more than 5 hours for a job application is a commonly acknowledged overkill!

I thought it did prove that I was a fast learner and a capable programmer. I sent that solution with notes how it should be improved to make up a complete module. However, "the architect was not satisfied with my code," and I was asked to finish what I had been assigned to. A note that I already had invested enough time for them to see if I was capable good enough to do the job was replied with a hilarious letter. It happens that I don't profess the company's philosophy that values result rather than capability! And unless I finish the code I'm not accepted.

The startup I applied to has, I guess, already failed; they still are not spoken about--whole google output is their SEO technologies. And they still require a registration on their website to all applicants--pathetic, isn't it?

Well, if the company values working for it for free, then it indeed is of a different philosophy than me. Good luck to these guys, I hope they won't fail many startups until they learn something.

Third

The reasons why I don't and won't apply to Google deserve a separate post, I think.

The third was another top Russian Internet company. They assigned an interview in their shiny big office with free a pong table, PS3 avaliable, as well as the usual free coffee and a large restroom! The interview started with a question "Why didn't you apply to Google?" -- apparently because of my Summer of Code participation (or because I was a candidate to replace a guy who transferred to Google from there).

The interview went smoothly, even considering that I didn't know anything about how HTTP or web servers work. After more usual questions, I was given a coding assignment. The clouds started gathering: they asked to write a C program to parse apache log and detect how many different IP addresses were in it. "Why not C++?" I asked, "it's easy and fast in C++!" The reply was that they wanted to judge my algorithm skills, not the knowledge of standard libraries. Well, OK, that seemed fair.. So, let's reckon how I did it in programming contests... *rubbing hands*

In 20 minutes the program was ready, but the remote server I was ssh-ed to suddenly went out of space, and the source somehow managed to get erased after a failed attempt to save it (Vim let me down this time). I spent another 20 minutes to code it again; it was all OK, and the interviewer even didn't see anything to criticize my code about.

The interview seemed to have gone great. The architect that interviewed me happened to come from the same university as me, was a fan of Vim as well, and sounded pleasant. We proceeded to discussing my salary, I made an offer, and there it happened.

"Okay, now I must confess," the architect started, "We lied to you a bit. All our software is actually written in C. Although we were looking for a C++ developer, you'll have to do C programming mostly."

O_O

The thing, yet unknown to the interviewer, is that I hate C; I hate it as much as I love C++. Feeling so deceived, I burst into explaining that writing in C is disgusting, since C is the new assembler, regardless of that I can do it quite well... No wonder I didn't get this job either.

Of course, I asked why they did development in C. And the answer was, "well, we sometimes feel urges to start rewriting it all in C++, module by module. But then we question ourselves—"but why?", and leave things as they are".

***

And now I also question myself—"why?" The fact that servers are mostly developed in C, I think, was crucial for my life. Perhaps, if it weren't the case, I would already have said goodbye to science and my masters studies, and would become a full-time corporate server developer. But sometimes at nights I still wonder why... why didn't they rewrite it in C++? And that's why I wanted that SO question answered.

Read on | Comments (0) | Make a comment >>


Binary compatibility of C++ Shared Libraries in Linux

A year ago I've completed my first "scientific" paper. I presented it in a local conference for junior researches. It actually didn't have much effect: as far as I understood no one at the conference was into the subject (ah, no, I'm wrong, there was one guy who developed boost::build). I wanted to fix it (since it contained minor bugs) for a very long time. But since I haven't done it, then, I suppose, I will never manage to fix it.

So here it is. A pdf text is available online freely, and there's even a ppt presentation.

Small abstract: it aims to describe why C++ libraries may lose binary compatibility. It also analyzes if certain compatibility restrictions may be relinquished if the user of the library is only allowed to use a subset of C++ by library's specifications. No one did such analysis before.

The paper is also a somewhat successful attempt to analyze C++ with scientific method. It appeared that this language is not eligible for such analysis, since it's just too huge. But anyway this is the most complete--and thus the most useless--compatibility guide. It was quoted by GCC folks in their manual, but that's an only known reference; and I have no idea how they found my article.

The developer of ABI Compliance Checker used the knowledge gained during this study, and this checker happens to be a working tool. This proves that my studies haven't been useless or erroneous.

Well, on this weekend I'm committing another article to a conference; and I hope it will be less useless to the world. Wish me luck.

Read on | Comments (0) | Make a comment >>


More posts about me >>