Traffic II: Why does the Fast Lane become Slowest in Traffic?

As I drive up to San Francisco from one of the freeways (either 101 or 280), I'm approaching the inevitable traffic stoppage. The freeway is clogged and it seems I'm up for a 15 minutes in bumper-to-bumper traffic, wishing I had a self-driving car.

But wait, what is it? Why am I stopped and everyone else keeps going? Could it be just another example of the Why is the Other Lane Always Faster? illusion?

Series on Car Traffic modeling

Inspired by my almost daily commute on California Highway 101, I explored the achievements of traffic theory and found answers to the most pressing mysteries of my commute.

Casual, repeated observations confidently refuted this. The traffic always gets stuck in exactly the same manner all the time. The leftmost lane's blockage starts farthest; then the second-to-the-left lane, and the rightmost lanes are always fastest. How come?

The conclusion it seems, does not require deep understanding of traffic theory: more cars arrive through fast lanes than through the slow lanes; more cars means more congestion. But traffic theory can help put mathematical notation around it so read on.

Fundamental properties of traffic: Flow, Density, Speed

My observations suggested that a model of traffic that can explain this phenomenon requires to detach the speed of the traffic from its other propertie such as amount of cars per square unit of higheway. I later learned that traffic theory has been exploring these questions since like 1930s (here's an overview of classical traffic flow models on Wikipedia), so I'll put my observations in these accepted terms.

If you observe a section of a lane of the freeway for some time, as enough cars pass, you'll notice that traffic has some numerical properties.

  1. The amount of cars passing through a certain point per unit of time, what traffic engineers call flow. Let's measure it in cars per minute, and name it q.

  2. The average speed with which cars move, or v.

  3. The average amount of cars that are simultaneously within the segment boundaries at any given point of time, referred to as density, or k.

We'll talk more about these in the section on traffic models, but for now we'll just use them to discuss the question of the day.

Why does the fastest lane have the longest congestion?

Let's assume a typical highway in the San Francisco Bay Area that's moving cars without congestion:

The speed limit on Californian freeways is 65 mph (~105 km/h). So when slow trucks drive in the slow right lane, few other cars want to share the lane with them. That means that the slow lane has disproportionately small flow.

So while I didn't conduct scientific research on these values of q, they seem entirely explainable within the traffic model and agree with casual observation.

We know that the Fast Lane (the left lane)'s speed will be higher. However, it is also true [in California] that the Fast Lane will have higher flow.

It also makes sense that a slightly faster lane will move slightly more traffic (= will have larger flow). The disproportionally smaller flow in the slow lane is a result of a speed limit effect (see sidenote).

Now let's assume congestion develops at a certain point in the road (because the road narrows, or even spontaneously).

How many cars will get stuck in each lane (assuming no lane changes occur)? If n-th lane's flow is $q$, then the number of cars passing through would be their product $q\cdot T$. If the incoming flow is $q_n$ and outgoing $q_0$, then the number of cars that enter the zone but do not get through the blockage per unit of time $T$ would equal to:

$$(q_0 - q_n)\cdot T$$

It seems reasonable that lanes with the highest traffic flow will accummulate the most cars:

Indeed, more cars in the high-flow lane are entering the road before the congestions per unit of times than in the flow lanes. Therefore, the higher-flow lane will accummulate more cars.

Note that "high-flow" does not necessarily mean "faster". Traffic flow theory and practice establish that the highest flow is attained at a certain speed, and higher speeds as well as lower speeds lead to the decrease in flow. That makes sense: drivers would increase the distance between cars as they drive faster. Link to the theory below.

Admittedly, this simple models ignores most of the long-term effects of the traffic. However, it does illustrate what I observe pretty much daily: Lanes with highest flow tend to develop longer congested segments than the lanes with lowest flow.

So a lesson from this could be: when you see congestion ahead, merge into the slow lane.

Another lesson from this: it's sometimes beneficial to drive in a lane that has a high-flow exit. Consider the following situation:

Here, the congestion in the first lane will be twice as small as in the other lanes, because the exit "unloaded" it.

This model stops scaling though

The effects described above will likely disintegrate after several minutes, thanks to lane changes. The distribution of the speeds, flow, and concentration will "diffuse" from the slowest lane into other lanes. I'll devote a separate post to lane changes.

Other results of Traffic Modeling

Researchers have been studying traffic and the properties of traffic for as long as there were cars on the city roads. Basically, the models I read about focused on two areas:

  1. "Car follower models" that infer macroscopic traffic flow properties from the behavior of individual driver decision-making.

  2. "Traffic flow models" that study macroscopic traffic flow properties directly, and infer relationships between them.

Note that classical fluid dynamics models (of the kind that study the flow of water in the pipes) are not applicable to traffic flow. Despite that fluid dynamics studies similar properties "such as flow velocity, pressure, density, and temperature, as functions of space and time", cars and molecules of the fluid behave differently. Most notably, cars don't normally push one anther as they collide, so things like Bernoulli principle do not apply, and while liquid in a pipe under pressure accelerates at the bottleneck, car traffic decelerates.

Car follower models

Car follower models basically model behaviors of individial cars (how the drivers accelerate, break, change lanes, and generally "follow" one another"). For example, there's Gipps model and Newell's model. The diagrams like this show individual car tracks:

(Image from the "Traffic flow theory and modelling" chapter by Serge Hoogendoorn and Victor Knoop.)

To illustreate the point I made above, notice how this model simply has "overtaking" as if it has no other effects than a car passing a different car. However, on a congested freeway, there needs to also be space in the other lane for the car to move into it, and subsequently and optionally back into the original lane. So this particular model does not intend to describe lane changes (which is OK; it can have other uses).

However, some other models do. In fact, a system called TRANSIMS mentions that it models behavior on a freeway as a network of agents trying to maximize their uitility, and finds a Nash Equilibrium, which becomes the solution for the steady flow of traffic.

Traffic flow models

Early traffic flow models mostly focused on establishing the relationship between speed, density, and flow. For example, the following diagram could be used to predict, at what speed will the highway reach maximum capacity:

It's reported that early traffic models did not explain spontaneous congestion on freeways (when trere's traffic without any apparent reason). I guess all it took was for freeways to become spontaneously congested in the areas where the researchers worked. :-)

The discovery of spontaneous traffic breakdown by various people followed (in late 80s-late 90s), and the following law called "Fundamental diagram of traffic flow" was established: as the upstream traffic flow increases, speed downstream increases until it reaches the breaking point, at which both speed and the downstream flow start decreasing with continued increase of the upstream flow. It can be depicted on a neat three-dimensional diagram:

The diagram is borrowed from the "Traffic Stream Characteristics" by Fred L. Hall (pdf here).

My personal takeaway from this

While I'm not a traffic engineer, I set out to try to play with traffic simulations and try to see how I can model my daily commute. While revieweing the literature, it turned out that the field has amassed ample knowledge about highway traffic already, and there are existing open-source simulators (like TRANSIMS) that probably already do it better.

For example, the model I wanted to develop would be very similar to the "Cellular automata model" described on slide 24 in the slide deck by Benjamin Seibold (Temple University); known as Nagel-Steckenberg model.

However, I still wasn't able to find any mention of traffic lane change models. They probably exist; please let me know if you find them!

So I think I'll pull a plug on the simulation and move on to other things. However, I will muse about the lane change modeling and dynamics a bit in the next post, and also tell the story of the 101-92 interchange here.

Traffic I: Why is the Other Lane Always Faster?

"This highway has four lanes. One of them is the fastest, and there's a 75% chance we're not in it."

Last year, I switched from the Google office in San Francisco, next to where I live, to the Mountain View office. In search of more interesting projects and better career prospects, I repeated the move towards the company headquarters many engineers make. But it comes with a price. The 101.

The commute takes anywhere between 45 minutes and 2 hours one way. The spread is not a force of nature; it is completely explained by the variations in traffic.

Series on Car Traffic modeling

Inspired by my almost daily commute on California Highway 101, I explored the achievements of traffic theory and found answers to the most pressing mysteries of my commute.

The more time I spent on the road, the more I noticed that the traffic follows some predicatable patterns. I noticed the patterns but they seemed counterintuitive. For example:

  1. Why does the left lane travels faster when the road is clear but seems to get stuck more when there's traffic? (here's why)
  2. Why does the 92-101 southbound interchange always gets stuck but the road is always free after that? (here's why)
  3. And finally, why are other lanes always faster?

Over the next several posts, I plan to explore these questions and maybe build some sort of a traffic simulator. But let's begin?

"Why are other lanes always faster?"

Once I was late for a flight, and a colleague offered to drive me so I wouldn't have to spend time on parking. It was a Thursday afternoon so the Bay Area traffic was in a predictable standstill. Luckily, my colleague was an amateur race car driver, so we gave it a shot.

Racing expertise didn't seem to help. Some skills were helpful, like merging at will into a lane that has "no space" between the cars. However, we got predictably stuck with the rest of the drivers. We started talking about traffic, and why did we get stuck in the slowest lane.

I brought up a book on traffic that I listened to before. "Traffic: Why We Drive the Way We Do (and What It Says About Us) by Tom Vanderbilt is a very fitting enterntainment for someone stuck in their car (I listened to it on Amzaon's Audible). Among other mysteries of traffic, the book explored the paradox of the slowest lane, in the very first chapter.

So why do other lanes seem faster? The book posits that it's an illusion: (the other lanes are just as slow), and offers the following explanation:

  1. "Unoccupied waiting" seems longer than it actually is.
  2. Humans hate when others "get ahead of them".
  3. We're naturally more aware of things that move than of the things that don't, so we don't notice the other lane when it's slow.

Having heard all that, my colleague offered a simpler explanation.

"This highway has four lanes. One of them is the fastest, and there's a 75% chance we're not in it."

Let's make a model!

And I think my friend was more accurate here. Traffic lanes do differ in the time it takes to travel them. I kept noticing that when driving down the highway in the left lane and as the traffic "bunched up" way further back than the "slower" right lanes!

That's why I want to come up with a mathematical model that explains my own commute experience. Here, I will not take on modelling traffic in a big, densely interconnected city, but focus on something simpler. I'll try to model the traffic on one single long highway (just like the 101), and see where it takes me.

Of course there will be lanes, because explaining the dissimilarity between the flow of traffic in different lanes is the whole goal of this.

Stay tuned.

I tricked the Amazon Go store

I tricked the Amazon Go store.

It's not hard. You need an accomplice. The accomplice swaps two items on the shelf. You take one of the misplaced items. You get charged for the other. But that's not the point

Mechanical Turk?

Two days ago, I was on a trip to Seattle, and of course I visited the Amazon Go Store. If you are out of the loop, it's a store without checkout. You come into a store, you grab items from the shelves, you walk out. That's it.

Amazon doesn't explain how it works, but we can infer some from observations.

  1. When you walk out, you don't get a receipt instantly;
  2. The app sends you a receipt later;
  3. The time it takes their servers to present you a receipt varies. We had three people enter the store; the person who didn't spend much time got his receipt in 2-3 minutes, the accomplice in ~5 minutes, and me, it took Amazon the whopping 15-20 minutes to serve my receipt.

We can conclude that tricky interactions get sent for a human review, e.g. to Mechanical Turk, which Amazon conveniently owns. It seems that a bunch of object recognition coupled with a bit of mechanical-turking does the trick.

But it is the future

Once I've satisfied my curiosity, and managed to trick the store, I returned to use it for real.

I walked in, grabbed a bottle of water, and walked out. It took 22 seconds. I got a receipt for a bottle of water later, but I didn't even check.

Folks, this is the future.

In his article "Invisible Asymptotes", Eugene Wei attributes a lot of Amazon's achievement in winning retail consumers hearts to eliminating friction. He writes,

People hate paying for shipping. They despise it. It may sound banal, even self-evident, but understanding that was, I'm convinced, so critical to much of how we unlocked growth at Amazon over the years.

Interestingly, Eugene doesn't apply this to Amazon Go, but that's probably one visit to Seattle away. ;-) Waiting in checkout lines is the worst part of brick-and-mortar shopping experience; it's obvious to everyone who shopped at least once.

Therefore, Amazon Go is the future.

By the way, does anyone need a bottle of salad dressing?

Take Apart that Giant Case: A Compact, 1.5kg Thunderbolt3 eGPU Setup

A Thunderbolt 3 external GPU setup that doesn't weigh 7 pounds (3kg)? Is it even possible? Of course it is, but you'll need to do some simple tinkering. This post describes how you can do it in a couple of hours if you're so inclined. You too can loose 4lbs off your Thunderbolt 3 eGPU within a day!

This is a Thunderbolt3-enabled DIY setup based on the Thunderbolt3-enabled components sourced from Akitio Node Pro (this and other Amazon links are affilliate). This is not the first of its kind. Here's an example from Dec 2016 linked to me on reddit. And there are many other well-known DIY setups for pre-TB3 tech [1] [2]. My setup weighs 1.5kg including the power supply, and only 0.47kg without one, and it can fit larger video cards.

This setup does not aim to save money, but to achieve superior portability, utilizing Thunderbolt3's plug-and-play capabilities as well as keeping it light and small. It cost about $400, but at least it doesn't needs its separate suitcase now!

I happened to speak to some industry professionals who actually know something about electronics. They suggested that removing the case might create significant EMF interference which would manifest in wifi connectivity issues. I ran some tests and wasn't able to detect any such effect. Perhaps, it only appears when you're having a LAN-party with 10 of those in. But if you're worried about EMF, get a Faraday bag ;-)

If you want to skip my rambling about GPUs, laptops, portability, and the good old days of the greener grass, you may scroll straight to the assembly guide, and check out more pictures with the completed setup Otherwise, read on.

And if you own a business that produces the Thunderbolt 3 enclosures, could you please pretty please just make a retail solution that weighs 2 lbs, 75% of which would be the power supply? Please?

On 4k screens, portability, and priorities

Would you believe that an employed, experienced software engineer does not own a laptop? Neither did my friends whom I told I don't own one. Boy did it make for some awkward job interview conversations. "Let's code something on your laptop!" and I would respond, "Oh I don't own one," and get this suspicious "oh really" squint.

(Answer: I just don't.)

I finally gave in when gearing up for a vacation in my hometown. I recalled all the times I wanted to use a laptop: mostly when writing blog posts on an airplane (like this one), researching bike routes when traveling, and editing photos I've just taken. Many of these tasks have been mostly, but not completely replaced by smart phones (Lightroom for phones and shooting RAWs from a phone camera dealing a pretty severe blow).

my laptop at the airport

I rarely need a laptop when away from a power outlet: I'm not the sort of explorer who ventures into a remote village and emerges with a 10-page magazine article. In fact, I don't really look at the laptop that much when I travel. But when I do look at the laptop, I demand the premium experience. Many ultralight laptops offer a UHD 1980x1040 screen in exchange for the extra 2-3 hours of battery life... Please, my cell phone has more pixels! I settled on HP Spectre x360 13-inch with a 4k screen.

What a gorgeous screen it is! It is easily the best display I've ever owned, and probably the best display I've eber looked at. How to make use of this artifact (well, apart from over-processing photos in Lightroom)? Play gorgeous games with gorgeous 3D graphics. Like The Witcher 3. Or DOOM (the new one, not the 1990s classics). Or Everyone's Gone to the Rapture's serene landscapes.

The problem is, for a screen this gorgeous, the Spectre's internal video card is simply... bad. The integrated Intel UHD 620 graphics card does not believe in speed. After rendering just 1 frame of the idyllic British countryside, the video card froze for 3 seconds, refusing to render another one frame until it's done admiring the leaves, and the shades, and the reflection. It produces less than 1 FPS at best, and its 3DMark score of 313 solidly puts it at the worst 1% of computers to attempt the test.

The test doesn't favor the brave--who would attempt to 3dmark an integrated ultralight laptop video card?--but it does show you how bad the result is. How can we improve?

When my desktop PC's GeForce 660 MX saw the first frame of DOOM, it was in simiar awe, confused, confronted with a task more demanding it ever had before. After struggling a bit and rendering maybe three frames, the video card decided to retire, pack its things and move to Florida, while I replaced it with the state-of-the-art-but-not-too-crazy GeForce GTX 1070. DOOM instantly became fast and fun. So the question is now obvious.

How can I connect my allmighty GeForce GTX 1070 to the Laptop?

DIY eGPUs (before Thunderbolt 3)

Turns out, tinkerers have been connecting external GPUs to laptops since forever. With time, GPUs required more and more power and space, while the possible and hence demanded laptop size shrank. The GPU power consumption trend has finally been reversed, but the laptops are going to get lighter still.

video card on top of the laptop.  The GPU is gigantic compared to an ultralight laptop!

A laptop is just a PC stuffed into a small plastic case, so connecting a GPU would be just like connecting it to the desktop. Laptop manufacturers would leave a "PCI extension slot" either featured as a supported connector or at least available inside the case for the bravest to solder. There is a lot of external GPU (eGPU) do-it-yourself DIY [1] [2], [3], and out-of-the-box solutions available.

But then, Intel developed a further extension of USB-C called Thunderbolt 3. The previous USB interface generations were also named "Thunderbolt", just the lightning was meek and the thunder unnoticed).

eGPU after Thunderbolt 3

Apparently, not all graphic adapters are compatible with Thunderbolt 3, or with the specific hardware I used. For example, I wasn't able to make my GeForce MX 660 Ti work with it (even before I took everything apart if you must ask). My guess would be that older video cards are not compatible. If in doubt, check this forum for compatibility reports.

Thunderbolt 3 is no magic. It's simply a standard to produce USB chips with higher wattage and throughoput... so high and fast that it allows you to connect, say, displays or even graphic cards over USB. It "merely" quadruples the throughput of the USB 3 standard, and now you can do plug-and-play for your video card, eh? You would just buy a thing to plug your video card into, and connect that thing into USB. Easy!

So all I need is to buy that "thing". Easy! There's a plenty of Thunderbolt 3 "things", here, take a look:

Comparison table of eGPU enclosures with the "weight" row highlighted

Notice something? That's right, they all are freaking gigantic and weigh 3-4 times more than the ultralight laptop itself. Here, I bought one, and it's a size of an actual desktop computer I own!

The manufacturers are telling: "want Thunderbolt 3? Buy a 3 kg case!" Look Akitio Node Pro has a freaking handle! A handle!

It didn't use to be this way. Before Thunderbolt 3 enabled plug-and-play, hackers still found ways to attach an eGPU as shown above. These solutions are tiny and they cost pennies! How do we get something similar with Thunderbolt 3?

My other choice here would be a smaller-sized external GPUs like Breakaway Puck, which is indeed both smaller and cheaper. I decided against those as I would have to buy a new GPU that was less powerful than the GPU I already own. Besides, the power supplies included with those are lackluster, citing portability concerns, but they under-deliver still.

On top of that, the total weight would still be more than 1 kg, but the build would deliver significantly less power than their 1.5 lbs counterparts. The bigger enclosures have enough power to both charge the laptop and supply the GPU with enough wattage to churn those FPS.

Some speculate Intel just takes a pretty big cut in the licensing fees for every Thunderbolt 3 device produced. Since they say it on the internet, it must be true. (See also here, scroll to the middle.) This explains the $200+ price. But that does not explain the 7 lbs of scrap.

It's time to take the matter into my own hands.

"When There's Nothing Left to Take Away"

...perfection is finally attained not when there is no longer anything to add, but when there is no longer anything to take away.

— "Terre des Hommes" via wikiquote. Antoine de Saint Exupéry

So we're going to turn this

A rather big Akitio Node Pro placed next to the laptop

into this

The procedure consists of two steps: disassembling the original box, and mounting of the enclosure onto a chunk of wood. Before we start, please understand the risks of this procedure, and accept the full responsibiltiy for the results.

Tinkering with the device in a manner described in this post will definitely void your warranty, and putting a source of heat next to a piece of wood is likely a fire hazard. Do not attempt unless you know what you're doing.

We'll need the following ingridients:

  • Akitio Node Pro (to source the TB3 extension cards) (amazon);
  • T10 and T8 screwdrivers (to disassemble the case) (see the pic, and sample product);
  • Set of M3 standoffs (at least half an inch tall); we'll recycle the matching screws from the original assembly (amazon);
  • Chunk of wood at least 3x5 inches (to mount the assembly onto) (amazon);
  • Drill or some other way to make deep holes in the said wood;
  • Super Glue (or some other glue to stick metal to wood)

A note on choosing the enclosure

The enclosure used here was Akitio Node Pro. I mostly chose it for these reasons:

  1. a compact 500W power supply. I found the reports of its noisiness unwarranted, but I was a bit unlucky with it.
  2. existing evidence it's ieasy to take apart
  3. a proven record of its compatibility with GTX 1070.
  4. powerful enough to also charge the laptop (unlike, say, the non-"Pro" Akitio Node)! I can confirm it does. You should shoot for 450-500W+ for that: the laptop charger would draw 100W, and you can look it up in the reviews.
  5. ... and last but not least it actually scored well in the reviews (see also the comparison).

Taking the case apart

It was somewhat straightforward to disassemble the box. I needed T8 and T10 screws as well as the expected Phillips screw. I got T8 and T10 from ACE Hardware brick-and-mortar stores. If you're curious, hex screwdrivers and flat screwdrivers only got me so far, until I faces a Boss Screw, a T10 right next to a board. That's when I gave up and went to the hardware store:

Basically, just unscrew every bolt you can see, remove the part if you can; then find more screws and repeat.

This bit is tricky: you need to unplug this wire; I couldn't find more ways to unscrew anything else, so I just extracted this using a thin Allen key.

What we need is these three parts: two boards (one with the PCI slots and the other with the USB ports), and the power supply. Look, they weigh only 1kg, as opposed to other, non-essential parts that weigh 2.3kg.

Putting it back together (without the scrap)

The bottom board, once dismounted, reveals that it can't stand on its own and needs to be mounted on at least 1cm standoffs. I decided to mount them onto a wooden board that needs to be at least 3x5. This board set worked, albeit it only had 1 out of 5 boards that was rectangular (it fits pretty snug so you only have one chance).

SuperGlue and the board with mounted standoffs

Practice board

Wait, how did I know where to put those, and how did I secure them? Simple: I drilled the wood and glued the standoffs. I first tried to mark the spots to drill by putting nails through the mount holes like so:

This did not work! I wasn't able to put the nails in straight enough, and missed the correct spot for a millimeter or so. I fixed it by tilting one of the standoffs a bit, but I did resort to a different method of marking the holes: drilling right through the board!

A drill goes through the mounting hole and into the wood!

This looked sketchy but it worked better than trying to hammer nails vertically.

I used super glue to secure the standoffs to the wood. As added security, I put a bit of saw dust back into the holes for a tighter grip. (Ask Alex for the right way. Some mention epoxy glue but then my dad said its unsuitable for gluing wood to metal, so I didn't research this question further (and I surely didn't have it and I didn't want to go to the hardware store again).

I practiced to mount standoffs onto a different board first, and I highly recommend you do this too. I only had one board at the time, so I couldn't screw it up, but if you just get more boards, it'd be easier.

Finishing touches and results

After I plugged the cards, the setup seemed a bit unstable. After all, that is a heavy double-slot monster of a GPU. So I added back one of the assembly pieces previously discarded and secured it back with a screw I found. However, I also later lost that screw while on the road, and ran this for hours without the extra piece, and it worked well, and didn't fall over (duh).

So here it is (without the power supply)

And here it is without the GPU (and without the power supply either)

The final setup weighs just 1.3 kg, including the power supply and 0.47kg without.

In order to improve portability, I only used the X screws when putting things back, and made sure that no T8 or T10 screws are needed, and I can travel with a regular philips screwdriver. Make sure to pick a large enough screwdriver to unscrew that tricky bolt. I tried to use a small screwdriver one might use for watches and I didn't get enough torque out of it.


And we're done. I've ran a variety of tests and benchmarks. Note that I ran all benchmarks with the internal display.

3Dmark Time Spy

I ran a 3Dmark Time Spy benchmark multiple times; here are the scores from one run. I also ran it on my PC (same GPU, but a different, older CPU) to check if switching from PC to a Laptop

Test Score Percentile Verdict Link
Laptop no eGPU 313 bottom 1%
Laptop with eGPU 3984 better than 40% top gaming laptop
Desktop same GPU 5259 better than 47% what? time to upgrade

My desktop runs Intel(R) Core(TM) i7-3770, whereas the laptop runs way more modern Core i7-8550U. However, it's known that CPUs didn't make much progress in single-core performance over the last several years; most improvements have been in portability and energy efficiency.

Unfortunately, I didn't use 3dMark Pro, so I couldn't force 4k for the desktop; it ran under a lower resolution. I suspect, they'd be on par otherwise.

So it seems that the eGPU setup runs as well as the desktop setup with the same GPU (but a way older CPU).


I used Cuda-Z software recommended by to measure throughput.

It seems, the connection does churn 2Gb/s of data each way, which is good. Overall, I don't really know how ti interpret the results.


I've playes many hours of Witcher 3 (2015) on max settings to great effect. There was no lag, and I enjoyed beautiful and responsive gateway. I also played a bit of DOOM (2016) on Ultra settings, and it was a bit too much (I got 20-30 FPS when walking and most of the fights, but several simultaneous effects lagged a bit). Non-ultra settings were a blast though!

Both games were enjoyed in 4k resolution. It was awesome.

Portability and features

The laptop charges while playing; just make sure to get a 100W-enabled cable.

As an added bonus, Ubuntu Linux 18.04 (bionic) recognized the plugged in video card without any issues or configuration. I haven't yet tested Machine Learning acceleration but I'll update this post when I can do it.

Field Test Number one... oops!

How does this setup fare without benchmarks? I was able to get several hours of continous gameplay until the stock Akitio's power supply died of overheading and never recovered. The boards, the GPU, and the laptop were OK, but the power supply just stopped working.

I can't quite tell what prompted it, but I did notice that the fan didn't turn on. It used to but it didn't. I must have damaged the supply when transporting it in the checked in luggage.

Common ATX Power Supply Replacement

This section is probably not interesting to you as I'm unsure your power supply will blow up. So please skip to the final Field Test section.

Instead of trying to recover, debug, and fix the dead power supply, I just bought a new one. I immediately noticed the difference between the Akitio's power supply and the typical ATX power unit.

  1. Akitio's Power Supply is smaller than ATX and it weighs less (0.8 kg vs 1.5kg)
  2. only has PCI-E cords whereas ATX has all of them
  3. has two PCI-E power connectors, but they are on different wires. The ATX power supply I got has two of them attached to the same wire, and the distance between them is quite short!
  4. turns on when the switch on the back is flipped, whereas a normal ATX power supply requires another switch to power-up, which motherboards usually supply but we don't yet have.

Since the original device was 500W, I tried to match it and settled on a 550W my local store had in stock. You can choose anything that works, but I was limited to what my local electronics outlet had in stock. So I bought a Corsair 550W modular supply RM550x which featured:

  1. slightly smaller weight than the alternatives (we pulled up the comparison table at the store, and even less powerful units were listed as heavier).
  2. modular wiring so I could discard the excess wires these units usually have for all ATX computer internals.
  3. a bit more expensive because I didn't want to destroy another power supply.

You need to also short-circuit the pins so the power switch actually turned the power on one. I also had to add some extenders.

Note that there are several ways to short-circuit the power supply's pins, so don't be confused if you see seemingly conflicting instructions.

However, I'm not quite satisfied with the result. Corsair RM550x is large. While it doesn't have any cooling issues as the original Akitio's supply does, I feel there is middle ground here with something not as large and more specialized.

When selecting an ATX power supply, also buy a PCI-E extender along the way. 6-pin is enough (the Akitio's "motherboard" piece is powered via 6-pin PCIe supply). Most likely, the dual PCI-E connectors are designed for two video cards placed rigth next to one another, whereas about 10-15 inch long wire is needed for our setup.

You might need or could use a splitter instead as well.

Wi-fi Interference testing

I heard, the major reason Thunderbolt 3 eGPUs come with a huge enclosure is to prevent the electromagnetic field emissions. Supposedly, Thunderbolt 3 boards emit quite a bit of EMF radiation, and this can cause Wi-fi interference.

I wasn't able to find the evidence of that. I measured wi-fi connectivity speeds and signal strength and I wasn't able to notice a drop. That doesn't mean there was no packet drop: perhaps, the wi-fi connection was indeed broken, but my 100Mb/s broadband was too slow to actually affect the speeds. It also could be that you'd need say 5 Thunderbolt 3 cards to emit enough EMF.

I used my cell phone to measure signal strength and used OOKLA speedtest to measure up- and download speeds. I placed cell phone into three places: 2ft from Wi-fi router, 12 ft from Wi-fi router, and into the other room. I also placed eGPU into three positions: 2ft from Wi-fi router, 10 ft from Wi-fi router, and completely off. Here's what it looked like when the eGPU is 2ft away from Wi-fi; you can see the Ubiquiti wi-fi "White Plate" in the top right corner:

I was running the Unigine Superposition benchmarks while measuring the signal strength and the download speeds, in case the EMF interference only appears under load.

Science log is here. The results are in the table below, each cell contains "Download speed (Mb/s) / Upload Speed (Mb/s) / Signal Strength (db)".

Phone position eGPU off eGPU 10ft away eGPU 2ft Away
2ft away 114 / 14.0 / -31 114 / 12.8 / -22 116 / 13.3 / -22
10 ft away 118 / 14.2 / -38 119 / 13.1 / -34 116 / 13.4 / -35
Other Room 110 / 13.8 / -53 116 / 13.8 / -49 116 / 13.6 / -54

So this means my Wi-fi stays pretty much unaffected regardless of eGPU presence. If there was packet drop, it didn't affect the 100 Mb/s connection.

Optional Addons

Buy a longer USB3 Thunderbolt-enabled cable ⚡

The USB cable that comes with the Akitio Node Pro is quite good but a bit too short. No wonder: a longer cable will cost you. A detachable cable that affords the required throughput needs to conform to som Thunderbolt 3 standard, and support 40Gbps data throughput. I simply bought a pretty long, 6ft cable by Akitio hoping for best compatibility, and I've had no issues with it. The Field Tests were done on that longer cable.

Putting the enclosure away reduces noise and improves mobility: you can put the setup close to the power outlet and attach to it from a different side of the table.

Use that extra fan

Akitio Node Pro had one extra fan to draw air into the case, and it is now unused.

Optionally, you can attach it to the board where it originally was. If I were to do this, I would also screw some leftover standoffs into the fan so it gets better intake. The original case had a curve to separate it from the ground. However, I got good enough performance out of the video card.

Faraday Bag

A way to reduce EMF exposure is to put the emitter into a Faraday Cage or a special bag.

This one actually works. As probably do others, but just a month ago there used to be many scams on Amazon of faraday cages that you should place on top of your router, and they would "block EMF", improve your health, and make Wi-fi faster at the same time. 😂 This Farady bag actually works (I tested it by placing a cell phone and calling it to no avail). I can't tell you have to use it, but maybe it could put your mind at ease.

Final Evaluation and notes on performance

It works. Moreover, the power supply doesn't overheat. I have never seen its fan turn on. Perhaps, the larger size allowed to space the internals better, or perhaps it's just of better quality.

I've put in about 10 hours of gameplay on the new setup, including about 5 continouos hours. As I test it out more, I'll update this post with better long-term evaluation data.

The performance is stellar, even with the internal display. Various measurements (done by other people) detect that using the internal display on a laptop or connecting an external display to the laptop (as opposed to connecting it to the video card itself) saturates some bottleneck and results in a performance drop. My findings are consistent with this blog post on with a card like GTX 1070, you won't notice the drop because you're getting 40+ FPS anyway.

After playing Witcher 3 at "Ultra High" quality (!), with full-screen antialiasing (!!), at 4k resolution (!!!), for several hours (was hard to resist anyway), I am happy to call it a success.

Moreover, the setup survived a transcontinental flight in the checked-in luggage, wrapped in underwear and padded with socks. And now, just one challenge remains: get it through TSA in a carry-on and prove the bunch of wires is a gaming device.

Containers on Amazon made easy: AWS Elastic Beanstalk Speed Run

In the previos post, I described why I chose Amazon AWS elastic Beanstalk for this blog. AWS Elastic Beanstalk (EBS) is the way to deploy Docker containers on the autoscaled Amazon Cloud with zero setup. It’s similar to App Engine Flexible Environment, but slower and harder to use. But how do you actually use it?

This post contains a speed run of setting up AWS Elastic Beanstalk. It’s easy to lose your way in AWS documentation, and I hope I can make it easy for you here.

We’re going to set up a very simple application, that has only 1 type of instances. In my case, this instance serves 1 docker image with a simple web server listening to port 80. Hopefully, when this guide becomes popular, the instace will scale up (wow!). Otherwise it will just be the cheapest thing AWS can do for some simple custom code with no disk or mutable state (aka a database).

Choose the right product (which is “Elastic Beanstalk”)

The first challenge is to not confuse it with other, less useful Amazon products. It’s harder than it seems. You do not want Amazon Elastic Container Service despite that it has the word “Container” in it, but “Elastic Beanstalk” only seems to offer beans, or stalking or both. The “Container Service” is a counterpart of EBS that requries you to set everything up manually, including your Elastic Private Network, Managed Instance Group, Elastic Load Balancer, and other Crap You Don’t Care about. On top of that, you will have to manually update Docker installations. “So uncivilized”.

Configure private Docker Registry

The next challenge is, find out a way to deploy Docker containers to your private repo. You need Amazon Elastic Container Registry (this one both has the word “Container” and is actually useful). Create a repo for your server image (let’s call it megaserver). Optionally (later), add a “Lifecycle Policy” that deletes old images automatically. But for now, you need to configure the client.

Click on “View Push Commands”, which will show you something like this:

aws ecr get-login --no-include-email --region us-west-2

and it will direct you to AWS command-line client installation guide.

Create the Elastic Beanstalk app

Go to the Elastic Beanstalk, and click “Create New Application” in the top right corner. Choose some sensible name; it doesn’t matter. Then, inside the application, choose “Actions”, then “Create new environment”. Choose “Docker” as the platform.

Now, deploy your first version. Download and configure the eb app, which orchestrates deployment of your local docker images to Elastic Beanstalk. Follow the AWS-supplied user guide–Just substitute PHP they use as example for “Docker”. Also skip the DB if you don’t need one, like I didn’t. Run the eb init, and follow this guide to configure Access Keys.

sudo apt-get install -y python-dev python-pip && sudo pip install awsebcli
eb init

If eb init shows some permission error, try enabling AWSElasticBeanstalkFullAccess permission.

Make sure that your ~/.aws folder did not exist before running these commands! If you were playing with other AWS products, you might have already written something there and corrupted it somehow. So if some auth commands don’t work, try removing the folder and then running:

rm -r ~/.aws
eb init
$(aws ecr get-login)

(The last line means “run aws ecr get-login, then run the command it printed to the console. It prints a docker login command that would authorize docker push to put containers to the AWS registry. )

Now, your keys should be in ~/.aws/credentials. Mine looks like this:

aws_secret_access_key = BoopBoop12345/+/BlahBlahBlahfoobar

As part of the Beanstalk Command-line tool workflow, you’ll need to create ebs/ in your working folder. See here for documentation. Here’s the file I created (note how it uses the image repository name megaserver we created above).

  "AWSEBDockerrunVersion": "1",
  "Image": {
    "Name": "",
    "Update": "true"
  "Ports": [
      "ContainerPort": "80"
  "Volumes": [],
  "Logging": "/var/log/nginx"

Now, you can finally deploy your image:

docker tag boring_volhard
docker push
eb deploy --verbose

If all the permission are correct, your image will start deploying, and will be available within 1-2 minutes.

Update the application

When you want to deploy the next version, just repeat the same commands. You’ll see how the image is being updated on the environment page.

docker tag boring_volhard
docker push
eb deploy --verbose

If you set the Environment parameters correctly (I don’t remember if you need to change the defaults or not), it will perform a rolling update, where it would replace your running containers one-by-one.

Here’s the configuration that works for me. Note the “Rolling updates and deployments” in the middle. This website can scale to more instances based on network I/O (particularly, based on the O).

Keep the change

My bill for one micro instance is $0.85 per day, which brings it to… $26 a month. In a quirk of Amazon Billing, it says I’m paying the most (65%) for the Load Balancer rather than for “running the instances” (27%). Based on which, it seems to me, these costs are made up anyway. Consider this the minimum price at which one can run AWS beanstalk with dockers.

Here’s the resultant cost profile.

Overall, this setup worked for my blog. I hope it works for you as well.

This blog is now a Hugo-powered static website on AWS Elastic Beanstalk

Ten years ago, I thought this website should be a content platform with video hosting, blog platform, and other cool things. It was an essential learning experience: I’ve chased performance bugs and reinvented highload wheels. Today, armed with several years of Industry Experience™, I am ready to present the highest-grade advice of a seasoned web engineer.

The right architecture for this blog is a set of static webpages served off of a file system by a simple web server such as Apache or nginx. One tiny shared VM would serve way more visitors than my content could ever attract.

And I did just that. There’s no dynamic content here anymore. So… wordpress you say?

Why do this again?

I hated managing the website manually, so why did I choose to endure one more manual-ish setup? First, Wordpress is for wimps who love PHP, especially the hosted one. I was going to yet again immerse into whatever the current hype was in order to learn about the latest cool tech. And the hype of 2017 was

  • container-based (Docker)
  • mobile-first
  • implemented in Go
  • deployed onto Managed Cloud
  • with managed services for email, database, etc
  • …and maybe even serverless?

Everyone seems to like Amazon. I would have chosen Google Cloud Platform, of course, if I were to optimize for quality and reliability. However I’ve chosen AWS because its a) the hype; b) not where I work. I’ve had enough exposure to Google Cloud as an insider, and I did want to expand my horizons.

But how would I choose what exactly to do? What should my architecture achieve? Of course, it should be security, long-term maintainability, and… money.


My previous version of the blog ran on Gentoo Linux. It was hell and it became unmaintainable. I rant about it at length in my previous post.

Anything that is quickly updateable would work. I used whatever linux. I literally don’t remember what it is; I need to look….

$ head -n 1 Dockerfile
FROM nginx:latest

What is it? I guess it’s Ubuntu or something else Debian-based, but I literally don’t care. Ok, Security–done, next.

Let’s optimize for… money!

Another motivation I had was to reduce the cost. Before the migration, the single VM that was serving my shitty Ruby on Rails app cost ~\$50 / month. Gosh, I could buy a brand new computer every year.. So, can I do better than to shell out the whopping \$720 / year for this stupid blog nobody reads? Can I do, say, \$10 or \$20/month?

It sometimes gets worse. As the fascinating Interview with an Anonymous Data Scientist article puts it,

…spot prices on their GPU compute instances were \$26 an hour for a four-GP machine, and \$6.50 an hour for a one-GP machine. That’s the first time I’ve seen a computer that has human wages

Turns out, I could get it cheaper but only if I didn’t use all the managed services. Managed is costly.

The smallest (tiniest) MySQL managed database costs \$12 / month on AWS. It supplies me with 1 cpu and 1 Gig of memory. This blog doesn’t need 1 damn CPU dedicated to the DB! It doesn’t even need a DB! It needs to copy static pages from basic file system to your screen!

Rabbit holing is another problem. So what if I want a AWS managed Git? Yes sure! That’d bee free or \$1 a month. Plus access to the Keys. The Keys would be \$1/month. Oh, and the logging of the key usage? That would be another whatever per access unless there’s 10k accesses but don’t worry, for most workflows that’d be fine!..

ugh. Can I get something more predictable? One way is to search clarity, and the other to get rid of this.

Getting rid of this. And of that.

Turns out, I can get by on the internet with free services for code storage, bug tracking, and file, photo and video storage. The \$1/month I pay to Google covers 100 Gb of crap, which I’m yet to fill. GitHub and Youtube are the staples. I’ve explained more on private git and other things in the previous post.

Do I even need rendering?

So what about converting human-writeable rich-text formats to HTML? Wordpress would be too lame, but I can’t do my own rendering engine anymore. The highlight of the previous version was of course the custom context-free parser generator that compiles a formal grammar into Ruby code. It took it sweet 1-2 seconds (!) to render each page of this website (not a joke).

That thing burns in hell and gets replaced with Markdown.

There would be no database. The contents (i.e. the text on this and other pages) would be versioned in Git and stored on a “local” disk (disks that are attached to only 1 machine are considered local even if they are actually remote, which is how cloud architectures now work).

If I wanted to change the contents or to post something new, here’s what my workflow would look like:

  • SSH onto the server
  • Use vim to add or edit a file in a special folder.
  • Use git to push the change into the “release” branch.
  • Run a ruby script that would use Markdown to format all the blog pages into HTML. It would use ls to get the list of all pages and make the [blog archives][archives] page. That’s not much different from a DB-based architecture: ater all, Databases evolved out of simple collection of files arranged into folders.
  • rsync the code onto the remote disk.

That’s it. nginx would serve the generated pages. Since making a new post invalidates all pages anyway because you’d see it pop up in the sidebar to the left, there’s even no need to be smart about it!

What a genius idea! It’s so obvious that I’m surprised nobody…

…and here’s a list of 450 static site generators


I chose Hugo because I wanted to play with Go.

Well, ok, now what do we do with Docker?

Docker can combine the local disk contents with a recipe called Dockerfile to produce a VM-like image that could serve a website.

Now, it would be a fun idea to have a self-hosted Docker image. The image would contain the Git repositories for website content and its own Docker files, and it would build itself and redeploy itself using AWS APIs. I think it could work…

But let’s start with something simpler. Let’s simply build Docker from another Docker container. It’s easy enough and it protects me from the loss of my home machine. In the end, the workflow would work like so:

  • Docker image Build contains all the build tools, including the Hugo website builder.
  • Static contents (including the images and Markdown files) are in a git-versioned local folder.
  • The Build image runs hugo to create a folder with HTML, CSS, and other files that constitute the entirety of the website.
  • Another Dockerfile describes the “Final” image, which combines [nginx:latest][nginx-docker] and the static files created in the previoius step.
  • The Script deploys it to Amazon Elastic Beanstalk.
  • a Makefile connects it all together.

Here’s a diagram of what it looks like:

And in the end, you get this website.

Amazon Elastic Beanstalk speed run

The speed run of how to set up autoscaled container service on Amazon Cloud is in a separate post.

A speed run is a playthrough of an otherwise fun game done as fast as possible (primarily, on video). Setting up AWS is a very fun game. For example, it's easy to set up ECS, and then discover, halfway through, that you've made some wrong choices at the beginning that are unfixable, and you have to start over.

I wrote a speed-run of AWS Container game. Check it out, and after that you can enjoy speed runs of less fun games on youtube.

But did it work?

Yes it did.

It works, it is blazingly fast, and it’s pretty cheap for a managed platform with rolling updates. My bill is \$0.85 per day, which brings it to… \$26 a month. In a quirk of Amazon Billing, it says I’m paying the most (65%) for the Load Balancer rather than for “running the instances” (27%). All these costs are bogus anyway.

Believe me, I tried to delete the Load Balancer (this kills the service) or switch to single-instance configuration (this simply doesn't switch and quickly reverts back--sic!). I couldn't bring it below \$25, and I'm OK with that. Although, I could run this on App Engine for free...

I’ve achieved my goals: cut costs by almost a factor of 3, and reduced the page load speed by a factor of 10-100x.

I fixed my blog. But I set out to fix it not to blog about fixing it; I wanted to explore something more interesting. But perhaps, next time. ;-)

A Farewell to Gentoo

As I mentioned in my previous post, the previous version of this blog ran on a VM powered by Gentoo Linux. Partly, that was the reason it was such a big mess and frankly, a security hazard.

You see, I’ve become literally scared to update Gentoo. Installing updates on Gentoo is like a challenging puzzle game. Installing Gentoo updates is an NP-hard problem. It is a drain on your time, it’s a family tragedy and it is plain and simple a security threat. But let’s start at the very beginning, when I first saw a Linux thingy at my grad school….

At the beginning, there was Windows 3.11 for Workgroups. The first computers I interacted with ran MS-DOS or Windows 3.11. Then Windows 95, and 98, and finally Windows XP. I thought Windows was all there is.

And then I went to a CS class in college, and wham! Gentoo.

I immediately fell in love with these green ok ] marks that show when a portion of the system has comkpleted loading. Unlike the never-ending scrollbar of Windows XP, it fosters immediate connection with the inner workfings of the machine. You feel involved. You feel in the know. You feel powerful.

So when I needed to choose a Linux distro to complete my coursework, it was Gentoo.

The main feature of Gentoo is that you build everything from sources. Nothing connects you to the inner workings than you literally witnessing the gcc invocations as they churn through your kernel you manually configured, through the window manager, or a new version of perl. That’s right, every single package–including the Kernel–is rebuilt on your local machine. Why?

One tihng, is that you can enable unsafe optimizations and tie everything to your machine. Those off-the-shelf distros have to work on a number of machines, but with Gentoo, you can compile everything with gcc -O3 --arch=icore7 -fenable-unsafe-life-choices.

It is insanely satisfying to watch. You haven’t lived if you’ve never seen Linux software compile. If you haven’t seen it, watch it. It’s worth it. It’s like watching fire.

Another selling point, you can disable the features and Reduce Bloat™. You don’t want to build a desktop environment? Fine–all your packages will compile without the GUI bindings. You never use PHP? Fine, no PHP bindings. You don’t like bzip2? (Wait, what?) You can disable that too! You just specify it in the USE flags in your make.conf, like USE="-pgp -gtk -qt4 -bzip2", and then when you emerge your packages, they’ll build without them. (emerge is the Gentoo’s apt-get install).

Awesome. Wait, what did you say about Bzip2? You can compile your system without bzip and only with gzip? Why do you even care? That’s because you’re a college kid with a lot of time on your hands. Emerge on.

So I emerge. Back in 2005, it took really long to compile KDE 3. We would leave it overnight to compile, and pray that it would not fail because our particular USE flags selection didn’t make it fail.

And then you try to update it. emerge -uDpav, I still remember it. It recompiles all your updates.

… or not. If you somehow forget to update the system (e.g. you leave for a vacation, or your cron crashes) then come back in two weeks and try to update it… it will fail to compile. That’s when you’re introduced to dependency twister.

Since the system is its own build environment, every next version should be buildable on top of the previous version. But sometimes it’s just not. It just doesn’t build. Some library is too old, but in order to compile a new version, you need to downgrade another library. Or worse, build dependencies form loops. Imagine dependency Foo needs a new version of library Bar to compile, and the new version of library Bar requies a new version of Foo–this actually sometimes happens.

Then, you’d have to resolve them by temporarily disabling or re-enabling USE flags. Or randomly rebuilding subsets of your packages (via helper tools like revdep-rebuild). Or applying the updates in the correct order, but you need to figure out the order first.

It’s 2017 and you still have to do it; nothing changed.

As a result, your system quickly rots and becomes a security hazard. A computer that hasn’t been updated for years, and is open to the network is a security risk. My access logs showed that automated bots were constantly trying to hack the website (polling URLs like /wp-admin/admin.php). So that’s it. Unless the system can security updates quickly and reliably, it’s a security hazard. Gentoo can not.

I got tired playing dependency twister around the time I graduated. Also, I got tired of trying to update Ruby’s ActiveRecord every now and then. Nothing like doing this for several years, I really makes you appreciate App Engine and simialr products.

So I bid Gentoo farewell and moved on. I moved on to Whatever Linux that makes my Docker Containers up to date… which is now I believe Ubuntu? I don’t really know and I no longer care.

Good bye, ok ]. I will miss you.

A New Look

This website just got a new look (with mobile layout), a new web hosting, and a new technology that powers it. I’ll post the summary in the coming days. And otherwise, just enjoy fewer lags and 500 errors. (Except today when I accidentally routed the traffic to the wrong Load Balancer. >_<)

I’ll be writing separate posts about these migrations, but here’s what went where.

  • Website hosting moved from Rackspace Cloud hellscape of a single manually managed Gentoo Linux instance to Amazon Elastic Beanstalk. Apart from manually managing a VM, managing a Gentoo VM is its own special hell.

  • Image hosting moved to Google Photos. Especially since Google photos is not as much of a walled garden as it used to be.

  • Video hosting dies–why have it when we have youtube and vimeo. Yes, I had my own small video hosting service. I’m glad it died.

  • Code hosting moved to my Github page.

  • Email filters. The server used to run an IMAP filtering script that was scanning my gmail inbox and sorting mail. I’ve created filters in Gmail interface instead.

Non-essential services have been happily shut down. And I hope you’re enjoying the new site and 0ms page load times. Shoot me an email if something’s not right.

Division by Zero? How About Multiplication by Zero

Crash due to division by zero? How about multiplication by zero?

I recently transferred to the next level of floating point arithmetic caveats. Division by zero is something we all know about, but multiplication by zero can be as harmful.

An algorithm computed weight of entities, then converted them to integers, as in

This code may fail the assertion. Imagine y is large (say, 1000), so exp(y) no longer fits double. The value of exp(y) will be +Infinity in this case. Surprisingly, it will correctly understand that INT_MAX is less than +Infinity, and will pass the check as expected. But here's when multiplication by zero kicks in.

What will happen if x is 0, and exp(y) is infinity? Any result will be mathematically nonsense, unless it's a special value of "Not A Number". min() would also return NaN, and integer conversion will happily convert it to... -2147483648. The assertion fails, and the production job crashes because it does not expect the result to be negative. We're multiplying two positive floating point numbers, how can it be negative?

Yet it is. All because of multiplication by zero.

How Reader Mutexes Can Deadlock

Sample Deadlock

Translucent areas depict waiting for something; incomplete lock statements have dashed border. Note that it doesn't matter in which order the top two acquisitions are made.

Can a cryptic entanglement of your mutex locks lead to a deadlock? It sure can. Deadlocking is the second thing your parents tell you about mutexes: if one thread acquires A, then acquires B before releasing A, and the other does the same in the reverse order, the threads may potentially deadlock. And deadlock they will if the first two acquisitions are picked from separate threads. Here's the code of two threads, and sidenote depicts the problematic execution:

But what about Reader/writer locks also known as "shared/exclusive" locks? Let's recap what these are first. Sometimes, to achieve greater efficiency, a mutex implementation supports two flavors of locking: Reader and Writer (otherwise known as "shared" and "exclusive"). If several threads only want to read from a shared variable, there's no need for each of them to wait for others. That's where you'd use a ReaderLock operation on a mutex guarding the variable. If thread wants to write, it invokes WriterLock which means "do not run any readers of writers while I'm holding the lock". Here's a wiki entry for reference, and here's standard Java API.

Seemingly OK Reader Lock Execution

We no longer have a "must-before" relation between B locks in two threads, so they don't deadlock. This looks OK, but it actually is not!

So imagine that both threads X and Y happen to use one of the locks as Reader lock? It seemingly should prevent deadlocking: if, say, B is a reader lock, then the execution specified above will make progress: B.ReaderLock() in thread X will not block waiting for thread Y to release it... right? Here's the code for clarity:

Turns out, reader locks can deadlock. You just need to make reader lock wait for another reader lock's release; how?

Many mutual exclusion implementations make acquiring threads "form a line" of some sort to ensure fairness: no thread should wait forever for a lock. Then a threads that tries to acquire a lock--either shared or exclusive--waits until all threads that called L.WrLock() earlier exit their critical sections. Fairness is especially important when you have reader and writer locks: if you'd allow any reader lock to proceed while there is another reader holding the lock, your writers could "starve" waiting for quiescence among readers, which may never happen on a highly contended lock.

So, to make a reader lock wait on another reader lock, we need a writer lock between them.

Deadlocking Reader Lock Execution

Here's how three threads can interleave such that you have a deadlock between reader mutex locks. The "blocked by" relationship between these reader locks transitively hops over a writer lock in some other thread Z.

Assume, that in the execution described earlier, before Thread X attempts to acquire the reader lock B, thread Z chips in, invokes B.WrLock(), and only then X calls B.RdLock(). The second X's Y.RdLock() starts to wait for the Z to acquire and then release B because of fairness concerns discussed above. Z's B.WrLock() waits for Y to release B.RdLock(). Y waits for X to release A. No thread makes progress, and here's the deadlock. Here's sample code of all three threads:

Note that you will have at least one writer lock for B somewhere (because if all acquisitions are Reader Locks, there's no point in having the lock at all.) Therefore, the only way to prevent this kind of deadlock is to not distinguish reader and writer locks when reasoning about progress guarantees.

This kind of deadlock needs at least three threads in order to bite you, but don't dismiss it outright! If the Internet taught us that a million monkeys in front of typewriters will not eventually recreate all the body of Shakespeare's work. they would at least trigger all possible race conditions in our typewriters no matter how contrived the corresponding executions seem.