As I drive up to San Francisco from one of the freeways (either 101 or 280), I'm approaching the
inevitable traffic stoppage. The freeway is clogged and it seems I'm up for a 15 minutes in bumper-to-bumper
traffic, wishing I had a self-driving car.
Casual, repeated observations confidently refuted this. The traffic always gets stuck in exactly the same
manner all the time. The leftmost lane's blockage starts farthest; then the second-to-the-left lane, and the
rightmost lanes are always fastest. How come?
The conclusion it seems, does not require deep understanding of traffic theory: more cars arrive through
fast lanes than through the slow lanes; more cars means more congestion. But traffic theory can help put
mathematical notation around it so read on.
Fundamental properties of traffic: Flow, Density, Speed
My observations suggested that a model of traffic that can explain this phenomenon requires to detach the
speed of the traffic from its other propertie such as amount of cars per square unit of higheway. I later learned that traffic theory has been exploring
these questions since like 1930s (here's an overview of classical traffic flow models on
Wikipedia), so I'll put my observations in these accepted terms.
If you observe a section of a lane of the freeway for some time, as enough cars pass, you'll notice that
traffic has some numerical properties.
The amount of cars passing through a certain point per unit of time, what traffic engineers call
flow. Let's measure it in cars per minute, and name it q.
The averagespeed with which cars move, or v.
The average amount of cars that are simultaneously within the segment boundaries at any given point of
time, referred to as density, or k.
We'll talk more about these in the section on traffic models, but for now we'll just use them to
discuss the question of the day.
Why does the fastest lane have the longest congestion?
Let's assume a typical highway in the San Francisco Bay Area that's moving cars without congestion:
The speed limit on Californian freeways is 65 mph (~105 km/h). So when slow trucks drive in the slow right
lane, few other cars want to share the lane with them. That means that the slow lane has disproportionately
So while I didn't conduct scientific research on these values of q, they seem entirely explainable within
the traffic model and agree with casual observation.
We know that the Fast Lane (the left lane)'s speed will be higher. However, it is also true [in
California] that the Fast Lane will have higher flow.
It also makes sense that a slightly faster lane will move slightly more traffic (= will have larger flow).
The disproportionally smaller flow in the slow lane is a result of a speed limit effect (see sidenote).
Now let's assume congestion develops at a certain point in the road (because the road narrows, or even spontaneously).
How many cars will get stuck in each lane (assuming no lane changes occur)? If n-th lane's flow is $q$, then
the number of cars passing through would be their product $q\cdot T$. If the incoming flow is $q_n$ and
outgoing $q_0$, then the number of cars that enter the zone but do not get through the blockage per unit
of time $T$ would equal to:
$$(q_0 - q_n)\cdot T$$
It seems reasonable that lanes
with the highest traffic flow will accummulate the most cars:
Indeed, more cars in the high-flow lane are entering the road before the congestions per unit of times than in
the flow lanes. Therefore, the higher-flow lane will accummulate more cars.
Note that "high-flow" does not necessarily mean "faster". Traffic flow theory and practice establish that
the highest flow is attained at a certain speed, and higher speeds as well as lower speeds lead to the
decrease in flow. That makes sense: drivers would increase the distance between cars as they drive faster.
Link to the theory below.
Admittedly, this simple models ignores most of the long-term effects of the traffic. However, it does
illustrate what I observe pretty much daily: Lanes with highest flow tend to develop longer congested
segments than the lanes with lowest flow.
So a lesson from this could be: when you see congestion ahead, merge into the slow lane.
Another lesson from this: it's sometimes beneficial to drive in a lane that has a high-flow exit.
Consider the following situation:
Here, the congestion in the first lane will be twice as small as in the other lanes, because the exit
This model stops scaling though
The effects described above will likely disintegrate after several minutes, thanks to lane changes. The
distribution of the speeds, flow, and concentration will "diffuse" from the slowest lane into other lanes.
I'll devote a separate post to lane changes.
Other results of Traffic Modeling
Researchers have been studying traffic and the properties of traffic for as long as there were cars on the
city roads. Basically, the models I read about focused on two areas:
"Car follower models" that infer macroscopic traffic flow properties from the behavior of individual
"Traffic flow models" that study macroscopic traffic flow properties directly, and infer relationships
Note that classical fluid dynamics models (of the kind that study the flow of water in the pipes) are not
applicable to traffic flow. Despite that fluid dynamics studies similar properties
"such as flow velocity, pressure, density, and temperature, as functions of space and time", cars and
molecules of the fluid behave differently. Most notably, cars don't normally push one anther as they
collide, so things like Bernoulli principle do not apply, and while liquid in a pipe under
pressure accelerates at the bottleneck, car traffic decelerates.
Car follower models
Car follower models basically model behaviors of individial cars (how the drivers accelerate, break, change
lanes, and generally "follow" one another"). For example, there's Gipps model and Newell's
model. The diagrams like
this show individual car tracks:
To illustreate the point I made above, notice how this model simply has "overtaking" as if it has no other
effects than a car passing a different car. However, on a congested freeway, there needs to also be space in
the other lane for the car to move into it, and subsequently and optionally back into the original lane. So
this particular model does not intend to describe lane changes (which is OK; it can have other uses).
However, some other models do. In fact, a system called TRANSIMS mentions that it models behavior
on a freeway as a network of agents trying to maximize their uitility, and finds a Nash Equilibrium, which
becomes the solution for the steady flow of traffic.
Traffic flow models
Early traffic flow models mostly focused on establishing the relationship between speed, density, and flow.
For example, the following diagram could be used to predict, at what speed will the highway reach maximum
It's reported that early traffic models did not explain spontaneous congestion
on freeways (when trere's traffic without any apparent reason). I guess all it took was for freeways to
become spontaneously congested in the areas where the researchers worked. :-)
The discovery of spontaneous traffic breakdown by various people followed (in late 80s-late
90s), and the following law called "Fundamental diagram of traffic flow" was
established: as the upstream traffic flow increases, speed downstream increases until it reaches the breaking
point, at which both speed and the downstream flow start decreasing with continued increase of the upstream
flow. It can be depicted on a neat three-dimensional diagram:
The diagram is borrowed from the "Traffic Stream Characteristics" by Fred L. Hall (pdf
My personal takeaway from this
While I'm not a traffic engineer, I set out to try to play with traffic simulations and try to see how I can
model my daily commute. While revieweing the literature, it turned out that the field has amassed ample
knowledge about highway traffic already, and there are existing open-source simulators (like
TRANSIMS) that probably already do it better.
However, I still wasn't able to find any mention of traffic lane change models. They probably exist; please
let me know if you find them!
So I think I'll pull a plug on the simulation and move on to other things. However, I will muse about the
lane change modeling and dynamics a bit in the next post, and also tell the story of
the 101-92 interchange here.
"This highway has four lanes. One of them is the fastest, and there's a 75% chance we're not in it."
Last year, I switched from the Google office in San Francisco, next to where I live, to the Mountain View
office. In search of more interesting projects and better career prospects, I repeated the move towards the
company headquarters many engineers make. But it comes with a price. The 101.
The commute takes anywhere between 45 minutes and 2 hours one way. The spread is not a force of nature; it is
completely explained by the variations in traffic.
Series on Car Traffic modeling
Inspired by my almost daily commute on California Highway 101, I explored the achievements of traffic theory
and found answers to the most pressing mysteries of my commute.
The more time I spent on the road, the more I noticed that the traffic follows some predicatable patterns. I
noticed the patterns but they seemed counterintuitive. For example:
Why does the left lane travels faster when the road is clear but seems to get stuck more when there's
traffic? (here's why)
Why does the 92-101 southbound interchange always gets stuck but the road is always free after that?
And finally, why are other lanes always faster?
Over the next several posts, I plan to explore these questions and maybe build some sort of a traffic
simulator. But let's begin?
"Why are other lanes always faster?"
Once I was late for a flight, and a colleague offered to drive me so I wouldn't have to spend time on parking.
It was a Thursday afternoon so the Bay Area traffic was in a predictable standstill. Luckily, my colleague
was an amateur race car driver, so we gave it a shot.
Racing expertise didn't seem to help. Some skills were helpful, like merging at will into a lane
that has "no space" between the cars. However, we got predictably stuck with the rest of the drivers. We
started talking about traffic, and why did we get stuck in the slowest lane.
I brought up a book on traffic that I listened to before. "Traffic: Why We Drive the Way We Do (and What It
Says About Us) by Tom Vanderbilt is a very fitting enterntainment for someone stuck in their car (I
listened to it on Amzaon's Audible). Among other mysteries of traffic, the book explored the
paradox of the slowest lane, in the very first chapter.
So why do other lanes seem faster? The book posits that it's an illusion: (the other lanes
are just as slow), and offers the following explanation:
"Unoccupied waiting" seems longer than it actually is.
Humans hate when others "get ahead of them".
We're naturally more aware of things that move than of the things that don't, so we don't notice the other
lane when it's slow.
Having heard all that, my colleague offered a simpler explanation.
"This highway has four lanes. One of them is the fastest, and there's a 75% chance we're not in it."
Let's make a model!
And I think my friend was more accurate here. Traffic lanes do differ in the time it takes to travel them. I
kept noticing that when driving down the highway in the left lane and as the traffic "bunched up" way further
back than the "slower" right lanes!
That's why I want to come up with a mathematical model that explains my own commute experience. Here, I will
not take on modelling traffic in a big, densely interconnected city, but focus on something simpler. I'll try
to model the traffic on one single long highway (just like the 101), and see where it takes me.
Of course there will be lanes, because explaining the dissimilarity between the flow of traffic in different
lanes is the whole goal of this.
It's not hard. You need an accomplice. The accomplice swaps two items on the shelf. You take one of the
misplaced items. You get charged for the other. But that's not the point
Two days ago, I was on a trip to Seattle, and of course I visited the Amazon Go Store. If you are
out of the loop, it's a store without checkout. You come into a store, you grab items from the shelves, you
walk out. That's it.
Amazon doesn't explain how it works, but we can infer some from observations.
When you walk out, you don't get a receipt instantly;
The app sends you a receipt later;
The time it takes their servers to present you a receipt varies. We had three people enter the store; the
person who didn't spend much time got his receipt in 2-3 minutes, the accomplice in ~5 minutes, and me, it
took Amazon the whopping 15-20 minutes to serve my receipt.
We can conclude that tricky interactions get sent for a human review, e.g. to Mechanical Turk,
which Amazon conveniently owns. It seems that a bunch of object recognition coupled with a bit of mechanical-turking does the trick.
But it is the future
Once I've satisfied my curiosity, and managed to trick the store, I returned to use it for real.
I walked in, grabbed a bottle of water, and walked out. It took 22 seconds. I got a receipt for a bottle of
water later, but I didn't even check.
Folks, this is the future.
In his article "Invisible Asymptotes", Eugene Wei attributes a lot of Amazon's achievement in winning
retail consumers hearts to eliminating friction. He writes,
People hate paying for shipping. They despise it. It may sound banal, even self-evident, but understanding
that was, I'm convinced, so critical to much of how we unlocked growth at Amazon over the years.
Interestingly, Eugene doesn't apply this to Amazon Go, but that's probably one visit to Seattle away. ;-)
Waiting in checkout lines is the worst part of brick-and-mortar shopping experience; it's obvious to
everyone who shopped at least once.
Therefore, Amazon Go is the future.
By the way, does anyone need a bottle of salad dressing?
A Thunderbolt 3 external GPU setup that doesn't weigh 7 pounds (3kg)? Is it even possible? Of course
it is, but you'll need to do some simple tinkering. This post describes how you can do it in a
couple of hours if you're so inclined. You too can loose 4lbs off your Thunderbolt 3 eGPU within a day!
This is a Thunderbolt3-enabled DIY setup based on the Thunderbolt3-enabled
components sourced from Akitio Node Pro (this and
other Amazon links are affilliate). This is not the first of its kind. Here's an example from Dec
2016 linked to me on reddit. And there are many other well-known DIY setups
for pre-TB3 tech . My setup weighs 1.5kg including the power supply,
and only 0.47kg without one, and it can fit larger video cards.
This setup does not aim to save money, but to achieve superior portability, utilizing
Thunderbolt3's plug-and-play capabilities as well as keeping it light and small. It cost about
$400, but at least it doesn't needs its separate suitcase now!
I happened to speak to some industry professionals who actually know something about electronics. They
suggested that removing the case might create significant EMF interference which would manifest in wifi
connectivity issues. I ran some tests and wasn't able to detect any such effect. Perhaps, it
only appears when you're having a LAN-party with 10 of those in. But if you're worried about EMF, get a
Faraday bag ;-)
And if you own a business that produces the Thunderbolt 3 enclosures, could you
please pretty please just make a retail solution that weighs 2 lbs, 75% of which would be the power
On 4k screens, portability, and priorities
Would you believe that an employed, experienced software engineer does not own a laptop? Neither
did my friends whom I told I don't own one. Boy did it make for some awkward job interview
conversations. "Let's code something on your laptop!" and I would respond, "Oh I don't own one,"
and get this suspicious "oh really" squint.
(Answer: I just don't.)
I finally gave in when gearing up for a vacation in my hometown. I recalled all the times I wanted
to use a laptop: mostly when writing blog posts on an airplane (like this one), researching bike
routes when traveling, and editing photos I've just taken. Many of these tasks have been mostly,
but not completely replaced by smart phones (Lightroom for phones and
shooting RAWs from a phone camera dealing a pretty severe blow).
I rarely need a laptop when away from a power outlet: I'm not the sort of explorer who ventures into
a remote village and emerges with a 10-page magazine article. In fact, I don't really look at the
laptop that much when I travel. But when I do look at the laptop, I demand the premium experience.
Many ultralight laptops offer a UHD 1980x1040 screen in exchange for the extra 2-3 hours of battery
life... Please, my cell phone has more pixels! I settled on HP Spectre x360 13-inch with a
What a gorgeous screen it is! It is easily the best display I've ever owned, and probably the best
display I've eber looked at. How to make use of this artifact (well, apart from over-processing
photos in Lightroom)? Play gorgeous games with gorgeous 3D graphics. Like The Witcher
3. Or DOOM (the new one, not the 1990s classics). Or Everyone's Gone to the
Rapture's serene landscapes.
The problem is, for a screen this gorgeous, the Spectre's internal video card is simply... bad. The
integrated Intel UHD 620 graphics card does not believe in speed. After rendering just 1 frame
of the idyllic British countryside, the video card froze for 3 seconds, refusing to
render another one frame until it's done admiring the leaves, and the shades, and the reflection. It
produces less than 1 FPS at best, and its 3DMark score of 313 solidly puts it at the worst 1% of
computers to attempt the test.
The test doesn't favor the brave--who would attempt to 3dmark an integrated ultralight laptop video
card?--but it does show you how bad the result is. How can we improve?
When my desktop PC's GeForce 660 MX saw the first frame of DOOM, it was in simiar awe,
confused, confronted with a task more demanding it ever had before. After struggling a bit and
rendering maybe three frames, the video card decided to retire, pack its things and move to Florida,
while I replaced it with the state-of-the-art-but-not-too-crazy GeForce GTX 1070. DOOM
instantly became fast and fun. So the question is now obvious.
Turns out, tinkerers have been connecting external GPUs to laptops since forever. With time, GPUs
required more and more power and space, while the possible and hence demanded laptop size shrank.
The GPU power consumption trend has finally been reversed, but the laptops are going to get lighter
A laptop is just a PC stuffed into a small plastic case, so connecting a GPU would be just like
connecting it to the desktop. Laptop manufacturers would leave a "PCI extension slot" either
featured as a supported connector or at least available inside the case for the bravest to solder.
There is a lot of external GPU (eGPU) do-it-yourself DIY ,
, and out-of-the-box solutions available.
But then, Intel developed a further extension of USB-C called Thunderbolt 3. The previous
USB interface generations were also named "Thunderbolt", just the lightning was meek and the thunder
eGPU after Thunderbolt 3
Apparently, not all graphic adapters are compatible with Thunderbolt 3, or with
the specific hardware I used. For example, I wasn't able to make my GeForce MX 660 Ti
work with it (even before I took everything apart if you must ask). My guess would be that older video
cards are not compatible. If in doubt, check this forum for compatibility
Thunderbolt 3 is no magic. It's simply a standard to produce USB chips with higher wattage
and throughoput... so high and fast that it allows you to connect, say, displays or even graphic
cards over USB. It "merely" quadruples the throughput of the USB 3 standard, and now you can do
plug-and-play for your video card, eh? You would just buy a thing to plug your video card into, and
connect that thing into USB. Easy!
So all I need is to buy that "thing". Easy! There's a plenty of Thunderbolt 3 "things", here, take
Notice something? That's right, they all are freaking gigantic and weigh 3-4 times more than the
ultralight laptop itself. Here, I bought one, and it's a size of an actual desktop computer I own!
The manufacturers are telling: "want Thunderbolt 3? Buy a 3 kg case!" Look Akitio Node Pro has a
freaking handle! A handle!
It didn't use to be this way. Before Thunderbolt 3 enabled plug-and-play, hackers still found ways
to attach an eGPU as shown above. These solutions are tiny and they cost pennies! How
do we get something similar with Thunderbolt 3?
My other choice here would be a smaller-sized external GPUs like Breakaway
Puck, which is indeed both smaller and cheaper. I decided against those as I would have to
buy a new GPU that was less powerful than the GPU I already own. Besides, the power supplies
included with those are lackluster, citing portability concerns, but they under-deliver still.
On top of that, the total weight would still be more than 1 kg, but the build would deliver significantly less power
than their 1.5 lbs counterparts. The bigger enclosures have enough power to both charge the laptop and supply
the GPU with enough wattage to churn those FPS.
Some speculate Intel just takes a pretty big cut in the licensing fees for every Thunderbolt 3
device produced. Since they say it on the internet, it must be true. (See also
here, scroll to the middle.) This explains the $200+ price. But that does not
explain the 7 lbs of scrap.
It's time to take the matter into my own hands.
"When There's Nothing Left to Take Away"
...perfection is finally attained not when there is no longer anything to add, but when there is
no longer anything to take away.
So we're going to turn this
The procedure consists of two steps: disassembling the original box, and mounting of the
enclosure onto a chunk of wood. Before we start, please understand the risks of this procedure, and
accept the full responsibiltiy for the results.
Tinkering with the device in a manner described in this post will definitely void your warranty, and
putting a source of heat next to a piece of wood is likely a fire hazard. Do not attempt unless you know what
We'll need the following ingridients:
Akitio Node Pro (to source the TB3 extension cards) (amazon);
powerful enough to
also charge the laptop (unlike, say, the non-"Pro" Akitio Node)! I can confirm it does. You
should shoot for 450-500W+ for that: the laptop charger would draw 100W, and you can look it up in
It was somewhat straightforward to disassemble the box. I needed T8 and T10 screws as
well as the expected Phillips screw. I got T8 and T10 from ACE Hardware brick-and-mortar stores.
If you're curious, hex screwdrivers and flat screwdrivers only got me so far, until I faces a Boss
Screw, a T10 right next to a board. That's when I gave up and went to the hardware store:
Basically, just unscrew every bolt you can see, remove the part if you can; then find more screws
This bit is tricky: you need to unplug this wire; I couldn't find more ways to unscrew anything
else, so I just extracted this using a thin Allen key.
What we need is these three parts: two boards (one with the PCI slots and the other with the USB ports), and
the power supply. Look, they weigh only 1kg, as opposed to other, non-essential parts that weigh 2.3kg.
Putting it back together (without the scrap)
The bottom board, once dismounted, reveals that it can't stand on its own and needs to be mounted on
at least 1cm standoffs. I decided to mount them onto a wooden board that needs to be at least 3x5.
This board set worked, albeit it only had 1 out of 5 boards that was rectangular (it fits
pretty snug so you only have one chance).
Wait, how did I know where to put those, and how did I secure them? Simple: I drilled the wood and
glued the standoffs. I first tried to mark the spots to drill by putting nails through the mount
holes like so:
This did not work! I wasn't able to put the nails in straight enough, and missed the correct
spot for a millimeter or so. I fixed it by tilting one of the standoffs a bit, but I did resort to
a different method of marking the holes: drilling right through the board!
This looked sketchy but it worked better than trying to hammer nails vertically.
I used super glue to secure the standoffs to the wood. As added security, I put a bit of saw dust
back into the holes for a tighter grip. (Ask Alex for the right way. Some mention epoxy glue but
then my dad said its unsuitable for gluing wood to metal, so I didn't research this question
further (and I surely didn't have it and I didn't want to go to the hardware store again).
I practiced to mount standoffs onto a different board first, and I highly recommend you do this too. I only had one board at the time, so I couldn't screw it up, but if you just get more
boards, it'd be easier.
Finishing touches and results
After I plugged the cards, the setup seemed a bit unstable. After all, that is a heavy double-slot
monster of a GPU. So I added back one of the assembly pieces previously discarded and secured it
back with a screw I found. However, I also later lost that screw while on the road, and ran this
for hours without the extra piece, and it worked well, and didn't fall over (duh).
So here it is (without the power supply)
And here it is without the GPU (and without the power supply either)
The final setup weighs just 1.3 kg, including the power supply and 0.47kg without.
In order to improve portability, I only used the X screws when putting things back, and made sure
that no T8 or T10 screws are needed, and I can travel with a regular philips screwdriver. Make sure
to pick a large enough screwdriver to unscrew that tricky bolt. I tried to use a small
screwdriver one might use for watches and I didn't get enough torque out of it.
And we're done. I've ran a variety of tests and benchmarks. Note that I ran all benchmarks with the
3Dmark Time Spy
I ran a 3Dmark Time Spy benchmark multiple times; here are the scores from one run. I also ran it on my PC
(same GPU, but a different, older CPU) to check if switching from PC to a Laptop
My desktop runs Intel(R) Core(TM) i7-3770, whereas the laptop runs way more modern Core i7-8550U. However,
it's known that CPUs didn't make much progress in single-core performance over the last several years; most
improvements have been in portability and energy efficiency.
Unfortunately, I didn't use 3dMark Pro, so I couldn't force 4k for the desktop; it ran under a lower
resolution. I suspect, they'd be on par otherwise.
So it seems that the eGPU setup runs as well as the desktop setup with the same GPU (but a way older CPU).
I used Cuda-Z software recommended by egpu.io to measure throughput.
It seems, the connection does churn 2Gb/s of data each way, which is good. Overall, I don't really know how
ti interpret the results.
I've playes many hours of Witcher 3 (2015) on max settings to great effect. There was no lag, and
I enjoyed beautiful and responsive gateway. I also played a bit of DOOM (2016) on Ultra settings, and it was
a bit too much (I got 20-30 FPS when walking and most of the fights, but several simultaneous effects lagged a
bit). Non-ultra settings were a blast though!
Both games were enjoyed in 4k resolution. It was awesome.
Portability and features
The laptop charges while playing; just make sure to get a 100W-enabled cable.
As an added bonus, Ubuntu Linux 18.04 (bionic) recognized the plugged in video card without any
issues or configuration. I haven't yet tested Machine Learning acceleration but I'll update this post when I
can do it.
Field Test Number one... oops!
How does this setup fare without benchmarks? I was able to get several hours of continous gameplay
until the stock Akitio's power supplydied of overheading and never recovered. The boards, the
GPU, and the laptop were OK, but the power supply just stopped working.
I can't quite tell what prompted it, but I did notice that the fan didn't turn on. It used to but
it didn't. I must have damaged the supply when transporting it in the checked in luggage.
Instead of trying to recover, debug, and fix the dead power supply, I just bought a new one. I
immediately noticed the difference between the Akitio's power supply and the typical ATX power unit.
Akitio's Power Supply is smaller than ATX and it weighs less (0.8 kg vs 1.5kg)
...it only has PCI-E cords whereas ATX has all of them
...it has two PCI-E power connectors, but they are on different wires. The ATX
power supply I got has two of them attached to the same wire, and the distance between them is quite
...it turns on when the switch on the back is flipped, whereas a normal ATX power supply requires
another switch to power-up, which motherboards usually supply but we don't yet have.
Since the original device was 500W, I tried to match it and settled on a 550W my local store had in
stock. You can choose anything that works, but I was limited to what my local electronics outlet had
in stock. So I bought a Corsair 550W modular supply RM550x which featured:
slightly smaller weight than the alternatives (we pulled up the comparison table at the store,
and even less powerful units were listed as heavier).
modular wiring so I could discard the excess wires these units usually have for all ATX computer internals.
a bit more expensive because I didn't want to destroy another power supply.
You need to also short-circuit the pins so the power switch actually turned the power on
one. I also had to add some extenders.
Note that there are several ways to short-circuit the power supply's pins, so don't be
confused if you see seemingly conflicting instructions.
However, I'm not quite satisfied with the result. Corsair RM550x is large. While it doesn't have any
cooling issues as the original Akitio's supply does, I feel there is middle ground here with something not as
large and more specialized.
When selecting an ATX power supply, also buy a PCI-E extender
along the way. 6-pin is enough (the Akitio's "motherboard" piece is powered via 6-pin PCIe supply).
Most likely, the dual PCI-E connectors are designed for two video cards placed rigth next to one
another, whereas about 10-15 inch long wire is needed for our setup.
You might need or could use a splitter instead as well.
Wi-fi Interference testing
I heard, the major reason Thunderbolt 3 eGPUs come with a huge enclosure is to prevent the electromagnetic
field emissions. Supposedly, Thunderbolt 3 boards emit quite a bit of EMF radiation, and this can cause Wi-fi
I wasn't able to find the evidence of that. I measured wi-fi connectivity speeds and signal strength and I
wasn't able to notice a drop. That doesn't mean there was no packet drop: perhaps, the wi-fi connection
was indeed broken, but my 100Mb/s broadband was too slow to actually affect the speeds. It also could be that
you'd need say 5 Thunderbolt 3 cards to emit enough EMF.
I used my cell phone to measure signal strength and used OOKLA speedtest to measure up- and download speeds.
I placed cell phone into three places: 2ft from Wi-fi router, 12 ft from Wi-fi router, and into the other
room. I also placed eGPU into three positions: 2ft from Wi-fi router, 10 ft from Wi-fi router, and completely
off. Here's what it looked like when the eGPU is 2ft away from Wi-fi; you can see the Ubiquiti wi-fi "White
Plate" in the top right corner:
I was running the Unigine Superposition benchmarks while measuring the signal strength and the download
speeds, in case the EMF interference only appears under load.
Science log is here. The results are in the table below, each cell contains "Download speed
(Mb/s) / Upload Speed (Mb/s) / Signal Strength (db)".
eGPU 10ft away
eGPU 2ft Away
114 / 14.0 / -31
114 / 12.8 / -22
116 / 13.3 / -22
10 ft away
118 / 14.2 / -38
119 / 13.1 / -34
116 / 13.4 / -35
110 / 13.8 / -53
116 / 13.8 / -49
116 / 13.6 / -54
So this means my Wi-fi stays pretty much unaffected regardless of eGPU presence. If there was packet drop, it
didn't affect the 100 Mb/s connection.
Buy a longer USB3 Thunderbolt-enabled cable ⚡
The USB cable that comes with the Akitio Node Pro is quite good but a bit too short. No
wonder: a longer cable will cost you. A detachable cable that affords the required throughput needs
to conform to som Thunderbolt 3 standard, and support 40Gbps data throughput. I simply bought a
pretty long, 6ft cable by Akitio hoping for best compatibility, and I've had no issues
with it. The Field Tests were done on that longer cable.
Putting the enclosure away reduces noise and improves mobility: you can put the setup close to the
power outlet and attach to it from a different side of the table.
Use that extra fan
Akitio Node Pro had one extra fan to draw air into the case, and it is now unused.
Optionally, you can attach it to the board where it originally was. If I were to do this, I would
also screw some leftover standoffs into the fan so it gets better intake. The original case had a
curve to separate it from the ground. However, I got good enough performance out of the video card.
A way to reduce EMF exposure is to put the emitter into a Faraday Cage or a special
This one actually works. As probably do others, but just a month ago there used to be many scams on
Amazon of faraday cages that you should place on top of your router, and they would "block EMF", improve your
health, and make Wi-fi faster at the same time. 😂 This Farady
bag actually works (I tested it by placing a cell phone and calling it to no avail). I can't tell you
have to use it, but maybe it could put your mind at ease.
Final Evaluation and notes on performance
It works. Moreover, the power supply doesn't overheat. I have never seen its fan turn on.
Perhaps, the larger size allowed to space the internals better, or perhaps it's just of better
I've put in about 10 hours of gameplay on the new setup, including about 5 continouos hours. As I
test it out more, I'll update this post with better long-term evaluation data.
The performance is stellar, even with the internal display. Various measurements (done by other
people) detect that using the internal display on a laptop or connecting an external display to the
laptop (as opposed to connecting it to the video card itself) saturates some bottleneck and results
in a performance drop. My findings are consistent with this blog post on egpu.io: with
a card like GTX 1070, you won't notice the drop because you're getting 40+ FPS anyway.
After playing Witcher 3 at "Ultra High" quality (!), with full-screen antialiasing (!!), at 4k
resolution (!!!), for several hours (was hard to resist anyway), I am happy to call it a success.
Moreover, the setup survived a transcontinental flight in the checked-in luggage, wrapped in
underwear and padded with socks. And now, just one challenge remains: get it through TSA in a
carry-on and prove the bunch of wires is a gaming device.
This post contains a speed run of setting up AWS Elastic Beanstalk. It’s easy to lose your way in AWS
documentation, and I hope I can make it easy for you here.
We’re going to set up a very simple application, that has only 1 type of instances. In my case, this instance
serves 1 docker image with a simple web server listening to port 80. Hopefully, when this guide becomes
popular, the instace will scale up (wow!). Otherwise it will just be the cheapest thing AWS can do for
some simple custom code with no disk or mutable state (aka a database).
Choose the right product (which is “Elastic Beanstalk”)
The first challenge is to not confuse it with other, less useful Amazon products. It’s harder than it
seems. You do not wantAmazon Elastic Container Service despite that it has the word “Container”
in it, but “Elastic Beanstalk” only seems to offer beans, or stalking or both. The “Container Service” is a
counterpart of EBS that requries you to set everything up manually, including your Elastic Private Network,
Managed Instance Group, Elastic Load Balancer, and other Crap You Don’t Care about. On top of that, you will
have to manually update Docker installations. “So uncivilized”.
Configure private Docker Registry
The next challenge is, find out a way to deploy Docker containers to your private repo. You need Amazon
Elastic Container Registry (this one both has the word “Container” and is actually useful). Create a
repo for your server image (let’s call it megaserver). Optionally (later), add a “Lifecycle Policy” that
deletes old images automatically. But for now, you need to configure the client.
Click on “View Push Commands”, which will show you something like this:
Go to the Elastic Beanstalk, and click “Create New Application” in the top right corner. Choose some
sensible name; it doesn’t matter. Then, inside the application, choose “Actions”, then “Create new
environment”. Choose “Docker” as the platform.
Make sure that your ~/.aws folder did not exist before running these commands! If you were playing with
other AWS products, you might have already written something there and corrupted it somehow. So if some auth
commands don’t work, try removing the folder and then running:
rm -r ~/.aws
$(aws ecr get-login)
(The last line means “run aws ecr get-login, then run the command it printed to the console. It prints a
docker login command that would authorize docker push to put containers to the AWS registry. )
Now, your keys should be in ~/.aws/credentials. Mine looks like this:
As part of the Beanstalk Command-line tool workflow, you’ll need to create ebs/Dockerrun.aws.json in
your working folder. See here for documentation. Here’s the file I created (note how it uses
the image repository name megaserver we created above).
docker tag boring_volhard 12345567890.dkr.ecr.us-west-2.amazonaws.com/megaserver:latest
docker push 12345567890.dkr.ecr.us-west-2.amazonaws.com/megaserver:latest
eb deploy --verbose
If all the permission are correct, your image will start deploying, and will be available within 1-2 minutes.
Update the application
When you want to deploy the next version, just repeat the same commands. You’ll see how the image is
being updated on the environment page.
docker tag boring_volhard 12345567890.dkr.ecr.us-west-2.amazonaws.com/megaserver:latest
docker push 12345567890.dkr.ecr.us-west-2.amazonaws.com/megaserver:latest
eb deploy --verbose
If you set the Environment parameters correctly (I don’t remember if you need to change the defaults or not),
it will perform a rolling update, where it would replace your running containers one-by-one.
Here’s the configuration that works for me. Note the “Rolling updates and deployments” in the middle. This
website can scale to more instances based on network I/O (particularly, based on the O).
Keep the change
My bill for one micro instance is $0.85 per day, which brings it to… $26 a month. In a quirk of Amazon
Billing, it says I’m paying the most (65%) for the Load Balancer rather than for “running the instances”
(27%). Based on which, it seems to me, these costs are made up anyway. Consider this the minimum price at
which one can run AWS beanstalk with dockers.
Ten years ago, I thought this website should be a content platform with video hosting, blog
platform, and other cool things. It was an essential learning experience: I’ve chased performance
bugs and reinvented highload
wheels. Today, armed with several years of Industry Experience™, I
am ready to present the highest-grade advice of a seasoned web engineer.
The right architecture for this blog is a set of static webpages served off of a file system by a simple web
server such as Apache or nginx. One tiny shared VM would serve way more visitors than my content could ever
And I did just that. There’s no dynamic content here anymore. So… wordpress you say?
Why do this again?
I hated managing the website manually, so why did I choose to endure one more manual-ish setup? First, Wordpress is
for wimps who love PHP, especially the hosted one. I was going to yet again immerse into whatever the
current hype was in order to learn about the latest cool tech. And the hype of 2017 was
implemented in Go
deployed onto Managed Cloud
with managed services for email, database, etc
…and maybe even serverless?
Everyone seems to like Amazon. I would have chosen Google Cloud Platform, of course, if I were to
optimize for quality and reliability. However I’ve chosen AWS because its a) the hype; b) not where
I work. I’ve had enough exposure to Google Cloud as an insider, and I did want to expand my
But how would I choose what exactly to do? What should my architecture achieve? Of course, it
should be security, long-term maintainability, and… money.
My previous version of the blog ran on Gentoo Linux. It was hell and it became unmaintainable. I rant about
it at length in my previous post.
Anything that is quickly updateable would work. I used whatever linux. I literally don’t remember what it
is; I need to look….
$ head -n 1 Dockerfile
What is it? I guess it’s Ubuntu or something else Debian-based, but I literally don’t care. Ok,
Let’s optimize for… money!
Another motivation I had was to reduce the cost. Before the migration, the single VM that was
serving my shitty Ruby on Rails app cost ~\$50 / month. Gosh, I could buy a brand new computer
every year.. So, can I do better than to shell out the whopping
\$720 / year for this stupid blog nobody reads? Can I do, say, \$10 or \$20/month?
…spot prices on their GPU compute instances were \$26 an hour for a four-GP machine, and \$6.50 an hour for a one-GP machine. That’s the first time I’ve seen a computer that has human wages
Turns out, I could get it cheaper but only if I didn’t use all the managed services. Managed is costly.
The smallest (tiniest) MySQL managed database costs \$12 / month on AWS. It supplies me with 1 cpu
and 1 Gig of memory. This blog doesn’t need 1 damn CPU dedicated to the DB!
It doesn’t even need a DB! It needs to copy static pages from basic file system to your screen!
Rabbit holing is another problem. So what if I want a AWS managed Git? Yes sure! That’d bee free
or \$1 a month. Plus access to the Keys. The Keys would be \$1/month. Oh, and the logging of the
key usage? That would be another whatever per access unless there’s 10k accesses but don’t worry,
for most workflows that’d be fine!..
ugh. Can I get something more predictable? One way is to search clarity, and the other to get rid
Getting rid of this. And of that.
Turns out, I can get by on the internet with free services for code storage, bug tracking, and file,
photo and video storage. The \$1/month I pay to Google covers 100 Gb of crap, which I’m yet to fill.
GitHub and Youtube are the staples. I’ve explained more on private git and other things in the
Do I even need rendering?
So what about converting human-writeable rich-text formats to HTML? Wordpress would be too lame,
but I can’t do my own rendering engine anymore. The highlight of the previous version was of course
the custom context-free parser generator that compiles a formal grammar into Ruby code.
It took it sweet 1-2 seconds (!) to render each page of this website (not a joke).
That thing burns in hell and gets replaced with Markdown.
There would be no database. The contents (i.e. the text on this and other pages) would be versioned
in Git and stored on a “local” disk (disks that are attached to only 1 machine are considered local
even if they are actually remote, which is how cloud architectures now work).
If I wanted to change the contents or to post something new, here’s what my workflow would look
SSH onto the server
Use vim to add or edit a file in a special folder.
Use git to push the change into the “release” branch.
Run a ruby script that would use Markdown to format all the blog pages
into HTML. It would use ls to get the list of all pages and make the [blog archives][archives]
page. That’s not much different from a DB-based architecture: ater all, Databases evolved out of
simple collection of files arranged into folders.
rsync the code onto the remote disk.
That’s it. nginx would serve the generated pages. Since making a new post invalidates all pages
anyway because you’d see it pop up in the sidebar to the left, there’s even no need to be smart
What a genius idea! It’s so obvious that I’m surprised nobody…
Docker can combine the local disk contents with a recipe called Dockerfile to produce a VM-like
image that could serve a website.
Now, it would be a fun idea to have a self-hosted Docker image. The image would contain the Git
repositories for website content and its own Docker files, and it would build itself and redeploy
itself using AWS APIs. I think it could work…
But let’s start with something simpler. Let’s simply build Docker from another Docker container. It’s
easy enough and it protects me from the loss of my home machine. In the end, the workflow would work like so:
Docker image Build contains all the build tools, including the Hugo website builder.
Static contents (including the images and Markdown files) are in a git-versioned local folder.
The Build image runs hugo to create a folder with HTML, CSS, and other files that constitute the
entirety of the website.
Another Dockerfile describes the “Final” image, which combines [nginx:latest][nginx-docker] and the static
files created in the previoius step.
The speed run of how to set up autoscaled container service on Amazon Cloud is in a separate post.
A speed run is a playthrough of an otherwise fun game done as fast as possible (primarily, on video). Setting up AWS is a very fun game. For example, it's easy to set up ECS, and then discover, halfway through, that you've made some wrong choices at the beginning that are unfixable, and you have to start over.
It works, it is blazingly fast, and it’s pretty cheap for a managed platform with rolling updates. My bill is
\$0.85 per day, which brings it to… \$26 a month. In a quirk of Amazon Billing, it says I’m paying the most
(65%) for the Load Balancer rather than for “running the instances” (27%). All these costs are bogus
Believe me, I tried to delete the Load Balancer (this kills the service) or switch to single-instance
configuration (this simply doesn't switch and quickly reverts back--sic!). I couldn't bring it
below \$25, and I'm OK with that. Although, I could run this on App Engine for free...
I’ve achieved my goals: cut costs by almost a factor of 3, and reduced the page load speed by a factor of 10-100x.
I fixed my blog. But I set out to fix it not to blog about fixing it; I wanted to explore something more
interesting. But perhaps, next time. ;-)
As I mentioned in my previous post, the previous version of this blog ran on a VM
powered by Gentoo Linux. Partly, that was the reason it was such a big mess and frankly,
a security hazard.
You see, I’ve become literally scared to update Gentoo. Installing updates on Gentoo is like a challenging
puzzle game. Installing Gentoo updates is an NP-hard problem. It is a drain on your time, it’s a
family tragedy and it is plain and simple a security threat. But let’s start at the very beginning,
when I first saw a Linux thingy at my grad school….
At the beginning, there was Windows 3.11 for Workgroups. The first computers I interacted with ran
MS-DOS or Windows 3.11. Then Windows 95, and 98, and finally Windows XP. I thought Windows was all
And then I went to a CS class in college, and wham! Gentoo.
I immediately fell in love with these green [ ok ] marks that show when a
portion of the system has comkpleted loading. Unlike the never-ending scrollbar of Windows XP, it
fosters immediate connection with the inner workfings of the machine. You feel involved. You feel
in the know. You feel powerful.
So when I needed to choose a Linux distro to complete my coursework, it was Gentoo.
The main feature of Gentoo is that you build everything from sources. Nothing connects you to
the inner workings than you literally witnessing the gcc invocations as they churn through your
kernel you manually configured, through the window manager, or a new version of perl. That’s
right, every single package–including the Kernel–is rebuilt on your local machine. Why?
One tihng, is that you can enable unsafe optimizations and tie everything to your machine. Those
off-the-shelf distros have to work on a number of machines, but with Gentoo, you can compile
everything with gcc -O3 --arch=icore7 -fenable-unsafe-life-choices.
It is insanely satisfying to watch. You haven’t lived if you’ve never seen Linux software compile.
If you haven’t seen it, watch it. It’s worth it. It’s like watching fire.
Another selling point, you can disable the features and Reduce Bloat™. You don’t want to build a
desktop environment? Fine–all your packages will compile without the GUI bindings. You never use
PHP? Fine, no PHP bindings. You don’t like bzip2? (Wait, what?) You can disable that too!
You just specify it in the USE flags in your make.conf, like USE="-pgp -gtk -qt4
-bzip2", and then when you emerge your packages, they’ll build without them. (emerge is the
Gentoo’s apt-get install).
Awesome. Wait, what did you say about Bzip2? You can compile your system without bzip and only with
gzip? Why do you even care? That’s because you’re a
college kid with a lot of time on your hands. Emerge on.
So I emerge. Back in 2005, it took really long to compile KDE 3. We would leave it
overnight to compile, and pray that it would not fail because our particular USE flags selection
didn’t make it fail.
And then you try to update it. emerge -uDpav, I still remember it. It recompiles all your
… or not. If you somehow forget to update the system (e.g. you leave for a vacation, or your cron
crashes) then come back in two weeks and try to update it… it will fail to compile. That’s when you’re introduced to dependency twister.
Since the system is its own build environment, every next version should be buildable on top of the
previous version. But sometimes it’s just not. It just doesn’t build. Some library is too old,
but in order to compile a new version, you need to downgrade another library. Or worse, build dependencies
form loops. Imagine dependency Foo needs a new version of library Bar to compile, and the new
version of library Bar requies a new version of Foo–this actually sometimes happens.
Then, you’d have to resolve them by temporarily disabling or re-enabling USE flags. Or randomly
rebuilding subsets of your packages (via helper tools like revdep-rebuild). Or applying the
updates in the correct order, but you need to figure out the order first.
As a result, your system quickly rots and becomes a security hazard. A computer that hasn’t been
updated for years, and is open to the network is a security risk. My access logs showed that
automated bots were constantly trying to hack the website (polling URLs like /wp-admin/admin.php).
So that’s it. Unless the system can security updates quickly and reliably, it’s a security
hazard. Gentoo can not.
I got tired playing dependency twister around the time I graduated. Also, I got tired of trying to
update Ruby’s ActiveRecord every now and then. Nothing like doing this for several years, I really
makes you appreciate App Engine and simialr products.
So I bid Gentoo farewell and moved on. I moved on to Whatever Linux that makes my Docker Containers
up to date… which is now I believe Ubuntu? I don’t really know and I no longer care.
This website just got a new look (with mobile layout), a new web hosting, and a new technology that powers it. I’ll post the summary in the coming days. And otherwise, just enjoy fewer lags and 500 errors. (Except today when I accidentally routed the traffic to the wrong Load Balancer. >_<)
I’ll be writing separate posts about these migrations, but here’s what went where.
Crash due to division by zero? How about multiplication by zero?
I recently transferred to the next level of floating point arithmetic caveats. Division by zero is something we all know about, but multiplication by zero can be as harmful.
An algorithm computed weight of entities, then converted them to integers, as in
This code may fail the assertion. Imagine y is large (say, 1000), so exp(y) no longer fits double. The value of exp(y) will be +Infinity in this case. Surprisingly, it will correctly understand that INT_MAX is less than +Infinity, and will pass the check as expected. But here's when multiplication by zero kicks in.
What will happen if x is 0, and exp(y) is infinity? Any result will be mathematically nonsense, unless it's a special value of "Not A Number". min() would also return NaN, and integer conversion will happily convert it to... -2147483648. The assertion fails, and the production job crashes because it does not expect the result to be negative. We're multiplying two positive floating point numbers, how can it be negative?
Yet it is. All because of multiplication by zero.
Translucent areas depict waiting for something; incomplete lock statements have dashed border. Note that it doesn't matter in which order the top two acquisitions are made.
Can a cryptic entanglement of your mutex locks lead to a deadlock? It sure can. Deadlocking is the second thing your parents tell you about mutexes: if one thread acquires A, then acquires B before releasing A, and the other does the same in the reverse order, the threads may potentially deadlock. And deadlock they will if the first two acquisitions are picked from separate threads. Here's the code of two threads, and sidenote depicts the problematic execution:
But what about Reader/writer locks also known as "shared/exclusive" locks? Let's recap what these are first. Sometimes, to achieve greater efficiency, a mutex implementation supports two flavors of locking: Reader and Writer (otherwise known as "shared" and "exclusive"). If several threads only want to read from a shared variable, there's no need for each of them to wait for others. That's where you'd use a ReaderLock operation on a mutex guarding the variable. If thread wants to write, it invokes WriterLock which means "do not run any readers of writers while I'm holding the lock". Here's a wiki entry for reference, and here's standard Java API.
Seemingly OK Reader Lock Execution
We no longer have a "must-before" relation between B locks in two threads, so they don't deadlock. This looks OK, but it actually is not!
So imagine that both threads X and Y happen to use one of the locks as Reader lock? It seemingly should prevent deadlocking: if, say, B is a reader lock, then the execution specified above will make progress: B.ReaderLock() in thread X will not block waiting for thread Y to release it... right? Here's the code for clarity:
Turns out, reader locks can deadlock. You just need to make reader lock wait for another reader lock's release; how?
Many mutual exclusion implementations make acquiring threads "form a line" of some sort to ensure fairness: no thread should wait forever for a lock. Then a threads that tries to acquire a lock--either shared or exclusive--waits until all threads that called L.WrLock() earlier exit their critical sections. Fairness is especially important when you have reader and writer locks: if you'd allow any reader lock to proceed while there is another reader holding the lock, your writers could "starve" waiting for quiescence among readers, which may never happen on a highly contended lock.
So, to make a reader lock wait on another reader lock, we need a writer lock between them.
Deadlocking Reader Lock Execution
Here's how three threads can interleave such that you have a deadlock between reader mutex locks. The "blocked by" relationship between these reader locks transitively hops over a writer lock in some other thread Z.
Assume, that in the execution described earlier, before Thread X attempts to acquire the reader lock B, thread Z chips in, invokes B.WrLock(), and only then X calls B.RdLock(). The second X's Y.RdLock() starts to wait for the Z to acquire and then release B because of fairness concerns discussed above. Z's B.WrLock() waits for Y to release B.RdLock(). Y waits for X to release A. No thread makes progress, and here's the deadlock. Here's sample code of all three threads:
Note that you will have at least one writer lock for B somewhere (because if all acquisitions are Reader Locks, there's no point in having the lock at all.) Therefore, the only way to prevent this kind of deadlock is to not distinguish reader and writer locks when reasoning about progress guarantees.
This kind of deadlock needs at least three threads in order to bite you, but don't dismiss it outright! If the Internet taught us that a million monkeys in front of typewriters will not eventually recreate all the body of Shakespeare's work. they would at least trigger all possible race conditions in our typewriters no matter how contrived the corresponding executions seem.