Roberto Vitillo's Blog

Understanding Distributed Systems 1.1

March 25, 2021

Coordination is expensive as it reduces the availability and performance of distributed applications (PACELC theorem). I have extended…

Leader election best practices

December 26, 2020

Sometimes a single process in a system needs to have special powers, like being the only one that can access a shared resource or assign…

How distributed systems fail

December 05, 2020

At scale, any failure that can happen will eventually happen. Hardware failures, software crashes, memory leaks - you name it. The more…

The costs of microservices

November 22, 2020

An application typically starts its life as a monolith. Take a modern backend of a single-page Javascript application, for example - it…

I am writing a book

June 21, 2020

I have released the first chapter of Understanding Distributed Systems ! "Wait, what? Weren't you working on a video class?" - I hear you…

Back of the envelope estimation hacks

May 19, 2020

There are two types of engineers, the ones that can quickly do estimates and the ones that can't. Are these people just smarter, or is there…

Don't trust default timeouts

May 03, 2020

Modern applications don't crash; they hang. One of the main reasons for it is the assumption that the network is reliable. It isn't. When…

I am back

April 24, 2020

It's been three years since my last post on my old Wordpress blog! You can still find my earlier posts there as I was too lazy to move…

Differential privacy for dummies

July 29, 2016

Technology allows companies to collect more data and with more detail about their users than ever before. Sometimes that data is sold to…

How to review a data analysis

July 18, 2016

Writing good code is hard, writing a good analysis is harder. Peer-review is an essential tool to fight repetitive errors, omissions and…

Counting at scale

April 12, 2016

How engaged are users for a certain segment of the population? How many users are actively using a new feature? One way to answer that…

Monoids for analytics

January 16, 2016

This is a short post on the elegance of using abstract algebra for analytics in Scala. A monoid is a set that is closed under an…

Spark best practices

June 30, 2015

Spark execution model Spark's simplicity makes it all too easy to ignore its execution model, and still manage to write jobs that eventually…