Super Volatile

Krzysztof Szafranek's link blog

Hi, I'm Krzysztof and I make websites.
When I'm not making websites, I read these.
Apr 9, 2012 / 2:35pm

Just how big are porn sites?

To put that 800Gbps figure into perspective, the internet only handles around half an exabyte of traffic every day [10], which equates to around 50Tbps — in other words, a single porn site accounts for almost 2% of the internet’s total traffic. There are dozens of porn sites on the scale of YouPorn, and hundreds that are the size of ExtremeTech or your favorite news site. It’s probably not unrealistic to say that porn makes up 30% of the total data transferred across the internet.

So much for the “internet is made of kittens” myth.

Filed under: scalability  
Nov 20, 2011 / 1:33pm

StackExchange Architecture Updates - Running Smoothly, Amazon 4x More Expensive

Notable for their scale-up architecture, you might expect with their growth that they would slam into a wall. Not so. They've been able to scale-up the power of individual servers by adding more CPU and RAM. SSD has been added in some cases. Even their flagship StackOverflow product runs on a single server. New machines have been bought, but very few.

Interesting details of the architecture that powers Stack Overflow and Stack Exchange.

Filed under: scalability   stack overflow  
Sep 13, 2011 / 10:02pm

Amazon is More Interesting than Google

But the world has changed, and Google can’t seem to keep up. Amazon has become the polar opposite of Google, empowering every developer on the planet to make incredible technology. Want MapReduce? Amazon has you covered. Want to play with terabytes of data like it ain’t no thing? Check. Want to launch thousands of servers to handle a tough computation? Check, check, and check. Want to launch thousands of human brains to solve otherwise unassailable problems? No problem. Heck, want to simply send email to your users? They have that too.

Amazon embraced much more pragmatic and commercially-driven approach to technology than Google. While the latter company may seem more open, eventually it may lose the hearts of developers for its inability to successfully commercialize the results of its R&D, effectively reducing them to impractical toys.

Filed under: amazon   google   scalability  
Mar 26, 2011 / 1:18pm

Anatomy of a Crushing

A sustainable, credible business model is a big feature.
more on pinboard.in

A scalability story from the trenches. Apart from technical information, there are some interesting observations on the relationship between pricing model and the need to scale.

Filed under: scalability  
Mar 1, 2011 / 11:39pm

Don't scale: 99.999% uptime is for Wal-Mart

To go from 98% to 99% can cost thousands of dollars. To go from 99% to 99.9% tens of thousands more. Now contrast that with the value. What kind of service are you providing? Does the world end if you’re down for 30 minutes?

This advice is so obvious to anyone who has done (or failed) some real projects. Yet, there's the nerd's urge for perfection that more often than not brings extra delays, money waste and doom.

Filed under: scalability   software development  
Jan 22, 2011 / 10:50pm

Why does Quora use MySQL as the data store rather than NoSQLs such as Cassandra, MongoDB, CouchDB, etc?

The primary online data store for an application is the worst place to take a risk with new technology. If you lose your database or there's corruption, it's a disaster that could be impossible to recover from. If you're not the developer of one of these new databases, and you're one of a very small number of companies using them at scale in production, you're at the mercy of the developer to fix bugs and handle scalability issues as they come up.
more on quora.com

Quora developer answers why they choose MySQL instead of following NoSQL fad.

Filed under: nosql   scalability  
Dec 21, 2010 / 2:46pm

The Full Stack, Part I

I'm using approximate powers-of-ten here to make the mental arithmetic easier. The actual numbers are less neat. When dealing with very large or very small numbers it's important to get the number of zeros right quickly, and only then sweat the details. Precise, unwieldy numbers usually don't help in the early stages of analysis.
more on facebook.com

Facebook engineer shares interesting hints on estimating scalability of the architecture. Also, illustrates well the power of the bottom-up understanding of the system.

Filed under: performance   scalability  
Dec 16, 2010 / 10:32pm

5 Lessons We’ve Learned Using AWS

3. The best way to avoid failure is to fail constantly.

We’ve sometimes referred to the Netflix software architecture in AWS as our Rambo Architecture. Each system has to be able to succeed, no matter what, even all on its own. We’re designing each distributed system to expect and tolerate failure from other systems on which it depends.

If our recommendations system is down, we degrade the quality of our responses to our customers, but we still respond. We’ll show popular titles instead of personalized picks. If our search system is intolerably slow, streaming should still work perfectly fine.

Netflix's developers share their experiences with Amazon Web Services at massive scale. What sounds like a good tip is to have a fallback still serving some data when one of the dependent services fails.

Filed under: scalability  
Sep 9, 2010 / 8:09am

As Digg Struggles, VP Of Engineering Is Shown The Door

The new version of Digg, v4, is based on a distributed database called Cassandra, which replaced the MySQL database the site ran on before. Cassandra is very advanced—it is supposed to be faster and scale better—but perhaps it is still too experimental. Or maybe it’s just the way Digg implemented it (Twitter uses Cassandra, although not for its main data store, as does Facebook in places, but it obviously is not as battle-tested as it needs to be). Every engineer at Digg is currently just trying to keep the site up and running.

Quinn was the main champion of moving over to Cassandra, say our sources.

This article is just speculative, but if it's true, then it would be another story of a silver bullet that backfired. Reminds me of Twitter's struggle with Rails and Reddit's with Lisp.

Filed under: nosql   scalability  
Apr 2, 2010 / 12:33pm

Why I’ll never own another server – stu.mp

If you run huge amounts of servers AWS can be a few hundred thousand more by comparison on raw numbers that compare cost of your own hardware to cost of AWS. The problem with this vanilla comparison is it forgets one extremely important cost for startups – opportunity cost.
more on stu.mp

The founder and CTO of SimpleGeo talks about the hidden costs of not using the cloud.

Filed under: aws   scalability