Twitter, Rails, Hammers, and 11,000 Nails per Second
There’s an interesting kerfluffle going on regarding the scaling woes that Twitter.com is going through, especially since it’s built on Ruby On Rails. Here’s the original interview with one of the Twitter coders, the somewhat evasive reply by the lead Rails architect, and Mark Pilgrim’s cruel-but-funny dissection of the latter.
Ruby is a lovely language and Rails is a lovely framework, but both of them trade performance for aesthetics and convenience. That is, they’re slow. No problem, though — in the magical world of web apps, you just solve performance problems by throwing hardware at the problem, usually in the form of more and more web servers. (If I sound sarcastic, I’m just envious because I write client-side software, where we don’t have that convenient fallback.)
Then, since 99.99% of web apps use a SQL database, the next bottleneck is the database server that all the web servers are talking to. The next step of course is to buy more database servers, but then you have to distribute/replicate the database across those servers, which is even more challenging. (Or if you’re a crazed wunderkind like LiveJournal founder Brad Fitzpatrick, you invent a memory-based distributed hashtable as a cache to put in front of the database.)
SQL as the universal hammer
This whole thing has me wondering why it is that SQL databases are used as the all-purpose hammer for solving all data storage problems for web apps. To some degree this has to be a historical accident, because I don’t think it’s necessarily obvious. My guess is that (a) a lot of companies already used these databases for storing their, uh, data; (b) in the mid-’90s all these companies were desperate to Get On The Web; so (c ) countless web developers wrote “three-tier” frameworks for gluing those SQL databases onto web servers.
I’ve been using SQL a lot (via sqlite, as a data store for the Syndication framework) and it’s quite nice. But I don’t think SQL — or any sort of query-based database system — is the right tool for every job, in particular these kinds of social-messaging apps like Twitter.
“11,000 requests per second”
The figure that stands out to me is “…up to 11,000 requests per second”. Jesus Christ, that’s a lot. Where does that come from? Even assuming a million members who each post ten times a day, that’s only about 100 posts per second. If each member runs a client app polling for updates every 15 minutes, 24 hours a day, that’s still only about 1,000 hits/sec. There’s still an order of magnitude unaccounted for. (Maybe the apps poll every 1.5 minutes?)
But think about the polling again. Each of these requests just boils down to the client asking “what messages did I get since the last time I asked?” The SQL way of answering that question is to run a complicated join that looks up all of the members I’m subscribed to, looks up all the messages of those members, and selects the ones whose timestamp is later than the date of the client’s last request. Sure, there are indexes, and databases are optimized for this stuff, but it’s still going on a thousand times a second.
Now, a more sane way to do this (if I may be so bold) is to keep a message queue for every member, and whenever someone posts a new message, copy a pointer to that message into the queue of every member who’s subscribed. Then when someone polls, you just have to get the messages out of their queue and send them along. If that sounds kind of familiar, it’s because that’s exactly how email works — the Twitter posting interface is like a listserv, and the polling interface is like a POP server.
The cool thing is that you don’t even need a database to do this. You can store all of this stuff using very simple file formats, with one file per message queue. Since this is so much simpler than triple-decker SQL joins, one server can handle more requests; and if that’s too much, you can use a networked filesystem (or even a SAN) to distribute it across servers.
And of course you could keep going with this novel “distributed” idea, and make this into an actually-distributed system, where individual users (or groups of them) can run their own Twitter servers that queue their incoming messages and relay their outgoing ones. Imagine!
“We’re all doomed,” sighed Eeyore
The trouble is, the idea of centralizing is so entrenched, and there’s so much industry support for “scalability” by means of piling on more servers and moving to bigger data-centers with fatter pipes, that I can see that this is going to go on for some time. (I’ve heard that AIM handles over a billion IMs a day, every one of which streams in and out of an unfathomably fat pipe in a huge AOL data center in Virginia. Shudder.)
In some moods (the same sorts of moods where I fantasize about how $5-per-gallon gas would kick-start desperately needed changes in our transportation system) I think about what would happen if all the DNS servers mysteriously vanished, all the data centers blew up, and we had to re-implement everything in a distributed peer-to-peer fashion. It’s actually really interesting …
April 18th, 2007 at 5:54 PM
Good reminder to not always assume a relational database is the answer. I am such a newb when it comes to web development, that when I bite off some small piece of functionality for my site, the question is usually “how did somebody else do it,” and the answer almost always involves SQL and a database.
I recently added support for providing a discount price to a relatively small (hundreds) number of customers, who had purchased a previous version of the app. My first thoughts again turned to a database. How would I design the schema, etc. Finally I realized it was ridiculous, and popped all of the lines into a text file, which gets searched at request time. My biggest concern was that I was handling the access-mutexing as correctly as a database would, but overall the solution was greatly simplified.
April 18th, 2007 at 8:16 PM
Look at Prevayler and HAppS, two systems that don’t use a database at all. In-memory persistence with write-ahead logging, and they handle give-or-take 1000 hits/s on a stock Xeon server.
April 18th, 2007 at 9:26 PM
Please remember that I am very old (and in black and white) but these pages are very hard to read.
April 18th, 2007 at 10:00 PM
Certain large ecommerce sites make use of BDB as a read-only caching layer. Since most twitter access is read-only, it’d make sense for them to move to that model as well, since replicating from SQL into BDB is fairly quick and reading from BDB is lightning-fast.
April 18th, 2007 at 10:31 PM
I’m starting to work on a web app that is expected to need to store about 200 or 300 simple records a year. I suggested that we just throw those records in text file and be done with it.
The program manager insisted on an Oracle database and assumes I don’t know SQL. Why else would someone store data in a text file?
April 18th, 2007 at 10:56 PM
Sorry to pop your bubble, but you seem to be concentrating on one specific tradeoff while ignoring many others which are just as important (if not more). Lets look, for example, at the write/read ratios and required IO rates for your suggestion. In your sample scenario (1M members * 10 posts), lets assume each post has 15 subscribers who each poll 24 times a day. So with a database, we have 1M*10=10M inserts per day, and 1M*15*24 = 360M queries a day.
Now, when using queues we will need 1M*10 * 15 queues = 150M inserts per day. Notice that you have upped the storage requirements by a factor of 15 (!!), not to mention the increased IO throughput needed to support writing all that data.
What about locking issues? Assuming that each member subscribes to 15 others, each queue must support syncronization and locking of up to 15 simultaneous publishers - a non trivial task to implement performantly. And of course, the system must do all of the above while still not blocking the 360M read requests coming in.
Let’s add in a few more real-world requirements to the mix (what happens when a disk or an entire server goes down?) - fault tolerance, incremental backup/restore, manageability.
Bottom line - If you don’t buy a database you will end up writing one.
April 18th, 2007 at 11:09 PM
As somebody who started plugging databases into web sites in around ‘95 maybe I can shed some light on why it is so popular, because back then there wasn’t any precedent to go on so we chose our solutions based on whatever tools we happened to have to hand.
Many people did use the file system for handling storage in the way that you describe. The problem is that a web application is a multi-threaded application and an SQL server solves the concurrency problems.
With your solution there are all sorts of subtle contention problems on those files. If two requests are trying to write to the same person’s queue how do you handle it? If somebody else is trying to read it at the same time what happens? If you’ve updated three queues but fail on the next how do you handle this?
These are all questions that SQL databases answer for us and the answers they give are both good and, just as importantly, easy to use. The reason why web applications are so easy to build is because the SQL server abstracts all of the difficult multi-threading issues for us.
It isn’t that we can’t use flat file databases and it isn’t that they wouldn’t be more appropriate in many cases, it is that if we use them we have to write our own concurrency controls and if you’ve ever tried that you know just how hard it is to get right.
The SQL server takes all of that pain away and lets us write web apps as if we were writing single-threaded applications. That’s a very powerful abstraction to get with such low cost.
April 19th, 2007 at 1:04 AM
Even better, a folder format, with one file per message. One’s messages would be in username/messages, and one’s queue would be in username/queue, and one’s friends list would be in username/friends. Each of these would be a directory: the messages directory would contain regular files, the friends directory would contain symlinks to other user directories, and the queue directory would contain hard links to regular files in other users’ messages directories.
Adding a message to someone’s queue would consist of hard-linking it (the ln command, without the -s option); removing a message from someone’s queue would consist of unlinking it (the rm command, with no options at all).
If the Twitter people are reading: You’re welcome to take this idea and use it. ☺
April 19th, 2007 at 1:04 AM
Paul Graham makes the point that they used the Unix file system for storage when building Viaweb: http://www.paulgraham.com/vwfaq.html
However, I believe Yahoo! Store uses an RDBMS now.
April 19th, 2007 at 3:21 AM
I don’t see what the problem is, I really don’t.
It’s twitters problem, and it’s a design error.
Have they not considered partitioning the data over several databases, file systems and/or cache layers? If statistical analysis of twitter usage is not part of the core requirement, then partitioning strategies are very easy to come up with when you look at what the queries are trying to obtain.
This just strikes me as a problem of design, and not technology. If your data storage design fails to scale, then don’t blame the tools.
April 19th, 2007 at 3:52 AM
“Then, since 99.99% of web apps use a SQL database,”
I think zope and plone have a bigger share than 0.01%!!
April 19th, 2007 at 4:49 AM
@DavidK
The rails filosofie is NOT having that design freedom (confention over … etc). So you are saying that It is not only a technical problem that rails does not scale it is a Fundemental problem in the rails filosofie!!!
April 19th, 2007 at 6:18 AM
The real brainbug with Twitter is using a polling model in the first place. Push would be much more efficient: hold a TCP connection open to each client and propagate every relevant new message as it arrives. No need for anything beyond in-memory storage for messages.
April 19th, 2007 at 6:30 AM
Databases are not always the answer. Unfortunately, most people would flatly disagree with that statement. In fact, I wrote a simple website that used two flat files to store information, and was lambasted “because I didn’t use a database.”
I think you’re right about the benefits of using queues based on a simple file format. I know that databases are “supposed” to be the storage medium of choice. But if there is a better, although conventional way, they should go for it.
Lots of developers will go crazy hearing that idea, so I reference this link as a way of rebuttal:
http://themicrobusinessexperiment.blogspot.com/2007/03/coding-by-dogma.html
April 19th, 2007 at 6:42 AM
Yes polling would be horribly inefficient for Twitter and I’d be very surprised if they wer but you can’t “hold a TCP connection open to each client” when you have as many clients as they do. That works on the small scale but when you start having lots of them OSs have a tendency to implode, unless of course you have a massive farm of computers to handle them all but that’s not a very sensible solution considering most of them are doing nothing most of the time.
April 19th, 2007 at 7:04 AM
Databases are always the best option for storing and recalling large amounts of data.
April 19th, 2007 at 7:23 AM
“Databases are always the best option for storing and recalling large amounts of data.”
That is of course true by definition. It just shows how nontechnical people have opinions on things they do not really understand. A database management system (DBMS) is not the same as a database! A database is more the collection of data. fot example Mysql is not a database but a database management system.
April 19th, 2007 at 7:31 AM
Also DBMS’s are probably faster under concurrent loads than files the locking is finer in a database and courser in a file (you lock the complete file but a database has advanced locking algorithms designed to perform better under higher concurrent loads)
April 19th, 2007 at 8:00 AM
So, a couple of you have written itsy-bitsy little web apps which use the file system for storage. And suddenly that qualifies you as experts on writing massively scaleable web applications?
As a developer with a decade of experience in heavily multi-threaded server network and communications programming, the idea of solving all the concurrency issues others have described above to makes the hair on the back of my neck.
Modern RDBMSs are the cheapest way to achieve even modest scaleability. And even if the schema design is horrible, they will still scale better than the proposed file-system solutions (with all those critical sections and mutexes causing bottleneck after bottleneck).
The developers of Oracle, MySQL, SQL Server, Sybase Adaptive Enterprise, DB/2, PostgreSQL etc have invested many thousands of man-years into research into locking and concurrency strategies. I have too much experience in this field to think that I could come up with a solution which was even close, let alone better.
And let’s not forget the flexibility with which you can change the shema/indices and queries with a relational database. Good luck writing new query logic against a homemade file format when the requirements change (as they will on a monthly basis with agile systems development). And good luck debugging the data corruption issues which result from the lack of atomic transactions.
If you can’t appreciate the complexity of these issues and as a result believe that the file-system-based approach is a good idea then I hope to God that you aren’t employed developing software I ever have the misfortune to pay for.
April 19th, 2007 at 8:02 AM
Um, not to burst anyone’s bubbles, but a filesystem is a database so you using this one file/message and using a directory to hold each person’s queue of messages us just using a database, the filesystem.