Possible Twitter Solution
This is, yes, another post on Twitter. There are a few things I have read, thought, or have been released since the last post that should be shared. First, the Twitter folks publicly admit they don’t know what the issues are. Isn’t that something that they should have identified by now? Haven’t these issues happened repeatedly? Like I said in another post, I’m relatively new to Twitter so I don’t want to be to critical but it seems they should know by now what is up.
Driving to work this morning my roommate and I were discussing a possible infrastructure that would support the scalability that Twitter so desperately needs. Then today I read a very interesting blog post with a similar solution to Twitters problems. The post is called “A Detailed Five Step Twitter Scaling Plan” posted by Hank Williams. I really like the simplicity of his diagram in his proposed solution. I don’t want to just repeat his post here but I encourage you to read it.
This is Hank’s simple infrastructure diagram:
His five steps to solve Twitters problems are…
1. Automated CPU provisioning
You really *do* need the ability to provision resources more or less on demand.
2. Denormalization
Every user’s outbound tweets should be stored in a separate database from their inbound tweets
3. Sharding
In the above diagram you see that we have separated users into clusters.
4. Shard Splitting
In the real world each database should continue to grow until it starts to be over burdened. Then you “split the shard” into two databases.
5. Distributed User Lookup
Storing a lookup table for each user that indicates where their inbound and outbound data servers are.
This plan to fix Twitter seems very simple to me. It would be cool if we could get a conversation going in the comments around what other infrastructure solutions people think are viable for Twitter. Also, why do you think the are experiencing so many problems? Let us know.



