Here are some interesting facts about Google Reader:
* Google Reader has two kinds of feeds:
- feeds that have one subscriber (two thirds from the number of feeds, they're updated every 3 hours)
- feeds that have more than one subscriber (these feeds are updated every hour)
* Google Reader uses 10 TB for storing all the raw data
* Google Reader crawls 8 million feeds
* Google Reader is the only major feed reader that keeps the entire history for all the feeds.
* many Google applications use Google Reader's infrastructure for feeds: iGoogle, orkut, Gmail's web clips, Blogger widgets, Google Spreadsheets, Ajax API. Google Reader is the place for any kind of user-driven activities that involve feeds and it's independent from Google Blog Search.
* the rate of user growth = the rate of growth for the number of feeds
* the index size grows 4% every week
* 70% of the Google Reader traffic comes from Firefox (a lot of geeky users)
* Gmail and orkut are the only Google applications that have a bigger number of pageviews/user than Google Reader
* search requires a lot of computational resources. Google Reader uses two indexes for search:
- a big tree updated twice a day (150machines, 600 million documents)
- 40 small trees for recent posts, updated every 5 minutes (40 machines, 40million documents)
* future features:
- very soon: internationalization, feed recommendations, accepting pings sent to Google Blog Search
- in the near future: simple clustering based on links (posts that link to the same page), adding comments to the shared items
- in the distant future: getting calls to Reader from Gmail and orkut's main interface
- idea for monetization: adding AdSense ads and sharing the revenue with publishers, assuming they use AdSense
Note: Most of the information from this post comes from a confidential video in which Google's Ben Darnell explained to some Nooglers how Google Reader works. The video was hosted by Google Video, but it's no longer available. More about the video here.
{ via Blogoscoped }