A Poor Man’s CDN

Disclosure: this post mentions my current hosting provider (
). I’m in no way affiliated to them, I just like them a lot.

Hosting large and often-downloaded files can be tricky, especially when you want users to have decent download speeds and 100% availability. This is the story of how
are hosted.

First, some data:

  • At the time of writing, there are
    102 docsets
  • The total size of these docsets is
    1.5 GB
    (while archived)
  • Bandwidth requirements are in the range of
    5-7 TB / month
  • It would cost about
    $600 / month
    to host them in a “regular” CDN. In contrast, my hosting only costs
    $20 / month
    (thanks to 4 VPSs from DigitalOcean)

Hosting the docsets

Some docsets are somewhat large, so download speeds need to be decent. This is achieved by hosting the files in different data centers:

  • 2 mirrors in New York (for North America)
  • 1 mirror in San Francisco (for North America and Asia)
  • 1 mirror in Amsterdam (for Europe – or at least Western Europe)
  • Extra mirrors can be added in less than 2 minutes to account for spikes

South America, Eastern Europe, Africa and Australia are not directly covered, but should still have alright download speeds, as no one complained yet. More mirrors will be added whenever DigitalOcean opens more data centers.

Dash performs latency tests on all available mirrors by loading a
small file
. The mirrors are then prioritised based on latency. Whenever a mirror goes down, Dash notices it and avoids it.

This setup results in 100% uptime and really cheap bandwidth costs. I highly recommend you consider a similar setup if you need to host large files.

Hosting the docset feeds

The docset feeds are just small XML files which Dash polls to check for updates. These files are requested a lot, on each Dash launch and every 24 hours afterwards. As each docset has its own feed and most users have more than one docset installed, about 320k HTTP requests are made each day.

These requests are easily handled by a
nginx web server
on a 512 MB VPS in New York and are also mirrored on
. I tried using Apache but it would sometimes use over 1 GB of RAM while hosting these files and would end up completely failing, while nginx serves requests faster and uses less than 40MB of RAM. I’ll talk about my experiences with nginx in a future post.

Whenever Dash needs to load a feed, it launches 2 threads which race to grab the feed (from kapeli.com or from GitHub), whichever thread finishes first wins and its results are used. Most of the time, the kapeli.com thread wins.

The chances of both kapeli.com and GitHub being unavailable are very very small, so this approach resulted in 100% uptime so far.

稿源:Kapeli Blog (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 移动互联 » A Poor Man’s CDN

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录