Friday, 10 December 2010

The costs of building storage for the cloud

In a previous post, The costs of storage in the cloud, I attempted to assess the differing costs of storing various amounts of data in the cloud using Amazon S3, Dropbox and Rackspace. I ended by asking:

What is interesting is whether [those prices] can be matched or beaten by academic providers (such as Eduserv) and/or in-house institutional provision, and if so by how much?

So... let's do a quick exercise!

Supposing you wanted to build your own Amazon S3 service? How much would it cost? Could you beat Amazon's S3 prices? And if not, what economies of scale would be necessary before you could?

To answer those questions, let's forget about actual money for a moment. Rather, let's draw up a list of the things that would have to be paid for. We can fill in the numbers later.

So firstly, there's the storage hardware - the raw disks. Of course, it's not quite as simple as that. Decisions need to be made about the kind of storage you want... a fibre-channel based SAN vs. cheaper SATA disks organised into some kind of Network-attached Storage (NAS) for example. Reliability vs. cost will be the issue here. Given that we're trying to build an alternative to Amazon S3, let's use SATA disks. Then there's the chosen architecture. How resilient do you want your storage to be? Amazon offer 99.999999999% durability (which, I think, means they will only lose data 1 time in 100,000,000,000?) with no single points of failure (and a design to sustain the concurrent loss of data in two facilities)... both of which are pretty hard to compete with, so let's relax that requirement a little. Let's say that we want all data to be replicated across 2 separate NAS clusters (meaning that both clusters have to fail before data is lost), running RAID 6 within each cluster (providing fault tolerance from two simultaneous drive failures). With that kind of configuration, providing 1PB of usable disk would require something like 2.7PB of raw disk. (Actually, according to Backblaze's How to build cheap cloud storage, a RAID 6 design based on 15 disks, of which 2 are parity disks, should deliver "87 percent of the raw hard drive totals" as usable space, so we should be able to do a bit better than that).

Then there's the network switching and cabling necessary to join everything together - with, again, decisions to be made about bandwidth and so on. There's also the connection to the outside world to worry about - routers, switching and a firewall for example.

Finally, there's the cost of the physical data centre space to consider, at least insofar as it represents an opportunity cost against doing something else.

That covers the initial investment (which is already non-trivial for any kind of substantial cloud infrastructure), for which costs can probably be depreciated over, say, 5 years.

There are also the recurrent costs...

Energy, both to power the disk arrays and other servers and to keep everything cool, and staffing. Operator cover (perhaps 0.5FTE per 10PB?), some developer effort at least initially (both to keep things running smoothly and to assist in integrating the cloud storage with other systems - let's say 1FTE), a service/project manager (again, let's say 1FTE) and some procurement/financial effort. Again, these are non-trivial sums of money.

So that's my shopping list. What have I forgotten? What have I over-specified?

At this stage, I'm not going to fill in the actual amounts of money - not least because the costs of storage are dropping all the time and the actual price paid will be subject to negotiation. But the shopping list:

  • Disks
  • Network infrastructure (switching, etc.)
  • Router/firewall
  • Physical space costs
  • Energy
  • Operator cover
  • Development effort
  • Project/service management
  • Procurement/financial effort
seems to me to be a useful way of thinking about the overall costs that would have to be met in any activity of this kind.

Because I haven't indicated actual costs (in monetary terms), I can't answer my opening question - at least not in public. We can, and have, begun to answer it privately, and our answers, so far, indicate that we would be able to beat Amazon's prices (not least because a direct connection to JANET means that network bandwidth doesn't need to be charged for but also because our not-for-profit mission means that we are focused solely on delivering effective use of IT for the benefit of education - that's why we do what we do).

Of course, individual institutions are also able to do the sums and in some cases may be able to come up with better answers than we can. Ultimately, economies of scale will win out, one presumes.

Our thinking to date gives us confidence that an academic cloud service of the kind envisioned by FleSSR is viable (in a sustainability sense). The next step is to do some more thinking about the numbers and to consider pricing models that might be more acceptable to HEIs than Amazon's. One of the problems with the Amazon model is that the costs are not predictable in advance. That's something that we'd like to see if it is possible to address.

More anon...

13 comments:

  1. I'd love to see the figures on this myself and I would have thought physical space costs, air con and the costs of the employees needed to monitor it and swap out dead disks will run it pretty close.

    Also are you assuming that the service is used fully or are you keeping all that capacity online to store 20Gb? Thus what is the value for money proposition?

    If it is a lot cheaper and there is a large scale demand it begs the question of why it hasn't been done already? I guess the project is designed to answer that one.

    ReplyDelete
  2. If you felt able to share some numbers for the costs on your shopping list more publicly I'd love to see them.

    A rule of thumb I have heard quoted by people who have recently built machine rooms:

    the cost of building and maintaining your physical space is twice the cost of buying the network and server infrastructure that goes inside it.

    ReplyDelete
  3. You seem to have the right mix of elements in there, although how you turn them into a cost model is critical. I might also question some of the figures you have given, although not the basic assumptions.

    Staff are clearly necessary, but I'm not convinced that a service like this needs a full-time manager AND operator support until it's operating on a very large scale. You do need staff, but getting the number right is critical both to the quality of service and your long-term costs. Staff costs tend to increase in a way that technology doesn't. (Energy costs are likely to grow faster than the staff ones, though.) I've heard figures from one of the large-scale cloud compute providers of 1 operator per 1,000 blades. (About 1.4 hours/year staff time per server.) That's clearly only achievable with well-planned automation of many things. It's not easy to translate to storage, where my guess would be that failures requiring manual intervention are more frequent.

    I think you are over-specifying the storage, though, and there's scope here to provide something different to Amazon et al. Not all uses of cloud storage require 2-site replication or high degrees of redundancy or availability. If I am using the cloud to provide off-site resilience for local storage (for instance) I only want one copy, and I can tolerate failure. If you can configure your storage to provide that as an option to some, it could be attractive. We're thinking of DCC services on top of this cloud that would ideally need just that.

    It's also dangerous to assume the network is free. JANET is free at the point of use and costs aren't volume related, but at minimum there's an opportunity cost for you, or another institution, in using your bandwidth in this way. Once your link is saturated you start weighing up which services really require it. And we can't assume JANET's model will continue to work the way it has in the past, although I very much hope that it does continue. One of the best decisions taken about national IT infrastructure in the UK ever.

    ReplyDelete
  4. Andy;

    I love what you're doing here but I think you're painting over the amount of cost associated to operations by just placing a line item for "Operator Cover".

    In order to cover all the technologies you've listed above, you either have an expensive IT organization with separate skillsets (security engineer, network engineer, storage, etc)...or you have one or two crackerjack engineers which leave you back at a single point of failure.

    These costs are generally not trivial.

    ReplyDelete
  5. I appreciate you for all the valuable information that you are providing us through your blog.
    Storage Buildings

    ReplyDelete
  6. Thanks for the post event staffing agency You should take part in a contest for one of the best blogs on the web. I will recommend this site!

    ReplyDelete
  7. I was very pleased to find this site. I definitely enjoyed reading every little bit of it and I event staffing agency have it bookmarked to check out new stuff posted regularly.

    ReplyDelete
  8. After reading some nice stuff in your article I really feel speechlessEvent staffing agency

    ReplyDelete
  9. Thanks for sharing this article, it has been a really helpful read. I've never dealt much with storage in Scarborough. I was going to spend the money for a nice unit but my brother in law builds stuff like this and said he would help build on with me, and to top it off, he's not making me pay! I'm just so happy and thrilled!

    ReplyDelete
  10. This is awesome! I've been looking for some quality metal storage buildings in North Carolina, and this gave me some great insight of what to look for when choosing one. Thanks for sharing!

    ReplyDelete
  11. A regret that we have heard from past customers is they wished they had thought more long range and purchased a building that could have housed their future needs as well.

    little barns

    ReplyDelete
  12. wow... what a wonderful site.. i really impressed this site.. great work.. after read this site i get some new knowledge.. This is the perfect blog for anybody who hopes to understand this topic.




    Citrix Thin Client & Linux Thin Client

    ReplyDelete