Monday, April 27, 2015

Yahoo chose Ceph for its Cloud Object Store

Yahoo (www.yahoo.com | NASD:YHOO), recently published a very interesting blog post on its Engr portal. No surprise, we learn that Yahoo object storage perimeter covers 250B objects for various content coming from emails, images, videos and blog posts. Yahoo is a strong believer in object storage having used it for several years with an appliance approach. But the company also realizes that a Software-Defined Storage model is the right one to support growth of data volume within the company and to support various applications workloads.
Yahoo insists on 3 dimensions when selecting SDS:
  • Cost trade-off,
  • Access Methods what Yahoo calls Interfaces and
  • Storage abstractions with Block, File or Objects.
Yahoo names its approach Cloud Object Storage or COS and started with Flikr with a multi-PB configuration. In 2015, COS support more than 100 PB for Flikr, Yahoo Mail and Tumblr. Wow impressive.


After a deep study between OpenStack Swift, Ceph and a few commercial solutions, Yahoo selected Ceph. The configuration chosen by Yahoo is a federation of Clusters, all Ceph based, that gives to Yahoo the level of flexibility needed by the company. In addition to that, they develop and embed into applications their on hashing algorithm to place data on the right cluster. Each Ceph Cluster is 3PB raw giving a simple and fast recovery and could be seen as the increment size within the supercluster. In addition, Yahoo prefers Erasure Coding in a 8+3 mode. In term fo storage media supported, Yahoo uses SSD, HDD and SMR.


Yahoo also contributes to the project and changes a few things to boost the response time and reduce latency. For instance, in S3, a bucket that stores object in Amazon terminology belongs to 1 node, Yahoo changes that and the bucket is now sharded across multiple node to increase parallelism. Next deployments will be around geo-replication for business continuity, small object configurations and lifecycle management. Very real use case for Ceph at a very large scale. Impressive.
Share:

0 commentaires: