Facebook Scale Graph Processing using Giraph.

Track:

scale

Speaker(s):

Nitay Joffe

This talk will discuss the design and architecture of Giraph - an offline distributed graph processing system built on top of Hadoop. We have scaled Giraph to process very large graphs. For example at Facebook we run PageRank on a 400 billion edge graph in a matter of minutes. A similar workload in Hive or Hadoop takes many hours and requires an order of magnitude more machines. The talk will go through the design decisions we made in order to keep Giraph simple to use yet expressive and powerful. We will dive into the architecture that allows Giraph to scale to very large data sizes. Giraph utilizes Hadoop for job scheduling, resource management, and checkpointing, among other things. We ended up customizing core functionality for efficiency wins. In particular, we built on our own completely in-memory message passing system and use Netty I/O with a lot of caching for performance.

Slides are here: http://www.slideshare.net/nitayj/2013-0603-berlin-buzzwords

About the speaker:

Nitay Joffe is a software engineer at Facebook working on large scale graph computation. He has contributed to several Apache projects including HBase, Thrift and ZooKeeper before working on Giraph. He likes wheat beer.

Schedule info

Time slot:

3 June 12:00 - 12:45

Room:

Kesselhaus

You are currently visiting an old archive website with limited functionality. If you are looking für the current Berlin Buzzwords Website, please visit https://berlinbuzzwords.de

Facebook Scale Graph Processing using Giraph.

Schedule info

Gold-Partner

Silver-Partner

Startup-Sponsor

User login