Putting the Sting In Hive

Track: 
scale
Speaker(s): 

Apache Hive is the most widely used SQL interface for Hadoop. As Hadoop usage continues its explosive growth, Hive's performance and features do not meet the requirements and expectation of many users. This includes answering queries in human time (less than 30 seconds) and support for common analytics operations. The Hive community has risen to the challenge. Work is being done to drive down start up time of a Hive query, extend Hive to work on Tez (a Hadoop execution environment that is much faster than MapReduce), make Hive operators process records at 10x more than their current speed, add support for analytics and windowing functions such as RANK, NTILE, LEAD, LAG, etc., and add support to Hive for standard SQL datatypes. This talk will discuss the design and code changes that have been done as well as look at ongoing work and additional optimizations and features that could be added in the future.

About the speaker: 
Alan is a co-founder of Hortonworks and an original member of the engineering team that took Pig from a Yahoo! Labs research project to a successful Apache open source project. Alan also designed HCatalog and guided its adoption as an Apache Incubator project. Alan has a BS in Mathematics from Oregon State University and a MA in Theology from Fuller Theological Seminary. He is also the author of Programming Pig, a book from O’Reilly Press.

Schedule info

Time slot: 
3 June 12:00 - 12:45
Room: 
Palais