Crowd-sourced intelligence built into Search over Hadoop


Search has quickly evolved from being an extension of the data warehouse to being run as a real time decision processing system. Search is increasingly being used to gather intelligence on multi-structured data leveraging distributed platforms such as Hadoop in the background. This session will provide details on how search engines can be abused to use not text, but mathematically derived tokens to build models that implement reflected intelligence. In such a system, intelligent or trend-setting behavior of some users is reflected back at other users. More importantly, the mathematics of evaluating these models can be hidden in a conventional search engine like Solr, making the system easy to build and deploy. The session will describe how to integrate Apache Solr/Lucene with Hadoop. Then we will show how crowd-sourced search behavior can be looped back into analysis and how constantly self-correcting models can be created and deployed. Finally, we will show how these models can respond with intelligent behavior in realtime.

About the speaker: 
About Grant Ingersoll Grant Ingersoll is a co-founder of LucidWorks as well as an active member of the Lucene community – a Lucene and Solr committer, co-founder of the Apache Mahout machine learning project and a long standing member of the Apache Software Foundation. Grant’s prior experience includes work at the Center for Natural Language Processing at Syracuse University in natural language processing and information retrieval. Grant is also the co-author of the upcoming "Taming Text" from Manning Publications.

Schedule info

Time slot: 
4 June 16:00 - 16:45