Turn on ye ole Javascript to add ratings in this low-budg app.

Cloud-Crunching Big Data with HIVE/Hadoop and R

Event Interactive 2010
Format Dual
Organizer Michael Driscoll Dataspora
Description We live in the Age of Big Data, where even six-month old start-ups can accumulate billions of data points. We'll show how we've used two emerging tools -- HIVE/Hadoop and R -- to crunch, analyze, and visualize massive data sets on Amazon's EC2 infrastructure.
Questions
Answered
  1. What is HIVE/Hadoop and why do I need it?
  2. What is MapReduce and what problems is it ideal for?
  3. How can I launch a HIVE/Hadoop cluster on Amazon's EC2 infrastructure?
  4. What are the best practices for persisting massive data on the cloud?
  5. How can I bolt statistical analyses onto Hadoop using R (and RHIPE)?
  6. How can I sample data from my data set with HIVE?
  7. What is HadoopStreaming and why should I fall in love with it?
  8. What are some workarounds to the limitations of HIVE's built-in data types?
  9. How can I give hints to HIVE to efficiently partition my data?
  10. How much does a Petabyte weigh today -- and how much did it weigh in 1970?
Level Advanced
Category Back-End Programming / Databases, Cloud Storage / Delivery, Information Architecture, New Technology / Next Generation, Open Source