Tutorial

Design your Apache Hadoop strategy with this guide

SearchBusinessIntelligence.IN staff

Hadoop, which is part of the Apache Software Foundation, is increasingly becoming the technology of choice to deal with big data and work with unstructured and new data forms. This resource is a good starting point for BI professionals who need to know the intricacies of Apache Hadoop.

 

Table of contents:

  1. Will Hadoop help tackle the ‘big data’ problem?
  2. A quick comparison between Hadoop and MapReduce
  3. What works for you? Customary DW concepts or Hadoop?
  4. Build massively distributed applications with Hadoop
  5. How do Hadoop and open source impact BI architecture?
  6. Distributed big data processing with Hadoop
  7. Kobielus: It’s time for a standards body
  8. Expert-views on Hadoop, EDW and big data
  9. Know how Yahoo optimizes the banner ads
  10. Microsoft’s Hadoop strategy

 

Will Hadoop help tackle the ‘big data’ problem?

“Problems don’t care how you solve them,” wrote James Kobielus, a Forrester analyst, in his blog last year. The statement that he made in the context of Hadoop and big data has already become imperative to the successful utilization of data today. Data scientists venture on a journey seeking solutions away from traditional database and business intelligence tools. Let’s take a look at what path Apache Hadoop is paving for big data analytics.

 

A quick comparison between Hadoop and MapReduce

There’s a lot of talk going around about Apache Hadoop and MapReduce, but there still prevails lack of clarity as to how those two emerging database technologies relate to each other. Read this Q&A and get a quick lesson on their association.

 

What works for you? Customary DW concepts or Hadoop?

Organizations are already neck deep in traditional data warehousing methods; changing course to Apache Hadoop now has become a matter of proving that the earth is round. Here we explore the pros and cons of customary data warehouse concepts vs. Apache Hadoop. Let the show-down begin.

 

Build massively distributed applications with Hadoop

The deluge of information has ushered in a series of technological breakthroughs that allows organizations to grapple with data stores stretching into the hundreds of gigabytes and even petabytes. Learn how to build massive applications to utilize Apache Hadoop.

 

How do Hadoop and open source impact BI architecture?

Yes, everyone is excited about Apache Hadoop, but just how and where does it fit in? Find out how the open source technologies affect business intelligence architecture and BI development.

 

Distributed big data processing with Hadoop

Open source Hadoop enables distributed data processing framework for handling big data applications across cloud servers. The idea is that distributed, parallel processing will result in redundancy and stronger application performance across clouds to prevent failure.Get the full details.

 

Kobielus: It’s time for a standards body

James Kobielus, senior data management analyst with Cambridge, Mass.-based Forrester Research Inc., is the authority on Apache Hadoop and big data. In this interview with him, he gets candid about the challenges that the adoption of Apache Hadoop face.

 

 

Expert-views on Hadoop,EDW andbig data

The bugle sounds as Yahoo! Inc. takes on Apache Hadoop as its baby. Many organizations line up behind the Hadoop banner. Read on as experts take on Hadoop and discuss whether it is a passing trend or here to stay.

 

Know how Yahoo optimizes the banner ads

Find out how Yahoo’s BI strategy is optimizing banner advertisement campaigns through data management and analytics, all while enabling end users to query unstructured data. We take a look at the problems, business gains, and the future of the implementation.


Microsoft’s Hadoop strategy

“Big data ain’t that big anyway, say the vendors nonchalantly--when it pleases them. Microsoft hopes to do the same and harness the waves of fast-moving, enormous sets of structured and unstructured data that are overwhelming enterprises by linking the upcoming SQL Server release and the Windows Azure cloud platform to big data workhorse Apache Hadoop.

 

 

 


 

This was first published in February 2012