Tuesday, November 25, 2014

Hadoop Buyer's Guide - Part 1

Introduction

This Buyer’s Guide by Robert D. Schneider presents a series of guidelines that you can use when searching for the essential Hadoop infrastructure that will be sustaining your organization for years to come. In fact, this guide is specifically designed to be incorporated into your RFP when it comes to evaluating Hadoop platforms.

Notes : You can also have the insight about Big Data & Hadoop in here :

Related Hadoop Projects

Hadoop has also spawned an entire ecosystem of ancillary initiatives. Here are just a few examples of other projects:



Why your choice of hadoop infrastructure is important?

Acquiring, deploying, and properly integrating all of the moving parts that constitute the Hadoop ecosystem has proven to be a hardship for many IT organizations, which would much rather focus on their primary business responsibilities than the care and feeding of a hand-crafted Hadoop environment. To further muddy the water, not only is Hadoop continually evolving, but so are all of the related projects in its ecosystem. In an effort to ease the task of rolling out a complete Hadoop implementation, a number of vendors are offering comprehensive distributions that generally fall into one of three models:
  • Open source Hadoop and support. This pairs bare-bones open source with paid professional support and services. Hortonworks is a good example of this strategy.
  • Open source Hadoop, support, and management innovations. This goes a step further by combining open source Hadoop with IT-friendly tools and utilities that make things easier for mainline IT organizations. Cloudera is an instance of this model.
  • Open source Hadoop, support, and adding value through architectural innovations. Hadoop is architected with a component model down to the file system level. Innovators can then replace one or more components and package the rest of the open source components and maintain compatibility with Hadoop. MapR is an instance of this model.



Critical considerations when selecting a hadoop platform

Adopting a Hadoop distribution is a vital decision that has far-reaching ramifications for the entire organization, in ways that you can’t fully anticipate when you create your initial appraisal. This is particularly true since we’re only at the dawn of Big Data in the enterprise. Hadoop infrastructure is just that: infrastructure, and it requires the same level of attention and scrutiny as your organization expends when choosing other critical assets, such as application servers, storage, and databases. Thus, you shouldn’t be surprised that your Hadoop environment will be subject to the same requirements as the rest of your IT portfolio, in terms of:
  • Service Level Agreements (SLAs)
  • Data protection
  • Security
  • Integration with other applications
  • Professional services
  • Training
To begin, don’t think of Hadoop as a single solution, but rather as a platform with a collection of applications on top. These elements must work together to derive maximum value. Secondly, don’t force your enterprise to conform to your chosen Hadoop technology; instead, find solutions that adapt to the way you operate your business.

Consider using the guidelines in this section to help you construct an RFP, just like you would when identifying any other fundamental software product. For clarity, these are grouped into four major categories:
  • Performance and Scalability
  • Dependability
  • Manageability
  • Data Access
For each recommendation, Robert D. Schneider explain what it is and what to look for in a Hadoop distribution. Also, provided are several examples that demonstrate how these capabilities add value in real-world situations. 

Continue To Part 2 : Click Here

No comments:

Post a Comment

Share Your Inspiration...