The HP Big Data Reference Architecture: It’s Worth Taking a Closer Look…
This is a duplicate of the blog I’ve authored on the HP blog site at http://h30507.www3.hp.com/t5/Hyperscale-Computing-Blog/The-HP-Big-Data-Reference-Architecture-It-s-Worth-Taking-a/ba-p/179502#.VMfTrrHnb4Z
I recently posted a blog on the value that purpose-built products and solutions bring to the table, specifically around the HP ProLiant SL4540 and how it really steps up your game when it comes to big data, object storage, and other server based storage instances.
Last month, at the Discover event in Barcelona, we announced the revolutionary HP Big Data Reference Architecture – a major step forward in how we, as a community of users, do Hadoop and big data – and it is a stellar example of how purpose-built solutions can revolutionize how you accelerate IT technology, like big data. We’re proud that HP is leading the way in driving this new model of innovation, with the support and partnership of the leading voices in Hadoop today.
Here’s the quick version on what the HP Big Data Reference Architecture is all about:
Think about all the Hadoop clusters you’ve implemented in your environment – they could be pilot or production clusters, hosted by developer or business teams, and hosting a variety of applications. If you’re following standard Hadoop guidance, each instance is most likely a set of general purpose server nodes with local storage.
For example, your IT group may be running a 10 node Hadoop pilot on servers with local drives, your marketing team may have a 25 node Hadoop production cluster monitoring social media on similar servers with local drives, and perhaps similar for the web team tracking logs, the support team tracking customer cases, and sales projecting pipeline – each with their own set of compute + local storage instances.
There’s nothing wrong with that set up – It’s the standard configuration that most people use. And it works well.
Just imagine if we made a few tweaks to that architecture.
- What if we replaced the good-enough general purpose nodes, and replaced them with purpose-built nodes?
- For compute, what if we used HP Moonshot, which is purpose-built for maximum compute density and price performance?
- For storage, what if we used HP ProLiant SL4540, which is purpose-built for dense storage capacity, able to get over 3PB of capacity in a single rack?
- What if we took all the individual silos of storage, and aggregated them into a single volume using the purpose-built SL4540? This way all the individual compute nodes would be pinging a single volume of storage.
- And what if we ensured we were using some of the newer high speed Ethernet networking to interconnect the nodes?
Well, we did.
And the results are astounding.
While there is a very apparent cost benefit and easier management, there is a surprising bump in performance in terms of read and write.
It was a surprise to us in the labs, but we have validated it in a variety of test cases. It works, and it’s a big deal.
And Hadoop industry leaders agree.
“Apache Hadoop is evolving and it is important that the user and developer communities are included in how the IT infrastructure landscape is changing. As the leader in driving innovation of the Hadoop platform across the industry, Cloudera is working with and across the technology industry to enable organizations to derive business value from all of their data. We continue to extend our partnership with HP to provide our customers with an array of platform options for their enterprise data hub deployments. Customers today can choose to run Cloudera on several HP solutions, including the ultra-dense HP Moonshot, purpose-built HP ProLiant SL4540, and work-horse HP Proliant DL servers. Together, Cloudera and HP are collaborating on enabling customers to run Cloudera on the HP Big Data architecture, which will provide even more choice to organizations and allow them the flexibility to deploy an enterprise data hub on both traditional and newer infrastructure solutions.” – Tim Stevens, VP Business and Corporate Development, Cloudera
“We are pleased to work closely with HP to enable our joint customers’ journey towards their data lake with the HP Big Data Architecture. Through joint engineering with HP and our work within the Apache Hadoop community, HP customers will be able to take advantage of the latest innovations from the Hadoop community and the additional infrastructure flexibility and optimization of the HP Big Data Architecture.” – Mitch Ferguson, VP Corporate Business Development, Hortonworks
And this is just a sample of what HP is doing to think about “what’s next” when it comes to your IT architecture, Hadoop, and broader big data. There’s more that we’re working on to make your IT run better, and to lead the communities to improved experience with data.
If you’re just now considering a Hadoop implementation or if you’re deep into your journey with Hadoop, you really need to check into this, so here’s what you can do:
- my pal, Greg Battas posted on the new architecture and goes technically deep into it, so give his blog a read to learn more about the details.
- Hortonworks has also weighed in with their own blog.
If you’d like to learn more, you can check out the new published reference architectures that follow this design featuring HP Moonshot and ProLiant SL4540:
- HP Big Data Reference Architecture: Cloudera Enterprise reference architecture implementation
- HP Big Data Reference Architecture: Hortonworks Data Platform reference architecture implementation
If you’re looking for even more information, reach out to your HP rep and mention the HP Big Data Reference Architecture. They can connect you with the right folks to have a deeper conversation on what’s new and innovative with HP, Hadoop, and big data. And, the fun is just getting started – stay tuned for more!
Until next time,