Archive

Posts Tagged ‘HP’

Recognizing the Layers of Critical Insight That Data Offers

March 11, 2015 Leave a comment

This is a joint blog I did with John Furrier of SiliconAngle / theCube and Leo Leung from Scality, originally published at http://bit.ly/1E6nQuR 

Data is an interesting concept.

During a recent CrowdChat a number of us started talking about server based storage, big data, etc., and the topic quickly developed into a forum on data and its inherent qualities. The discussion led us to realize that data actually has a number of attributes that clearly define it – similar to how a vector has both a direction and magnitude.

Several of the attributes we uncovered as we delved into this notion of data as a vector include:

  • Data Gravity: This was a concept developed by my friend, Dave McCrory, a few years ago, and it is a burgeoning area of study today.  The idea is that as data is accumulated, additional services and applications are attracted to this data – similar to how a planet’s gravitational pull attracts objects to it.   An example would be the number 10.  If you the “years old” context is “attracted” to that original data point, it adds a certain meaning to it.  If the “who” context is applied to a dog vs. a human being, it takes on additional meaning.
  • Relative Location with Similar Data:  You could argue that this is related to data gravity, but I see it as more of a poignant a point that bears calling out.  At a Hadoop World conference many years ago, I heard Tim O’Reilly make the comment that our data is most meaningful when it’s around other data.   A good example of this is medical data.  Health information of a single individual (one person) may lead to some insights, but when placed together with data from a members of a family, co-workers on a job location, or the citizens of a town, you are able to draw meaningful conclusions.  When grouped with other data, individual pieces of data take on more meaning.
  • Time:  This came up when someone posed the question “does anyone delete data anymore?”   With the storage costs at scale becoming more and more affordable, we concluded that there is no longer an economic need to delete data (though there may be regulatory reasons to do so).   Then came the question of determining what data was not valuable enough to keep, which led to the epiphany that data that might be viewed as not valuable today, may become significantly valuable tomorrow.  Medical information is a good example here as well – capturing the data that certain individuals in the 1800’s were plagued with a specific medical condition may not seem meaningful at the time, until you’ve tracked data on specific descendants of his family being plagued by similar ills over the next few centuries.   It is difficult to quantify the value of specific data at the time of its creation.

Data as a vector.jpg

In discussing this with my colleagues, it became very clear how early we are in the evolution of data / big data / software defined storage.  With so many angles yet to be discussed and discovered, the possibilities are endless.

This is why it is critical that you start your own journey to salvage the critical insights your data offers.  It can help you drive efficiency in product development, it can help you better serve you constituents, and it can help you solve seemingly unsolvable problems.   Technologies like object storage, cloud based storage, Hadoop, and more are allowing us to learn from our data in ways we couldn’t imagine 10 years ago.

And there’s a lot happening today – it’s not science fiction.  In fact, we are seeing customers implement these technologies and make a turn for the better – figuring out how to treat more patients, enabling student researchers to share data across geographic boundaries, moving media companies to stream content across the web, and allowing financial institutions to detect fraud when it happens.  Though the technologies may be considered “emerging,” the results are very, very real.

Over the next few months, we’ll discuss specific examples of how customers are making this work in their environments, tips on implementing these innovative technologies, some unique innovations that we’ve developed in both server hardware and open source software, and maybe even some best practices that we’ve developed after deploying so many of these big data solutions.

Stay tuned.

Until next time,

Joseph George – @jbgeorge

Director, HP Servers

Leo Leung – @lleung

VP, Scality

John Furrier – @furrier

Founder of SiliconANGLE Media

Cohost of @theCUBE

CEO of CrowdChat

The HP Big Data Reference Architecture: It’s Worth Taking a Closer Look…

January 27, 2015 Leave a comment

This is a duplicate of the blog I’ve authored on the HP blog site at http://h30507.www3.hp.com/t5/Hyperscale-Computing-Blog/The-HP-Big-Data-Reference-Architecture-It-s-Worth-Taking-a/ba-p/179502#.VMfTrrHnb4Z

I recently posted a blog on the value that purpose-built products and solutions bring to the table, specifically around the HP ProLiant SL4540 and how it really steps up your game when it comes to big data, object storage, and other server based storage instances.

Last month, at the Discover event in Barcelona, we announced the revolutionary HP Big Data Reference Architecture – a major step forward in how we, as a community of users, do Hadoop and big data – and it is a stellar example of how purpose-built solutions can revolutionize how you accelerate IT technology, like big data.   We’re proud that HP is leading the way in driving this new model of innovation, with the support and partnership of the leading voices in Hadoop today.

Here’s the quick version on what the HP Big Data Reference Architecture is all about:

Think about all the Hadoop clusters you’ve implemented in your environment – they could be pilot or production clusters, hosted by developer or business teams, and hosting a variety of applications.  If you’re following standard Hadoop guidance, each instance is most likely a set of general purpose server nodes with local storage.

For example, your IT group may be running a 10 node Hadoop pilot on servers with local drives, your marketing team may have a 25 node Hadoop production cluster monitoring social media on similar servers with local drives, and perhaps similar for the web team tracking logs, the support team tracking customer cases, and sales projecting pipeline – each with their own set of compute + local storage instances.

There’s nothing wrong with that set up – It’s the standard configuration that most people use.  And it works well.

However….

Just imagine if we made a few tweaks to that architecture.

  • What if we replaced the good-enough general purpose nodes, and replaced them with purpose-built nodes?
    • For compute, what if we used HP Moonshot, which is purpose-built for maximum compute density and  price performance?
    • For storage, what if we used HP ProLiant SL4540, which is purpose-built for dense storage capacity, able to get over 3PB of capacity in a single rack?
  • What if we took all the individual silos of storage, and aggregated them into a single volume using the purpose-built SL4540?  This way all the individual compute nodes would be pinging a single volume of storage.
  • And what if we ensured we were using some of the newer high speed Ethernet networking to interconnect the nodes?

Well, we did.

And the results are astounding.

While there is a very apparent cost benefit and easier management, there is a surprising bump in performance in terms of read and write. 

It was a surprise to us in the labs, but we have validated it in a variety of test cases.  It works, and it’s a big deal.

And Hadoop industry leaders agree.

“Apache Hadoop is evolving and it is important that the user and developer communities are included in how the IT infrastructure landscape is changing.  As the leader in driving innovation of the Hadoop platform across the industry, Cloudera is working with and across the technology industry to enable organizations to derive business value from all of their data.  We continue to extend our partnership with HP to provide our customers with an array of platform options for their enterprise data hub deployments.  Customers today can choose to run Cloudera on several HP solutions, including the ultra-dense HP Moonshot, purpose-built HP ProLiant SL4540, and work-horse HP Proliant DL servers.  Together, Cloudera and HP are collaborating on enabling customers to run Cloudera on the HP Big Data architecture, which will provide even more choice to organizations and allow them the flexibility to deploy an enterprise data hub on both traditional and newer infrastructure solutions.” – Tim Stevens, VP Business and Corporate Development, Cloudera

“We are pleased to work closely with HP to enable our joint customers’ journey towards their data lake with the HP Big Data Architecture. Through joint engineering with HP and our work within the Apache Hadoop community, HP customers will be able to take advantage of the latest innovations from the Hadoop community and the additional infrastructure flexibility and optimization of the HP Big Data Architecture.” – Mitch Ferguson, VP Corporate Business Development, Hortonworks

And this is just a sample of what HP is doing to think about “what’s next” when it comes to your IT architecture, Hadoop, and broader big data.  There’s more that we’re working on to make your IT run better, and to lead the communities to improved experience with data.

If you’re just now considering a Hadoop implementation or if you’re deep into your journey with Hadoop, you really need to check into this, so here’s what you can do:

  • my pal, Greg Battas posted on the new architecture and goes technically deep into it, so give his blog a read to learn more about the details.
  • Hortonworks has also weighed in with their own blog.

If you’d like to learn more, you can check out the new published reference architectures that follow this design featuring HP Moonshot and ProLiant SL4540:

If you’re looking for even more information, reach out to your HP rep and mention the HP Big Data Reference Architecture.  They can connect you with the right folks to have a deeper conversation on what’s new and innovative with HP, Hadoop, and big data. And, the fun is just getting started – stay tuned for more!

Until next time,

JOSEPH

@jbgeorge

Highlights from the Open Source Business Conference 2012

May 28, 2012 Leave a comment

.

Last week I had the pleasure to head (back) to San Francisco to spend a few days with other open source believers at this year’s Open Source Business Conference.  I was there on behalf of Dell, the company I work for.

Here are some of my thoughts from the sessions / keynotes I sat in on this past week.Open Source Business Conference 2012

  • Jim Whitehurst of Red Hat spoke at a keynote and highlighted how the innovation that will be built on IaaS is where the revolution will reside, and that the role vendors will play in this new open source friendly enterprise will focus more on support and services.
      
  • There was a great open source panel with personnel from Yahoo, Warner Music, Blackduck, Acquia, and NorthBridge that talked through real use cases at Yahoo and Warner, plus feedback on their annual open source survey which talked through the rise of open source adoption in the enterprise, how quality and cost is driving that, and how many companies are viewing open source software as a starting point for projects now, rather than an alternative option.
      
  • HP’s Biri Singh talked through their cloud strategy including their tiered strategy of Iaas + ecosystem + marketplace.  Turns out they’re using quite a bit of open source as they are building out their public cloud  with focus on web services at scale.
      
  • A panel on “Amazon vs the world”, panelists from Canonical , Eucalyptus, and Citrix talked about open private cloud with the backdrop of Amazon’s dominance as a public cloud provider.  AWS API compatibility came up a lot, as well as the need to productize open source technologies more.  Some opportunities that were highlighted included the need to have vendors who know more than just software, but also the “wiring” of actual working systems, and the importance of staying open as we are just starting to see adoption by the enterprise.
      
  • CloudScaling hosted a great session on why open cloud is winning – how internet companies drove cloud technologies and how they were built with open source, the differences between the “Enterprise IT cloud” and the “Next Gen IT cloud”, and how “no lock-in” + flexibility + scale are the key tenets of open cloud.
      

Obviously there was a lot more at the event that I was not able to get to – You can check out a few of the presentation slides at https://www.eiseverywhere.com/ehome/31601/50199/?& 

If you were out there last week, be sure to leave a comment with your thoughts.

I enjoyed the few days out there – looking forward to the next open source event – likely in San Fran again. 🙂

Until next time,

JBGeorge
@jbgeorge