Archive
Day 1: Big Data Innovation Summit 2014
.
Hello from sunny, Santa Clara!
My team and I are here at the BIG DATA INNOVATION SUMMIT representing Dell (the company I work for), and it’s been a great day one.
I just wanted to take a few minutes to jot down some interesting ideas I heard today:
- In Daniel Austin’s keynote, he addressed that the “Internet of things” should really be the “individual network of things” – highlighting that the number of devices, their connectivity, their availability, and their partitioning is what will be key in the future.
. - One data point that also came out of Daniel’s talk – every person is predicted to generate 20 PETABYTES of data over the course of a lifetime!
. - Juan Lavista of Bing hit on a number of key myths around big data:
- the most important part of big data is its size
- to do big data, all you need is Hadoop
- with big data, theory is no longer needed
- data scientists are always right 🙂
QUOTE OF THE DAY: “Correlation does not yield causation.” – Juan Lavista (Bing)
- Anthony Scriffignano was quick to admonish the audience that “it’s not just about data, it’s not just about the math… [data] relationships matter.”
. - The state of Utah state government is taking a very progressive view to areas that analytics can help drive efficiency in at that level – census data use, welfare system fraud, etc. And it appears Utah is taking a leadership position in doing so.
I also had the privilege of moderating a panel on the topic of the convergence between HPC and the big data spaces, with representatives on the panel from Dell (Armando Acosta), Intel (Brent Gorda), and the Texas Advanced Computing Center (Niall Gaffney). Some great discussion about the connections between the two, plus tech talk on the Lustre plug-in and the SLURM resource management project.
Additionally, Dell product strategists Sanjeet Singh and Joey Jablonski presented on a number of real user implementations of big data and analytics technologies – from university student retention projects to building a true centralized, enterprise data hub. Extremely informative.
All in all, a great day one!
If you’re out here, stop by and visit us at the Dell booth. We’ll be showcasing our hadoop and big data solutions, as well as some of the analytics capabilities we offer.
(We’ll also be giving away a Dell tablet on Thursday at 1:30, so be sure to get entered into the drawing early.)
Stay tuned, and I’ll drop another update tomorrow.
Until next time,
JOSEPH
@jbgeorge
NOW HIRING: Dell’s Revolutionary Cloud and Big Data Team Expands
.
We’re growing!
The Revolutionary Cloud and Big Data Team at Dell (the company I work for) is looking to expand our team of rockstars, so we’re putting the word out. Specifically we’re looking for architects, engineers, developers, and I’m looking to hire a few more senior product managers to join my team of subject matter experts.
Just for context, we’re the team that has taken to market the Dell OpenStack-Powered Cloud Solution, the Dell Apache Hadoop Solution, and the Dell Crowbar software framework and open source project.
And if you’re a rockstar in any of those spaces, we’d like to talk to you.
SPOILER ALERT – If you’re interested in talking to us about a technical spot on our team, you can email us your info and resume at OpenStack@Dell.com or Hadoop@Dell.com.
What is this team about?
A few years ago, the Dell Data Center Solutions team came into being with a mission of servicing the biggest hyperscale environments in the world, which included many of the market’s top cloud providers. It has succeeded in its mission in dominating the density optimized space (check out more on that here), and in fact, just shipped it’s ONE MILLIONTH SERVER.
An extension of DCS’s mission soon became clear – as many customers were looking to accelerate into spaces like cloud and big data, providing them integrated solutions would ease their implementation of these technologies. And so our Revolutionary Cloud and Big Data Solutions team was born – to deliver integrated solutions based on cutting edge technologies like OpenStack and Hadoop (and more), as well as innovative Dell projects like Crowbar, in an effort to enable customers to grow and thrive in their businesses with our products, innovation, and expertise.
Who are we?
The team at Dell is made up of a number of people, like myself, that you’d recognize from OpenStack and Hadoop circles – folks like Rob Hirschfeld, Greg Althaus, Kamesh Pemmaraju, and others. We all come from a variety of backgrounds – some from big companies in the technology spaces and many from startups – we happen to have quite a few entreprenuers on our team! And we try to service our customers in the best way possible – agile development processes, open source friendly, community oriented, etc.
What are we trying to do?
Our mission is to develop and deliver HW+SW+Services solutions to market that will enable our customers to be successful. Clear and simple.
Here’s a sampling of what our team has done over the course of our existence:
- The first hardware solutions vendor to support OpenStack
- Released the first HW+SW+Services OpenStack solution to market – the Dell OpenStack-Powered Cloud Solution
- Launch of open source project “Crowbar” to fill the void of an automated bare metal OpenStack provisioner
- Released HW+SW+Services Apache Hadoop solution to market – the Dell Apache Hadoop Solution
- Launch of the Emerging Solutions Ecosystem Partner Program to enable our customers by incorporating some of our best in breed partner technologies into our solutions, which includes Datameer, Pentaho, enStratus, Mirantis, and Canonical, with more to come
- Launch of the Emerging Solutions Platform Partner Program to enable our customers by delivering solutions focused on specific workloads and target markets
In addition, we’re big believers in the community – we regularly hold hackfests to help move these communities forward, lead community meetups in Austin and Boston working with other key vendors that co-sponsor with us (you may be surprised), are regularly active in IRC, skype discussions, conference breakout sessions, and more.
It’s a fast-paced, customer focused, ever evolving group and its a great place to deliver tanglible, difference making solutions to customers.
It’s not for the faint of heart, but it’s DEFINITELY for the mover and shaker.
Who we want to hear from
We’re looking to expand in a number of areas, but specifically we’re looking for technical talent
- Developers / QA
- Technical Product Managers and Strategists
- Architects and Technical Leads
If I’ve piqued your interest, drop me a note and your resume at OpenStack@Dell.com.
Look forward to hearing from cloud / big data / open source rockstars.
Until next time,
JOSEPH
@jbgeorge
HADOOP WEBINAR: “New Business Insights with Hadoop Analytics”
.
Hadoop World last week was a blast, so hopefully, you’re still on a Hadoop high and checking out all the new ideas coming from Dell (the company I work for), and others in the space.
And to keep the good times going, Dell is joining forces with our partner, Datameer, to host a webinar deep diving into Hadoop analytics.
Jeff Stacey, Dell’s Senior Product Manager of Big Data Solution (which includes our Dell Apache Hadoop Solution), will be co-hosting this webinar with Datameer as they dig into real-world examples and use cases of how companies are taking advantage of hardware and software advances to analyze data with Hadoop. They’ll take a look at numerous data sources that are being leveraged, and how this wealth of data is already providing critical new insights in industries ranging from financial services to new media.
Here’s all you need to know:
- Date: Wed, Nov 7, 2012
- Time: 10:00 AM Pacific / 1:00 PM Eastern
- What: Dell + Datameer webainar – “New Business Insights with Hadoop Analytics”
- Register here: LINK
Come check out how analytic use cases spanning marketing, internet security, asset risk management, product usage and IT infrastructure are already driving competitive advantages and operational efficiencies.
If you’d like to learn more about how Dell is making our customers successful with Hadoop via the Dell Apache Hadoop Solution, visit www.Dell.com/Hadoop or drop us a line at Hadoop@Dell.com.
See you at the webinar!
Until next time
JOSEPH
@jbgeorge
Highlights from the 2012 Hadoop World
.
Had a great time at last week’s Hadoop World, so wanted to write up a few of my thoughts from the event.
- This year’s Hadoop World was the best attended to date – I believe I heard the attendee number to be at 2500 vs 1400 last year! It’s great to see this kind of growth among the community considering there were only 500 attendees just four years ago.
- In some similarities to what I’m seeing in the OpenStack community, this conference seemed to boast more from the “user” ranks as opposed to just developers as in the recent past. It speaks volumes to the general adoption that Hadoop is seeing in the market.
- Dell, the company I work for, and our Ecosystem Partner Datameer hosted a networking event for a number of folks at Hadoop World at the prestigious Circo NYC restaurant – great food and a great time with some innovative Hadoop implementers. Got to really get indepth how real people are implementing Hadoop in their enviornments today. Appreciate those that took the time out to attend, and for those who missed out, see you next time!
- Cloudera announced their beta project called “Impala”, which allows users to perform real-time queries of their data, a feature that a number of Hadoop users have been anticipating. According to Cloudera, Impala can process queries up to 30 times faster than Hive / MapReduce – very cool, and I look forward to checking it out.
- Finally, Dell made an announcement about our donation of “Zinc”, an ARM-based server concept to the Apache Software Foundation, with support from our partner, Calxeda, where we see ARM infrastructures as an interesting technology for Hadoop environments. The donation includes hosting and technical support for the Apache community. and we’re hosting the server concept at an Austin-based co-location. The Apache Hadoop project has actually performed more than a dozen builds within the first 24 hours of the servers’ deployment. (You can check out the full press release here to learn more.)
All in all, Hadoop World is another hit! It was a great event overall and I look forward to next year’s conference.
To learn more about the Dell Apache Hadoop Solution and more about what Dell is doing in this space, visit us at www.Dell.com/Hadoop.
And if you want to chat about how Dell can help you with your Hadoop initiative, drop me an email at Hadoop@Dell.com.
Until next time,
JOSEPH
@jbgeorge
Dell @ Hadoop World 2012: Experts, Solutions, and Networking Event

- Date: Tuesday, October 23, 2012
- Time: 6:30 – 8:30 p.m. EST
- Place: Circo NYC, 120 W. 55th Street, New York, NY 10019, (212) 265-3636, circonyc.com
- Circo: offers upscale Italian fare built upon a foundation of signature Tuscan recipes from the kitchen of Maccioni matriarch Egidiana and prepared by Executive Chef Michael Galata. The menu is served in a lively, sophisticated setting reminiscent of the old-style European circus tents which inspired the restaurant’s name.
If you’re interested in joining us, be sure to RSVP with Dianna Doan (ddoan@datameer.com) ASAP. There are only a few spots left, so be sure to RSVP now.
I’ll be there, so I hope to see you too.
Looking forward to a great week!
Until next time,
JOSEPH
@jbgeorge
Highlights from the Open Source Business Conference 2012
.
Last week I had the pleasure to head (back) to San Francisco to spend a few days with other open source believers at this year’s Open Source Business Conference. I was there on behalf of Dell, the company I work for.
Here are some of my thoughts from the sessions / keynotes I sat in on this past week.
- Jim Whitehurst of Red Hat spoke at a keynote and highlighted how the innovation that will be built on IaaS is where the revolution will reside, and that the role vendors will play in this new open source friendly enterprise will focus more on support and services.
- There was a great open source panel with personnel from Yahoo, Warner Music, Blackduck, Acquia, and NorthBridge that talked through real use cases at Yahoo and Warner, plus feedback on their annual open source survey which talked through the rise of open source adoption in the enterprise, how quality and cost is driving that, and how many companies are viewing open source software as a starting point for projects now, rather than an alternative option.
- HP’s Biri Singh talked through their cloud strategy including their tiered strategy of Iaas + ecosystem + marketplace. Turns out they’re using quite a bit of open source as they are building out their public cloud with focus on web services at scale.
- A panel on “Amazon vs the world”, panelists from Canonical , Eucalyptus, and Citrix talked about open private cloud with the backdrop of Amazon’s dominance as a public cloud provider. AWS API compatibility came up a lot, as well as the need to productize open source technologies more. Some opportunities that were highlighted included the need to have vendors who know more than just software, but also the “wiring” of actual working systems, and the importance of staying open as we are just starting to see adoption by the enterprise.
- CloudScaling hosted a great session on why open cloud is winning – how internet companies drove cloud technologies and how they were built with open source, the differences between the “Enterprise IT cloud” and the “Next Gen IT cloud”, and how “no lock-in” + flexibility + scale are the key tenets of open cloud.
Obviously there was a lot more at the event that I was not able to get to – You can check out a few of the presentation slides at https://www.eiseverywhere.com/ehome/31601/50199/?&
If you were out there last week, be sure to leave a comment with your thoughts.
I enjoyed the few days out there – looking forward to the next open source event – likely in San Fran again. 🙂
Until next time,
JBGeorge
@jbgeorge
Play Ball! Hadoop Players Sponsor Big Data Event in Chicago
.
.
What does data analytics have to do with baseball????
Well actually, quite a bit. Moneyball anyone?
(If you haven’t seen it, I highly recommend it. A true story adaption about Billy Beane and the Oakland A’s using intense number crunching to build a solid baseball team in a smaller market, competing with bigger markets – and bigger salaries.)
The Technology
Last week, I had the pleasure of representing Dell (the company I work for), as we joined Intel, Cloudera, and Clarity to meet with a number of customers at the Ivy League Baseball Club across from Wrigley Field, right before the Cubs – Cardinals game. It was great to talk to customers who were using Hadoop, as well as those that were just learning about the technology.
The presentation delivered by all four companies focused on the Dell Apache Hadoop Solution, a powerful packaged solution that features
-
A reference architecture featuring Intel technology
-
A set of software which includes Cloudera’s CDH distribution (with option to upgrade to Cloudera Enterprise), along with Dell’s innovative Crowbar software framework to enable easy provisioing and management
-
Services provided by a combination of Dell, Cloudera, and Clarity, to provide our customers with deployment, support, and consulting services
.
The Experience
Even more impactful than the presentation was the more 1:1 time after the presentation, where many users and newbies shared stories, experiences, best practices, etc. Got to hear about a lot of the struggles around “going it alone”, and enthusiasm that Dell and our partners were delivering a solution that would make that a bit simpler.
Here’s a sampling of some of the topics that came up.
Why should I care about big data / hadoop?
Here’s the thing: you have data. It’s in your sales tracking system, from your website traffic, from your social media outlets, in your customer support databases, and more. And not only do you have data, you have A LOT of data. But here’s the power of data. Your company has strategic objectives, customer strategies, and product plans. Data gives you insight into how to best spend your resources, where to focus your product development, where your customers are buying your products, and what problems they are encountering. This enables your business to make intelligent decisions to better satisfy your customers.
I already have a data warehousing solution – what’s the benefit of hadoop?
Many analytics solutions today require data to be in a format that adheres to the standards of a relational database (aka structured data). This is fine for data that conforms to this format. However, a lot of the new data that is available to us is not formatted in that manner – this is referred to as unstructured data. Unstructured data includes data types, such as audio, video, graphics, log files, etc. Hadoop as a technology handles unstructured data very well, allowing for analysis of those types of data. Additionally, a number of the traditional enterprise level analytics solutions are building hadoop connectors to allow for hadoop processed data to be utilized by the enterprise tool set. Finally, as data scales, using an open source based technology like Hadoop makes things very cost efficient.
How does the Dell Apache Hadoop Solution help me with hadoop?
Before this solution was made available, many of our Dell customers came to us asking, “If Dell was going to build a hadoop solution, how would you design it?” And this was how we started down the path of hadoop. What we discovered was many customers had pockets of hadoop projects in their companies, but progress was at a crawl. Many of the issues were around infrastructure design, deployment, and overall general help around the technology. And that is the basis for the Dell Apache Hadoop Solution – making hadoop accessible, quick, and simple to deploy from bare metal and get to a functional hadoop cluster asap. We’ve enabled many of these customers to go from a science experiment to a productive Hadoop instance very quickly, and provide them the consulting and education they need to maximize its benefit.
You can learn more about what Dell is doing with Hadoop at www.Dell.com/Hadoop or you can drop me an email at Hadoop@Dell.com.
The Game
For those of you not interested in sports, you can now tune your TV’s off – about to talk baseball for a bit.
As far as the game went, it was a doozy. I have ties to Chicago, so I was rooting for the Cubs.
- The Cubs were up 1-0 most of the game until the top of the 8th when Cardinal Matt Holliday knocked out a 2 run homer
- Trailing in the bottom of the 9th, Cubs first baseman Bryan Lahair hit a homer to tie it up 2-2, and take us into extra innings
- Here’s where the fireworks really began!
- Bottom of the 10th
- Cubs LF Tony Campana gets on base with a single
- Campana then tries to steal 2nd and barely makes it
- Cardinals coach Matt Matheny did not agree and made a federal case out of it with the 2nd base umpire
- And out goes Matheny – ejected!
- Cardinals walked Lahair
- With two men on base, Cubs LF Alfonso Soriano gets a single and drives Campana home for the 3-2 win!
- Prior to this, the Cardinals had beaten the Cubs in the LAST THIRTEEN SERIES between the two clubs. With this win, that streak has been broken.
Great game, great crowd, great partners! Thanks to everyone who came out. I look forward to the next one. 🙂
Until next time,
JBGeorge
@jbgeorge