Living Large: The Challenge of Storing Video, Graphics, and other “LARGE Data”
UPDATE SEP 2, 2016: SUSE has released a brand new customer case study on “large data” featuring New York’s Orchard Park Police Department, focused on video storage. Read the full story here!
This is a duplicate of a blog I authored for SUSE, originally published at the SUSE Blog Site.
Everyone’s talking about “big data” – I’m even hearing about “small data” – but those of you who deal in video, audio, and graphics are in the throes of a new challenge: large data.
Big Data vs Large Data
It’s 2016 – we’ve all heard about big data in some capacity – generally speaking, it is truckloads of data points from various sources, magnanimous in its volume (LOTS of individual pieces of data), its velocity (the speed at which that data is generated), and its variety (both structured and unstructured data types). Open source projects like Hadoop have been enabling a generation of analytics work on big data.
So what in the world am I referring to when I say “large data?”
For comparison, while “big data” is a significant number of individual data that is of “normal” size, I’m defining “large data” as an individual piece of data that is massive in its size. Large data, generally, is not required to have real-time or fast access, and is often unstructured in its form (ie. doesn’t conform to the parameters of relational databases).
Some examples of large data:
Why Traditional Storage Has Trouble with Large Data
So why not just throw this into our legacy storage appliances?
Traditional storage solutions (much of what is in most datacenters today) is great at handling standard data. Meaning, data that is:
- average / normal in individual data size
- structured and fits nicely in a relational database
- when totaled up, doesn’t exceed ~400TB or so in total space
Unfortunately, none of this works for large data. Large data, due to its size, can consume traditional storage appliances VERY rapidly. And, since traditional storage was developed when data was thought of in smaller terms (megabytes and gigabytes), large data on traditional storage can bring about performance / SLA impacts.
So when it comes to large data, traditional storage ends up being consumed too rapidly, forcing us to consider adding expensive traditional storage appliances to accommodate.
“Overall cost, performance concerns, complexity and inability to support innovation are the top four frustrations with current storage systems.” – SUSE, Software Defined Storage Research Findings (Aug 2016)
Object Storage Tames Large Data
Object storage, as a technology, is designed to handle storage of large, unstructured data from the ground up. And since it was built on scalable cloud principles, it can scale to terabytes, petabytes, exabytes, and theoretically beyond.
When you introduce open source to the equation of object storage software, the economics of the whole solution become even better. And since scale-out, open source object storage is essentially software running on commodity servers with local drives, the object storage should scale without issue – as more capacity is needed, you just add more servers.
When it comes to large data – data that is unstructured and individually large, such as video, audio, and graphics – SUSE Enterprise Storage provides the open, scalable, cost-effective, and performant storage experience you need.
SUSE Enterprise Storage – Using Object Storage to Tame Large Data
- It is designed from the ground-up to tackle large data. The Ceph project, which is core of SUSE Enterprise Storage, is built on a foundation of RADOS (Reliable Autonomic Distributed Object Store), and leverages the CRUSH algorithm to scale data across whatever size cluster you have available, without performance hits.
- It provides a frequent and rapid innovation pace. It is 100% open source, which means you have the power of the Ceph community to drive innovation, like erasure coding. SUSE gives these advantages to their customers by providing a full updated release every six months, while other Ceph vendors give customers large data features only as part of a once-a-year release.
- It offers pricing that works for large data. Many object storage vendors, both commercial and open source, choose to charge you based on how much data you store. The price to store 50TB of large data is different than the price to store 100TB of large data, and there is a different price if you want to store 400TB of large data – even if all that data stays on one server! SUSE chooses to provide their customers “per node” pricing – you pay subscription only as servers are added. And when you use storage dense servers, like the HPE Apollo 4000 storage servers, you get tremendous value.
Why wait? It’s time to kick off a conversation with your SUSE rep on how SUSE Enterprise Storage can help you with your large data storage needs. You can also click here to learn more.
Until next time,