solved.scality.com Open in urlscan Pro
104.198.99.15  Public Scan

URL: https://solved.scality.com/solved/object-storage-for-data-lakes/
Submission: On June 26 via api from US — Scanned from DE

Form analysis 2 forms found in the DOM

GET https://solved.scality.com/solved/

<form role="search" method="get" class="pc-searchform" action="https://solved.scality.com/solved/">
  <div class="pc-searchform-inner">
    <input type="text" class="search-input" placeholder="Type and hit enter..." name="s">
    <i class="penciicon-magnifiying-glass"></i>
    <input type="submit" class="searchsubmit" value="Search">
  </div>
</form>

GET https://solved.scality.com/solved/

<form role="search" method="get" class="pc-searchform" action="https://solved.scality.com/solved/">
  <div class="pc-searchform-inner">
    <input type="text" class="search-input" placeholder="Type and hit enter..." name="s">
    <i class="penciicon-magnifiying-glass"></i>
    <input type="submit" class="searchsubmit" value="Search">
  </div>
</form>

Text Content

The future of Scality RING: How a good...
Are you ready for multi-cloud?
What is object storage, anyway?
How Scality uniquely addresses the upgraded 3-2-1 backup...
Why air-gapped, immutable storage is now essential to...
Lessons from VeeamON 2023: 5 levels of unbreakable...
All-flash storage all the time? Why it doesn’t...
Introducing ARTESCA 2.0: Unrivaled security, value and freedom...
How Snowflake integrates with Scality for enterprise data...
Meet Peter Brennan, our new executive powerhouse and...
The future of Scality RING: How a good...
Are you ready for multi-cloud?
What is object storage, anyway?
How Scality uniquely addresses the upgraded 3-2-1 backup...
Why air-gapped, immutable storage is now essential to...
Lessons from VeeamON 2023: 5 levels of unbreakable...
All-flash storage all the time? Why it doesn’t...
Introducing ARTESCA 2.0: Unrivaled security, value and freedom...
How Snowflake integrates with Scality for enterprise data...
Meet Peter Brennan, our new executive powerhouse and...


 * VISIT SCALITY SITE


 * Industry
 * Open Source
 * Leadership
 * Insights
 * Scality Life


Insights


WHY OBJECT STORAGE IS IDEAL FOR DATA LAKES

by Paul Speciale April 19, 2022
by Paul Speciale April 19, 2022

BY PAUL SPECIALE, CHIEF PRODUCT OFFICER, SCALITY

The term “data lake” — a centralized repository that holds a vast amount of raw
data in its native format — has only been around for about a decade. Despite the
relatively new term, data lakes are expected to reach an annual market volume of
$20.1 billion by 2025, according to Research and Markets.

Usually, a data lake houses data from many sources in multiple formats — all of
which requires analysis in order to yield business insights. Increasingly, we
hear “data lake” and “big data” mentioned in the same breath. And that makes
sense, because big data analytics requires a massive trove of data to derive
insights from. 


THE NEED FOR FLEXIBLE, SCALABLE MANAGEMENT OF ALL DATA FORMATS

Because data lakes aggregate data from various sources, they can quickly reach
petabyte scale and beyond. This data volume exceeds the capacity of traditional
database technologies, such as relational database management systems (RDBMS),
which were primarily designed to handle structured data. 

Not only is there a potential capacity issue, but data lakes amass structured,
semi-structured, and unstructured data. To flexibly and scalably manage these
different data types, new storage systems like the Hadoop distributed file
system (HDFS) have been used as a data lake storage solution. But, like any
technology, HDFS has its limitations. 

A major downside to HDFS is that its compute and storage resources are tightly
coupled as it scales (because the file system is hosted on the same machines as
the application). Computing capacity and memory grow together, which can end up
being quite expensive.


MODERN OBJECT STORAGE OFFERS FUNDAMENTAL ADVANTAGES FOR DATA LAKES

To fully reap the business insights that lie in these massive data lakes,
organizations depend on both analytics tools and the storage repository where
the data is stored — the latter is arguably most important. 

Why? Because the repository must process data from various sources with just the
right performance, plus it must be able to grow in both performance and capacity
so that data is broadly available to applications, tools and users.

In the search for greater scalability, flexibility and lower cost, object
storage is quickly emerging as the storage standard for data lakes. 

With object storage, there’s no limit on the volume of data. Another key benefit
is that it accommodates all types of data without the need for predefined
“schemas” (as is the case with RDBMS where the structure and relationships
between tables for complex queries must be predefined) — this capability
increases flexibility. 

In addition, modern object storage systems like Scality support independent
scale-out of capacity and performance — a major bonus for large analytics
projects. Being able to independently scale offers the right compute performance
for data analysis — on demand — and substantially decreases the total cost of a
data lake solution. 

Object storage has also been embraced by application vendors in their quest to
solve the challenges of increasing data capacities for customers. Solutions such
as Splunk now support object storage via the SmartStore interface (which
leverages the Amazon S3 API), and Microfocus Vertica provides EON mode (which
also leverages S3). 

These solutions decouple the compute (search) tier from the persistent capacity
tier, giving users more flexibility and cost efficiency while at the same time
enabling much higher data volumes to make analytics more effective. Furthermore,
the Apache Spark tool ecosystem which traditionally used HDFS for storage, is
also compatible with S3 object storage over the S3A Hadoop-compatible file
system interface, which leverages the S3 API. 


Want to know more about the advantages of object storage for data lakes? Read my
recent article for Data Center Dynamics.


TwitterLinkedinEmail

PAUL SPECIALE

Chief Marketing Officer at Scality. Expert in Cloud Computing, Object Storage,
NAS & file systems, data management and database technologies.


previous post

MEET MELISSA LYONS, OUR NEW CHANNEL SUPERSTAR

next post

TACKLING CLIMATE CHANGE ONE TON OF CARBON AT A TIME

RELATED ARTICLES


THE FUTURE OF SCALITY RING: HOW A GOOD...

April 17, 2023


ARE YOU READY FOR MULTI-CLOUD?

March 30, 2023


WHAT IS OBJECT STORAGE, ANYWAY?

March 23, 2023


HOW SCALITY UNIQUELY ADDRESSES THE UPGRADED 3-2-1 BACKUP...

March 2, 2023


WHY AIR-GAPPED, IMMUTABLE STORAGE IS NOW ESSENTIAL TO...

February 23, 2023


LESSONS FROM VEEAMON 2023: 5 LEVELS OF UNBREAKABLE...

June 14, 2023


ALL-FLASH STORAGE ALL THE TIME? WHY IT DOESN’T...

June 1, 2023


HOW SNOWFLAKE INTEGRATES WITH SCALITY FOR ENTERPRISE DATA...

May 4, 2023


MEET PETER BRENNAN, OUR NEW EXECUTIVE POWERHOUSE AND...

April 26, 2023


THE FUTURE OF SCALITY RING: HOW A GOOD...

April 17, 2023


ARE YOU READY FOR MULTI-CLOUD?

March 30, 2023


WHAT IS OBJECT STORAGE, ANYWAY?

March 23, 2023


HOW SCALITY UNIQUELY ADDRESSES THE UPGRADED 3-2-1 BACKUP...

March 2, 2023


WHY AIR-GAPPED, IMMUTABLE STORAGE IS NOW ESSENTIAL TO...

February 23, 2023


LESSONS FROM VEEAMON 2023: 5 LEVELS OF UNBREAKABLE...

June 14, 2023


ALL-FLASH STORAGE ALL THE TIME? WHY IT DOESN’T...

June 1, 2023


HOW SNOWFLAKE INTEGRATES WITH SCALITY FOR ENTERPRISE DATA...

May 4, 2023


MEET PETER BRENNAN, OUR NEW EXECUTIVE POWERHOUSE AND...

April 26, 2023


THE FUTURE OF SCALITY RING: HOW A GOOD...

April 17, 2023





NEWSLETTER SIGNUP







KEEP IN TOUCH

Facebook Twitter Instagram Linkedin

Facebook Twitter Instagram Linkedin

@2023 - Scality. All Right Reserved. Designed and Developed by Us


 * Industry
 * Open Source
 * Leadership
 * Insights
 * Scality Life


日本語
Powered by Localize
English
×
Notice

We and selected third parties use cookies or similar technologies for technical
purposes and, with your consent, for other purposes as specified in the cookie
policy. Denying consent may make related features unavailable.



Use the “Accept” button or close this notice to consent.

Press again to continue 0/1
Learn more
Accept