www.fivetran.com Open in urlscan Pro
13.115.92.205  Public Scan

Submitted URL: https://drftclk-609.com/click/a61250f4-e8d5-460b-b6b8-fad5727a239f?u=https://www.fivetran.com/blog/how-to-compare-etl-to...
Effective URL: https://www.fivetran.com/blog/how-to-compare-etl-tools?utm_medium=email&utm_source=drift&utm_campaign=Trial-Cart-Abandonm...
Submission: On February 28 via manual from SG — Scanned from SG

Form analysis 1 forms found in the DOM

Name: wf-form-Free-Trial-Footer-FormGET

<form id="Free-Trial-Footer-Form" name="wf-form-Free-Trial-Footer-Form" data-name="Free Trial Footer Form" method="get" data-form="free-trial-start" class="c2__form" __bizdiag="26636836" __biza="WJ__" aria-label="Free Trial Footer Form"><input
    type="email" class="c2__input text__jetbrains-14 w-input" maxlength="256" name="Email-5" data-name="Email 5" placeholder="Enter email for a 14-day free trial" id="Email-5" required=""><input type="submit" value="Sign up"
    data-wait="Please wait..." class="button__gradient-purple w-button"></form>

Text Content

Product

Platform
Data movement
Transformations
Security
Governance
Extensibility and management
Sources
SaaS replication
Database replication
SAP replication
Streaming replication
File replication
Custom connectors
Destination to destination
Destinations
Data lakes + warehouses
Solutions

For Operations
Data democratization
Infrastructure modernization
Embedded
For Analytics
Marketing
Finance
Sales and support
For industry
Retail + CPG
Financial services
Manufacturing
For Teams
Data science
Data engineering
Partner Technology
Amazon Web Services
Databricks
Google BigQuery
Microsoft Azure
Snowflake

Connectors
Pricing
Developers

Docs
Full API reference
Documentation
Developer forum
API tools
Powered by Fivetran
Support
Changelog
Status
Support portal
Resources

Community
Community
Partners
Events
Education
Blog
Case studies
Resource center
Contact salesLog inStart free
Choose your language
English (United States)
Link 1Link 1Link 1
Privacy PolicyTerms of ServicePrivacy Policy
©2021 Fivetran Inc.




HOW TO COMPARE ETL TOOLS

Use these criteria to choose the best ETL tool for your data integration needs.
Charles Wang
May 26, 2021

ETL means “extract, transform, and load” and refers to the process of moving
data to a place where you can do data analysis. ETL tools make that possible for
a wide variety of data sources and data destinations. But how do you decide
which ETL tool — or ELT tool — you need?  

The first step to making full use of your data is getting it all together in one
place — a data warehouse or a data lake. From that repository you can create
reports that combine data from multiple sources and make better decisions based
on a more complete picture of your organization’s operations.

Theoretically, you could write your own software to replicate data from your
sources to your destination — but generally it’s ill-advised to build your own
data pipeline.

Fortunately, you don’t have to write those tools, because data warehouses are
accompanied by a whole class of supporting software to feed them, including open
source ETL tools, free ETL tools, and a variety of commercial options. A quick
look at the history of analytics helps us zero in on the top ETL tools to use
today.

The key criteria for choosing an ETL tool include:

 * Environment and architecture: Cloud native, on premises, or hybrid?
 * Automation: You want to move data with as little human intervention as
   possible. Important facets of automation include:
 * Programmatic control
 * Automated schema migration
 * Support for slowly changing dimensions (SCD)
 * Incremental update options
 * Reliability
 * Repeatability, or idempotence

We’ll cover each of these in detail below. But first, here’s a short background
on how ETL tools came about and why you need an ETL tool.


THE RISE OF ETL

In the early days of data warehousing, if you wanted to replicate data from your
in-house applications and databases, you’d write a program to do three things:

 1. extract the data from the source,
 2. change it to make it compatible with the destination, then
 3. load it onto servers for analytic processing.

The process is called ETL — extract, transform, and load.



Traditional data integration providers such as Teradata, Greenplum, and SAP HANA
offer data warehouses for on-premise machines. Analytics processing can be
CPU-intensive and involve large volumes of data, so these data-processing
servers have to be more robust than typical application servers — and that makes
them a lot more expensive and maintenance-intensive.

Moreover, the ETL workflow is quite brittle. The moment data models either
upstream (at the source) or downstream (as needed by analysts) change, the
pipeline must be rebuilt to accommodate the new data models.

These challenges reflect the key tradeoff made under ETL, conserving computation
and storage resources at the expense of labor.


CLOUD COMPUTING AND THE CHANGE FROM ETL TO ELT

In the mid-aughts, Amazon Web Services began ramping up cloud computing. By
running analytics on cloud servers, organizations can avoid high capital
expenditures for hardware. Instead, they can pay for only what they need in
terms of processing power or storage capacity. That also means a reduction in
the size of the staff needed to maintain high-end servers.

Nowadays, few organizations buy expensive on-premises hardware. Instead, their
data warehouses run in the cloud on AWS Redshift, Google BigQuery, Microsoft
Azure Synapse Analytics, or Snowflake. With cloud computing, workloads can scale
almost infinitely and very quickly to meet any level or processing demand.
Businesses are limited only by their budgets.

An analytics repository that scales means you no longer have to limit data
warehouse workloads to analytics tasks. Need to run transformations on your
data? You can do it in your data warehouse — which means you don’t need to
perform transformations in a staging environment before loading the data.

Instead, you load the data straight from the source, faithfully replicating it
to your data warehouse, and then transform it. ETL has become ELT — although a
lot of people are so used to the old name that they still call it ETL.

We go into more detail in ETL vs. ELT: Choose the Right Approach for Data
Integration.


WHICH IS THE BEST ETL TOOL FOR YOU?

Now that we have some context, we can start answering the question: Which ETL
tool or ETL solution is best for you? The four most Important factors to
consider include environment, architecture, automation, and reliability.


ENVIRONMENT

As we saw in our discussion about the history of ETL, data integration tools and
data warehouses were traditionally housed on-premise. Many older, on-premise ETL
tools are still around today, sometimes adapted to handle cloud data warehouse
destinations.

More modern approaches leverage the power of the cloud. If your data warehouse
runs on the cloud, you want a cloud-native data integration tool that was
architected from the start for ELT.


ARCHITECTURE

Speaking of ELT, another important consideration is the architectural difference
between ETL and ELT. As we have previously discussed, ETL requires high upfront
monetary and labor costs, as well as ongoing costs in the form of constant
revision.



By contrast, ELT radically simplifies data integration by decoupling extraction
and loading from transformations, making data modeling a more analyst-centric
rather than engineering-centric activity.


AUTOMATION

Ultimately, the goal is to make things as simple as possible, which leads us
right to automation. You want a tool that lets you specify a source and then
copy data to a destination with as little human intervention as possible.

The tool should be able to read and understand the schema of the source data,
know the constraints of the destination platform, and make any necessary
adaptations to move the data from one to the other. Those adaptations might, for
example, include de-nesting source records if the destination doesn’t support
nested data structures.

All of that should be automatic. The point of an ETL tool is to avoid coding.
The advantages of ELT and cloud computing are significantly diminished if you
have to involve skilled DBAs or data engineers every time you replicate new
data.


RELIABILITY

Of course all this simplicity is of limited use if your data pipeline is
unreliable. A reliable data pipeline has high uptime and delivers data with high
fidelity.

One design consideration that enhances reliability is repeatability, or
idempotence. The platform should be able to repeat any sync if it fails, without
producing duplicate or conflicting data.

We all know that failure happens: Networks go down, storage devices fill up,
natural disasters take whole data centers offline. Part of the reason you choose
an ETL tool is so you don’t have to worry about how your data pipeline will
recover from failure. Your provider should be able to route around problems and
redo replications without incurring data duplication or (maybe worse) missing
any data.


HOW RITUAL REPLACED THEIR BRITTLE ETL PIPELINE WITH A MODERN DATA STACK


WITH A MODERN DATA STACK, RITUAL SAW A 95% REDUCTION IN DATA PIPELINE ISSUES, A
75% REDUCTION IN QUERY TIMES, AND A THREEFOLD INCREASE IN DATA TEAM VELOCITY.

LEARN MORE



ETL AUTOMATION EXPLAINED

Automation means expending a minimum of valuable engineering time, and deserves
further discussion. The most essential things an automated data pipeline can
offer are plug-and-play data connectors that require no effort to build or
maintain. Automation also encompasses features like programmatic control,
automated schema migration, and efficient incremental updates. Let’s look at
each of those in turn.


PROGRAMMATIC CONTROL

Besides automated data connectors, you might want fine control over setup
particulars, such as field selection, replication type, and process
orchestration. Fivetran provides that by offering a REST APIs for Standard and
Enterprise accounts. They let you do things like create, edit, remove, and
manage subsets of your connectors automatically, which can be far more efficient
than managing them through a dashboard interface.

Learn more about how they work from our documentation, and see a practical use
of the API in a blog post we wrote about building a connector status dashboard.


AUTOMATED SCHEMA MIGRATION

Changes to a source schema don’t automatically modify a corresponding
destination schema in a data warehouse. That can mean you need to do twice as
much work so you can keep your analytics up to date.

With Fivetran, we automatically, incrementally, and comprehensively propagate
schema changes from source tables to your data warehouse. We wrote a blog post
that explains how we implement automatic schema migration, but the key takeaway
is: less work for you.


SLOWLY CHANGING DIMENSIONS

Slowly changing dimensions (SCD) describes data in your data warehouse that
changes pretty infrequently — customers’ names, for instance, or business
addresses or medical billing codes. But this data does change from time to time,
on an unpredictable basis. How can you efficiently capture those changes?

You could take and store a snapshot of every change, but your logs and your
storage might quickly get out of hand.

Fivetran lets you track SCDs in Salesforce and a dozen other connectors on a
per-table basis. When you enable History Mode, Fivetran adds a new timestamped
row for every change made to a column. This allows you to look back at all your
changes, including row deletions.

You can use your transaction history to do things like track changes to
subscriptions over time, track the impact of your customer success team on
upsells, or any time-based process.


INCREMENTAL UPDATE OPTIONS

Copying data wholesale from a source wastes precious bandwidth and time,
especially if most of the values haven’t changed since the last update.

One solution is change data capture (CDC), which we talk about in detail in a
recent blog post. Many databases create changelogs that contain a history of
updates, additions, and deletions. You can use the changelog to identify exactly
which rows and columns to update in the data warehouse.


PRICING

Beyond the top factors of environment, architecture, automation, and reliability
are a few other considerations, including security, compliance, and support for
your organization’s data sources and destinations.

Finally, if an ETL provider has ticked all of those boxes, you have to consider
pricing. ETL providers vary in how they charge for their services. Some are
consumption-based. Others factor in things like the number of integrations used.
Some put their ETL tool prices right on their websites. Others force you to
speak with a sales rep to get a straight answer. Pricing models may be simple or
complicated.

When you’ve done your due diligence, you’ll find Fivetran excels at all the key
factors we’ve covered, and we have several consumption-based pricing plans that
suit a range of businesses, from startups to the enterprise. Take a free trial,
or talk to our sales team.




START FOR FREE

Join the thousands of companies using Fivetran to centralize and transform their
data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Get demo



Product
Data movementTransformationsSecurityGovernanceExtensibility +
managementConnectors
Pricing
PlansAll featuresFree PlanY Combinator promotions
Resources
BlogDocumentationFivetran CommunityCase studiesEventsResource
centerStatusSupport portal
Company
About FivetranCultureCareersNewsContact usLegalPrivacy policyCookie
SettingsTerms of service
From the blog
How Sigma Computing built its best-in-class modern data stack
Read more
New and upcoming connectors and data models
Read more
Beyond data integration: Why data movement is the future
Read more




Follow us

©2022 Fivetran Inc.

*dbt Core is a trademark of dbt Labs, Inc. All rights therein are reserved to
dbt Labs, Inc. Fivetran Transformations is not a product or service of or
endorsed by dbt Labs, Inc.




FIVETRAN'S USE OF COOKIES

By clicking “Accept All Cookies”, you agree to the storing of cookies on your
device to enhance site navigation, analyze site usage, and assist in our
marketing efforts. Read our Cookie Policy.

Accept All Cookies Reject All Optional Cookies
Cookies Settings



PRIVACY PREFERENCE CENTER




 * YOUR PRIVACY


 * REQUIRED COOKIES


 * FUNCTIONAL COOKIES


 * SOCIAL MEDIA COOKIES


 * ADVERTISING COOKIES


 * PERFORMANCE COOKIES

YOUR PRIVACY

When you visit any website, it may store or retrieve information on your
browser, mostly in the form of cookies. This information might be about you,
your preferences or your device and is mostly used to make the site work as you
expect it to. The information does not usually directly identify you, but it can
give you a more personalized web experience. Because we respect your right to
privacy, you can choose not to allow some types of cookies. Click on the
different category headings to find out more and change our default settings.
However, blocking some types of cookies may impact your experience of the site
and the services we are able to offer.
More information

REQUIRED COOKIES

Always Active

These cookies are necessary for the website to function and cannot be switched
off in our systems. They are usually only set in response to actions made by you
which amount to a request for services, such as setting your privacy
preferences, logging in or filling in forms. You can set your browser to block
or alert you about these cookies, but some parts of the site will not then work.
These cookies do not store any personally identifiable information.

FUNCTIONAL COOKIES

Functional Cookies


These cookies enable the website to provide enhanced functionality and
personalisation. They may be set by us or by third party providers whose
services we have added to our pages. If you do not allow these cookies then some
or all of these services may not function properly.

SOCIAL MEDIA COOKIES

Social Media Cookies


These cookies are set by a range of social media services that we have added to
the site to enable you to share our content with your friends and networks. They
are capable of tracking your browser across other sites and building up a
profile of your interests. This may impact the content and messages you see on
other websites you visit. If you do not allow these cookies you may not be able
to use or see these sharing tools.

ADVERTISING COOKIES

Advertising Cookies


These cookies may be set through our site by our advertising partners. They may
be used by those companies to build a profile of your interests and show you
relevant adverts on other sites. They do not store directly personal
information, but are based on uniquely identifying your browser and internet
device. If you do not allow these cookies, you will experience less targeted
advertising.

PERFORMANCE COOKIES

Performance Cookies


These cookies allow us to count visits and traffic sources so we can measure and
improve the performance of our site. They help us to know which pages are the
most and least popular and see how visitors move around the site. All
information these cookies collect is aggregated and therefore anonymous. If you
do not allow these cookies we will not know when you have visited our site, and
will not be able to monitor its performance.

Back Button


BACK

Filter Button
Consent Leg.Interest
checkbox label label
checkbox label label
checkbox label label

Clear
checkbox label label
Apply Cancel
Confirm My Choices
Allow All