clumio.com Open in urlscan Pro
76.76.21.21  Public Scan

Submitted URL: https://go.clumio.com/NjY2LUpGVS03NjgAAAGP_V-7D_xStlNKXTXtszpLdCMlciG3mtM3SSnJP7e4Uuj7OPNwnQdmhcuzYBIAwseT6uRMlBY=
Effective URL: https://clumio.com/customer-stories/atlassian/?utm_source=newsletter&utm_medium=email&utm_campaign=nl202312&mkt_tok...
Submission: On December 13 via api from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

 * Platform
   * PlatformArchitectureWhy ClumioPricing
   * Protected WorkloadsS3EC2 / EBSRDSSQL on EC2DynamoDBMicrosoft 365
 * Solutions
   * Use CasesRansomware recoveryOperational recoveryComplianceAWS backup cost
     optimization
   * Industry SolutionsFinancial servicesHealthcare & Life sciencesSaaS &
     MarketplaceEducation
 * Learn
   * TypesCustomer storiesBlogVideoseBooksWebinarsWhite papersAnalyst
     reportsEventsAll >>
 * Support
 * Company
   * About us
   * Careers
   * Contact us
   * Newsroom
 * Request a demoStart a backup

Request a demoStart a backupLogin >>


ATLASSIAN


HOW ATLASSIAN IMPROVED JIRA CLOUD RESILIENCE


“CLUMIO’S GAME-CHANGING APPROACH TO CLOUD DATA BACKUP AND RECOVERY HAS BEEN
INSTRUMENTAL IN TACKLING SOME OF OUR BIGGEST DATA RESILIENCY CHALLENGES.”

Andrew Jackson, Senior Engineer

Atlassian unleashes the potential of every team. Their collaboration software
products like JIRA and Confluence help teams organize, discuss and complete
shared work.

Atlassian saw an opportunity to improve the resiliency of their JIRA Cloud
product, with critical customer data in Amazon S3 to the tune of 40 billion
objects at 43 petabytes, and more than 1 million changes per hour. A previous
backup effort had yielded a longer than ideal recovery time, a lot of API
operations, and a high cost.

With Clumio, Atlassian was able to backup an initial 24.5 billion objects in 17
days, without application impact. Their critical customer data is now backed up
continuously and securely, with a 15-minute RPO, and at a 70% reduction in cost.

Click for sound





14:25











TRANSCRIPT



 

Woon: My name is Woon, I’m the co-founder and CTO of Clumio. And today I have
Andrew Jackson from Atlassian.

Andrew: Thank you Woon. Hi, everyone. My name is Andrew. I’m a Senior Engineer
at Atlassian, and I spend my days working with teams to solve problems
surrounding data. With my most recent focus being disaster recovery. It’s an
absolute pleasure to be here today to talk a little bit about our successful
collaboration with Clumio, and how their game changing approach to cloud data
backup and recovery has been instrumental in tackling some of our biggest data
resiliency challenges.

 



Andrew: I’m going to quickly run through the agenda for today’s session.

 * I’m going to provide some context about Atlassian and what we’re all about,
   and details like a high level walk through of our architecture.
 * Then I’ll touch through one of the most complex problems in this line of
   work, data resiliency.
 * I’ll provide some of the complexities and challenges associated with the
   space.
 * Then I’ll pass it over to Woon who will delve deeper into the efforts
   required to provide a highly performant and highly scalable S3 backup
   solution.
 * And then we’ll come together, talk a little bit more about the implementation
   at Atlassian specifically, as well as the benefits achieved.

 



Andrew: So what is Atlassian? At its core, it’s a company that drives innovation
and collaboration, with the mission being simple yet powerful, to unleash the
potential of every team. And Atlassian helps facilitate this by providing
software that you probably use already in your day-to-day work. Some of you may
have already used JIRA for project management capabilities, Confluence for
extensive content collaboration, Bitbucket for source code management, or even
Trello to organize tasks in a fun and flexible way. Those are just four of a
variety of different products we provide.

 




Andrew: But in line with the mission, it’s not necessarily about the range of
the products, but more about how they’re used. And since being founded in 2002,
Atlassian has amassed over 260,000 customers worldwide with some of the biggest
names in various industries. What that really means is that whether you’re
coming from a small startup or a large enterprise, these tools have been
available to help streamline processes, enhance productivity, and foster
innovation.

But with 260,000 customers, the amount of data that we have to process daily is
vast, not just vast in terms of the volume of data, but also because of the
requirements and compliance standards associated with that data. So I’m going to
quickly walk through the high-level architecture to give you some ideas about
how we operate.




Andrew: At its core, Atlassian is powered by a variety of different AWS
services, including compute, data, network, and storage-related services. These
services underpin Atlassian’s own internal platforms and services that in turn
provide capabilities for our products. And these product capabilities include
things such as the JIRA issue service, or Confluence analytics. You’ve got
things like the editor in Confluence, as well as media identity and commerce for
anyone familiar with those aspects.

Typically an Atlassian product consists of multiple containerized services that
have been deployed on AWS using our in-house provisioning layout micros, which
effectively orchestrates our AWS deployments. And these services contain a
variety of different features from request handling, transactional user
generated content, authentication management, data lakes, observability and even
analytical services. And with all these different services powering the
Atlassian ecosystem, we need to have extremely strong foundations, which in this
environment is our data source.

 




Andrew: For that purpose today, I’ll focus on S3, because S3 in itself is
probably the largest data store we’ve got due to the vast variety of data that
has to be received and processed daily. Now, with such a significant
architecture comes equally tricky challenges. First off, in an era when data is
vital, it actually becomes an increasingly complex task to handle the amount of
data that we’ve got, as more unique ways to view and transform data actually
occur every day. And that means it becomes harder to guarantee our requirements
are satisfied.

The second aspect of it is that compliance and scalability are crucial. It was
becoming an arduous task to guarantee requirements were satisfied as more and
more customers onboarded to our platform. And what this really meant is that
these challenges are not just operational hurdles, but they’re actually
impediments to the quality experience that people expect from Atlassian.

And so we knew that there had to be a data backup solution that didn’t just meet
our requirements for minimal downtime and quick recovery, but could actually
tick off some of these requirements, like 99.95% uptime, data residency
controls, and one hour RPO. These are just the highlights of those requirements,
but there are substantially more.

And so we were keen to find out what was on the market. We investigated
internally, and luckily Clumio reached out to us after seeing our interest in
the space. We collaborated very closely, and through that collaboration we were
able to obtain an optimal solution that enabled us to back up the S3 data at
that scale of the 40 petabytes, whilst at the same time ensuring that we had
safety and accessibility in the event of unforeseen circumstances.

I’m going to pass it over to Woon, who will delve a little bit deeper into the
efforts required to provide such an optimal solution that has effectively
exceeded our expectations in that space, turning those challenges into a success
story.

 




Woon: Thank you, Andrew. What I want to start with first is the high level
overview of the architectures, and then go deeper. First of all, on the left
hand side, we have the customer account, which in this example would be
Atlassian’s AWS account. That bucket in the middle is the bucket that we’re
trying to protect.

The way that you onboard is pretty straightforward. You onboard by installing
either a CloudFormation Template or a Terraform that we provide. Obviously,
every environment is unique, so we also allow you to customize that
CloudFormation Template to fit your needs in your environment. Once that
CloudFormation template is installed, we install all the assets needed to carry
out the backup.

For example, an IAM role gets created in your AWS environment that we use to
carry out all the operations needed for backups. Along with that, we also
install things like S3 inventory and S3 Event Bridge. These are all mechanisms
for us to get that list of objects to be backed up.
The S3 inventory will get us to the full list of the objects, whereas the S3
EventBridge will get us the Delta, or the changes happening in your bucket
minute by minute. And these are the technologies that allow us to build
continuous backups. We backup every 15 minutes, providing the 15 minute RPO
through that Event Bridge integration.
Once that is set up, on the right hand side is everything that is actually
managed by Clumio. This includes all the processing, cataloging, and data
verification that all happens on the Clumio side with the entire architecture
being serverless. It actually scales up and down based on the load that you
have.

If there are a lot more objects to be backed up, we’ll employ more Lambda
functions, and if there are less, we’ll employ fewer Lambda functions, but all
that is actually completely managed by us. And then on top of that, all this
processing happens and the data is housed in a separate account that is
dedicated for that one customer. So in this example, this account is dedicated
for Atlassian, and all the data processing and housing of the data happens in
that account.

If I move on, in this presentation we’ll talk about some of the innovations in
the backup space and the ingestion layer that allow us to backup large buckets
like the ones that Atlassian have.

 




Woon: So let’s start first with the bucket again on the left hand side. So
imagine that you have a bucket with thirty billion objects. You have objects
starting with the prefix A, all the way to prefix Z all sorted in that list. So
at a high level, you might think, how hard could it be, right?

You fire up a bunch of lambda functions and you start copying objects out and
that’s basically your backup. But not so quick. If you have a ton of lambda
functions and you schedule them in a way they all work on the same prefix, then
what’s going to happen is that you’re going to hit the API limit before anything
else.

 




Woon: So that bucket, the primary application is the primary owner of that
bucket. So if the backup operation goes and steals all the API requests per
second, all the TPS, then guess what? The primary application is the one that
suffers. It’s the JIRA tickets that are going to load slowly, or the attachment
that is not going to load in in a speedy manner.

So what we do is that we’re constantly looking at the API back pressure. So we
observe and track the API back pressure, and schedule accordingly. So if we see
a ton of back pressure happening on partition A, we actually reschedule the
lambda functions to actually give a little bit of a break to partition A, and we
actually do a little bit more work on partition B because there’s more capacity
there. And this process happens all the time, dynamically, up until the backup
is complete. All of this is actually automatic and managed by Clumio.

So just to sum up my part end to end; Onboarding is pretty straightforward. It’s
really a Terraform template, 15 minutes you’re up and running. All that
complexity that I talked about, like dynamically scheduling and partitioning,
all that stuff happens behind the scenes and is delivered to you as a service.
Andrew, you want to take it over and discuss your experience?

 




Andrew: Absolutely. Because the beautiful thing about these types of results is
that once we had that initial backup done within that 17 days, and for context,
we actually had substantial challenges getting to that point. Once we got there
though, that was huge because it meant now we could actually start working
towards the standard that our customers expect from us.

The very first thing that comes to mind when it comes to backup and recovery is
RPO. So we wanted to make sure we could provide that one hour RPO when it came
to being able to restore data. But what was fantastic was that after some
testing and rolling out this feature, we actually were able to identify through
our tests that we were meeting a 15 minute RPO, which had practically been
unheard of with this level of data.

Once that technical feasibility was handled, then it just became a question of
actually tackling the business problems, which for the purposes of this, I’ll
talk about two very specific ones. The first one being data residency controls.
So the beautiful thing about data residency controls is that you have to make
sure that you can easily roll out your controls to new environments as you need
to. And the best part about this is that we were simply able to work with Clumio
and enable those features for those specific regions, which lifted a substantial
weight off our shoulders there and ticked that box in terms of compliance
standards.

The second part, which I’m sure everyone loves talking about, is cost reduction,
because for context Clumio actually provides an air gapped and immutable
solution, and for those not familiar with that, it means that if we wanted to
delete our backups, we actually have to go through a very rigid process to do
so. So developers can’t just simply knock out their backups.
And by enabling this in place, it meant we could actually revisit some of our
existing backup solutions and optimize accordingly, remove some of those
resources, and actually apply and tune our approach, which overall resulted in a
70% cost reduction. I can’t stress how huge that is given this type of data that
we’re working with. It has made this a major success overall.

Woon: This has been a ton of fun working with the folks at Atlassian. So today
we’re in 20 different regions and we’re going to continue to expand to match the
regions that Atlassian is using. The level of performance optimizations and the
scalability work that we did with Atlassian was a huge amount of fun, and we’re
at a position that we can actually back up 30 billion objects in literally two
weeks, 17 days, to be exact, while also throttling because we don’t want to
impact the primary application. Our partnership and collaboration with the team
and Andrew has been a pleasure.

Andrew: Absolutely.

 



 

Woon: So this is it. This is the end of the talk and we have our demo. Stop by
for the demo or visit us at Clumio.com. Thank you.



Share

ABOUT ATLASSIAN

Behind every great human achievement, there is a team.

From medicine and space travel, to disaster response and pizza deliveries,
Atlassian’s products help teams all over the planet advance humanity through the
power of software.


EXPERIENCE CLUMIO

Start your first backup >>Try Clumio for free
<< Get a demoGet in touch
Request a demoStart a backup >>
Committed to the Highest Compliance Standards:

Request a demo >>Start a backup >>Login >>
 * Platform
   * Architecture
   * Why Clumio
   * AWS
   * Microsoft
   * Pricing
   * Support
 * Solutions
   * Ransomware recovery
   * Operational recovery
   * Compliance
   * AWS backup cost optimization
   * Financial services
   * Healthcare & Life sciences
   * SaaS & Marketplace
   * Education
 * Learn
   * Customer stories
   * Blog
   * Videos
   * eBooks
   * Webinars
   * White papers
   * Analyst reports
   * Events
   * Knowledge base
   * RTO
 * Company
   * About us
   * Careers
   * Contact us
   * Newsroom
   * Legal

© Copyright 2023 Clumio, Inc. All Rights Reserved. Read Privacy Policy and Terms
of Use.