www.cockroachlabs.com
Open in
urlscan Pro
2a05:d014:58f:6201::1f4
Public Scan
Submitted URL: https://friends.cockroachlabs.com/MzUwLVFJTi04MjcAAAGQMM7_XJMuOFVS6pwMAIxMOQrv4elzqzT6GdrSsdzguHCGBjUvDYTHF0O8fwAi-x1rYPj876E=
Effective URL: https://www.cockroachlabs.com/big-ideas-podcast/andy-pavlo-ottertune/?utm_campaign=eblast-podcast-big-ideas-one-off-promo-best...
Submission: On January 02 via api from US — Scanned from DE
Effective URL: https://www.cockroachlabs.com/big-ideas-podcast/andy-pavlo-ottertune/?utm_campaign=eblast-podcast-big-ideas-one-off-promo-best...
Submission: On January 02 via api from US — Scanned from DE
Form analysis
3 forms found in the DOM<form id="footer-mktoForm_1480_5" class="mkto-install-form mkto-footer-form m-auto p-0 mktoForm mktoHasWidth mktoLayoutLeft" __bizdiag="-1089065311" __biza="WJ__" novalidate="novalidate"
style="font-family: Helvetica, Arial, sans-serif; font-size: 13px; color: rgb(51, 51, 51);">
<style type="text/css">
.mktoForm .mktoButtonWrap.mktoSimple .mktoButton {
color: #fff;
border: 1px solid #75ae4c;
padding: 0.4em 1em;
font-size: 1em;
background-color: #99c47c;
background-image: -webkit-gradient(linear, left top, left bottom, from(#99c47c), to(#75ae4c));
background-image: -webkit-linear-gradient(top, #99c47c, #75ae4c);
background-image: -moz-linear-gradient(top, #99c47c, #75ae4c);
background-image: linear-gradient(to bottom, #99c47c, #75ae4c);
}
.mktoForm .mktoButtonWrap.mktoSimple .mktoButton:hover {
border: 1px solid #447f19;
}
.mktoForm .mktoButtonWrap.mktoSimple .mktoButton:focus {
outline: none;
border: 1px solid #447f19;
}
.mktoForm .mktoButtonWrap.mktoSimple .mktoButton:active {
background-color: #75ae4c;
background-image: -webkit-gradient(linear, left top, left bottom, from(#75ae4c), to(#99c47c));
background-image: -webkit-linear-gradient(top, #75ae4c, #99c47c);
background-image: -moz-linear-gradient(top, #75ae4c, #99c47c);
background-image: linear-gradient(to bottom, #75ae4c, #99c47c);
}
</style>
<div class="mktoFormRow">
<div class="mktoFieldDescriptor mktoFormCol" style="margin-bottom: 10px;">
<div class="mktoOffset" style="width: 10px;"></div>
<div class="mktoFieldWrap mktoRequiredField input-group float-none"><label for="Email" id="LblEmail" class="mktoLabel mktoHasWidth" style="width: 17px;">
<div class="mktoAsterix">*</div>
</label>
<div class="mktoGutter mktoHasWidth" style="width: 10px;"></div><input id="Email" name="Email" placeholder="Email*" maxlength="255" aria-labelledby="LblEmail InstructEmail" type="email"
class="mktoField mktoEmailField mktoHasWidth mktoRequired form-control border-0" aria-required="true" style="">
<div class="mktoButtonRow" style="display: flex;"><span class="mktoButtonWrap mktoSimple" style=""><button type="submit" class="mktoButton">Subscribe</button></span></div><span id="InstructEmail" tabindex="-1" class="mktoInstruction"></span>
<div class="mktoClear"></div>
</div>
<div class="mktoClear"></div>
</div>
<div class="mktoClear"></div>
</div>
<div class="mktoFormRow"><input type="hidden" name="Subscription_Podcast__c" class="mktoField mktoFieldDescriptor mktoFormCol" value="TRUE" style="margin-bottom: 10px;">
<div class="mktoClear"></div>
</div>
<div class="mktoFormRow"><input type="hidden" name="Would_you_like_to_receive_email_updates__c" class="mktoField mktoFieldDescriptor mktoFormCol" value="Yes" style="margin-bottom: 10px;">
<div class="mktoClear"></div>
</div>
<div class="mktoFormRow"><input type="hidden" name="optin" class="mktoField mktoFieldDescriptor mktoFormCol" value="TRUE" style="margin-bottom: 10px;">
<div class="mktoClear"></div>
</div>
<div class="mktoFormRow"><input type="hidden" name="utm_adgroup__c" class="mktoField mktoFieldDescriptor mktoFormCol" value="" style="margin-bottom: 10px;">
<div class="mktoClear"></div>
</div>
<div class="mktoFormRow"><input type="hidden" name="utm_campaign__c" class="mktoField mktoFieldDescriptor mktoFormCol" value="eblast-podcast-big-ideas-one-off-promo-best-of-2023" style="margin-bottom: 10px;">
<div class="mktoClear"></div>
</div>
<div class="mktoFormRow"><input type="hidden" name="utm_content__c" class="mktoField mktoFieldDescriptor mktoFormCol" value="listen-now" style="margin-bottom: 10px;">
<div class="mktoClear"></div>
</div>
<div class="mktoFormRow"><input type="hidden" name="utm_medium__c" class="mktoField mktoFieldDescriptor mktoFormCol" value="email" style="margin-bottom: 10px;">
<div class="mktoClear"></div>
</div>
<div class="mktoFormRow"><input type="hidden" name="utm_source__c" class="mktoField mktoFieldDescriptor mktoFormCol" value="mkto" style="margin-bottom: 10px;">
<div class="mktoClear"></div>
</div>
<div class="mktoFormRow"><input type="hidden" name="utm_term__c" class="mktoField mktoFieldDescriptor mktoFormCol" value="episode" style="margin-bottom: 10px;">
<div class="mktoClear"></div>
</div>
<div class="mktoFormRow"><input type="hidden" name="utm_sfcamp" class="mktoField mktoFieldDescriptor mktoFormCol" value="" style="margin-bottom: 10px;">
<div class="mktoClear"></div>
</div>
<div class="mktoFormRow"><input type="hidden" name="gclid_field" class="mktoField mktoFieldDescriptor mktoFormCol" value="" style="margin-bottom: 10px;">
<div class="mktoClear"></div>
</div><input type="hidden" name="formid" class="mktoField mktoFieldDescriptor" value="1480"><input type="hidden" name="munchkinId" class="mktoField mktoFieldDescriptor" value="350-QIN-827">
</form>
<form novalidate="novalidate" class="mktoForm mktoHasWidth mktoLayoutLeft" style="font-family: Helvetica, Arial, sans-serif; font-size: 13px; color: rgb(51, 51, 51); visibility: hidden; position: absolute; top: -500px; left: -1000px; width: 1600px;"
__bizdiag="-1305081519" __biza="WJ__"></form>
<form class="mkto-install-form mkto-footer-form m-auto p-0 mktoForm mktoHasWidth mktoLayoutLeft" __bizdiag="-1305081519" __biza="WJ__" novalidate="novalidate"
style="font-family: Helvetica, Arial, sans-serif; font-size: 13px; color: rgb(51, 51, 51); visibility: hidden; position: absolute; top: -500px; left: -1000px; width: 1600px;"></form>
Text Content
PRODUCT Capabilities * Elastic Scale * Cloud-Native & Kubernetes * Built-in Survivability * Familiar, Consistent SQL * Global Data SOLUTIONS By Industries * Finance * Gaming * Manufacturing & Logistics * Media and Streaming * Real-Money Gaming * Retail & eCommerce * SaaS * Startups Customer Stories See how our customers use CockroachDB to handle their critical workloads. Read case studies RESOURCES COCKROACH UNIVERSITY World-class training and tutorials for beginners and advanced use cases. Sign up for free Learn * Resource center * Blog * Developers * Webinars * Podcast Support * Forum * GitHub * Slack * Support Portal DOCS DOCS HUB Access tutorials, guides, example applications, and much more. Explore * Quickstart * FAQ * Example applications * Architecture Overview CUSTOMERS Featured stories * AllSaints * Hard Rock Digital * Netflix * Shipt * Starburst All customer stories BLOG PRICING CONTACT US SIGN IN * Product Capabilities * Elastic Scale * Cloud-Native & Kubernetes * Built-in Survivability * Familiar, Consistent SQL * Global Data * Solutions By Industries * Finance * Gaming * Manufacturing & Logistics * Media and Streaming * Real-Money Gaming * Retail & eCommerce * SaaS * Startups Customer Stories See how our customers use CockroachDB to handle their critical workloads. Read case studies * Resources COCKROACH UNIVERSITY World-class training and tutorials for beginners and advanced use cases. Sign up for free Learn * Resource center * Blog * Developers * Webinars * Podcast Support * Forum * GitHub * Slack * Support Portal * Docs DOCS HUB Access tutorials, guides, example applications, and much more. Explore * Quickstart * FAQ * Example applications * Architecture Overview * Customers Featured stories * AllSaints * Hard Rock Digital * Netflix * Shipt * Starburst All customer stories * Pricing * Contact us * Sign In * Start instantly Big Ideas in App Architecture Episode 8 Database Benchmarking Efficiency with OtterTune’s Andy Pavlo Andy Pavlo Associate Professor of Databaseology at Carnegie Mellon and Co-Founder at OtterTune Never miss an episode Thank you * Subscribe * EPISODE SUMMARY * | * TRANSCRIPT Database building is not for the faint of heart! It’s a grueling process that can take years to master and develop. This week, we’re with one of those masters of database building, Andy Pavlo, an Associate Professor with indefinite tenure of databaseology in the computer science department at Carnegie Mellon University and the Co-Founder of OtterTune. Andy discusses how his introduction to “databaseology” changed the way databases are not only built but also studied for more efficiency at companies. From optimizing databases for efficient testing and usage to building effective databases using the right knobs and building blocks, Andy shares his insights and expertise on how to improve the performance and reliability of your applications. Join as we discuss: * The emerging field of databaseology and its importance for efficient testing and usage. * The design and implementation of effective databases using the right knobs and building blocks. * Latest trends and techniques in database management and design that can improve application performance. Tim Veil: Well, welcome to another episode of Big Ideas in App Architecture. I am super excited today to welcome Andy Pavlo as a guest on the show. Andy, as you and I have talked, I wanted to make sure I properly introduced you with your proper title, but then you threw me for a loop with this very long thing that I was too embarrassed or afraid to restate. So, I’ve asked you to introduce yourself properly and then we can jump right into all the fun stuff. Andy Pavlo: Again, to be very clear, it’s not like I go to bars and introduce myself with these titles. If someone has to put a name tag up, this is what it says. Tim Veil: I fully understand. Andy Pavlo: I am an Associate Professor with indefinite tenure of Databaseology in the Computer Science Department at Carnegie Mellon University. Tim Veil: We are thrilled to have you on the show today and thank you for explaining that title. So, welcome. You and I got to know each other maybe informally. It’s a couple years ago now when I was asked to poke around a benchmarking framework that you had something to do with. So, I thought maybe we’d start there, but maybe even before we get into database benchmarking and OLTP Bench and the like, do tell us maybe just a little bit about you, how you got to Carnegie Mellon, what your background is, because there’s some really interesting things obviously that you’ve been doing, been researching, been looking at. Then obviously, we want to spend a lot of time talking about your newest venture, which is OtterTune. So, before we get into all that, maybe just spend a few minutes and tell us a little bit on how you got to where you are. Andy Pavlo: I mean, what is databaseology? First of all, it’s not a real terms to be very clear here. As I said, when they put name tags up, they put that on, because the university thinks it’s like ecology or neurology, right? It’s made up, but my area of research is focused on database systems. So, that means I’m interested in how do you design and optimize software systems to efficiently store, retain, and process queries for data. So, prior to being at Carnegie Mellon, this is actually my 10th year, I completed my PhD at Brown University under Stanley Zdonik and Mike Stonebraker. The Stonebraker name should resonate with your audience because he’s the inventor of Postgres, Ingres, the Turing Award Winner in 2014. So, when I was a grad student, I worked on a system called H-Store that was commercialized as VoltDB. When I was graduating with my PhD, I didn’t really know what I wanted to do. I know what I didn’t want to do, work at a big bank or something like that. So, I applied to a bunch of startups, research labs, universities. I did not think Carnegie Mellon would hire me. So, I was very relaxed when I came here and interviewed and they wanted someone that does databases. I’ve been here ever since and I love it here. This leads into, “Why do we build OLTP bench or how did I meet you?” When I was in grad school, I wrote four different variations of TPCC. That’s the standard OLTP Benchmark that since 1992 that everyone uses to measure these systems. So, when I started thinking about going to academia and what do I want to do next, I realized that if I have students, I don’t want them to have to write TPCC four times. So, we thought, “Let’s write it once in a single framework, have a bunch of other workloads that we can reuse, and just have everyone take advantage of these things.” Tim Veil: It’s been a fantastic tool. So, just a little bit of context on how I stumbled upon it. Working at CockroachDB, people come to us all the time and say, “Geez, how does your performance stack up against some other database?” Database benchmarking is one way people ascertain whether or not your product is better or worse than some other thing. What we have found, and that’s why we were so happy to stumble upon the work that you had done, is that the benchmarking frameworks that were widely available were in various states of I think maturity, various states of supporting one database or another. Some were just more accessible than others. I mean, I was a Java guy and spent a lot of time writing Java. I could understand and reason about how you’d put together OLTP Bench. We looked at some of the other things that were popular at the time and this is written in Tickle or Lua or some other thing. I don’t know what this is. So, I ended up spending quite a bit of time in there and trying to make it work with Cockroach and the rest is history. But to this day, it comes up all the time. Hey, how can we test the database against some other thing? What has now become BenchBase is what we talk to folks about. But before we go down that path a little bit more, I wanted ask you another question. I don’t know why. I was back in my hometown this past week. I was born in Orlando and we were down there for the Gartner Analytics Summit. It just had me reflecting on childhood a little bit. When I graduated college, I didn’t know what I wanted do either I suppose, but I ended up dual majoring in I think MIS and finance. It’s funny you say big bank, I realized very quickly I didn’t want to go work for a bank either. So, I stayed and did MIS. How did you end up getting into databases as a field? I mean, what was the thing that got you down this path to begin with? I’m just so curious about how folks end up in the industries they chose. Do you remember what drove you down this path? Andy Pavlo: Absolutely, yes. So, in undergrad, I think a reoccurring theme that I experienced was that for whatever reason, I understood databases better than everyone else. My first interaction with MySQL, this is going back to 1999 with MySQL 3 was I used to work for a… I don’t want to go into details. It was a sketchy startup. My boss was a crook and he was doing shady stuff and his business partners fired him. Then they were like, “Okay, well, it was doing web programming stuff, start working on this new project.” I had to learn what MySQL was. So, this is when I was in high school. For me, it just clicked and relational database had made sense. Throughout my undergraduate career and then when I went off to do a pre-doc at the University of Wisconsin, the reoccurring theme that I noticed was, for whatever reason, I seem to understand what databases were doing much more easily than other people. I’m not saying that I’m smarter than everyone else, I’m the coach. I realized this was my thing. Then when it came time to go to grad school, I had the fortune opportunity to hook up with Stan and Stonebraker and start building this database system from scratch. Then it’s the same thing. I learned a lot as I went along and I realized that incredibly, I enjoyed this. It’s fun. My databases are awesome because they’re in everything. Yeah, it was just one of the things that I just picked up maybe in my early twenties that this is something that I’m just better at than everyone else. So, I just keep going and see how far I can go with this. Tim Veil: It’s funny you say that. I’ve found a similar thing. I think they rule the world. I mean they run the world. There’s so very little that’s out there that in some way, shape, or form, you can’t trace the lineage of what’s happening back to something being stored in a database. Andy Pavlo: I tell my students that think of a two class of software that’s super important. There’s several categories, but operative systems, of course, web browsers everyone’s using, and databases. At the university, we don’t really teach a course at how to build a web browser or we teach a course how to build a database system. It’s that important. I can’t really think of any other class of applications above the operating system where you have a whole dedicated course on how to build specific software type, because as you said, it’s in everything. They’re not going away. They’re the backbone of every major application. Tim Veil: Yeah, I totally agree. So, going back to this idea of OLTP Bench, so you wrote a paper and please do correct me, because I’m going to buy some of this from memory, but you wrote this paper where you described this approach to benchmarking databases. In that paper, you described not only TPCC, which I think many people would know about, but there were a handful of other well-known or proposed benchmarks for doing this, which again was super fascinating for me in my reading just because again, at working for a database company, people are always like, “How do I test? What do I do?” Can you talk just a little bit about some of the thinking or thoughts behind creating that? And then I would love to maybe just talk a little bit about how OLTP Bench, how it’s been constructed, how it evolved over the years, and then where it is today. Then I think that’s a nice segue into OtterTune, which I’m really looking forward to hearing about. Andy Pavlo: OLTP Bench and I’m sure you guys provide the GitHub link in the description. That actually was the precursor to OtterTune. That’s a natural segue. So, the original OLTP Bench project was a collaboration between myself, Carlo Curino, who was a postdoc at MIT, who’s now at Microsoft, and then Philippe Cudré-Mauroux. He’s a Swiss, I’m butchering it, but in University of Fribourg and his student Djellel. So, the MIT guys had this other project, relational cloud, and they had a Java-based benchmarking framework. Then when I was working the H-Store, I had my own Java-based benchmarking framework and we realized, “Okay, we basically have the same thing. Other people implemented the same thing. Let’s implement this once and for all and see how far we can push it.” The reason why we had so many workloads beyond TPCs because in academia, it’s very hard to get access to real workloads to run experiments. If you’re building a system, it’s a research project, you don’t have customers. You don’t have try things out on other than synthetic workloads. So, that was the additional side of OLTP Benches. In addition to building a single framework, we actually then try to find other workloads that we could then port into OLTP Bench. So, for example, we have a workload based on what we think of as the access patterns in Twitter. We actually took the Wikipedia source code, that MediaWiki PHP source code, converted their transactions into Java. Then there’s a benchmark for that. So, obviously, the data’s all synthetic, but it’s based on actually analyzing some real workload traces and understanding what they actually do. That was the original motivation of, “Okay, it’s hard to get real workloads in academia. Let’s just have a suite that has a bunch of them that we can use for all our different projects.” Tim Veil: I think that’s so important because it really started as this way that it was this extensible way just as you described, to add additional workloads to simulate new kinds of things. Because certainly in our work, somebody may describe what they’re currently doing today in a database and we have to do this pattern matching between what their workload load looks like, what their schema looks like, what their workload looks like versus something else. Having lots of different options because not every workload looks like TPCC or looks like YCSB, which is another certainly well-known one. So, I think one of the really neat things that drew us initially to OLTP Bench was there was these different flavors. So, we could say, “You know what? Okay, I get it. What you’re trying to do is more like this thing than this other thing and let’s let it rip.” Then the other thing is there’s tons of customization you could do. So, this ideal of parallelism, I’m going to launch any number of these threads. I want to generate this amount of work. So, you could really get in there and fine tune it to be at least as close as possible to represent some workload to give people a sense of what it is. Of course, you supported multiple databases, which again was nice. You happened to support Postgres, which was a close cousin of Cockroach. Andy Pavlo: Well, you asked about in the early days, DB sent us patches for their data system. Cockroach did a lot of work as well. We’ll talk about that later, but we had some vendors actually send patches for their databases as well. The only challenge to that one is some of them we can’t test with the cloud platforms or the proprietary. So, I think we’ve gotten better at making sure we test and make sure that things aren’t broken for other different databases. But as the goal, the idea is a single framework is part of different workloads for different databases. So, you can start doing true apples to apples comparisons, which is hard to do. Tim Veil: Yeah, it’s super, super hard to do. I may have this wrong, but I think over the years, you’ve had your students participate in working on adding to resolving. Isn’t that correct? Andy Pavlo: Yes. So, yeah, over the years, I’ve had new students get started… I’m not saying I’m vetting them or as a job interview, but new student comes to me and says, “I want to work on databases.” So, before we start throwing them on the bigger project that’s more complicated, we give them something smaller in BenchBase, OLTP Bench and see how they do with it. Part of the reason we changed the name for OLTP Bench to BenchBase is we started adding analytical workloads like TPCH, something like TPCDS. We’re working on that. But new students add new workloads over the years. So, even though it’s a 10-year-old project at this point, we are still trying to keep things up to date and added new workloads as we need as they come along. Tim Veil: Yeah, it’s great. Then my recollection of this might be different than reality, but at one point, I based on the direction of my boss at the time who wanted me to go and see if I could get OLTP Bench to work with Cockroach, I forked it and then I have this bad habit of just doing all these things at once. So, I made all these crazy changes and I think I realized not terribly long after I did that, there’s no way. Now, the train has left the station. There’s no way these guys are ever going to want any of my stuff back into OLTP Bench. Then I randomly got this email one day and I feel like this was a long time after I’d gone down this path that is either you or one of the students reached out. I was like, “Hey, we’re trying to merge this stuff.” I’m like, “Really?” Oh, okay. I’d love to help. That’s awesome. Andy Pavlo: You did a hard fork because I think a student found you. I forget what they were searching for because it’s not like you showed up in the GitHub forklift and these changes are amazing. We want all these things. Let’s see we can do to get you to help us put it back in. Tim Veil: So that was fun. We worked for a while to get everything back in there, but it’s been fantastic. I have so enjoyed and I appreciate you all taking a look at the work that I did, because I’ve so enjoyed learning a lot, not only about databases, about benchmarking and how to write code. We actually learned a ton about Cockroach I think in the process, because just as you described, the ability to run these multiple workloads that have different flavors, different query patterns, different requirements for indexing, different types of schema, doing that in an efficient way has been a neat process for us. Andy Pavlo: I would say also, recently, Carlo Curino’s team at Microsoft, they’ve been helping out contributing, not just making work better on SQL Server, but also fixing other bugs and issues like that. So, they’ve stepped up in the last year or so and been contributing back to the project as well. It’s good to see Carlo and his team 10 years that have come back around and still work with it. Tim Veil: Yeah. As I’ve said to you over Slack in other places, I’ve been a very bad helper recently by getting distracted in other things. So, for Cockroach, the interesting thing has been we’re running all these workloads and we’re finding things that we can internally do better. It’s been a really good way for us to identify potential optimizations within our code as a database product. It’s been a great way and we’ve had a number of prospects and customers start their journey with Cockroach using BenchBase. But I think that’s a nice segue into what you’ve been doing on the side, which is this startup that you’ve created called OtterTune. I did not realize you guys had been at it for as long as you have. The more I learn and read about it, the more excited I am about what you all are trying to do. So, maybe we use that as a jumping off point into what is OtterTune. Because look, we’re named after a database or a cockroach. We have this funny name. So, now I have this interest in naming. I’d love to know why or how you came up with this name, but then, yeah, tell us all about what you’ve been doing with OtterTune. Andy Pavlo: There is a connection with BenchBase with OtterTune. I’ll discuss in a second. The name OtterTune is just a play on the words of autotune. So, my PhD student, Dana Van Aken, she went to, I think, some zoo, liked otters, got a t-shirt of an otter, and then she’s like, “Oh, we should call this OtterTune.” I’m like, “Brilliant, let’s do it.” I think the name, there’s like some acapella group at some university that’s also called the OtterTunes, but that was our only competition for this. So, it’s just a play on autotune. Tim Veil: Okay. I thought like cockroaches, they’ll survive anything. I thought maybe otters, there’s something about the animal that was indicative of high performance or something. Andy Pavlo: We did not realize this until later that otters, the animal are vicious animals. Tim Veil: Really? Andy Pavlo: Oh yeah, go Google. Look on YouTube, type otter tight otter fights X or whatever another animal. They’ll fight. They’ll look all cute and cuddly, but they’re vicious. Tim Veil: Oh, that’s funny. I didn’t know that. I’ve heard that about pandas by the way, or is it pandas or koala bears? I think they’re cuddly. Andy Pavlo: No, koalas, they’re like brain-dead, because the eucalyptus doesn’t give enough energy. Tim Veil: Oh, really? Andy Pavlo: Yeah. Tim Veil: So, panda is it. Andy Pavlo: So, I think there was a Davis conference in Australia at some point. We went down and there’s some petting zoo that where you can touch koalas and they smelled terrible. They barely moved because the eucalyptus doesn’t give enough nutrients. Tim Veil: I had no idea. Andy Pavlo: I’ll describe what OtterTune is in a segment. I would tell one quick story about how vicious otters are. For marketing reasons, we were going to sponsor an otter at the Pittsburgh Zoo at the local zoo, because one of our investors is the founder of Duolingo. They sponsored an owl after their mascot at the National Avery here in the city. So, we’re like, “Okay, let’s sponsor an otter. That can’t cost that much.” So, when we called the zoo to do it, one was like, “Oh, yeah, let me go check with the trainers and so forth,” but then she’s like, “Just so you know, you can’t go into the enclosure with them.” She’s like, “They’re so vicious that even the handlers don’t even want to go in there because they’ll fight and kill anything.” But we didn’t know this before. We were like, “Oh, otters are cute.” Then apparently they’re like murderers. Tim Veil: Have you thought of changing the name at all or now it’s just like we’re embracing the violence of this mascot? Andy Pavlo: I think that as long as we’re careful in what we say, again, there’s your podcast, I don’t want to cause you fired. Tim Veil: You just lean into it. You’ll just lean into the name. Andy Pavlo: Well, yes, absolutely, yes. So, as I said, so OtterTune, the names that play on autotune and the idea of autotune, the big picture we’re trying to do is apply machine learning techniques to optimize database systems. The original research project at the university was on doing knob configuration tuning. So, these are runtime parameters that control the behavior of the system. Every database system has them, even though they’re huge pain. They’re basically buffer pool sizes, cashing policies, log file sizes, things you can tune as the end user for how the data system is going to be used by the application. The reason why these knobs exist is because when the developer is actually building the database system, at some point, they have to make a decision about how much memory to allocate for a hash table. Then instead of putting a pound to find in the source code or some hard code of value, they expose it as a knob, because they assume someone else who knows more about databases or knows more about the applications come along and tune it, but it never happens. So, you just accumulate these knobs over time. So, now, the new version of OtterTune that we’re releasing, by the end of this podcast comes out, it’ll be out. But the new version that we’re putting out this year expands and goes beyond knob tuning. We’re doing index tuning, we’re doing query tuning. We’ll talk about why that matters as well, but the original research project of OtterTune was just doing knob tuning. So, how this relates actually to BenchBase or OLTP Bench was one of the things that we were building out when I started at CMU in OLTP Bench was the ability to collect the runtime metrics from the database system like the internal telemetry that every system generates, pages read, pages written, locks out and so forth. OLTP would then uploaded automatically to a website to keep track of these things. We were heavily inspired by the code beat project from PyPI, thinking of continuous integration, keeping track of performance of things over time. What I really wanted was I wanted to be able to run experiments and then store it all in a single repository. Click one button to make me the graph that I could put in my research papers. That was what I really, really wanted. Then from there, I realized, “Oh, well, okay, if you have all this telemetry, what can you start do to do with it?” My PhD thesis was on using ML-like techniques before ML was a big thing, but basically automated techniques to optimize and store multi databases. But as I said, the big challenge I was facing was that it’s impossible to get real workloads from customers. So, we were relying on the synthetic workloads in OLTP Bench. So, the idea then with original version of OtterTune was well, “What can I do or what can I optimize in a database system automatically using a machine learning or automation technique that does not require the database, does not actually need to look at the queries?” So the idea was that by this runtime telemetry, that is a stand in or represent representation of what the workloads actually is without actually seeing the workload. So, we were trying to use that as the signal to decide how to optimize the system. Also too, at the time, there hadn’t been a lot of work on doing automated knob tuning. There was maybe some work done it in the mid-2000s, but it wasn’t the long history of research projects like an index tuning or query tuning. So, then the OLTP Bench project, that website then morphed into the first version OtterTune, where we just used simple machine learning techniques to tune basic knobs and then it got more sophisticated as over time went. Tim Veil: Yeah, I remember making some of my changes. I remember what the hell is uploading. What is this, IP address? Where is this going? I think one of the first changes I made is I’m ripping this out. I don’t know what this is doing here. I can’t use this, but now it makes sense what you all are doing. Andy Pavlo: So basically, what happened was we published the first paper about OtterTune in Sigma, which is the top conference in databases. Yeah, the other academics were like, “Oh, this seems cool, this seems nice,” but nobody in the industry paid attention to it. Then we met the guy that runs all of Amazon’s machine learning division or group, whatever you want to call it. He came to Carnegie Mellon. I got five minutes at a time just to thank him for giving Dana, my PhD student a couple thousand dollars to run her experiment with EC2. He was like, “Great, can you write a blog article for us? We just started the new Amazon AI blog. We need material.” So, we just converted the Sigma paper into a blog article and that got published in Amazon site. Then that’s when everyone started emailing us and say, “We have the exact problem. We’ll give you money to fly student out, set up OtterTune for us.” This happened so many times, we’re like, “Okay. Clearly, there’s a signal here. We should go do a startup.” Tim Veil: Well, it’s funny. I think it’s such an interesting thing, because again, I’m more of a blue collar approach to this, but obviously, in the applications I’ve been building and teams I’ve run and even at Cockroach when we’re working on evaluations for folks, people are always trying to squeeze as much out of the database as they can. We do this today too. We start at the schema as the schema. What do your queries look like? Are those properly tuned? Do we have indexes? But I know Cockroach does this and I know many databases I’ve ever worked with have hundreds of these knobs as you call them to tune, changing this parameter. You’re out there searching and it’s like, “Oh, somebody said do this. Somebody said do that.” You end up with this list of 5 or 10, 15, 20 things I’m going to adjust, but I have no idea where this is working. You run something. It’s like, “Well, it didn’t break so maybe somewhere it’s going to be better.” So, I find this fascinating. How do you all determine what knobs to even touch and how do you set expectations about whether or not this is even a relevant or related change to whatever I’m trying to do? That to me was one of the biggest questions I had as I was reading through. How do you know what would make sense? Andy Pavlo: So, the first step, as you said, you got to figure out what knobs to actually tune. Let’s be honest. So, I think it’s about 500 knobs in MySQL, 400 knobs in Postgres, but not all of them, things you actually would want tune automatically. They’re the directory names, file names, port numbers. If you tune those, the system doesn’t work. So, we obviously put those in a denial list. Tim Veil: We really like this port over here. Why don’t you try this? Andy Pavlo: Yes. So, then the next step is you do have to do some manual curation to find knobs or particular values for knobs that could affect the safety of the database. Because the most obvious thing is if you turn off disc rights or calling S-Sync to flush the log to disk when you commit a transaction, the machine learning algorithms find out really quickly of not writing things to disk means your database goes faster. But if you now crash, you lose the last 10 seconds of the log, you lose all your data. There’s an external cost that the machine learning models or algorithms can’t reason about. So, a human has to come in and say, “Okay, well, we don’t turn off S-Sync,” because that’s an external cost the algorithms can’t even reason about. So, then the next step is we basically did a random walk, if you will, of just trying out different knobs in different workloads, different situations just to collect the training data. Then you run a ranking algorithm to figure out which knobs you think have the most impact on performance. It actually works reasonably well to no surprise if you’re on Postgres, it’s inner DB buffer pool size. Postgres is shared buffer size. The buffer pool is almost always the biggest thing. Because if you’re writing from disc, your performance is terrible. So, we basically looked at the list, it seemed reasonable. It’s hard to say whether or not one’s different than another. In some situations, actually, depending on the workload too, the ranking might differ, but it’s usually not for the top 8 or top 10, almost always the same. So, that gives you the knobs you think you should target first. The original version of OtterTune would do an incremental approach of maybe tuning 5 knobs, let it go for a little while and then tune 8 and then to 10, expand it, because the knobs that have the most impact will give you the most benefit right away. There are other methods in the academic literature that try all at once using deep nets or reinforcement learning. So, there’s proof techniques for maybe looking at a wider number of knobs than the version of OtterTune did. Then your next question is, okay, how do you determine what you think is going to make a difference or tell the user or what do you want to know? Do you want to know how do the albums figure out this is the knobs I should tune this way? Tim Veil: Yeah. Not only that, but then I would imagine I do something and it could have a negative impact for some reason. So how is the system then making sure that I’m making forward progress continually is an interesting thing? Andy Pavlo: So this would be a good difference between the academic version and the commercial version. So, in the academic version, we made the assumption that people would not tune production databases with something like OtterTune, because it’s machine learning, the models have to learn. So, that means that there may be times where it tries out things that make performance gets worse. Then we also assume that because you’re not running production, that you’re running on a clone, that you would’ve a workload trace, that you can repeatedly run over and over again to see whether they’re making things better. Until the models converge, you think you have the best recommendation and then you can then apply it to the production database. Again, when we were at the university, everybody we talked to could do this. We did a deployment at a major bank in France. They had a whole team that could set up things on clones. We talked to the patent office. We talked to other pretty big companies that could do this. Since we commercialized it, we’ve realized that not everyone can do this. Most people cannot capture our workload trace. Here’s all the sequel queries executed and then run that on the side. Then most people also don’t want to set up another clone or a machine, because that that’s expensive to run experiments for OtterTune to tune. So, in the commercial version, we’re actually tuning the production database directly. So, that means that we have to be more cautious in the recommendations we make and set up more guardrails to make sure that things don’t go wrong. So, that just means that it’ll be less aggressive in exploring the solution space and that we try to use some training data we collected from previous databases to help make sure that the first things we try out for your database is not way out of line, way out of whack. Tim Veil: Now I’m always surprised when I go and visit prospects or customers and maybe not even so much now, just in previous work about just quite honestly, people’s existing knowledge of databases or whatever technology it is. I think sometimes you assume that people who have been in a company for many years and working on a product have this deep knowledge. Well, it’s not true necessarily. There’s folks out there that are doing important things and maybe don’t understand all the details. I’m curious to the extent that you can share, is there, “Hey, we walked into one place and we were able to do this amazing work”? In other words, is there an outlier there where OtterTune came in and just blew somebody’s doors off because their existing setup was so potentially problematic? I’m just curious what the average- Andy Pavlo: This doesn’t sound like I’m bragging, so I don’t want to come off that way or being pretentious, but we have found that for knob tuning, the algorithms work better in the real world than they did in the university. Because as a baseline, when we did our experiments for research projects, we assume that people maybe have done a bit more tuning than they actually do. We assume that the user would be a bit more sophisticated than we are finding. Now, I think in both cases though, there’s people that have done almost no tuning or actually zero tuning. We’ve had customers tell us, since we’re targeting Postgres, MySQL, running on Amazon, people have told us that they thought Amazon had already tuned their database for them. They’re not. Jeff Bezos is not doing anything in a database. But even places where they had in-house DBAs that have tuned the database, the OtterTune algorithms can still carve off another 20, 25% improvement, because it’s just hard. There’s so many different things you have to deal with. Even if you have full-time DBA, if you have a lot of databases, which you would have if you’re going to pay for a DBA, they’re doing so many other things that they don’t really have time to tune every single database for exactly- Tim Veil: Look, I mean as somebody who works with databases every day and represents a database company, our own documentation and our own list of knobs, it’s incredible. It is daunting to understand and fully appreciate what each and every change does. So, that doesn’t surprise me because it’s hard enough to do all the other stuff right, to get into these esoteric settings is a whole other ball of wax. You had said though that you know started with knobs, but you all are moving into index tuning, query tuning, and the like. What’s that look like? The reason I’m asking is I’m curious as you’ve gotten into that, where’s the biggest bang for the buck? Is it schema and index tuning? Is it query plans? Is it all of this stuff? Have you even measured that? What are you doing and what are your thoughts on that? Andy Pavlo: So, we haven’t measured it, but it varies. For some applications, the knob tuning is they have all the right indexes. The query plans could use some work, but the biggest win they can get right away is knob tuning. In other cases, if you don’t have the right indexes and all your queries are sequential scans or you’re just doing terrible nested loop joins, then we can tune the knobs all day, but we can’t magically make your query run faster, because it doesn’t have the index. So, I would say it varies, but even then everybody needs everything. You know what I mean? We have not come across any database like, “Oh, my God. Here’s your money back. You don’t need us.” There’s always something. The reason is because databases aren’t static in that they’re ingesting new data, but also upstream, the application’s never static. The only place we’ve ever come across where someone has told us, “Yeah, our application hasn’t changed in three years,” was the patent office, rightfully so. What changes? But everybody else is like any other company, they’re putting out new features based on what customers want and so forth. So, new features mean new queries, new complexity, and new data for the database. So, in that case, people struggle with understanding how as their application evolves, the configuration, the tuning set up for the database evolves over time as well. There’s other things too where people may know how to do certain things, but it just falls through the cracks or people forget to do it. So, one example would be we had somebody where they had backups turned off on their production database. The reason is because that database used to be the staging database and then they were going from MySQL 5 or 7 to 8 or something like that. So, when they upgraded to 8, it made the staging database, the production base, someone forgot to turn on backups. So, things like that, it’s really just hard to know what are some of the best practices for databases, because again, most of these companies don’t have DBAs. It’s developers. It’s people who are writing application code and it’s somebody who may be set up the database at the last job. They’re responsible to set it in database at this job, but they’re doing a bunch of other stuff. So, there’s just too many things going on. Tim Veil: Yeah, it’s always somebody else’s responsibility. We see that a lot and I’ve always seen that. It’s interesting. Andy Pavlo: So, the new version of OtterTune that we’re putting out now is at its core, it’s still trying to optimize performance, efficiency of the data through knob tuning, indexes, query tuning, and so forth. The way we’re pitching it is we’re selling peace of mind. I realize it’s a fuzzy term to say about your database and especially as a scientist. It seems like a marketing thing and it’s hard to actually quantify what does it mean to be a peace of mind for your database. But this is the thing that people tell us is that they just don’t know what they don’t know about their database and they don’t know what they should be doing. So, the new version of OtterTune does all the things I said before, but it’s also checking to make sure your database is set up correctly, like the backups example. So, as your application evolves over time, OtterTune is there with you seeing how it evolves and making suggestions along the way. So, the original version of OtterTune, at least to the academic project, was like, “Okay, you tune once and then you’re done.” But it’s really this long-term lifecycle of the databases. That’s really what people need help with. That’s where we’re going. Tim Veil: Again, something we see play out over and over and over again, because once the people understand a particular technology, they tend to use it for everything, whether they should or shouldn’t. So, we’ve seen certainly many instances where database design and like sized provision for a certain task or workload all of a sudden is accepting work from some other thing and it’s just no longer suited for that. So, yeah, that makes total sense to me. This ability to go in and constantly reevaluate whether or not things are set up correctly based on new workloads makes a ton of sense. I’m curious just on that, because again, this is a topic that comes up for us a little bit, is this idea of multi-tenancy. I don’t know if that makes sense or how you think about that. It’s again, this idea that maybe a single database instance is used to serve multiple workloads, maybe different applications out of different schemes. Do you guys run into that? Do you have challenges with that? I’d imagine that’d be a tricky thing. Or are you just saying, “Hey look, that’s not exactly how we would advise you to set things up anyway,” which is a totally fair response, by the way. Andy Pavlo: So, the current version of OtterTune, we see the metadata about what it is, but we’re not differentiating which of the databases are being taxed the most. The thing that we’re working on now is looking at the fleet of databases holistically. So, what I mean by that is not just tuning this one instance, let me get the best performance of that one thing, but it’s really understanding how that instance interacts and is related to other databases in the fleet. Now some things are obvious. You have the replication set up. You know what the primary is or the reader or writer. The Amazon sees that relationship. But oftentimes we see where there’s implicit relationships and there are some optimizations or recommendations you can make if you understand how database are related to each other. So, a classic example would be staging in production. So, Amazon doesn’t know the staging base or the production database, but because we actually can see the schemas of the database, we can identify, “Oh, they’re actually the same.” Therefore, if we see a schema migration happen on the staging database, we can learn, “Okay, it’s going to get applied to the production database in one week, two week. We can ask the user when they’re going to apply it.” You can start making recommendations for like, “Okay, you’re going to add a column. That’s expensive, do it for your database, do it at this time. We won’t interfere with things,” or “You’re renaming a column. That’s cheap. Do it at this time.” So, you can start making those recommendations and understanding if you understand how these things are related at a logical application level. Another example would be we had a customer where they deployed their application in two different locations. So, it was two different front-end applications, two different database instances. One in the US, one in the EU, same schema, same workload, same application code, just different physical instances. Amazon doesn’t know that they’re related because they’re not replicating to each other. But then what happened was they found that the query latency on the EU database was 10X slower than the US one. It’s because they forgot to add an index that they added to the US one, they didn’t add it to the EU one. So, the new version of OtterTune can identify, okay, these schemas are the same and prompt the user and say, “Hey, look, you added this index. You really should be adding it over here too because we think they’re the same. Yes or no?” So again, none of that is not machine learning because it’s just identifying that these things are related, but it matters a lot. This is the new version of OtterTune that it’s trying to provide. Tim Veil: Well, it’s like you say though, it’s peace of mind. You’re running a complex system. You’re doing tough stuff and this is hard. It’s hard to be an expert at everything. So, having that idea that somebody can be out there auto or otter tuning this is pretty neat. I know we’re running up toward the end of the time we have allotted. I wanted to just ask you though, we’ve been talking a lot about the technology, but obviously starting a business, starting a startup company, it can’t be for the faint of heart. I know I’ve been privileged to watch from the inside as we’ve been trying to build Cockroach. I’m just curious on your take on trying to build this thing and getting investors and getting customers, what’s that journey been like for you? If you want to share, you certainly don’t have to, but I’m just curious, less technical and more business, what’s this been like? Andy Pavlo: I mean, definitely scratching a niche. Many academics do startups. I was very fortunate that my fellow co-founders or my former students that I’ve worked with before on OtterTune, so Dana Van Aken is the CTO. She did her PhD with me at Carnegie Mellon and then Bohan Zhang’s also co-founder. He did his master’s degree at CMU and he worked on the first version of OtterTune. So, that was fortunate that I could go into the company, get it started with people that I worked for and I trust, and they’re smarter than I am. They just don’t know it. Then when we raised money, we raised money at the beginning of the pandemic, which is choppy waters, but we used the funds that we got in the very beginning to go hire my best former students. Again, I’m very fortunate that they came along with me and they’ve been fantastic. That part’s been good, but I get to work with the students that I liked at the university with me at the company. I mean there’s ebbs and flows, of course. There’s lows and highs. I think the one thing I would say coming from academia, the thing that I underappreciated was the importance of marketing, sales, and operations. It’s one of the things you don’t know what you don’t know. So, now that we have people that are helping most of these things, I’m like, “Okay, now I see it.” Our operations manager’s fantastic. She does a bunch of stuff for me and it has been a huge lifesaver. So, that part, if I knew that sooner, I would’ve hired people like that earlier. I think that would be the one thing I’ve learned the most. I’ve enjoyed it because I’m seeing more databases, more things, more real use cases than I would’ve just building stuff in the university. In some cases, there’s been problems that have come up with customers that OtterTune’s not really going to solve in a position of solving this. But then there’s things like, “Oh, this is a hard problem and this does then guide my own research back in the university.” So, one example would be proxies like PgBouncer or OtterTune or Yandex. I don’t think that people can even realize how widely deployed these things are. Tim Veil: Everywhere. Andy Pavlo: Everywhere. Tim Veil: They’re everywhere. Andy Pavlo: Nobody does any research on them and they’re actually not very good. They’re not very high performance. The Yandex one is pretty good. I have a PhD student working on proxy stuff now because I saw them a lot in OtterTune. It’s not a problem OtterTune can solve. So, we’ve been looking at this now. Tim Veil: Can I give you just a little bit of an anecdote there? Andy Pavlo: Yes. Tim Veil: We see PgBouncer everywhere. It frustrates us because the documentation isn’t great and nobody really understands how to tune it, but it sits in the middle of every single connection to the database. So, if that thing goes sideways or is suboptimal, it can look like the database is not doing its job and that may not be the case. So, that’s really interesting because it is pervasive in Postgres obviously and it is not well understood. It’s not well understood here. Again, in part just because the whole thing seems very opaque, like the documentation, everything. So, that’s really fascinating. Andy Pavlo: We’ve come across people that run two or three layers of PgBouncer. PgBouncer talks to PgBouncer then talks to PgBouncer then talks to Postgres. It’s insane. Tim Veil: Yeah, it’s everywhere. Andy Pavlo: Yeah. So, I think, like I said, doing a startup plus I’m now teaching again full-time at the university. I’m back. I did a one-year leave of absence and then plus having a three-year-old daughter, I don’t know how long I keep doing this, but I enjoy it. Tim Veil: All right. Well, first of all, I know this is mostly audio only, but for those on video, I’ll probably get shot if I don’t ask what is behind you there. At least in my view, there is a mannequin. I feel like you owe it to the people who may be on video to explain what that is. Andy Pavlo: Yeah, actually, that’s Little Billy. That’s actually the child mannequin I got when I was in grad school, it’s how I proposed to my wife for marriage. We can go to that story if you want, but then when I came to Carnegie Mellon, she’s like, “You can’t bring that child mannequin anywhere until you get tenure because it creeps people out.” So, I got tenure last year. All right, I taught Cindy or vectorized instructions in my database class. So, we had to use the mannequin as a prop. Tim Veil: That’s so awesome. Well, we’ll definitely do the proposal story on part two of our conversation. So, I’ve been ending a lot of the podcasts like this and because I know we’re running out of a time, for us, it’s like the beginning of a new fiscal year. It’s spring. It’s like this time of optimism. I’m just curious, what are you looking forward to this year? I mean, you’ve got a lot going on. You’ve got a daughter, you’ve got this startup, you’ve got things at the university. What’s exciting? What are you looking forward to this coming year? Andy Pavlo: Yeah, so I think I was very COVID cautious and I’m actually starting to travel more and go visit places and give talks so that I’m looking forward to seeing all my database friends at universities and other places that in ways that hadn’t done it in recent years. My daughter, she can’t program yet. If you ask her what her favorite programming language is, she says SQL. So, that’s so far so good there. I’m looking to see how she grows. Then my wife is fantastic, so I want to spend more time with her as much as possible. It’s hard with a three-year-old. On the database side, I think with the economy looking dicey, I’m interested in seeing what the startup landscape looks like for database companies. I realize you’re at a database startup, although you guys have been around for a while, but I think it’s going to shake out a lot of the weaker companies and introducing what that looks like in this year. I don’t think there’s any exciting hardware on the horizon that Intel’s putting out or the video’s putting out that we’ll have to change how we think we build database systems. There’s nothing really, as far as I know that’s going to change anything. Then the large language models and ChatGPT stuff, I think that’s super fascinating. We’ve tried using it to tune databases. It actually doesn’t always work because it doesn’t know what your database is actually doing. If you asked it for certain things, it just regurgitates the Stack Overflow, which isn’t always correct. But I’m really interested in seeing where that goes next in the context of databases. Tim Veil: Yeah, we ran into a company at Gartner this week that was using it as an interface to query databases, which was really fascinating. Andy Pavlo: At some point, I want to start building something new. I don’t know what it’s going to be yet. I don’t have any free time, but I was like, I might do generative art for databases. You upload your schema and it makes a pretty picture or something like that using Midjourney or Dall-E. We’ll see. Tim Veil: Well, what I’m going to do after this, because I did this for something at our recent sales kickoff, I want to see what Dall-E or one of these things says for an otter fighting a cockroach. Andy Pavlo: Yes. Tim Veil: See what kind of savagery is revealed there. Well, listen, I’ve always enjoyed our chats and certainly enjoyed the work that you’ve done and the projects that you’ve started. I think what you all are doing at OtterTune is incredibly fascinating. So, I appreciate you joining and telling us a little bit about that. If you are willing, I’d love to have you on again sometime in the future and we can talk about even more interesting things. Andy Pavlo: Absolutely. Like I told you, I can do this all day. I really appreciate you having me. Tim Veil: Yeah, it’s been great. So, thanks again, Andy. We’ll talk soon. Andy Pavlo: Okay, thanks, Tim. See you. Tim Veil: Thanks again for listening to this week’s episode. If you’re a fan of the show, be sure to subscribe to the podcast to get every new episode in your feed as they’re available. Also, rate us five stars on your favorite podcast platform. If you like what you heard, you can also watch Big Ideas in App Architecture on our YouTube page, linked in the description. Thanks again. Bye. Big Ideas in App Architecture A podcast for architects and engineers who are building modern, data-intensive applications and systems. In each weekly episode, an innovator joins host Tim Veil to share useful insights from their experiences building reliable, scalable, maintainable systems. Tim Veil Host, Big Ideas in App Architecture Cockroach Labs Latest episodes Mastering Multi-Cloud with PwC’s Erol Kavas Erol Kavas Director at PwC Canada From FedEx to Five Guys: Designing digital experiences with Yext’s VP of Software Engineering Matt Bowman VP of Software Engineering at Yext Reliability and scalability in a data-driven world with Fivetran’s VP of Platform Engineering Mike Gordon VP of Platform Engineering at Fivetran Enabling a data-driven and innovative engineering culture at Amplitude Shadi Rostami SVP of Engineering at Amplitude How Estée Lauder scales strong engineering culture Meg Adams Executive Director of Platform Engineering at Estée Lauder Can I take your order? Building conversational AI to improve the customer experience Akshay Kayastha Senior Engineering Manager at ConverseNow Engineering resilient systems: Rescuing old treasures and unleashing modern capabilities Marianne Bellotti Author, Engineering Leader, Systems Geek The Full Package: How Route architects its all-in-one post-purchase platform Siddhartha Sandhu Engineering Manager at Route A historical journey in developer technologies Mike Willbanks CTO at Spark Labs From Legacy to Cloud: Success stories from migrating mission-critical applications Kishore Koduri Senior Director of Enterprise Architecture at Ameren Building purpose-driven engineering cultures Jason Valentino Head of Engineering Enablement at BNY Mellon Modernizing Insurance Application Architecture at New York Life Mike Murphy Corporate Vice President and Life Insurance Domain Architect at New York Life Innovation and Disruption: How Materialize pioneered a new era in data streaming Arjun Narayan Co-Founder and CEO at Materialize Stories from an SRE: How Hans Knecht builds better developer experiences Hans Knecht Cloud Consultant at Knechtions Consulting (Ex: Capital One; Ex: Mission Lane) Inside Chick-fil-A’s infrastructure recipe for a perfect customer experience Brian Chambers Chief Architect at Chick-fil-A Corporate Modernizing from the Mainframe: An Exploration of Distributed Systems Chris Stura Director, PwC UK IoT Standards & Data Mesh: Utility Facility App Architecture Grant Muller Vice President, Applications and Technology Architecture at Xylem Relational Data Problems: Doubble Dating Application Architecture Mattias Siø Fjellvang CTO & Co-Founder at Doubble From Legacy Systems to Limitless Scaling with Paycor’s Systems Engineering Fellow Adam Koch Systems Engineering Fellow at Paycor How to Understand Problems & Build Better Software with Technical Leader Joe Lynch Joe Lynch Technical Leader Observability in the Cloud & Dataflow Modifications with Yolanda Davis from Cloudera Yolanda Davis Principal Software Engineer, Data Flow Operations Early Days at Google & Building CockroachDB with Peter Mattis Peter Mattis Co-Founder and CTO of Cockroach Labs Database Benchmarking Efficiency with OtterTune’s Andy Pavlo Andy Pavlo Associate Professor of Databaseology at Carnegie Mellon and Co-Founder at OtterTune Observability & Statelessness with TripleLift’s Chief Architect Dan Goldin Chief Architect at TripleLift Understanding AI: PubNub CTO Stephen Blum’s Key to Faster App Development Stephen Blum PubNub Building reliable systems with DoorDash's Matt Ranney Matt Ranney DoorDash Real-Time Data Capturing: The Future of Fitness Technology Paul Lawler Head of Software at Wahoo Fitness Building Efficient App Architecture with Alloy Automation’s Gregg Mojica Gregg Mojica Co-Founder and CTO Alloy Automation Unleashing the Power of Hiring Software with Greenhouse CTO Mike Boufford Mike Boufford CTO at Greenhouse Software Decoding Data Warehousing: Insights from Ken Pickering, SVP of Engineering at Starburst Data Ken Pickering Senior Vice President of Engineering, at Starburst Data PRODUCT * CockroachDB * CockroachDB Dedicated * Pricing * Get CockroachDB * Sign In * Download RESOURCES * Guides * Video & Webinars * Podcast * Compare * Architecture Overview * FAQ * Security LEARN MORE * Docs * University * GitHub SUPPORT CHANNELS * Forum * Slack * Support Portal * Contact us COMPANY * About * Blog * Careers * Customers * Partners * Events * News * Trust * Privacy * Legal Notices Ask AI COOKIE CONSENT We use cookies to personalise content and ads, to provide social media features and to analyse our traffic. We also share information about your use of our site with our social media, advertising and analytics partners.Privacy Policy Cookies Settings Reject All Accept All Cookies PRIVACY PREFERENCE CENTER When you visit any website, it may store or retrieve information on your browser, mostly in the form of cookies. This information might be about you, your preferences or your device and is mostly used to make the site work as you expect it to. The information does not usually directly identify you, but it can give you a more personalized web experience. Because we respect your right to privacy, you can choose not to allow some types of cookies. Click on the different category headings to find out more and change our default settings. However, blocking some types of cookies may impact your experience of the site and the services we are able to offer. More information Allow All MANAGE CONSENT PREFERENCES STRICTLY NECESSARY COOKIES Always Active These cookies are necessary for the website to function and cannot be switched off in our systems. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms. You can set your browser to block or alert you about these cookies, but some parts of the site will not then work. These cookies do not store any personally identifiable information. TARGETING COOKIES Targeting Cookies These cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising. PERFORMANCE COOKIES Performance Cookies These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us to know which pages are the most and least popular and see how visitors move around the site. All information these cookies collect is aggregated and therefore anonymous. If you do not allow these cookies we will not know when you have visited our site, and will not be able to monitor its performance. Back Button PERFORMANCE COOKIES Search Icon Filter Icon Clear checkbox label label Apply Cancel Consent Leg.Interest checkbox label label checkbox label label checkbox label label Confirm My Choices