smallpdf.com
Open in
urlscan Pro
108.157.4.81
Public Scan
URL:
https://smallpdf.com/blog/stabilizing-keycloak-at-scale
Submission: On February 02 via api from GB — Scanned from GB
Submission: On February 02 via api from GB — Scanned from GB
Form analysis
0 forms found in the DOMText Content
Home Tools * Convert & Compress * Compress PDF Reduce the size of your PDF without losing quality * PDF Converter Convert Word, PowerPoint and Excel files to and from PDF * PDF Scanner Create PDFs from scans on your mobile * Organize * Merge PDF Combine multiple PDFs into one unified document * Split PDF Extract pages from your PDF or save each page as a separate PDF * Rotate PDF Rotate one or all pages in your PDF * Delete PDF Pages Remove one or multiple pages from your PDF * Extract PDF Pages Pick only the pages you need from a PDF * View & Edit * Edit PDF Add text, shapes, images and freehand annotations to your PDF * PDF Reader Display, print, and share PDFs online * Number Pages Insert page numbers in PDF with ease * AI PDF Summarizer Summarize or chat with your PDF using AI * Convert from PDF * PDF to Word Convert PDFs to editable Word documents * PDF to Excel Convert PDFs to editable Excel spreadsheets * PDF to PPT Convert PDFs to editable PowerPoint presentations * PDF to JPG Extract images from your PDF or save each page as a separate image * Convert to PDF * Word to PDF Convert Word documents to PDF files * Excel to PDF Convert Excel spreadsheets to PDF documents * PPT to PDF Convert PowerPoint presentations to PDF documents * JPG to PDF Transform JPG, PNG, BMP, GIF, and TIFF images to PDF * Sign & Security * eSign PDF Create your signature, sign your PDF and request people to sign * Unlock PDF Remove password, encryption, and permission from your PDF * Protect PDF Add a password and encrypt your PDF file Tools * Convert & Compress * Compress PDF Reduce the size of your PDF without losing quality * PDF Converter Convert Word, PowerPoint and Excel files to and from PDF * PDF Scanner Create PDFs from scans on your mobile * Organize * Merge PDF Combine multiple PDFs into one unified document * Split PDF Extract pages from your PDF or save each page as a separate PDF * Rotate PDF Rotate one or all pages in your PDF * Delete PDF Pages Remove one or multiple pages from your PDF * Extract PDF Pages Pick only the pages you need from a PDF * View & Edit * Edit PDF Add text, shapes, images and freehand annotations to your PDF * PDF Reader Display, print, and share PDFs online * Number Pages Insert page numbers in PDF with ease * AI PDF Summarizer Summarize or chat with your PDF using AI * Convert from PDF * PDF to Word Convert PDFs to editable Word documents * PDF to Excel Convert PDFs to editable Excel spreadsheets * PDF to PPT Convert PDFs to editable PowerPoint presentations * PDF to JPG Extract images from your PDF or save each page as a separate image * Convert to PDF * Word to PDF Convert Word documents to PDF files * Excel to PDF Convert Excel spreadsheets to PDF documents * PPT to PDF Convert PowerPoint presentations to PDF documents * JPG to PDF Transform JPG, PNG, BMP, GIF, and TIFF images to PDF * Sign & Security * eSign PDF Create your signature, sign your PDF and request people to sign * Unlock PDF Remove password, encryption, and permission from your PDF * Protect PDF Add a password and encrypt your PDF file Documents Search * Compress * Convert * Merge * Edit * Sign * Pricing * Teams Log In Free Trial 1. Home 2. Blog 3. Engineering 4. Stabilizing Keycloak at Scale: Tools for Debugging & Investigating The Engineering Blog STABILIZING KEYCLOAK AT SCALE: TOOLS FOR DEBUGGING & INVESTIGATING June 14, 2022 by Silvan Jegen You can also read this article in German, French, Indonesian, Italian and Portuguese. SMALLPDF STAFF SOFTWARE ENGINEER, SILVAN JEGEN EXPLORES SETBACKS ENCOUNTERED WITH KEYCLOAK AUTHENTICATION AND SOLUTIONS FOR RESOLVING THEM. At Smallpdf we introduced Keycloak for authentication in the summer of 2021. Like many new implementations, it wasn’t a smooth process and took a lot longer than expected. In this article, Smallpdf Staff Software Engineer, Silvan Jegen, takes a deep dive into issues encountered and techniques used to investigate and debug them. STABILITY ISSUES Our initial setup comprised several Keycloak instances, each with an embedded Infinispan cache. When we put these instances into production, we quickly encountered stability issues, where the nodes ran out of memory and restarted. These restarts also resulted in user sessions being lost, which then logged these users out. Several months of experimentation on how to address these stability issues followed. We tried different combinations of embedded and external Infinispan caches with different persistence options. For a while, we used the SIFS persistence option to an externalized Infinispan cache, but after about two weeks, this setup resulted in an outage. In the end, we settled on a setup with multiple Keycloak instances, all connecting to an external Infinispan cluster. When we provisioned enough nodes to be able to sustain the production load on both the Keycloak and the Infinispan side, we emerged from the experience with a stable production system. What did the process of finding our current setup look like? LOAD TESTING Our experience with the Keycloak+Infinispan with RocksDB showed us that our production workload wasn’t as close to the load testing conditions as we thought. The first measure we took to make sure that these outages wouldn’t happen again was to make more use of load testing. The tool we turned to for this was “k6,” which allows us to generate requests with fine-grained control (thanks to JavaScript scripting) and to customize the way they’re sent (i.e., frequency, duration, etc.). We had to switch from a legacy authentication system, so we needed to send direct grant requests to our Keycloak instance, which would mimic creating a lot of sessions on our cluster. To do this, we used the following script: To run this script with “k6” for one hour using 16 TCP connections to send requests in parallel, you can use the following command line: Note: when using this script, you create multiple direct grant sessions for the same user account. This may or may not be what you actually want to test. SOLVING THE PROBLEM Using “k6” to load sessions and checking the JVM metrics and logs that we collect in our observability solution allows us to verify whether we can expect our setup to be able to support our production load. Loading sessions into Infinispan with “k6” directly instead of sending them to Keycloak turned out to not accurately reflect what would happen when we put Keycloak together with Infinispan in production. So for the best results, we recommend loading sessions by sending requests to Keycloak using the script above instead of loading them directly into Infinispan—although this would be faster. LOST SESSIONS Once our Keycloak and Infinispan clusters were stable, we expected the Refresh Token Error counts that we were observing previously to go down. The errors did in fact go down, but they were still higher than expected, meaning that thousands of users were still being logged out daily, because when they were trying to refresh their access token, they were served with an error. GETTING MORE INFO To debug this issue, we first consulted the event log Keycloak comes with. However, the default log level that’s set was too high to get the information we were interested in. So, to get more detailed logs, we set the level to “DEBUG.” It helps to be aware that this can create a lot of data which will most likely cause issues if you’re saving the event logs in the relational Keycloak DB and have substantial traffic. The logs obtained this way still didn’t contain the required error details that we needed to identify the source of these Refresh Token Errors. We ended up patching the code of our Keycloak instance to add more details than the generic errors provided, as well as the session IDs. The latter allowed us to associate the Refresh Token Errors more easily with the corresponding “Login” events. This extra info then pointed us in the right direction: the vast majority of our Refresh Token errors were due to “session_not_found” errors. This indicated that Keycloak wasn’t finding the sessions of our users after some time had passed, though the time until the session was lost seemed to vary greatly. CREATING A CONTROLLED SETUP The next step was to reproduce the issue in a controlled environment. We found that the best way to do this was to create a local Keycloak+Infinispan setup to which we could send requests using Postman. Using this local setup, we could increase the log level to “TRACE” without drowning in events, since there was no production traffic to care about. The “TRACE” debug level allowed us to see that new sessions were initially successfully retrieved from Infinispan after creation. After some time passed, however, Keycloak would try to retrieve a session on the Infinispan side and the session would be gone with no discernible reason. It turned out that evicting and recalling a session back from disk twice in a row resulted in the session not being found at the time of the next refresh token request. The reason for this is still unknown. The session disappearing was the source of the Refresh Token Errors that we were observing in our production environment. SO WHAT DID WE LEARN? Debugging Keycloak is complex, especially if you’re running it with an externalized Infinispan cluster. Having the JVM metrics and the logs readily available might not be enough. In that case, depending on a local setup to reproduce the issue you’re having is the best way to go—even if it requires you to patch Keycloak to get the data from it that you need to debug the issue after. Don’t forget to do some proper load testing for changes that you’re making in your Keycloak setup either. Based on our experience, it will probably save you some trouble in production! With special thanks to Sam Smith and Zhandos Zhylkaidar for their insights and contributions to this article. Enjoyed this post? Look out for more info, insights, and tips from the engineers at Smallpdf coming soon on the blog. Silvan Jegen Staff Software Engineer @Smallpdf RELATED ARTICLES Guides to Working from Home HOW TO INCREASE PRODUCTIVITY WHEN WORKING FROM HOME Maximize and sustain that productivity while working at home. Quick & easy tips for your remote job. The Real Estate Blog E-SIGNING & REAL ESTATE: IMPORTANT FACTS & FIGURES Real estate is a powerful tool with which to build personal and economic wealth. We take a look at some facts and figures about this sector in the US. Sustainability THE BENEFITS OF AN ENVIRONMENTALLY-FRIENDLY BUSINESS The benefits of sustainability are many, but in business, even more so. Let’s discover the many advantages of sustainability in business. Show more → * Convert & Compress * Compress PDF * PDF Converter * PDF Scanner * Organize * Merge PDF * Split PDF * Rotate PDF * Delete PDF Pages * Extract PDF Pages * View & Edit * Edit PDF * PDF Reader * Number Pages * AI PDF Summarizer * Convert from PDF * PDF to Word * PDF to Excel * PDF to PPT * PDF to JPG * Convert to PDF * Word to PDF * Excel to PDF * PPT to PDF * JPG to PDF * Sign & Security * eSign PDF * Unlock PDF * Protect PDF We make PDF easy. * SOLUTIONS * Business * Education * COMPANY * About * Help * Blog * Careers * PRODUCT * Pricing * Teams * Embed PDF * Developers * APPS * Download Smallpdf * iOS App * Android App * PDF Scanner * Windows App * * * * * * © 2024 Smallpdf AG — Made with for the people of the internet. Privacy Notice Terms & Conditions Imprint Contact Us We use cookies This website uses cookies and similar technologies (together “cookies”) that are essential for the operations of this website and its core functions. Other cookies will only be placed with your consent. For more details visit our Privacy Policy. Accept All Cookie settings