$600m Enterprise SaaS Platform Upgrades From Stand-Alone Keycloak to Kubernetes-Based Deployment, Reducing Load Time By 83.33%
Client Overview
The client is a $600M enterprise SaaS platform that helps organizations enable secure remote work without managing complex infrastructure. They cater to companies seeking enterprise-level collaboration features.
The Objective: Upgrade Keycloak From a Stand-Alone to a Kubernetes-Based Instance
The client used an older version of Keycloak, an open-sourced identity access management tool, as a stand-alone instance. They wanted to update the application to provide more features, take advantage of bug fixes, and take advantage of Kubernetes for scalability.
As they were not using the Red Hat version (Red Hat Single Sign On), they did not have access to enterprise support and could not rely on their team to upgrade them to the latest version of the application. Their IT, while capable, was busy maintaining technical debt and improving features.
The team wanted to understand the level of effort required to upgrade the platform. They also wanted someone to carefully test the application in a mirror environment before upgrading their infrastructure to minimize disruption. Most importantly, they wanted to simply transfer their data over and maintain their current Postgres database.
Our Red Hat partners referred the client to Shadow-Soft, given our experience with Keycloak, Kubernetes, and Application Modernization.
The Solution: Kubernetes Instance of Keycloak (v25)
We successfully upgraded the client from Keycloak v18 to v24 while maintaining the database and data integrity. After testing and validating the change, we successfully deployed the upgrade from the v24 stand-alone environment to the v25 in Kubernetes.
Additionally, our team created extensive documentation to ensure the internal team could understand the application, replicate our journey, and improve upon the new instance.
Our Process
To begin, the client built a mirror instance for our team. Using the mirror database, we built a PoC to test the upgrades.
We started the project by testing an incremental upgrade from v18 to v21 and then from v21 to v24. While it was a safe approach, there were no significant changes from those versions. Upgrading from v24 to v25 with the Kubernetes clusters was a significant change, however.
The client had numerous custom SPIs (Keycloak extensions). Because they were going up several versions, they wanted to test if these SPIs would be compatible with the later versions. If they weren’t compatible, they wanted to know how to make them compatible (if needed).
The Step-By-Step Process:
- Assist the client with creating a duplicate running instance of the Keycloak Environment and Database.
- Update Keycloak instance configuration as POC using the duplicate database with Postgres.
- Upgrade initial binary to Keycloak 24.
- Test database migration scripts and functionality.
- Test integration of 10 custom SPIs to ensure functionality.
- Review and fix any issues.
- Create Kubernetes Deployment of Keycloak 24.
- Create an NFS Deployment Instance in a customer-provided Kubernetes cluster.
- Upload custom Keycloak themes to NFS.
- Deploy Keycloak 24 using the existing Bitnami Keycloak Helm chart.
- Add config maps to deployment to configure each custom SPI.
- Mount NFS share to the directory.
- Deploy ingress entry.
- Test by validating session replication across multiple instances.
- Migrate remaining changes during off-peak hours.
Key Features
- New Application Version: The client now has access to the latest version of Keycloak, leveraging the benefits of Kubernetes.
- Autoscaling: The initial load time for the platform is much faster, with the application automatically scaling to meet user demand.
- Faster Future Updates: The team can redeploy future updates much faster and easier than was possible on the stand-alone instance.
- Improved User Experience: The client can take full advantage of all new features to serve customers better and provide an optimal experience.
- Extensive Documentation: We created a YAML file for the client to relaunch the entire system with one command if there is any issue or disaster with the instance.
- Application Resilience: Because the application is built on Kubernetes, any crash only impacts a single pod that will restart independently.
- Rolling Restart: If the team needs to update any SPI, they can do so without downtime when updating the configuration.
Challenges with Migrating from Stand-Alone to Kubernetes
When upgrading several versions, the database changes. In this case, Keycloak makes permanent changes to the database schema. We needed to carefully implement improvements with certainty that the data and database parameters would remain intact.
Every time we updated the Keycloak binary, the database schema had over 600 realms to update, taking six hours of hands-off processing that caused the system to time out. After testing, we discovered it was safe to upgrade from v18 to v24 directly. Then, we focused on providing more direct support for upgrading to v25 and managing the shift to Kubernetes.
The internal IT team was often busy managing technical debt and working on application improvements. This slowed responsiveness down, adding additional time to the project. To mitigate this, we regularly followed up with the client to keep the project moving forward.
Tools
As leading Red Hat enterprise partners and Kubernetes experts with over 15 years of experience in the field, we bring a range of experience and expertise into every client engagement, whether they are current customers or using open-source versions.
Results: 83.33% Load Time Improvement
Our work with the client helped them modernize their application stack.
Moving from V18 to V25 of the application allowed the team to take full advantage of Kubernetes. Our careful approach minimized the disruption and risk to the business while allowing them to improve application performance and the customer experience.
While the client’s team is familiar with Linux, we’ve helped them think differently about how they provision infrastructure and applications using Kubernetes.
Key Results:
- The business now runs v25 of the application in Kubernetes instead of a standalone instance, opening the door to new potential customer outcomes.
- The application loads 83.3% faster. This reduction in load time (from 15 minutes to 2.5 minutes) helps the platform scale to meet user demand better.