A couple years ago - before LLM coding tools became ubiquitous - I was tasked with migrating the SAML implementation of an enterprise SaaS solution, which was a Grails application. It wasn’t quite legacy, but nobody wanted to touch the SAML code. It worked, and breaking it meant breaking sign-in for thousands of users. In this post I'll be talking about the challenges I faced, how I grew and what I took away from the experience.

Around this time our security team had been ramping up their tooling, and we'd adopted a static dependency analysis tool. Through that analysis, our Security Plugin used in our main application was identified to be using a dependency with multiple critical CVEs.

That dependency was our SAML library, the Grails SAML 2.0 Plugin which is no longer maintained* (to be completely accurate the issue was the plugin's dependency on the Spring Security SAML Extension which was its primary component). Maintaining a fork was an option we considered, but we came to the conclusion that it was better to migrate to Spring's Native SAML 2.0 Library.

* Upon writing this post, it seems the maintainers have actually updated that project off of the extension and onto Spring's Native SAML2 Library! It's hard to tell from the commit history, but it appears to have happened shortly after my migration to native. Had our security team been delayed a month or two, this migration and lessons learned may have never occurred!

Quick Overview on SAML

SAML stands for Security Assertion Markup Language and is an authentication standard that enables SSO. In SAML we have an IDP (identity provider: this is the single sign on side) and an SP (service provider - you'd log in to the service provider with Google or some other IDP). The IDP and SP exchange XML in a standard format to complete this process.

The general flow looks like this:
- User tries to access the service provider (SP).
- SP sends the user to the IDP (called a SAML Authn Request)
- IDP authenticates the user (user SSO)
- IDP generates a SAML Assertion, signs it
- This SAML Assertion is POSTed back to the SP (to an 'assertion consumer' endpoint)
- SP verifies the signature and logs the user in.

samlflow.png

When implementing or working with SAML, there are good libraries out there that provide the scaffolding and sensible defaults, however to implement SAML for an enterprise SaaS solution you need to understand what's going on under the hood so you can meet the requirements of many clients, ensure compatibility with existing infrastructure, and provide documentation to professional services and implementation folks.

The Migration

After some initial analysis and discovery, it was time to dive in. It was pretty obvious the libraries had functional parity; the challenge was going to be using the spring native implementation in a grails context, and maintain backward compatibility for our existing SAML clients.

I initially struggled to get started: the official docs assumed Spring Boot, so I had to reverse-engineer an unofficial Grails plugin buried in GitHub (I shouldn't understate the stroke of luck finding this Github project was - there wasn't a lot out there pertaining to the problem I was facing). This reverse engineering entailed the way the beans were instantiated, their configuration, and resulted in giving me more of an idea what beans from the native library corresponded to the legacy plugin's beans.

This reference implementation gave me a foundation to build off of - I had a working SAML implementation. But the work was far from over: I had to ensure backwards compatibility such that from a functional perspective, when we went live with this, nothing changed.

The first challenge was routing: the Spring Security SAML2 Library did not play well with Grails configurations. This meant I couldn't easily specify what URL(s) I wanted assertions to be consumed at with our new implementation. I ended up adding a middleware filter that matched on previously configured urls for existing clients, and forwarded the request internally after replacing the urls on the request object. This was required as the SAML library internally validated the urls against different routes that had paths which could not be configured (in our grails app). An even more in-the-weeds problem was that POSTs can't be redirected. Luckily, the Spring Boot dispatcher has something called 'Forwarding' which allows you to manipulate a request and 'forward' it, allowing me to get passed not being able to simply redirect the incoming request to a different endpoint. In the same filter I'd intercept the just-forwarded request I'd manipulated, and manually invoke the built-in Spring Security SAML filter chain which allowed me to get around the URL validation.

This filter/interceptor was the first middleware I've ever written. While this whole project was a before and after moment for me, this piece was a microcosm of that: my eyes were opened to all the amazing (and potentially gross and hacky) things you could do with web application middleware.

I also had to write a custom SAML Authentication Request Factory. The legacy grails library had different defaults and allowed more configuration than Spring's native SAML library. I had to extend what the native library provided and hook into the SAML Authn request creation to apply configs we had before that were natively not configurable. One example that ended up turning into an enhancement as a part of the migration: we had not tested/developed with Okta as an IDP previously, just a SimpleSaml docker container, so for rigor I included dev/testing with Okta during this migration. Okta rejected our Authn request because we were omitting a NameIDPolicy, and this was configurable with the previous plugin. Through Spring Security SAML it was not configurable - via my Custom Authn Request Factory, it was made available as a configuration through our application. Shortly after we released this to prod, we had our first Okta SAML client begin implementation. It felt kind of cool to defuse that landmine before we encountered it.

At this point, there were only minor edge cases to correct. The work was done and in the moment I could ace any exam on SAML or Spring Boot middleware.

One extra thing that was cool that we got for free as a part of the migration was SLO (Single Logout) which meant when you logged out of our application you were logged out of the IDP. We previously hacked around this by setting a post logout redirect url to the IDP's clear session endpoint, but now we officially supported SAML 2.0 SLO.

Growth

Coming out of this project I certainly gained an appreciation of the IoC pattern and specifically, the elegance of Spring's IoC. The ability to plug and play my own implementations to library interfaces was an absolute godsend. Sure, this is just what object oriented programming is, but with Spring Beans it was so easy.

I also learned the effectiveness of digging in and getting your hands dirty. I experienced some paralysis-by-analysis in the discovery phase. I was scared to start because I didn't know what to do. Once I began Just Trying Things, was when the ball started to roll and I began to gain much more understanding of what needed to be done and how it all worked.

Again, learning about middleware, the concept of filters and a filter chain was also an eye opener for me. Adding custom filters and using Spring's FORWARD request method to maintain backwards compatibility was huge.

I also gained a deeper knowledge of SAML. What parts of the process can and should be signed, encrypted - and finding the quirks and details of a SAML library forced deeper understanding in order to implement workarounds.

Lastly, I had to Just Figure It Out: I had just become the Engineering Manager of the team, and my boss was heads down working with another team. If I had an absolute red alert I could contact him, but I was pretty much on my own in a sea of unknown. Having no option other than to figure it out, and being able to accomplish that, gave me a ton of confidence moving forward.

Conclusion

A couple years removed from this experience meant I had to go back and look at the PR to jog my memory about details around the problems I had to solve - but I'll never forget this project as a turning point for me. It changed the way I think about software applications and authentication. Since, I've been much more confident in my ability to solve and understand hard problems.