Horizontal Scaling of Atlassian Crowd

Context

Atlassian Crowd is an identity management server for Web Applications. It integrates very well with other Atlassian tools like JIRA, Confluence, Stash, Fisheye, Crucible, etc. We also used it with some custom plugins as Authentication and Authorization mechanism for our CI Servers (TeamCity), SonarQubue (via OpenID) and Source Code Repositories (SVN, Atlassian Stash).

Crowd was connected to Active Directory with LDAP connector and delegating Authentication to it. The picture bellow represents out setup.

Crowd Setup

We had about 15000 users and 1500 groups across 3 directories. Each directory delegating authentication to Active Directory.

Performance issues

After a short while we started to get user complaining that they were unable to log in to JIRA or checkout code from Subversion, etc. Turned out that Crowd was slowing down. Profiling the system Crowd was running on showed us that there was plenty of CPU and Memory left for it.

The problem turned out to be Active Directory. Most specifically the slow response time for authentication requests into Active Directory at times of substantial load.

Plan and decision

We’ve decided that it would be nice to have a number of Crowd instances, serving different Applications.

Implementation

We’ve created Master -> Slave setup. All the users and groups’ management would happen in Master Crowd and propagate to Slave Crowd. We picked applications and grouped them by authentication load and set them up on appropriate Crowd Servers.

Crowd configuration got duplicated and catalogs imported so the Slave Crowd was still using Active Directory for authentication. Then we tackled the problem of keeping Crowd Master and Slave in Sync (users and groups).

Keeping Slave up to date

Unable to find a good out of the box solution we developed a Custom Crowd plugin and a bunch of web services.

Plugin functionality was rather simple: listen to any User or Groups events in Crowd (add/update/remove) and perform a simple HTTP get request with change details to preconfigured URL.

We’ve created two types of web services:

  1. The Web Services for the Crowd Plugin to call when the change happened. It was asynchronously calling to next Web Service that performed the changes in Slave Crowd.
  2. The Web Service performing changes in Slave Crowd.

The picture below represents the solution.

Crowd Simple Setup

Multiple Slaves

We’ve ended up having multiple Crowd Slave instances. One of the instances had to be older and different version.  Separating the Web Services gave us possibility for using different versions of Crowd Java Client Libraries.

We have also used this setup to keep our UAT environment with up to date data from Master Crowd.

The Web Services are stateless and have no database.

Crowd Full setup

Full Crowd synchronization

As additional functionality we implemented full synchronization triggered from WS 1. WS 1 takes list of all the users and groups for each Directory from Master Crowd. Once collected it will call each WS 2 with full update.

This functionality makes it very easy to bring new UAT/DEV Crowd environments and populate it with Production data. It also makes it possible to Sync entire directory if one of the Web Services goes down.

Summary

The solution described above gave us horizontal scalability and possibility of working with different Crowd Client libraries and Crowd Versions. It also makes it easy to upgrade Crowd Instances.

Frame – works?!

 

For last few years I’ve been listening to software architects having conversations about frameworks that should be used and where. There are always different opinions. Heated discussions. At the end, decision is made that turns out to be a pain in the developer’s back.

What is framework?

Framework is the common piece of code that provides some generic functionality, that could be reused. All the frameworks are created with the aim of solving some kind of a problem. I like to think of frameworks as tools with default solutions.

In software development the multitudes of frameworks are great. There are testing frameworks, for writing your tests. Dependency Injection frameworks for all your DI needs. Multitudes of web frameworks for every possible programming language that could serve html.

Framework creation

I was involved in building up the frameworks. They were designed for specific technology, with the purpose of solving particular problem. Once they had been created, people thought of those frames to be good at solving any problems.

Unfortunately, those frameworks were designed to deal with few issues one had. It is possible that by chance you might deal with exactly same problem and the framework would suit you.

Framework on steroids

There are frameworks that became monsters. The architects of those, started with a simple idea and developed them into mammoth projects. Those frameworks are trying to solve all your problems. All they do at the end, is creating more.

Process of selection

As I mentioned earlier, one designed a framework to deal with a specific problem, that is why most likely, it will not solve yours in exactly 100%. During my development life I have learned that there is no single Software project that looks the same as the other.

That is why you should leave some room for yourself when you select a framework. Don’t rely entirely on all the framework features. Don’t bind yourself to one framework as well. If there is a way, try to shield yourself with some kind of abstraction, which can be easily manipulated into another framework.

Avoid frameworks on steroids. If you are going to go the route of it, you’ll end up hacking around it, or putting up with it and jeopardising your design.

One tool to rule them all

I don’t like to use one tool to solve many problems. I always compare it to a use of a hammer. You can perfectly drive a nail into a piece of wood with it. I’m sure you can as well cut a wooden plank with a hammer but unfortunately the result might not be as pretty as if you used a saw.

Simplicity is a bliss

According to Dr John Medina, human brain is constantly learning, trying to see patterns, matching them and generalise. This is what we tend to do when it comes to a software design. We often see patterns in places where there aren’t any.

I was once at a brilliant presentation delivered by Dan North on architecture design. The bottom line of it was, that we should take a step back and look at what we are doing as the solution might be much simpler without any fancy framework.

Greg