upgrade

Upgrading Juju using model migrations

Since Juju 2.0 there's a new feature, model migrations, intended to help provide a bulletproof upgrade process. The operator stays in control throughout and has numerous sanity checks to help provide confidence along the upgrade path. Model migrations allow an operator to bring up a new controller on a new version of Juju and to then migrate models from an older controller one at a time. These migrations go through the process of putting agents into a quiet state and queue'ing any changes that want to take place. The state is then dumped out into a standard format and shipped to the new controller. The new controller then loads the state and verifies it matches by checking it against the state from the older controller. Finally, the agents on each managed machine are checked to make sure they can communicate with the new controller and that any state matches expectations before those agents update themselves to report to the new controller for duty.

Once this is all complete the handoff is finished and the old controller can be taken down once the last model is migrated away. In order to show how this works I've got a controller running Juju 2.1.3 and we're going to upgrade my models running on that controller by migrating them to a brand new Juju 2.2 controller. 

One thing to remember is that Juju controllers are the kings of state. Juju is an event based system and on each machine or cloud instance managed an agent runs that communicates with the controller. Events from those agents are processed and the controller updates the state of applications, triggers future events, or just takes note of messages in the system. When we talk about migrating the model, we're only moving where the state system is communicating. None of the workloads are moved. All instances and machines stay exactly where they're at and there's no impact on the workloads themselves. 

$ juju models -c juju2-1
Controller: juju2-1

Model       Cloud/Region   Status     Machines  Cores  Access  Last connection
controller  aws/us-east-1  available         1      1  admin   just now
gitlab      aws/us-east-1  available         2      2  admin   49 seconds ago
k8s*        aws/us-east-1  available         3      2  admin   39 seconds ago

This is our controller running Juju 2.1.3 and it has on it a pair of models running important workloads. One is a model running a Kubernetes workload and the other is running gitlab hosting our team's source code. Lets upgrade to the new Juju 2.2 release. The first thing we need to do is to bootstrap a new controller to move the models to. 

Gitlab running in the gitlab model to host my team's source code.

Gitlab running in the gitlab model to host my team's source code.

First we upgrade our local Juju client to Juju 2.2 by getting the updated snap from the stable channel. 

sudo snap refresh juju --classic

Now we can bootstrap a new controller making sure to match up the cloud and region of the models we want to migrate. They were in AWS in the us-east-1 region so we'll need to make sure to bootstrap there.

$ juju bootstrap aws/us-east-1 juju2-2

Looking at this model we have the two, out of the box, models a new controller always does. 

$ juju models
Controller: juju2-2

Model       Cloud/Region   Status     Machines  Cores  Access  Last connection
controller  aws/us-east-1  available         1      1  admin   2 seconds ago
default*    aws/us-east-1  available         0      -  admin   8 seconds ago

To migrate models to this new controller we need to be on the older controller. 

$ juju switch juju2-1

With that done we can now ask Juju to migrate the gitlab model to our new controller. 

$ juju migrate gitlab juju2-2
Migration started with ID "44d8626e-a829-48f0-85a8-bd5c8b7997bb:0"

The migration kicks off and the UUID is provided as a way of tracking among potentially several migrations going on. If a migration were to fail for any reason and resume its previous state we could make corrections and try again. Watching the output of juju status while the migration processes is a interesting. Once done you'll find that status errors because the model is no longer there. 

$ juju list-models -c juju2-1
Controller: juju2-1
    
Model       Cloud/Region   Status     Machines  Cores  Access  Last connection
controller  aws/us-east-1  available         1      1  admin   just now
k8s         aws/us-east-1  available         3      2  admin   54 minutes ago

Here we can see our model is gone! Where did it go?

$ juju list-models -c juju2-2                                                                                        (rharding@dingy:)
Controller: juju2-2

Model       Cloud/Region   Status     Machines  Cores  Access  Last connection
controller  aws/us-east-1  available         1      1  admin   just now
default*    aws/us-east-1  available         0      -  admin   45 minutes ago
gitlab      aws/us-east-1  available         2      2  admin   1 minute ago

There we go, it's over on the new juju2-2 controller. The controller managing state is on Juju 2.2, but now it's time for us to update the agents running on each of the machines/instances to also be the new version.

$ juju upgrade-juju -m juju2-2:gitlab
started upgrade to 2.2

$ juju status | head -n 2
Model    Controller  Cloud/Region   Version    SLA
default  juju2-2     aws/us-east-1  2.2        unsupported

Our model is now all running on Juju 2.2 and we can use any of the new features that are useful to us against this model and the applications in it. With that upgraded let's go ahead and finish up by migrating the Kubernetes model. The process is exactly the same as the gitlab model. 

$ juju migrate k8s juju2-2
Migration started with ID "2e090212-7b08-494a-859f-96639928fb02:0"

...one, two, skip a few...

$ juju models -c juju2-2
Controller: juju2-2

Model       Cloud/Region   Status     Machines  Cores  Access  Last connection
controller  aws/us-east-1  available         1      1  admin   just now
default*    aws/us-east-1  available         0      -  admin   47 minutes ago
gitlab      aws/us-east-1  available         2      2  admin   1 minute ago
k8s         aws/us-east-1  available         3      2  admin   57 minutes ago

$ juju upgrade-juju -m juju2-2:k8s
started upgrade to 2.2

The controller juju2-1 is no longer required since it's not in control of any models. There's no state for it to track and manage any longer. 

$ juju destroy-controller juju2-1

Give model migrations a try and keep your Juju up to date. If you hit any issues the best place to look is in the juju debug-log from the controller model. 

$ juju switch controller
$ juju debug-log

Make sure to reach out and let us know if it works well or if you hit a bug in the system. You can also use model migrations to move models among the same version of Juju to help balance load or for maintenance purposes. You can find the team in #juju on Freenode or send a message to the mailing list. 

Designing for long term operations, upgrades, and rebalancing

When you're building a tool for a user there's a huge amount of design, decision making, and focus put on getting the user started. There's good reason for this. Users have an ocean full of tools at their disposal, and if you're building something for others to use you need to win the first five minute test.

You've got about a five minute window for a user to make useful headway in understanding your tool and visualizing how it will aid them in their work. Often this manifests in tools that demo well, but that then have to be left behind when the real work hits. I like to use text editors as an example of this. So many users start out with notepad, nano, or some other really light editor. In time, they learn more and the learning curve of something like VIM, Eclipse, or Emacs really pays off. There's a big gap in folks that make that leap though. If you want to get the most folks in the door it needs to have that hit the ground running feel to it.

When you're designing a tool to help users deploy and run software there's a lot of focus on the install process. Nearly every hosting provider has worked with some sort of "one button install" tool. They're great because it gets users started quickly in that five minute window. Over time though, users find that those one click tools end up being very shallow. They need to add users, create new databases, back up data, restart daemons when appropriate, the list goes on and on. Two operational tasks which are particularly interesting are upgrades and rebalancing.

Juju is an operations tool that can track the state for many different models, or state of operated applications,  potentially run by many different users. These models evolve over time and are running production workloads for years. We know of many models that are nearly a year and a half old. In that same time Juju has had five 1.2x minor versions and three 2.x minor versions. That means you’d want to keep up with improvements in performance, features, and security by upgrading nearly every other month. Aiding in this the Juju 2.1 release includes a new feature, known as model migrations, specifically to help operators managing their infrastructure over the long term.

The general idea is that the largest danger is in doing complex upgrade steps such as database migrations and making sure that everything running is able to communicate on the new version of the software. In Juju 2.1, the Juju infrastructure that manages the state of the models as well as API connections from clients (known as the “controller”) uses model migrations allowing an operator to bring up a new controller and migrate the model state over to a new version. In that process the data is shipped over, it’s gone through any migrations that need to occur, the data is sanity checked between old and new controller, and the system is put together in such a way that if anything fails to check out it can roll back and use the previously running controller. That's some reassurance to lean on when you're doing an important upgrade. Since one controller operates many models, the operator is in control of which models get migrated at what time and allows for a very controlled rollout to the new software in a way that permits safely checking that all remains in the green as the new version of the Juju software is adopted.

Another big use case for model migrations is to balance out the load on Juju controllers over time. As an organization we all grow and change in our needs and over time it's important to be able to move to new hardware, shift services that generate heavy load to dedicated hardware, etc. Juju is tested to perform at thousands of managed machines, however there are dependencies such as the size of the controller machines that track state and over time a normal part of a growing organization is to put into play machines with newer CPUs, more memory, or just flat out beefier hardware with more cores.

I'd love to hear about the tool you use and which ones have fallen short on aiding you in managing your long term operational needs and what other types of long term operations you want your tools to assist with. Hit me up on twitter @mitechie

In a follow up post I'll walk through exactly how to perform a migration to the new Juju 2.2 release using the built in model migration feature.