The holy grail of web application deployment is restarting without dropping ongoing requests and without any downtime. This is called graceful restart or graceful reload and one of the easy ways to achieve it is to run multiple processes and have some of them stop accepting new requests and restart themselves when they finish serving the existing ones. In the mean time the remaining processes run as usual. When the first batch completes the restarts the rest can undergo the same procedure.
uWSGI is slowly becoming the de facto application server in the Python world and it shows great promise with many more languages. The Art of Graceful Reloading from the official documentation describes various strategies. The one I’m focusing on is described in the Subscription system section.
Long story short, the application processes run as vassals controlled by an emperor and subscribe to a fastrouter. A web server like Nginx uses the fastrouter as a backend. During the graceful reload the vassals unsubscribe from the fastrouter in order not to receive new requests and subscribe again after restarting.
Triggering a vassal’s graceful reload is as simple as touching its configuration file. Telling when a certain vassal has finished reloading is more complicated.
Here’s where my new project comes in. uwsgi_reload boils down to a script and example configuration files detailing the deployment of a Django project with uWSGI and Nginx. The script does not communicate with the application so it should work with most (all?) the frameworks/languages supported by uWSGI.
This is what the script does:
– it makes sure only one instance is running at any given time – with a simple and elegant file lock on its own file.
– it divides the vassals into two groups – the first is reloaded in parallel to speed things up and then the second (containing by default only one vassal) is reloaded sequentially.
– when the script exits all the vassals were restarted or the timeout was reached
These guarantees make it ideal for a deployment pipeline. If multiple developers trigger it, only one gets to execute it. It’s as fast as possible because of the parallelization and it waits for the end of the operations so you know for sure when the new version of the site is up and running.
under the hood
With all its features and configurability uWSGI is not perfect. It suffers from high complexity and documentation not always up to date. To do something as simple as finding out when a vassal is ready to accept new requests I had to look in the emperor stats for the last modification timestamp and the ‘accepting’ flag. The subscription and ‘death_mark’ status come from the fastrouter statistics. Most of the development time was spent trying to get this reliable information about a vassal’s status.
Convincing the vassals to only subscribe when the application server is ready to accept requests took some digging in mailing lists for the obscure configuration lines. Convincing Django to give up its lazy loading ways and warm up before entering the field was another story. Save yourself the trouble and use the files in the examples directory.