Django application with puppet
This post is a quick tutorial how to provision geodjango application using puppet. While writing this tutorial I have taken the approach that I start with running code and then refactor this to something better.
Firstly what is puppet? From their website :
Puppet provides a standard way of delivering and operating software, no matter where it runs. With the Puppet approach, you define what you want your apps and infrastructure to look like using a common language.
So it’s a tool for automatic deployment. Other choices are: fabric or ansible. I’ve chosen this tool first because I use it in my work as a tool for automation as well as I was keen to look more how this all works.
Puppet is different from other mentioned tools in a way it does deployment: there are two entities: puppet master and a puppet agent. Master is responsible for keeping the configuration how puppet agent should look like. When puppet is run it pulls out information from puppet master and apply to puppet agent. In other words, puppet agent doesn’t have information about its configuration directly- it pulls this from puppet master. Other tools have a different approach: to push configuration via SSH.
To play with puppet I decided to choose my project: geodjango + leaflet. As I said before to run puppet you have to have two machines: puppet master + puppet agent. Fortunately, there is a way to develop puppet modules (module is responsible for configuration of one thing: like module for PostgreSQL or APT) via vagrant.
This tool is so awesome that it allows you to have puppet master and agent on the same machine. How to do this? After installing Vagrant & VirtualBox place a file called Vagrantfile inside your project folder:
# -*- mode: ruby -*-
# vi: set ft=ruby :
Vagrant.configure(2) do |config|
config.vm.network "private_network", ip: "192.168.33.10"
config.vm.box = "ubuntu/trusty64"
config.vm.provision :shell do |shell|
shell.inline = "mkdir -p /etc/puppet/modules;
puppet module install puppetlabs-stdlib;
puppet module install ripienaar-concat;
puppet module install puppetlabs-apt;
puppet module install puppetlabs/postgresql;
puppet module install puppetlabs/vcsrepo;
puppet module install puppetlabs-git;
puppet module install arioch-redis;
puppet module install ajcrowe-supervisord;
puppet module install jfryman-nginx"
end
config.vm.provision "puppet" do |puppet|
puppet.options = ["--templatedir","/vagrant/templates"]
end
end
In this file, I set up ip address of machine: 192.168.33.10
as
well as what OS will be inside vagrant: ubuntu/trusty64
. Right after
that, I tell vagrant to execute shell commands for creating a directory
structure for puppet modules as well as install those modules that I
will need later. At the end, I tell vagrant to run puppet with template
directory. If you wanted to run this few times you can add to every
puppet module install flag --force
at the end of command like
puppet module install puppetlabs-stdlib --force;
.
Now I can move on to puppet code itself. Puppet modules have to be under
folder called manifests. The name of pp file is right now not important
so I left it as default value- default.pp
. So what is in this
file?
At the top I declared bunch of postgresql statements:
# required to postgresql resources to work
class { 'postgresql::server': }
# required by geodjango
class {'postgresql::server::postgis': }
# create db
postgresql::server::db { 'geodjango':
user => $title,
password => $title,
}
postgresql_psql { 'Add password to role':
db => 'geodjango',
command => "ALTER ROLE geodjango WITH PASSWORD 'geodjango';",
require => Postgresql::Server::Role['geodjango'],
}
# create geodjango role
postgresql::server::role {'geodjango':;}
postgresql::server::database_grant { 'grant ALL privilleges for user geodjango':
privilege => 'ALL',
db => 'geodjango',
role => 'geodjango',
}
postgresql_psql { 'Enable postgis extension':
db => 'geodjango',
command => 'CREATE EXTENSION postgis;',
unless => "SELECT extname FROM pg_extension WHERE extname ='postgis'",
require => Postgresql::Server::Db['geodjango'],
}
As you can see the puppet syntax is straightforward. To read more about
classes in puppet go
there.
I added one thing that can be not clear:
require => Postgresql::Server::Role['geodjango']
. It tells puppet that
first postgresql::server::role
resource needs to be applied. This is
how to create dependencies.
So I’ve setup database needed for geodjango application, but there are more dependencies for geodjango- GIS libraries. How to install them via puppet:
package {
'binutils': ensure => present;
'libproj-dev': ensure => present;
'gdal-bin': ensure => present;
'postgresql-server-dev-9.3': ensure => present;
'build-essential': ensure => latest;
'python3': ensure => latest;
'python3.4-dev': ensure => latest;
'python3-setuptools': ensure => latest;
'python3-pip': ensure => latest;
'python3.4-venv': ensure => latest;
'python-pip': ensure => present;
}
I’ve used redis for my application so I need it too. I’ve default config for redis and I don’t need to specify additional arguments for this resource:
class { 'redis':;}
I don’t like when application is run by root user that’s why I created a
special dedicated one only for my application. I also like to keep my
code on machines under /opt/name_of_project
path so I created this
too:
user { 'geodjango':
ensure => present,
managehome => true,
}
file { ['/opt/geodjango/','/opt/geodjango/geodjango']:
ensure => 'directory',
owner => 'geodjango'
}
For running my application I need it source code which is under git. To download it to vagrant machine I use:
include git
vcsrepo { '/opt/geodjango/geodjango':
ensure => latest,
provider => git,
source => 'https://github.com/krzysztofzuraw/geodjango-leaflet.git',
user => 'geodjango',
force => true,
}
In vcsrepo
, I added parameter force
to make sure that repo is
updated with new commits if it already exists on my deployed machine.
It’s a good practice in python word to have isolated environments per
application. In python 3 there is a tool for that in standard library
called venv. How to
create such virutal enviroment? By invoking similar command in shell:
python3 -m venv /opt/geodjango/env
As it is the command that is run in the shell, puppet has the special
resource to handling these cases: exec
. How to use it? It’s simple:
exec { 'create venv':
command => 'python3 -m venv /opt/geodjango/env',
path => '/usr/local/bin:/usr/bin:/bin',
require => Vcsrepo['/opt/geodjango/geodjango'],
}
I’m telling puppet to execute command
that is in path
. I decided
that this command will be run only when there are changes in the repo.
That’s why require
argument.
Right now I created virtual environment. It’s time to install python packages that are needed for proper operation of the whole application. I’ve used so-called requirements.txt. To install packages from that file via puppet I need:
exec { 'install requirements':
command => '/opt/geodjango/env/bin/pip install --requirement /opt/geodjango/geodjango/requirements.txt',
path => '/usr/local/bin:/usr/bin:/bin',
require => Exec['create venv']
}
I specify here full paths for pip
as well as for requirements
file.
As everything is installed I need a tool for managing my geodjango
application. I can do this by invoking django command runserver
as a
deamon. But there is a tool designed especially for that-
supervisor. How does it works? You specify in
ini file which commands needs to be run by supervisor. In addition to
that, you can see if your command run was successful or not. To use
supervisor you need:
include ::supervisord
supervisord::program { 'django':
command => '/opt/geodjango/env/bin/gunicorn geodjango_leaflet.wsgi -b 127.0.0.1:9000',
user => 'geodjango',
directory => '/opt/geodjango/geodjango',
subscribe => Vcsrepo['/opt/geodjango/geodjango'],
}
At the top, I included supervisord resource. D
at the end stands for
the daemon. Right below that I setup program django
which is a simple
gunicorn command run by user geodjango inside specified directory.
I have my app running via gunicorn managed by supervisor but there is one more thing: web server. In my apps I use nginx so I’m gonna setup that:
class {'nginx':
confd_purge => true,
vhost_purge => true,
}
$nginx_settings = {
'upstream_name' => 'geodjango',
'upstream_address' => '127.0.0.1:9000',
}
file { ["/etc/nginx/sites-available/geodjango.conf","/etc/nginx/sites-enabled/geodjango.conf" ] :
ensure => file,
content => template('nginx.erb'),
notify => Service['nginx']
}
Starting from the top: I configured class nginx to do not setup conf.d
files as well as vhost ones. Right after that, I defined puppet variable
$nginx_settings
which is a hash. I will be using this variable in
resource file
where I tell puppet to setup file in sites-available
as well as in sites-enabled
. Content of this file is present in
template nginx.erb
:
upstream <%= @nginx_settings['upstream_name'] %> {
server <%= @nginx_settings['upstream_address'] %>;
}
server {
location /static {
alias /opt/geodjango/static;
}
location / {
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_pass http://<%= @nginx_settings['upstream_name'] %>;
}
}
As you can see I use nginx_settings
inside my template. It’s because
puppet takes variables for the local scope of given module- in this case
default.pp
. It’s good to know that they are two types of templates
that puppet can use- one erb style (ruby) that I currently used in this
example and puppet style
(epp).
There are three more things to do: first to run database migrations, load initial data to the database and the third one to collect static files. I want to do them manually but here is puppet code if you are interested:
exec { 'run django migrations':
command => '/opt/geodjango/env/bin/python /opt/geodjango/geodjango/manage.py migrate --no-input',
path => '/usr/local/bin:/usr/bin:/bin',
require => Exec['install requirements'],
subscribe => Postgresql_psql['Add password to role'],
refreshonly => true,
}
exec { 'load initial data to db':
command => '/opt/geodjango/env/bin/python /opt/geodjango/geodjango/manage.py loaddata',
path => '/usr/local/bin:/usr/bin:/bin',
require => Exec['install requirements'],
subscribe => Postgresql_psql['Add password to role'],
refreshonly => true,
}
exec { 'collect static files':
command => '/opt/geodjango/env/bin/python /opt/geodjango/geodjango/manage.py collectstatic --noinput',
path => '/usr/local/bin:/usr/bin:/bin',
require => Exec['install requirements'],
subscribe => Vcsrepo['/opt/geodjango/geodjango'],
refreshonly => true,
}
All these 3 commands are django one (loaddata is made by myself). To use
them with puppet you need to specify them under exec
resource.
That’s all for this time. To sum these two articles up: I really enjoyed playing with puppet. Especially this clear syntax that puppet provides. I also like that you can even write a tests for puppet code! Having two machines (puppet master & agent) for provisioning is good because you can have real time update of your agent machine but requiers resources.
What is more I currently use vagrant with default config which is not good- not enough RAM on client machine forces puppet run to stop. I could set it up for higher value but my computer isn’t’ good enough. To bypass this I have plan to use docker with puppet master and agent. Lastly installing every time puppet modules in Vagrantfile isn’t good idea- that’s another thing to change and maybe use something like puppet-librarian?
Source code for this is avaiable here.