FabulAWS¶
FabulAWS is a tool for deploying Python web applications to autoscaling-enabled AWS EC2 environments.
Simple example¶
FabulAWS lets you create EC2 instances using a context manager in Python and easily execute work on that instance. Typical workflow might look like this:
from fabulaws.ec2 import MicroLucidInstance
with MicroLucidInstance():
run('uname -a')
If needed, you can extend the instance classes defined in the fabulaws.ec2
module as needed to further customize the instance before presenting it as
a context manager (or using it in your fab file). To do so, simply extend
the setup()
and cleanup()
methods in one of the existing classes.
Contents¶
Server Architecture¶
Prior to creating any new servers or deploying new code using FabulAWS, it’s helpful to have an overall understanding of the different components of the server architecture.
FabulAWS creates 5 different types of servers, plus an Amazon Elastic Load Balancer.
Load Balancer¶
Load balancers are created and managed in the AWS Management Console. The the following ports need to be configured:
- Port 80 forwarded to Port 80 (HTTP)
- Port 443 forwarded to port 443 (HTTPS)
The load balancer health check should be configured as follows (the defaults are fine for the values not listed):
- Ping Protocal: HTTPS
- Ping Port: 443
- Ping Path:
/healthcheck.html
Web Servers¶
Web servers are created automatically using FabulAWS. The web servers run Nginx, which proxies a lightweight Gunicorn-powered WSGI server for Django. Also running on the webservers are PgBouncer and Stunnel, which proxy connections to the database master and slave servers, both to speed up connection times and to decrease the load of creating and destroying connections on the actual database servers.
- Sample security groups: myproject-sg, myproject-web-sg
Worker Server¶
The worker server is very similar in configuration to the web servers, but it runs on a small instance type and does not have Nginx installed. Instead, it is configured to run Celery, the Python package for periodic and background task management. This server is used for tasks like creating response file exports, counting survey start and complete events as they happen, sending out scheduled mailings, and other related tasks. It exists as a separate server to isolate the web servers (which are typically expected to respond very quickly to short requests) from longer-running tasks. Some background tasks may take 5-10 minutes or more to complete.
- Sample security groups: myproject-sg, myproject-worker-sg
Cache and Queue Server¶
The cache and queue server runs Redis and RabbitMQ. Redis is used both as a cache and an HTTP session storage database. RabbitMQ handles receiving tasks from the web servers and delegating them to the worker server for completion.
- Sample security groups: myproject-sg, myproject-cache-sg, myproject-queue-sg
Database Master¶
The database master server runs PostgreSQL. It allows encrypted connections from the web and worker servers.
- Sample security groups: myproject-sg, myproject-db-sg
Database Slave¶
The database slave server also runs PostgreSQL, and is setup with streaming replication from the master database server. This results in very fast (typically less than a few seconds) of lag time between the two machines.
- Sample security groups: myproject-sg, myproject-db-sg
Autoscaling¶
Each server environment uses EC2 Auto Scaling (AS) to bring up and down new instances based on current demand. When deploying, a new AS Launch Configuration is created for the new revision of the code. The AS Group, which is created and managed largely via the EC2 console, is then updated via the API to point to the new Launch Configuration.
SSL Certificates¶
SSL certifcates for the production and staging domains can be updated and
managed via the Elastic Load Balancers in the AWS console. Internally, the
load balancer communicates with the web instances over SSL using the default
self-signed certificate that’s created on a standard Ubuntu installation
(/etc/ssl/certs/ssl-cert-snakeoil.pem
).
New Project Setup¶
AWS Configuration¶
Some configuration within the AWS console is necessary to begin using FabulAWS:
IAM User¶
First, you’ll need to create credentials via IAM that have permissions to create
servers in EC2 and manage autoscaling groups and load balancers. Amazon will provide you with a
credentials file which will contain AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
, which you
will need later in this document.
Security Groups¶
You’ll also need the following security groups. These can be renamed for your project and updated in fabulaws-config.yml.
- myproject-sg
- TCP port 22 from 0.0.0.0/0
- myproject-cache-sg
- TCP port 11211 from myproject-web-sg
- TCP port 11211 from myproject-worker-sg
- myproject-db-sg
- TCP port 5432 from myproject-web-sg
- TCP port 5432 from myproject-worker-sg
- TCP port 5432 from myproject-db-sg
- myproject-queue-sg
- TCP port 5672 from myproject-web-sg
- TCP port 5672 from myproject-worker-sg
- myproject-session-sg
- TCP port 6379 from myproject-web-sg
- TCP port 6379 from myproject-worker-sg
- myproject-web-sg
- For EC2-classic:
- TCP port 80 from amazon-elb/amazon-elb-sg
- TCP port 443 from amazon-elb/amazon-elb-sg
- For VPC-based AWS accounts:
- TCP port 80 from myproject-web-sg
- TCP port 443 from myproject-web-sg
- myproject-worker-sg
- (used only as a source - requires no additional firewall rules)
- myproject-incoming-web-sg
- TCP port 80 from any address
- TCP port 443 from any address
Load Balancer¶
You will need to create a load balancer for your instances, at least one for each environment. Note that multiple load balancers can be used if the site serves different domains (though a single load balancer can be used for a wildcard SSL certificate). Use the following parameters as a guide:
- Choose a name and set it in
fabulaws-config.yml
- Ports 80 and 443 should be mapped to 80 and 443 on the instances
- If on EC2-Classic (older AWS accounts), you can use ‘EC2-Classic’ load balancers. Note that this will cause a warning to be shown when you try to ‘Assign Security Groups’. That warning can be skipped.
- If on newer, VPC-based AWS accounts:
- Add security group myproject-incoming-web-sg to the load balancer so the load balancer can receive incoming requests.
- Add security group myproject-web-sg to the load balancer so the backend instances will accept forwarded requests from the load balancer.
- Setup an HTTPS health check on port 443 that monitors
/healthcheck.html
at your desired frequency (you’ll setup the health check URL in your app below) - Backend authentication and stickiness should be disabled
- The zones chosen should match those in
fabulaws-config.yml
(typically 2) - Configure a custom SSL certificate, if desired.
After the load balancer is created, you can set the domain name for the
associated environment fabulaws-config.yml
to your custom domain or the
default domain for the load balancer.
Auto Scaling Group¶
You will also need to create one auto scaling group per envrionment, with the following parameters:
- Choose a name and set it in
fabulaws-config.yml
- Choose an existing dummy launch config and set it with a “min” and “desired” instances of 0 to start, and a “max” of at least 4 (a higher max is fine).
- Select Advanced, choose your load balancer, and select the ELB health check
- Choose the same availability zones as for your load balancer
- You don’t need to configure scaling policies yet, but these will need to be set eventually based on experience
- You must configure the auto scaling group to tag instances like so:
- Name: myproject_<environment>_web
- deployment: myproject
- environment: <environment>
- role: web
Local Machine¶
You’ll need to make several changes to your local machine to use FabulAWS:
System Requirements¶
- Ubuntu Linux 14.04 or later
- Python 2.7
- PostgreSQL 9.3
- virtualenv and virtualenvwrapper are highly recommended
AWS API Credentials¶
First, you need to define the AWS credentials you created above in your shell environment:
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
It’s helpful to save these to a file (e.g., aws.sh
) that you can source
(. aws.sh
) each time they’re needed.
Passwords¶
Local passwords¶
A number of passwords are required during deployment. To reduce the number of
prompts that need to be answered manually, you can use a file called
fabsecrets_<environment>.py
in the top level of your repository.
If you already have a server environment setup, run the following command to get a local copy of fabsecrets_<environment>.py:
fab <environment> update_local_fabsecrets
Note: If applicable, this will not obtain a copy of the luks_passphrase
secret which for security’s sake is not stored directly on the servers. If you
will be creating new servers, this must be obtained securely from another
developer.
If this is a brand-new project, you can use the following template for
fabsecrets_<environment>.py
:
database_password = ''
broker_password = ''
smtp_password = ''
newrelic_license_key = ''
newrelic_api_key = ''
s3_secret = ''
secret_key = ''
All of these are required to be filled in before any servers can be created.
Remote passwords¶
To update passwords on the server, first retrieve a copy of fabsecrets_<environment>.py
using the above command (or from another developer) and then run the following
command:
fab <environment> update_server_passwords
Note
It’s only necessary to have a copy of fabsecrets_<environment>.py
locally if you
will be deploying new servers or updating the existing passwords on the servers.
Note
This command is really only useful on the web and worker servers. On all other servers, nothing will update the configuration files to use the new secrets.
Project Configuration¶
You’ll need to add several files to your repository, typically at the top level. You can use the following as templates:
fabfile.py¶
import logging
root_logger = logging.getLogger()
root_logger.addHandler(logging.StreamHandler())
root_logger.setLevel(logging.WARNING)
fabulaws_logger = logging.getLogger('fabulaws')
fabulaws_logger.setLevel(logging.INFO)
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
# XXX import actual commands needed
from fabulaws.library.wsgiautoscale.api import *
fabulaws-config.yml¶
instance_settings:
# http://uec-images.ubuntu.com/releases/trusty/release/
ami: ami-b2e3c6d8 # us-east-1 14.04.3 LTS 64-bit w/EBS-SSD root store
key_prefix: 'myproject-'
admin_groups: [admin, sudo]
run_upgrade: true
# Secure directories, volume, and filesystem info
secure_root: #/secure # no trailing /
secure_home: #/home/secure
fs_type: ext4
fs_encrypt: false
ubuntu_mirror: us.archive.ubuntu.com
# create swap of swap_multiplier * available RAM
swap_multiplier: 1
## REMOTE SETTINGS ##
deploy_user: myproject
webserver_user: myproject-web
database_host: localhost
database_user: dbuser
home: /home/myproject/
python: /usr/bin/python2.7
## LOCAL / PROJECT SETTINGS ##
disable_known_hosts: true
ssh_keys: deployment/users/
password_names: [database_password, broker_password, smtp_password,
newrelic_license_key, newrelic_api_key, s3_secret,
secret_key]
project: myproject
wsgi_app: myproject.wsgi:application
requirements_file: requirements/app.txt
requirements_sdists:
settings_managepy: myproject.local_settings
static_html:
upgrade_message: deployment/templates/html/503.html
healthcheck_override: deployment/templates/html/healthcheck.html
localsettings_template: deployment/templates/local_settings.py
logstash_config: deployment/templates/logstash.conf
# Set gelf_log_host to the host of your Graylog2 server (or other GELF log
# receiver)
# gelf_log_host: hostname
# Set syslog_server to a "hostname:port" (quote marks required due
# to the ":" in there) and server logs will be forwarded there using
# syslog protocol. "hostname:port" could be e.g. papertrail or a
# similar service.
# (You might want to set this in fabsecrets instead of here.)
# syslog_server: "hostname:port"
# You can alternatively supply a multi-line config for rsyslog as follows
# (e.g., in the event you need to enable TLS). For more information, see:
# http://www.rsyslog.com/doc/v8-stable/tutorials/tls_cert_client.html#sample-syslog-conf
# syslog_server: |
# # make gtls driver the default
# $DefaultNetstreamDriver gtls
#
# # certificate files
# $DefaultNetstreamDriverCAFile /rsyslog/protected/ca.pem
# $DefaultNetstreamDriverCertFile /rsyslog/protected/machine-cert.pem
# $DefaultNetstreamDriverKeyFile /rsyslog/protected/machine-key.pem
#
# $ActionSendStreamDriverAuthMode x509/name
# $ActionSendStreamDriverPermittedPeer central.example.net
# $ActionSendStreamDriverMode 1 # run driver in TLS-only mode
# *.* @@central.example.net:10514 # forward everything to remote server
# Set awslogs_access_key_id to the AWS_ACCESS_KEY_ID of the user with
# permissions to create log groups, log streams, and log events. For help
# setting up this role, see:
# http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/QuickStartEC2Instance.html
# NOTE: You will also need to set awslogs_secret_access_key in your
# fabsecrets_<environment>.py file
# awslogs_access_key_id: AK....
# Set extra_log_files to a list of log files you want to monitor, in addition
# to the default logs monitored by FabulAWS itself:
# extra_log_files:
# /path/to/file:
# tag: mytag
# date_format: '%Y-%m-%d %H:%M:%S'
backup_key_fingerprint:
vcs_cmd: git # or hg
latest_changeset_cmd: git rev-parse HEAD # hg id -i # or git rev-parse HEAD
repo: git@github.com:username/myproject.git
# Mapping of Fabric deployments and environments to the Mercurial branch names
# that should be deployed.
branches:
myproject:
production: master
staging: master
testing: master
## SERVER SETTINGS ##
# Local server port for pgbouncer
pgbouncer_port: 5432
# Version of Less to install
less_version: 2.5.3
# Local server ports used by Gunicorn (the Django apps server)
server_ports:
staging: 8000
production: 8001
testing: 8002
# Whether we're hosting static files on our webservers ('local')
# or somewhere else ('remote')
static_hosting: remote
# Mapping of celery worker names to options
# The worker name (key) can be any text of your choosing. The value should
# be any additional options you'd like to pass to celeryd, such as specifying
# the concurrency and queue name(s)
celery_workers:
main: -c 10 -Q celeryd
# Start this many Gunicorn workers for each CPU core
gunicorn_worker_multiplier: 8
# Mapping of environment names to domain names. Used to update the
# primary site in the database after a refresh and to set ALLOWED_HOSTS
# Note that the first domain in the list must not be a wildcard as it
# is used to update a Site object in the database.
# Wildcard format used per ALLOWED_HOSTS setting
site_domains_map:
production:
- dualstack.myproject-production-1-12345.us-east-1.elb.amazonaws.com
staging:
- dualstack.myproject-staging-1-12345.us-east-1.elb.amazonaws.com
testing:
- dualstack.myproject-testing-1-12345.us-east-1.elb.amazonaws.com
## ENVIRONMENT / ROLE SETTINGS ##
default_deployment: myproject
deployments:
- myproject
environments:
- staging
- production
- testing
production_environments:
- production
valid_roles:
- cache
- db-master
- db-slave
- web
- worker
## AWS SETTINGS ##
region: us-east-1
avail_zones:
- e
- c
# Mapping of role to security group(s):
security_groups:
db-master: [myproject-sg, myproject-db-sg]
db-slave: [myproject-sg, myproject-db-sg]
cache: [myproject-sg, myproject-session-sg, myproject-cache-sg, myproject-queue-sg]
worker: [myproject-sg, myproject-worker-sg]
web: [myproject-sg, myproject-web-sg]
# Mapping of environment and role to EC2 instance types (sizes)
instance_types:
production:
cache: c3.large
db-master: m3.xlarge
db-slave: m3.xlarge
web: c3.large
worker: m3.large
staging:
cache: t1.micro
db-master: m1.small
db-slave: m1.small
web: m1.small
worker: m3.large
testing:
cache: t1.micro
db-master: t1.micro
db-slave: t1.micro
web: m1.small
worker: m1.small
# Mapping of Fabric environment names to AWS load balancer names. Load
# balancers can be configured in the AWS Management Console.
load_balancers:
myproject:
production:
- myproject-production-lb
staging:
- myproject-staging-lb
testing:
- myproject-testing-lb
# Mapping of Fabric environment names to AWS auto scaling group names. Auto
# scaling groups can be configured in the AWS Management Console.
auto_scaling_groups:
myproject:
production: myproject-production-ag
staging: myproject-staging-ag
testing: myproject-testing-ag
# Mapping of Fabric environment and role to Elastic Block Device sizes (in GB)
volume_sizes:
production:
cache: 10
db-master: 100
db-slave: 100
web: 10
worker: 50
staging:
cache: 10
db-master: 100
db-slave: 100
web: 10
worker: 50
testing:
cache: 10
db-master: 100
db-slave: 100
web: 10
worker: 50
# Mapping of Fabric environment and role to Elastic Block Device volume types
# Use SSD-backed storage (gp2) for all servers. Change to 'standard' for slower
# magnetic storage.
volume_types:
cache: gp2
db-master: gp2
db-slave: gp2
web: gp2
worker: gp2
app_server_packages:
- python2.7-dev
- libpq-dev
- libmemcached-dev
- supervisor
- mercurial
- git
- build-essential
- stunnel4
- pgbouncer
db_settings:
# for help adjusting these settings, see:
# http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
# http://wiki.postgresql.org/wiki/Number_Of_Database_Connections
# http://thebuild.com/presentations/not-my-job-djangocon-us.pdf
postgresql_settings:
# Settings to apply to Postgres servers
# You can put anything here from postgresql.conf
# connections
max_connections: '80' # _active_ connections are limited by pgbouncer
# replication settings
wal_level: 'hot_standby'
hot_standby: 'on'
hot_standby_feedback: 'on'
max_wal_senders: '3'
wal_keep_segments: '3000' # during client deletion 50 or more may be generated per minute; this allows an hour
# resources - let pgtune set these based on actual machine resources
# shared_buffers: '8GB' # 25% of available RAM, up to 8GB
# work_mem: '750MB' # (2*RAM)/max_connections
# maintenance_work_mem': '1GB' # RAM/16 up to 1GB; high values aren't that helpful
# effective_cache_size': '48GB' # between 50-75%, should equal free + cached values in `top`
# checkpoint settings
wal_buffers: '16MB'
checkpoint_completion_target: '0.9'
checkpoint_timeout: '10min'
checkpoint_segments: '256' # if checkpoints are happening more often than the timeout, increase this up to 256
# logging
log_min_duration_statement: '500'
log_checkpoints: 'on'
log_lock_waits: 'on'
log_temp_files: '0'
# write optimizations
commit_delay: '4000' # delay each commit this many microseconds in case we can do a group commit
commit_siblings: '5' # only delay if at least N transactions are in process
# index usage optimizations
random_page_cost: '2' # our DB servers have a lot of RAM and may tend to prefer Seq Scans if this is too high
# More Postgres-related settings.
# How to install Postgres:
postgresql_packages:
- postgresql
- libpq-dev
# Whether and how to apply pgtune
postgresql_tune: true
postgresql_tune_type: Web
# Kernel sysctl settings to change
postgresql_shmmax: 107374182400 # 100 GB
postgresql_shmall: 26214400 # 100 GB / PAGE_SIZE (4096)
# Networks to allow connections from
postgresql_networks:
- '10.0.0.0/8'
- '172.16.0.0/12'
# Whether to disable the Linux out-of-memory killer
postgresql_disable_oom: true
local_settings.py¶
This file should be placed at the location specified in fabulaws-config.yml
,
typically deployment/templates/local_settings.py
.
from myproject.settings import *
DEBUG = False
# logging settings
#LOGGING['filters']['static_fields']['fields']['deployment'] = '{{ deployment_tag }}'
#LOGGING['filters']['static_fields']['fields']['environment'] = '{{ environment }}'
#LOGGING['filters']['static_fields']['fields']['role'] = '{{ current_role }}'
AWS_STORAGE_BUCKET_NAME = '{{ staticfiles_s3_bucket }}'
AWS_ACCESS_KEY_ID = 'YOUR-KEY-HERE'
AWS_SECRET_ACCESS_KEY = "{{ s3_secret }}"
SECRET_KEY = "{{ secret_key }}"
# Tell django-storages that when coming up with the URL for an item in S3 storage, keep
# it simple - just use this domain plus the path. (If this isn't set, things get complicated).
# This controls how the `static` template tag from `staticfiles` gets expanded, if you're using it.
# We also use it in the next setting.
AWS_S3_CUSTOM_DOMAIN = '%s.s3.amazonaws.com' % AWS_STORAGE_BUCKET_NAME
# This is used by the `static` template tag from `static`, if you're using that. Or if anything else
# refers directly to STATIC_URL. So it's safest to always set it.
STATIC_URL = "https://%s/" % AWS_S3_CUSTOM_DOMAIN
# Tell the staticfiles app to use S3Boto storage when writing the collected static files (when
# you run `collectstatic`).
STATICFILES_STORAGE = 'storages.backends.s3boto.S3BotoStorage'
# Auto-create the bucket if it doesn't exist
AWS_AUTO_CREATE_BUCKET = True
AWS_HEADERS = { # see http://developer.yahoo.com/performance/rules.html#expires
'Expires': 'Thu, 31 Dec 2099 20:00:00 GMT',
'Cache-Control': 'max-age=94608000',
}
# Having AWS_PRELOAD_META turned on breaks django-storages/s3 -
# saving a new file doesn't update the metadata and exists() returns False
#AWS_PRELOAD_METADATA = True
# database settings
DATABASES = {
{% for server in all_databases %}
'{{ server.database_key }}': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': '{{ server.database_local_name }}',
'USER': '{{ database_user }}',
'PASSWORD': '{{ database_password }}',
'HOST': 'localhost',
'PORT': '{{ pgbouncer_port }}',
},{% endfor %}
}
# django-balancer settings
DATABASE_POOL = {
{% for server in slave_databases %}
'{{ server.database_key }}': 1,{% endfor %}
}
MASTER_DATABASE = '{{ master_database.database_key }}'
# media roots
MEDIA_ROOT = "{{ media_root }}"
STATIC_ROOT = "{{ static_root }}"
# email settings
EMAIL_HOST_PASSWORD = '{{ smtp_password }}'
EMAIL_SUBJECT_PREFIX = '[{{ deployment_tag }} {{ environment }}] '
# Redis DB map:
# 0 = cache
# 1 = unused (formerly celery task queue)
# 2 = celery results
# 3 = session store
# 4-16 = (free)
# Cache settings
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
'LOCATION': '{{ cache_server.internal_ip }}:11211',
'VERSION': '{{ current_changeset }}',
},
'session': {
'BACKEND': 'redis_cache.RedisCache',
'LOCATION': '{{ cache_server.internal_ip }}:6379',
'OPTIONS': {
'DB': 3,
},
},
}
# Task queue settings
# see https://github.com/ask/celery/issues/436
BROKER_URL = "amqp://{{ deploy_user }}:{{ broker_password }}@{{ cache_server.internal_ip }}:5672/{{ vhost }}"
BROKER_CONNECTION_TIMEOUT = 4
BROKER_POOL_LIMIT = 10
CELERY_RESULT_BACKEND = "redis://{{ cache_server.internal_ip }}:6379/2"
# Session settings
SESSION_ENGINE = 'django.contrib.sessions.backends.cache'
SESSION_CACHE_ALIAS = 'session'
# django-compressor settings
COMPRESS_URL = STATIC_URL
# Use MEDIA_ROOT rather than STATIC_ROOT because it already exists and is
# writable on the server.
COMPRESS_ROOT = MEDIA_ROOT
COMPRESS_STORAGE = STATICFILES_STORAGE
COMPRESS_OFFLINE = True
COMPRESS_OFFLINE_MANIFEST = 'manifest-{{ current_changeset }}.json'
COMPRESS_ENABLED = True
ALLOWED_HOSTS = [{% for host in allowed_hosts %}'{{ host }}', {% endfor %}]
SSH keys¶
Before attempting to deploy for the first time, you should add your SSH public key
to a file named deployment/users/<yourusername>
in the repository. This path
can also be configured in fabulaws-config.yml
. Multiple SSH keys are permitted
per file, and additional files can be added for each username (developer).
Django Settings¶
FabulAWS uses django_compressor and django-storages to store media on S3. The following
settings changes are required in your base settings.py
:
compressor
,storages
, anddjcelery
should be added to yourINSTALLED_APPS
.- Add the following to the end of your
settings.py
, modifying as needed:
# Celery settings
import djcelery
from celery.schedules import crontab
djcelery.setup_loader()
CELERY_SEND_TASK_ERROR_EMAILS = True
# List of finder classes that know how to find static files in
# various locations.
STATICFILES_FINDERS = (
'django.contrib.staticfiles.finders.FileSystemFinder',
'django.contrib.staticfiles.finders.AppDirectoriesFinder',
'compressor.finders.CompressorFinder',
)
STATIC_ROOT = os.path.join(BASE_DIR, 'static')
COMPRESS_ENABLED = False # enable in local_settings.py if needed
COMPRESS_CSS_HASHING_METHOD = 'hash'
COMPRESS_PRECOMPILERS = (
('text/less', 'lessc {infile} {outfile}'),
)
wsgi.py¶
You’ll need to change the default DJANGO_SETTINGS_MODULE
in your project’s
wsgi.py
to myproject.local_settings
.
Static HTML¶
You need to create two static HTML files, one for displaying an upgrade message while you’re deploying to your site, and one to serve as a “dummy” health check to keep instances in your load balancer healthy while deploying.
The paths to these files can be configured in the static_html
dictionary
in your fabulaws-config.yml
:
static_html:
upgrade_message: deployment/templates/html/503.html
healthcheck_override: deployment/templates/html/healthcheck.html
The 503.html
file can contain anything you’d like. We recommend something
distictive so that you can tell if your health check is being served by Django
or the “dummy” health check html file, e.g.: OK (nginx override)
Similarly, the healthcheck.html
can contain anything you’d like, either
something as simple as Upgrade in progress. Please check back later.
or
a complete HTML file complete with stylesheets and images to display a “pretty”
upgrade-in-progress message.
Basic Auth¶
If you want to add HTTP Basic Auth to a site, add a section to fabulaws-config.yml
like this:
# Any sites that need basic auth
# This is NOT intended to provide very high security.
use_basic_auth:
testing: True
anotherenv: True
Add basic_auth_username
and basic_auth_password
to password_names
:
password_names: [a, b, c, ..., basic_auth_username, basic_auth_password]
And add the desired username and password to each environment secrets file:
basic_auth_username: user1
basic_auth_password: password1
You’ll need to add these entries to all secrets files; just set them to an empty string for environments where you are not using basic auth.
Then in the testing
and anotherenv
environments, fabulaws will apply
basic auth to the sites. For testing, user user1
will be able to use password
password1
, and so forth.
Note
Fabulaws will also turn off Basic Auth for the health check URL so that the load balancer
can access it. It assumes that the health check URL is /healthcheck.html
and that Django will
be serving the health check URL (rather than being served as a static file directly by Nginx, for
example). If either of those assumptions are not correct, you will need to tweak it by copying
and modifying the template for nginx.conf.
Health Check¶
You’ll need to configure a health check within Django as well. Following is a sample you can use.
Add to views.py
:
import logging
from django.db import connections
from django.http import HttpResponse, HttpResponseServerError
def health_check(request):
"""
Health check for the load balancer.
"""
logger = logging.getLogger('fabutest.views.health_check')
db_errors = []
for conn_name in connections:
conn = connections[conn_name]
try:
cursor = conn.cursor()
cursor.execute('SELECT 1')
row = cursor.fetchone()
assert row[0] == 1
except Exception, e:
# note that there doesn't seem to be a way to pass a timeout to
# psycopg2 through Django, so this will likely not raise a timeout
# exception
logger.warning('Caught error checking database connection "{0}"'
''.format(conn_name), exc_info=True)
db_errors.append(e)
if not db_errors:
return HttpResponse('OK')
else:
return HttpResponseServerError('Configuration Error')
Add lines similar to those highlighted below to your urls.py
:
from django.conf.urls import include, url
from django.contrib import admin
from fabutest import views as fabutest_views
urlpatterns = [
url(r'^admin/', include(admin.site.urls)),
url(r'^healthcheck.html$', fabutest_views.health_check),
]
Python Requirements¶
The following are the minimum Python requirements for deploying a web application using FabulAWS (update version numbers as needed):
Django==1.8.8
psycopg2==2.6.1
pytz==2015.7
django-celery==3.1.17
celery==3.1.19
gunicorn==19.4.5
django-balancer==0.4
boto==2.39.0
django-storages==1.1.8
django-compressor==2.0
python-memcached==1.57
redis==2.10.5
django-redis-cache==1.6.5
django-cache-machine==0.9.1
newrelic==2.60.0.46
In addition, the following requirements are needed for deployment:
fabric==1.10.2
boto==2.39.0
pyyaml==3.11
argyle==0.2.1
First Deployment¶
Once you have your EC2 environment and project configured, it’s time to create your initial server environment.
To create a new instance of the testing environment, you can use the
create_environment
command to Fabric, like so:
fab create_environment:myproject,testing
In addition to the console, be sure to inspect the log files generated (*.out
in the current directory) to troubleshoot any problems that may arise.
For more information, please refer to the Deployment documentation.
Deployment¶
FabulAWS uses Fabric for deployment, with which some familiarity is strongly recommended. This page assumes that you’ve completed the necessary setup described in the Server Architecture. You will also need a local development environment setup as described in New Project Setup.
Important
When deploying to an environment, your local copy of the code
should be up to date and must also have a checkout of the correct branch.
Environments can be mapped to branches in the fabulaws-config.yml
file.
Testing environment¶
The testing environment is usually a temporary environment that can be used for load testing, deployment testing, or other activities that would otherwise negatively impact the primary environment for testing new features (see Staging environment below).
To create a new instance of the testing environment, you can use the
create_environment
command to Fabric, like so:
fab create_environment:myproject,testing
Prior to running this command, be certain that all prior instances of testing servers have been terminated via the AWS console. This command will create all the required servers in parallel. To avoid difficulty with determining which server failed to be created when a problem is encountered, the logs for server creation are saved to separate files. Always check these files to ensure that the servers were created successfully.
Staging environment¶
The staging environment is typically used for testing and quality assurance of new features. It also serves as a testing ground for doing the deployment itself. New features (even small bug fixes) should be deployed to and tested on the staging environment prior to being deployed to the production environment.
The staging environment is usually a copy of the production environment running on smaller (cheaper) virtual machines at EC2. It also typically contains a recent snapshot of the production database, so any issues specific to the production environment can be tested on staging without affecting usability of the production site.
Production environment¶
The production environment is typically hosts the live servers in use by the the application’s end-users.
Deployment methods¶
Since FabulAWS uses Amazon Autoscaling, special care must be taken to update the autoscaling image at the same time as new code is deployed.
Autoscaling: Updating the image¶
Because AMI creation can be a time-intensive part of the process, it can be done separately ahead of time to prepare for a deployment.
To create an autoscaling AMI and launch configuration based on the current version of the code (from the appropriate branch - see above), run the following command:
fab create_launch_config_for_deployment:myproject,<environment>
This command will print out the name of the created launch configuration, which can be passed into the associated autoscaling deployment methods below. If needed, the launch configuration names and associated images can also be found via the AWS console.
Autoscaling: Full deployment¶
A “full” deployment should be used any time there are backwards-incompatible updates to the application, i.e., when having two versions of the code running simultaneously on different servers might have damaging results or raise errors for users of the site. Note that this type of deployment requires downtime, which may need to be scheduled ahead of time depending on which environment is impacted.
With autoscaling, a full deployment works as follows:
- First, the autoscaling group’s ability to add new instances to the load balancer is suspended, a new launch configuration for the new version of the code is installed, and the desired number of instances for the group is doubled. This has the effect of spinning up all the new required instances without adding them to the load balancer.
- Once those instances have been created, the “upgrade in progress” message
is displayed on all the servers,
deploy_worker
is run to update the database schema and any static media, and the autoscaling group’s ability to add instances to the load balancer is resumed. The process then waits for all instances to be healthy in the load balancer. - Finally, the old instances in the group are terminated, and the “upgrade in progress” message is removed from the new servers.
The syntax for completing a full deployment is as follows:
fab deploy_full:myproject,<environment>[,<launch config name>]
The launch configuration name is optional, and one will be created automatically if not specified.
Note
This command does not update secrets from your local file to the servers. If you want to do that,
explicitly run fab <environment> update_server_passwords
before running this command.
Autoscaling: Serial deployment¶
A “serial” deployment can be used any time the changes being deployed are minimal enough that having both versions of the code running simultaneously will not cause problems. This is usually the case any time there are minor, code-only (non-schema) updates. Each server points to a separate copy of the static media specific to the version of the code that it’s running, so backwards incompatible CSS and JavaScript changes can safely be deployed serially.
Serial deployments with autoscaling work by gradually marking instances in the autoscaling group as unhealthy, and then waiting for the group to create a new, healthy instance before proceeding. A serial deployment can be started as follows:
fab deploy_serial:cmyproject,<environment>[,<launch config name>]
Again, the launch config is optional and one will be created automatically if not specified.
Note
This command does not update secrets from your local file to the servers. If you want to do that,
explicitly run fab <environment> update_server_passwords
before running this command.
Note
You may see errors that look like this while running a serial deployment:
400 Bad Request
<ErrorResponse xmlns="http://elasticloadbalancing.amazonaws.com/doc/2012-06-01/">
<Error>
<Type>Sender</Type>
<Code>InvalidInstance</Code>
<Message>Could not find EC2 instance i-1bb70c35.</Message>
</Error>
<RequestId>9b3dc6a5-850e-11e3-9e35-b9e8294315ba</RequestId>
</ErrorResponse>
These errors are expected and simply mean that the elastic load balancer is not yet aware of the newly created instance.
Suspending and restarting autoscaling processes¶
If for any reason autoscaling needs to be suspended, this can be accomplished through Fabric. To suspend all autoscaling processes, simply run:
fab suspend_autoscaling_processes:myproject,<environment>
To resume autoscaling once any issues have been resolved, run:
fab resume_autoscaling_processes:myproject,<environment>
A note about usernames¶
If you get a prompt that looks something like this when you attempt to deploy, it’s quite possible that you’re giving the remote server the wrong username (or you don’t have access to the servers to begin with):
[ec2-23-22-145-188.compute-1.amazonaws.com] Passphrase for private key:
When deploying to any environment, if your local username is different from the username you use to login to the remote server, you need to give Fabric a username on the command line, like so:
fab -u <remoteusername> <environment> <commands>
Maintenance Tasks¶
Describing an environment¶
While performing maintenance on an environment, it’s sometimes helpful to know
exactly what servers are in that environment and what their load balancer
status is, if any. To get a list of all servers in a given environment and
print some basic meta data about those servers, you can use the describe
command to Fabric, like so:
fab describe:myproject,<environment>
Adding new sysadmin users¶
If you don’t have access to the servers yet, add your SSH public key in the deployment/users/ directory. To avoid having to pass a -u argument to fabric on every deploy, make the name of the file identical to your local username. Then ask someone who has access to run this command:
fab staging update_sysadmin_users
Updating New Relic keys¶
To update the New Relic API and License keys, first find the new keys from the new account. The License Key can be found from the main account page, and the API key can be found via these instructions: https://docs.newrelic.com/docs/apis/api-key
Next, make sure your local fabsecrets_<environment>.py file is up to date:
fab production update_local_fabsecrets
Next, update the newrelic_license_key
and newrelic_api_key
values
inside the fabsecrets_<environment>.py
file with the new values. Then, update the keys
on the servers:
fab staging update_server_passwords
fab production update_server_passwords
Finally, update the configuration files containing the New Relic keys and restart the Celery and Gunicorn processes:
fab update_newrelic_keys:myproject,staging
fab update_newrelic_keys:myproject,production
Note this short method of updating the configuration files involves a brief moment of downtime (10-20 seconds). If no downtime is desired, you can achieve the same result by repeating the following commands for each environment, as needed (but it will take much longer, i.e., 30-60 minutes):
fab production upload_newrelic_sysmon_conf
fab production upload_newrelic_conf
fab deploy_serial:myproject,production
Copying the database from production to staging or testing¶
To copy the production database on the staging server, run the following command:
fab staging reload_production_db
This will drop the current staging DB, create a new database, load it with a copy of the current production data, and then run any migrations not yet run on that database. The same command will work on the testing environment by replacing “staging” with “testing”. Internally, autoscaling is suspended and an upgrade message is displayed on the servers while this command is in progress.
Fixing an issue with broken site icons¶
If the button icons on the site appear as text rather than as images, there is probably an issue with the CORS configuration for the underlying S3 bucket that serves the font used to show these icons. To correct this, follow these steps:
First, navigate to the S3 bucket in the AWS Console, and click the Properties tab
Next, expand the Permissions section and then click Add CORS Configuration. The text in the popup should look something like this:
<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<CORSRule>
<AllowedOrigin>*</AllowedOrigin>
<AllowedMethod>GET</AllowedMethod>
<MaxAgeSeconds>3000</MaxAgeSeconds>
<AllowedHeader>Authorization</AllowedHeader>
</CORSRule>
</CORSConfiguration>
Finally, click the Save button to add the configuration. This step is important; while it may appear that the configuration is already correct, it needs to be saved before it will be added by S3.
Stopping EC2 machines while not in use¶
Some types of instances, included db-master, db-slave, and worker servers, can be stopped via the AWS console, later restarted, and then reconfigured by running the following commands (in order):
fab <environment> mount_encrypted:roles=db-master
fab <environment> mount_encrypted:roles=db-slave
fab <environment> mount_encrypted:roles=worker
The cache server, due to an intricacy with how RabbitMQ stores its data and configuration files, must be completely terminated and recreated (it does not support changing the host’s IP address). For more information, see: http://serverfault.com/questions/337982/how-do-i-restart-rabbitmq-after-switching-machines
Web servers are managed via Amazon Auto Scaling. To terminate all web servers, simply navigate to the AWS Auto Scaling Group and set the Minimum, Desired, and Maximum number of instances to zero. Failure to complete this step may result in the Auto Scaling Group perpetually attempting to bring up new web servers and failing because no database servers exist.
Resizing servers or recreating an environment¶
An entire environment can be recreated, optionally with different server sizes, with a single command. Note that this command takes a long time to run (30-60 minutes or even several hours, depending on the size of the database). For this reason, it is beneficial to clean out the database (see above) before downsizing the servers because copying the database from server to server takes a significant portion of this time. That said, the environment will not be down or inaccessible for this entire time; rather, the script does everything in an order that minimizes the downtime required. For a typical set of smaller servers and an empty database, the downtime will usually be less than 2 minutes.
If you’d like to resize an environment, first edit the instance_types
dictionary in fabulaws-config.yml
to the sizes you’d like for the servers.
Here are the minimum sizes for each server type:
- cache:
m1.small
- db-master:
m1.small
- db-slave:
m1.small
- web:
m1.small
- worker:
m1.medium
Once the sizes have (optionally) been adjusted, you can recreate the environment like so:
fab recreate_servers:myproject,production
Updating Dependencies¶
To circumvent the inevitable issues with PyPI during deployment, sdists for all
dependencies needed in the staging and production environments must be added to
the requirements/sdists/
directory. This means that, whenever you change in
requirements/apps.txt
, you should make a corresponding change to the
requirements/sdists/
directory.
Adding or updating a single package¶
To download a single sdist for a new or updated package, run the following
command, where package-name==0.0.0
is a copy of the line that you added to
requirements/apps.txt
:
pip install package-name==0.0.0 -d requirements/sdists/
After downloading the new package, remove the outdated version from version control, and add the new one along with the change to apps.txt.
Repopulating the entire sdists/ directory¶
You can also repopulate the entire sdists directory as follows:
cd requirements/
mkdir sdists_new/
pip install -r apps.txt -d sdists_new/
rm -rf sdists/
mv sdists_new/ sdists/
Upgrading system packages¶
Since the site uses Amazon Auto Scaling, to ensure the servers have the latest versions of Ubuntu packages we first need to update the web server image. This can be done by running a new deployment, like so:
fab deploy_serial:myproject,<environment>
Upgrading Ubuntu packages on the persistent (non-web) servers can be done with
the upgrade_packages
Fabric command. Before upgrading, it’s best to take
the site offline and put it in upgrade mode to avoid any unexpected error pages
while services are restarted:
fab <environment> begin_upgrade
Once the site is in upgrade mode, you can update packages on the servers as follows:
fab <environment> upgrade_packages
This command will connect to the servers one by one, run apt-get update
,
install any new packages needed by the web servers, and then run
apt-get upgrade
. You will be prompted to accept any upgrades that need to
take place, so you will have the opportunity to cancel the upgrade if needed
for any reason.
After verifying that the packages have installed successfully, you can bring the site back online like so:
fab <environment> end_upgrade
Note that upgrading may take some time, depending on the number of servers and size of the upgrades, so it’s best to schedule this during an off-hours maintenance window.
Troubleshooting server issues¶
Managing SSH host keys¶
Amazon will regularly reuse IP addresses for servers, which can cause conflicts
with your local ssh host keys (~/.ssh/known_hosts
on most systems). If you
see a message like this while creating a server, you’ll know you’re affected by
this:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
If you see this message, you will need to terminate the server being created,
delete the host key (for both the hostname and the IP address) from your
known_hosts
file, and re-create the server.
Resetting a troubled environment¶
If all the servers and services in an environment appear to be running properly but the web frontend or worker still isn’t functioning, one of the quickest ways to “reset” an environment is to run a deployment. Before attempting more extreme measures, you can run a deployment to get all the config files in sync and restart many of the services that make up the application, like so:
fab <environment> begin_upgrade deploy_worker deploy_web end_upgrade
Fixing Celery when it won’t stop¶
Celery occassionally gets into a state when it won’t stop and may be pegging
the CPU. If this happens (e.g., if the deploy_worker
command hangs
indefinitely while stopping the Celery workers), you may need to SSH to the
worker server manually and run:
sudo killall -9 celery
If this doesn’t work, you can revert to manually finding the PIDs of the stuck
Celery processes in ps auxww
or top
and killing them with:
sudo kill -9 <PID>
After doing this, be sure to run deploy_worker
(or if it was already
running, let it complete) so as to restore Celery to a running state again.
Master database goes down¶
If the master database goes down, manually make sure it’s permanently lost before converting a slave into the master. At this point you probably also want to enable the pretty upgrade message on all the web servers:
fab <environment> begin_upgrade
Any slave can be “promoted” to the master role via fabric, as follows:
fab <environment> promote_slave
This will tag the old master as “decommissioned” in AWS, tag the slave as the new master, and then run the Postgres command to promote a slave to the master role.
After promoting a slave, you need to reconfigure all the web servers to use the new master database. The easiest way to do that is through a deployment:
fab <environment> deploy_worker deploy_web
If you had more than one slave database before promoting a slave, the additional
slaves need to be reset to stream from the new master. This can be accomplished
with the reset_slaves
command:
fab <environment> reset_slaves
Once complete, you can disable the upgrade message and resume usage of the site:
fab <environment> end_upgrade
Slave database goes down¶
If a slave database goes down, first enable the pretty upgrade message on all the web servers:
fab <environment> begin_upgrade
The site can operate in a degraded state with only a master database. To do that, navigate to the AWS console and stop or re-tag the old slave server so it can no longer be discovered by Fabric. Then, run a deployment to update the local settings files on all the web servers:
fab <environment> deploy_worker deploy_web
Once complete, you can disable the upgrade message and resume usage of the site:
fab <environment> end_upgrade
Adding a new slave¶
If a slave database is lost (either due to promotion to the master role or because it was itself lost), it is desirable to return the application to having two or more database servers as soon as possible. To add a new slave database to the Postgres cluster, first create a new server as follows:
fab new:myproject,<environment>,db-slave,X
where X is the availability zone in which you wish to create the server (it should be created in a zone that doesn’t already have a database server, or has the fewest database servers).
Next, configure the web servers to begin using the new slave by doing a serial deployment:
fab deploy_serial:myproject,<environment>
This will take the web servers down one at a time, deploy the latest code, and update the settings file to use the newly added database server.
Slave database loses replication connection¶
While PostgreSQL administration is outside the scope of this guide, if you have determined that a slave database has lost the replication connection to the master database and you prefer not to simply create a new slave database server, you can re-sync the slave(s) with the master with the following command:
fab <environment> reset_slaves
Web server dies¶
Web servers are disposable, and are automatically recreated by via autoscaling if they become unhealthy.
Worker server dies¶
Worker servers are also disposable, so the easiest way to recover from one dying is simply to destroy it and create another. To destroy the instance, make sure that it’s really dead (try SSHing to it and/or rebooting it from the AWS console). If all else fails, you can terminate the instance from the console (unless you want to leave it around to troubleshoot what went wrong).
Adding a new worker server¶
Creating a new worker server works the same as creating a web server:
fab new:myproject,<environment>,worker,X
where X is the availability zone in which you wish to create the server.
After creating the worker, you will also need to update it with correct settings file and start the worker processes. This can be done by running:
fab <environment> deploy_worker
Cache service goes down¶
If one of the services (e.g., RabbitMQ or Redis) simply dies on the cache server, SSH to that machine and attempt to start it by hand. RabbitMQ has been known on at least one occasion to have shutdown by itself for no apparent reason.
Cache server (RabbitMQ and Redis) fails¶
If the cache server fails, the web site will be inaccessible until a new server is created because the site relies on using Redis as a session store. As such, first display the pretty upgrade message on the servers:
fab <environment> begin_upgrade
Now, create a new cache server as follows:
fab new:myproject,<environment>,cache,X
where X is the availability zone in which you wish to create the server. Typically this should be one of the two zones that the web servers reside in.
While the new server is being created, navigate to the AWS console and stop or re-tag the old cache server so it can no longer be discovered by Fabric.
Once the new server has finished building, update the configuration on all the servers by running a deployment:
fab <environment> deploy_worker deploy_web
When that’s complete, disable the upgrade message on the web servers:
fab <environment> end_upgrade
Web servers churning during a deploy¶
If you see web servers being launched, but then being terminated before they come into service, this is usually due to a problem with the load balancer not receiving a healthy response from the health check. If the web server is returning a 500 error, you should hopefully get an error email, which will help you debug the problem. If you get a 4xx error, you may not, so you might not even be aware that the web servers are churning. Once you are aware, suspend autoscaling:
fab suspend_autoscaling_processes:myproject,<environment>
SSH into the web server in question. Look at the
/home/myproject/www/{environment}/log/access.log
and see what HTTP status code is being returned
to the load balancer.
- 401 errors mean the load balancer is getting a Basic Auth check which it is failing.
- 404 errors mean the health check URL is incorrectly configured, either due to a misconfiguration in Nginx or in Django.
Remember to resume autoscaling once you have fixed the problem:
fab resume_autoscaling_processes:myproject,<environment>
Using FabulAWS in your fab file¶
FabulAWS uses Fabric internally to communicate with newly-created servers, so it follows naturally that you can use FabulAWS in your fab files to create new servers and deploy code to them.
Simple Fabric example¶
Adding fabulaws
to an existing fab file can be as simple as importing
and instantiating an EC2 instance class, e.g.:
from fabric.api import *
from fabulaws.ec2 import MicroUbuntuInstance
def new_instance():
i = MicroUbuntuInstance()
env.hosts = [i.hostname]
def bootstrap():
run('git clone %s' % env.repo)
The new_instance
method creates a new Amazon EC2 instance and gives you
access to the hostname of that newly created instance, so running:
fab new_instance bootstrap
Connecting to EC2...
would create a new copy of that instance on Amazon, using the API key in your shell environment.
Tagging instances¶
To make it easier to keep track of your instances on EC2, you can tag them
with your environment (e.g., 'staging'
or 'production'
), as well as
something that identifies the product or group of servers that you’re
deploying:
from fabric.api import *
from fabulaws.ec2 import MicroUbuntuInstance
def new_instance(environment):
tags = {'environment': environment, 'product': 'caktus-website'}
i = MicroUbuntuInstance(tags=tags)
env.hosts = [i.hostname]
def bootstrap():
run('git clone %s' % env.repo)
Now, you can pass the environment that you’re creating into when you run
fab
:
fab new_instance:staging bootstrap
Connecting to EC2...
Retrieving tagged instances¶
To retrieve and use tagged instances from your fab file, use the ec2_hostnames
method in fabulaws.api
to retrieve the hostnames for the instances
tagged with the appropriate tags, e.g.:
from fabric.api import *
from fabulaws.api import *
def staging():
filters = {'tag:environment': 'staging', 'tag:product': 'caktus-website'}
env.hosts = ec2_hostnames(filters=filters)
def update():
run('git pull')
Then, you can run fab
as you normally would from the command line, and
it will reach out to EC2 to retrieve the hostname(s) for your server(s)
before running commands on them:
$ fab staging deploy
Connecting to EC2...
Useful commands¶
Mega-commands¶
fab describe:deployment,environment
- Show the config and existing serversfab create_environment:deployment,environment
- create all the initial servers. After this, you might need to bump up the instances in the autoscaling group to get web servers going.fab update_environment:deployment,environment
- run update_sysadmin_users, update_server_passwords, and upgrade_packagesfab environment update_services
- does upload_newrelic_conf, upload_supervisor_conf, upload_pgbouncer_conf, and upload_nginx_conf
Deploys¶
All-in-one commands:
fab deploy_serial:deployment,environment[,launch_config_name]
- Create a new launch config if a name is not provided. Update the ASG to use the provided or new launch config. Take web servers down one at a time and bring up new ones, so you end up with all new ones without downtime.fab deploy_full:deployment,environment[,launch_config_name]
- Create a new launch config if a name is not provided. Update the ASG to use the provided or new launch config. Take all the web servers down and bring up new ones. This is faster than deploy_serial but does cause downtime.Note
Neither of the above commands updates secrets on the servers. Be sure to explicitly run
fab environment update_server_passwords
if you want to change secrets at the time of deployment.
More low-level commands:
fab create_launch_config_for_deployment:deployment,environment
- Create a new launch config and print its name, but do not use it for anything. Typically you could use this and then follow with one of the deploy commands, providing the launch_config_name that was output from this command.fab environment begin_upgrade
- puts up a maintenance page for all requests (the deploy_xxx commands do this for you)fab environment deploy_web[:changeset]
- deploy to web servers (update them in place, do not update LC or ASG), restart processesfab environment deploy_worker[:changeset]
- deploy to worker (update in place), restart processesfab environment flag_deployment
- sends a message to New Relic that the current code revision has just been deployedfab environment end_upgrade
- reverses begin_upgrade
Misc¶
fab environment update_sysadmin_users
- create or update dev users on serversfab environment upgrade_packages
- upgrade all Ubuntu packages on serversfab environment mount_encrypted
- see source
EC2 instances¶
fab new:deployment,environment,role[,avail_zone[,count]]
Handling secrets¶
fab environment update_local_fabsecrets
fab environment update_server_passwords
- push secrets from local file to servers (except luks_passphrase)
Supervisor¶
fab environment upload_supervisor_conf
fab environment supervisor:command,group[,process]
- runs ‘supervisorctl command environment-group:environment-process’, or ‘supervisorctl command environment-group’
Examples:
- fab testing supervisor:stop,web
- fab testing supervisor:stop,celery
- fab testing supervisor:stop,pgbouncer
- fab testing supervisor:start,pgbouncer etc.
Python/Django¶
web servers & worker:
fab environment update_requirements
- does a pip install (without -U) (on all webs & worker)fab environment update_local_settings
- render local settings template and install it on the servers (but does not restart services) (on all webs & worker)fab environment bootstrap
- clones source repo, updates services, creates an virtual env and installs Python packages (on all webs & worker)fab environment clone_repo
– clones the source repo (on all webs & worker)fab environment update_source
- updates checked-out source (on all webs & worker)fab environment current_changeset
- check latest code from repo (on all webs & worker)
worker only:
fab environment managepy:command
- run a manage.py command on workerfab environment migrate
- run a migrate command on workerfab environment collectstatic
- run a migrate command on workerfab environment dbbackup
- run a database backup using dbbackupfab environment dbrestore
- run a database restore - see code for now for more info
Databases¶
fab environment upload_pgbouncer_conf
fab reload_production_db[:prod_env[,src_env]]
fab reset_local_db:dbname
fab environment reset_slaves
- this resets the config & data on the db slaves and is a good way to get things back into a working state if the replication seems brokenfab environment promote_slave[:index]
- change slave index to be the master. After this, run update_local_settings to make the web servers use the new settings.
Nginx¶
fab environment upload_nginx_conf
fab environment restart_nginx
Newrelic¶
fab environment upload_newrelic_conf
fab update_newrelic_keys:deployment,environment
- especially useful because it restarts the Django processes, even if you don’t need to change the New Relic config.