An Introduction to the /opt Directory

<center> # An Introduction to the /opt Directory *Originally published 2018-09-10 by Nick Sweeting on [docs.sweeting.me/s/blog](https://docs.sweeting.me/s/blog).* </center> This post is about standardizing your code, data, and config folder locations on your servers as a sysadmin by using the `/opt` directory on UNIX-style systems. This post assumes you are comfortable with basic UNIX shell commands, `ssh`, `rsync`, `etc` files, and `bash` syntax. It also assumes you have some prior experience setting up services like web backends and databses on *nix systems. --- When setting up a new machine, were do you usually put the code you want to have running on the machine? What about the dotfiles and etc files? Do you use Ansible, Puppet, Chef, etc? Do you need to install packages and setup the machine? But the real question is: **Do you have a standard way to do this for every machine that you set up?** The core things to keep track of in each project are: - code - config - data - executables Consistency across your environments will help you provision and debug setups efficiently, no matter which machine you're on. The most effective way to have consistency is to write down your process, and stick to it. With a written process you now have a declarative description of the setup process that you can follow every time you set up a new machine. It also serves as documentation if you ever need to remember how something was set up. Try to reduce turing-completeness in your setup wherever possible. i.e. prefer yaml/json/ansible/docker configs describing the desired outcome to Puppet/Chef/bash scripts that imperatively run steps to get there. The documentation of your process can be broken down into "standards", think of them like RFCs describing a particular aspect of your sysadmin duties, except the standards committe only has one member: you! > *For a similar approach, see: [The Twelve-Factor App Methodology](https://12factor.net)* With that in mind as you read ahead, I encourage you to draft out your own "folder locations standard" that suits your needs, and use it when you create new projects to deploy to your servers. ## Start Simple: Where should we put our project root? One of the best things to standardize early on is where you locate your projects on your servers. Luckily UNIX can help us out here, there are already some standard folders given to us, like `/opt`, `/etc`, `/var`, `/bin`. You might know that `/etc` is for configuration files, but do you know what all the other ones are for? One particularly useful one for us today is `/opt`. > `/opt`: **[Add-on application software packages](http://www.pathname.com/fhs/pub/fhs-2.3.html)** > For the installation of add-on application software packages. > ___ > Generally, all data required to support a package on a system must be present within /opt/<package>, including files intended to be copied into /etc/opt/<package> and /var/opt/<package> as well as reserved directories in /opt. Following the spirit of that piece of UNIX philosophy, I like to put all my projects inside `/opt/<project-name>`. For example, if I were setting up a wordpress site `myblog.com`, I would put all the files here: ``` /opt/ myblog.com/ wordpress/ index.php wp-config.php ... ``` ## Where should we put config? I really like to keep all the files relevant to a certain project in one place, and luckily our `/opt` structure above lends itself well to that, we can put our config files in our `/opt/<projct-name>/etc` folder, and symlink them into place in the system `/etc` folders under the same paths. ``` /opt/ myblog.com/ ... etc/ nginx/ sites-enabled/ myblog.com.conf certs/ myblog.com.crt myblog.com.key mysql/ my.conf cron.d/ update_wordpress ``` ## What about application state? Our example wordpress site likely has a database, some logs, and other mutable state such as user uploaded-files. We can keep those in our project under a `data/` directory: ``` /opt/ myblog.com/ ... data/ wp-content/ # user-installed plugins, themes, and uploaded files database/ # mysql data folder logs/ access.log error.log database.log wordpress/ wp-content/ -> ../../data/wp-content (symlink) ... ``` You'll have to put these logfile locations into your `my.cnf` and `nginx.conf` files to get those processes writing data to this custom location instead of the system defaults. Symlinks are your friend, you can always symlink `/var/log/nginx/access.log` into your `opt` folder if you want to have logfiles accessible from the system default locations. If you'd like to follow strict UNIX philosophy you can symlink `/var/opt/<project-name>` to your opt data folder: http://www.pathname.com/fhs/pub/fhs-2.3.html#VAROPTVARIABLEDATAFOROPT ## Executables and helpers: how do we administer our app? In my projects I like to create 5 standard executables for the most common admin tasks: ``` /opt/ myblog.com/ ... bin/ setup.sh start.sh stop.sh backup.sh update.sh ``` - `start.sh`: starts the app and all its necessary services e.g. `systemctl nginx start` - `stops.sh`: stops the app cleanly and ends all processes - `setup.sh`: installs any necessary packages and symlinks all our `/opt/<project-name>` files into place on the host system - `backup.sh`: dumps any running databases or message queues to static files in `data/` which are safe to copy as a snapshot-in-time backup - `update.sh`: pulls a the latest version of the code and does any package updates/migrations necessary These executables are useful in many contexts, you can call them from cron jobs, use them in deploy scripts, or just call them manually when SSH'ed into the server and debugging stuff. ## Tying it all together I like to put a small README.md in the root of each project explaining the setup and how to do common admin tasks. Isn't this starting to look suspicously like a typical github project dir? We can actually add this whole folder into git or other version control, as long as we exclude the big mutable `data` folder. This lets us track our folder structure, config files, and everything else in version control so we can revert our project to any given version. Our final folder structure looks like this: ``` /opt/ myblog.com/ .git .gitignore README.md bin/ setup.sh start.sh stop.sh backup.sh update.sh etc/ nginx/ ... mysql/ ... cron.d/ ... ... wordpress/ index.php wp-config.php ... data/ wp-content/ ... database/ ... logs/ ... ``` ## Docker-based projects Luckily docker-based projects work really well with this folder layout because we can mount the `data/` directory as a volume in our images: ``` /opt/ myblog.com/ docker-compose.yml wordpress/ Dockerfile ... data/ ... bin/ ... ``` For a docker project, the `start` and `stop` helper bin files could just be shortcuts for `docker-compose up -d` and `docker-compose down`, and `update` could be as simple as `git pull; docker-compose pull` to get a new version of the docker-compose file and docker images. ## Multiple projects on a server If you have multiple services running on a server you can put them all into `/opt` and use an init system to orchestrate starting and stopping them. Which init system you choose is up to you, but personally I like to use supervisord, a simple python-based init system. ``` /opt/ supervisord.conf myblog.com/ ... otherblog.com/ ... vpn.mysite.com/ ... mail.mysite.com/ ... ``` An example `supervisord.conf` might look like this: ```bash [program:myblog.com] command=/opt/myblog.com/bin/start stderr_logfile=/opt/myblog.com/data/logs/wordpress.log user=www-data [program:otherblog.com] command=/opt/otherblog.com/bin/start stderr_logfile=/opt/otherblog.com/data/logs/wordpress.log user=www-data ... ``` Then you can start and stop various projects by doing this: ``` supervisorctl start myblog.com supervisorctl stop myblog.com supervisorctl restart otherblog.com ... ``` ## Backups: Is tar+gzipping the whole folder enough? You can tar+gzip/snapshot the whole folder, however that's not the best idea if you have running stateful services like databases or message queues like Redis. Copying a database's raw data folder while the service is running can create a backup that's inconsistent, since the files may be changing while it's copying. For small projects you can create a helper script which runs to export any database or message queues that are currently running before you tar+gzip the whole folder: `/opt/<project>/bin/backup`: ```bash # optionally stop the service before dumping state ./stop # dump mysql db to file mysqldump -u backups wp_myblog | gzip -9 > ../data/database/dump.sql.gz # or dump postgres db to file pg_dump wp_myblog | gzip -9 > ../data/database/dump.sql.gz # dump redis to file redis-cli SAVE gzip -9 < /var/lib/redis/dump.rdb > ../data/redis/dump.rdb.gz ``` ## Offsite backups are a piece of cake now When preparing to backup an entire project to an offiste backups location, you can do it in two steps like so: ```bash ssh myblog.com /opt/myblog.com/bin/backup rsync -r --archive --progress myblog.com:/opt/myblog.com/ /Backups/myblog.com ``` 1. Freeze the state of the app in time by dumping any running services to dump files 2. rsync the entire project folder, with everything inside: data, config, code, etc. Now `/Backups/myblog.com` contains a perfect replica of myblog.com at that point in time. It contains everything you would need to deploy that project on a new server. ## My 2 Cents This particular folder structure outlined above may not float your boat, maybe you prefer to keep everything in `/tank` or manage backups in a different way. Thats ok! **Having** a standard, is often more important than **which** standard you pick. The core things to keep track of are: `code`, `config`, `data`, `backups`, and `executables`. As long as your have these files in predictable places, and your projects have a consitent structure across different machines, you'll be able to debug and manage everything simply. Danger lies in the unexpected. --- Further reading: - https://grahamc.com/blog/erase-your-darlings - https://12factor.net