# Don't distribute packages via `curl | sh` <br/>(but not for the reasons you think)
(or if you do, at least provide major releases via package manager as well)
*Originally published 2019-05-22 on [docs.sweeting.me](https://docs.sweeting.me/s/blog).*
A rant about why package manager distributions are important.
I feel like lately the number of projects I encounter that recommend `curl | sh` as the *one true way* to install seems to have increased. Maybe people looked around and saw major packages like Docker, Rust, and Python pushing this a few years ago, and then felt empowered to do the same and reject the "tyranny of package managers"... whatever the cause, it's not a trend I want to see continue for a number of reasons.
Let's start by saying I'm by no means perfect. I'm definitely guilty of releasing projects with only a `setup.sh` or `curl | sh` install method. It's quick and easy, and solves most of my needs for simple projects without the overhead of package distribution. Usually, once the project reaches a cetain size, I've felt pressure from various directions to figure out better packaging solutions though. Sometimes I just stopped responding to issues or archived the repo on Github entirely to avoid the burden of changing it, other times I gave in and started providing packaged releases.
## A tale as old as time
I've seen a common pattern play out across a number of packages, and from reading blog posts from the early 2000's and 90's it seems like a common pattern. Hopefully some day we'll collectively become self-aware enough to break the cycle and forget about packaging minutiae in favor of greener pastures.
*The 7 steps to packaging nirvana:*
1. A developer comes up with an idea, runs `git init`, codes up a proof-of-concept, and shares it with a few friends/colleagues
2. It grows in scope and eventually requires some dependencies. They start managing it with an informal system like a `setup.sh` script, a `Makefile`, a `package.json`, or multiple methods cobbled together making subcalls to a few different packaging systems (e.g. a `Makefile` that runs `apt install` or `brew install` internally)
3. The userbase grows, suddenly they have users in Tunisia, Tuscany, and Taipei. To improve the install UX and help get people up and running, a `setup.sh` or `curl | sh` is thrown in the README, and they *hope*🤞it covers all cases.
4. Users start requesting package manager support, and the developer initially resists because distributing packages is a pain in the butt. After all, `apt` doesn't even guarantee reproducible builds!
> "the `curl | sh` is working fine on most sytsems, why not just use that? aren't most package managers pretty flawed anyway?..."
5. As the package becomes more mainstream, more users demand packaging support, and many "unofficial" packages appear. The unofficial ppas/dockerfiles/etc often contain numberous bugs and quirks specific to the use case of the creator, or end up abandoned several versions behind the main project repo, frequently missing critical security fixes (see python, nginx, caddy, docker, etc 2 years ago, thankfully several of them now have official PPAs and casks, or recent versions in mainline debian/brew)
6. The maintainer then gets tired of dealing with support burden of all these "unofficial" packages, so they cave in and create an official one. Everyone is relieved and the vast majority of people switch to using the official package. they install updates more frequently, and new users have fewer issues with environment quirks or program bugs permanently borking the installed version
7. The maintainer discovers their life is somehow much easier than before, despite the overhead of package distribution that they were so afraid of. Install support load is lighter and can be scoped more easily to particular OS/package manager combinations. The maintainer then begins to demote their other install methods and starts promoting the OS package manager installation instructions as the best way to obtain their software (see PostgreSQL as the ultimate multi-decade example of this pattern).
* Alternate #7b: The maintainer gets entirely fed up with life and writes a new OS and package manager from scratch. "this time we'll get it right" they mutter (see Nix).
I think python3 is at stage #5-6 right now, and both docker and nginx are at stage #6-7 (2020 UPDATE: docker is now solidly at #7, as predicted). ArchiveBox, a package I maintain is now on step #7 :) ~~currently on step #4, but I find myself fantasizing about jumping to #7 much sooner than with previous projects I've released, maybe this is what personal growth feels like...~~
Different package mangers have different tradeoffs, which are perfectly reasonable to gripe about. Some packages choose to prefer a language-specific package manager that has faster/easier release cycles than the OS-level one, for example, both `youtube-dl` and `gunicorn` are installable via `pip` and `apt`, but have faster release cycles on `pip` than `apt`. I'm just trying to show that both `apt` and `pip` are objectively better than `curl | sh`, just by the virtue of being structured, standardized install methods with centralized update checking, version history, and maintainer history.
## Assorted Arguments
#### Your package is a dependency in other people's projects
Your package may be a static binary with no dependencies... that's great, but *your project is a dependency in other people's projects*, so eschewing the entire package manager ecosystem forces every project that depends on yours to build buggy workarounds and run nondeterministic shell scripts to get your project set up in their particular environment. If *every* project is a special snowflake that cant plug-and-play with the standard methods for fetching and running other people's code, the entire ecosystem falls apart and we go back to the stone ages of posting source directories with 2000 line Makefiles on Usenet and telling the user to "figure it out".
#### Root of trust
I'm not arguing anything about the root of trust, if you trust your HTTPS download of Chrome, then you should trust a `curl | sh` script coming in via HTTPS just the same. My argument is centered around making dependency management easier for users, not around the authentication of the install script content.
#### The "security theater" dismissal
Some people describe `curl | sh` griping as playing "security theater", and claim that using package managers with unsandboxed post-install scripts is no different than running `curl | sh`. That's a strawman, because it focuses only on the HTTPS root-of-trust non-issue and totally ignores the real root argument about whether it's ok for packages to only be distributed via hand-written install scripts as opposed to via package managers that provide automated update checking, centralized ownership and versioning history, and easy uninstallability.
#### Transparency & Inspectability
I'm not making any case that apt, brew, or even nix are more or less transparent than `curl | sh`, I think both are easy enough to inspect in theory, just download the source. But manual code review of any downloaded source is usually impractical when the administrator is installing dozens of packages, all with their respective dependencies.
Punting manual code review responsibility to the end user for everything they install is labor-intensive, and unreliable as a security measure. Packages can still be pwned with obfuscated code that often goes unnoticed even by skilled code reviewers, and the same goes for bash scripts hosted on webservers.
#### Public, explicit chain of trust with revocation built-in
With `curl | sh`, you are forced to trust a random person's https server with unknown reputation to provide a good install script on every connection. With apt, brew, or nix, you're delegating your trust to a reputable repo maintainer and package maintainer who have verified accounts and signing keys that can be revoked in the event of a breach. The trust chain all the way from commit author, to reviewer, release tagger, maintainer, hoster, and patcher is laid bare and made explicit + public in a standardized way.
#### Enforced public changelog with immutble releases
One of the greatest benefits of package management systems is the ability to access a centralized, versioned database of all packge versions with immutable, monotonically increasing hashed builds, like PyPi, NPM, debian, or homebrew.
With `curl | sh` not only can you not request a previous version or a specific version of the install script, but the script contents can also [change mid-request](https://www.idontplaydarts.com/2016/04/detecting-curl-pipe-bash-server-side/) right under your nose!
#### Package managers still support private repos
With public repos there's transparent history and explicit ownership of contributions by maintainers, but you can still opt-out of that centralized system and run your own repo with whatever rules you want. (e.g. cask, ppa/cydia repo/sources.list, docker hub, etc).
Package managers don't restrict the space of possibilities, they can provide both the benefit of accountability and historical accuracy, along with the flexibility to install 3rd-party, "unapproved" packages directly from private ppas or git repos.
#### Auto-updating is hard, don't write auto-updating logic yourself
If the custom auto updater in your app ever has a bug, the upgrade path breaks forever and you have to resort to telling your users to manually update by running a new command. Hardware programmers learn this the hard way when they accidentally ship a device with broken firmware and have to get users to physically return their devices to exchange for a new one. But software programmers have the luxury of being able to rely on a system-provided auto-updater that works for all packages, and is user-configurable in a central, standardised way. **DONT SHUN THIS FREE BENEFIT!** Security people often say, "never write your own crypto", for all the same reasons, you should avoid writing your own package manager.
Users have a hard enough time keeping packages up-to-date as-is, if every little app has its own auto-updater, the surface area of things that can break is drastically increased, and there's no recourse for errors other than reinstalling from scratch or staying on the stale version. A single unified packaging interface that allows easy dependency management, installation, upgrade, and removal is the bare minimum requirement for a decent devops UX. Easy, automatic updates are extremely important for security-critical packages.
#### Package managers allow users to pick their packaging style at the OS-level
You may not agree with the tradeoffs or design decisions of a particular package manager, but *thats the whole point*, package managers are inherently opinionated, and users have a choice to use the OS and package mangers that align with their preferences (e.g. apt, brew, pkg, nix, pip, cargo, etc all have slightly different philosophies and fit slightly different use cases). Debian likes releases pinned to every LTS, with no new feature releases on packages until the next major release, whereas NPM is a free-for-all where package versions are sometimes bumped 10+ times in less than a week.
Package releasers like to have complete control over packaging behavior, but overall system UX suffers greatly when each app tries to override standard, commonly accepted packaging behavior for the host OS. You don't want to fight your user's system-wide standard packing behavior or you'll endlessly annoy your userbase for being that one "sore thumb" project they have to keep track of manually. If variation between OS environments and install processes is unacceptable, then ship your releases via Docker instead.
#### Why install its dependencies manually when pkg managers exist?
DEPENDENCIES. Why would you ever want to manage the dependencies for your code yourself? Isn't writing your project enough work already? Why go implementing an entire dependency manager + auto updater + usable install script that somehow manages to work across all of your user's OSs and doesn't have a way to install its dependencies automatically, or worse yet, it tries to "automagically" install them using the system package manager (if you're going to do that, just use the system package manager for it all directly!!). Pip, apt, yarn, brew, etc all have ways of easily specifying required dependencies, with a full range of granularity from entirely unpinned to pinned-versions-with-hashes-for-everything like pipenv or nix. If you must, you can always vendor your sub-dependencies within your package, but there is no need to leave the package manager ecosystem behind entirely!
#### Optional modules, dylibs, and plugins, oh my!
Package managers definitely struggle with some things. dylibs, untrusted plugins, os-specific binary dependencies, compile-time options, etc, these are all hard problems to solve, but they have all been solved before in different ways. It may take some creativity.
I propose that the default package manager distribution should be the "all batteries included" version, with most add-ons/modules that users might want precompiled in.
Forcing users to recompile your package from source to get even the most common addons like `nginx_http_perl_module` or `caddy:http-cache` drastically reduces usability for the majority of users who would've been ok with the extra kb needed to include them by default.
A separate "minimal" distribution can be created with no add-ons for the users that have resource limitations (e.g. bandwidth, CPU), but it shouldn't be the default unless you have the stats to show that the majority of your users are on dial-up!
The crux of this argument is that usability for the majority of users should not suffer under the tyranny of the minority with bandwidth limitations. Both parties can be satisfied if a minimal alternative version is provided alongside the "batteries included" default distribution.
#### Making your own package manager
The last resort would be to bootstrap your own package manager from the system package manger to install the program, manage its plugins, self-update (e.g. `apt install python3.7-pip; pip3 install setuptools`). This is almost always the wrong decision though, unless it relies heavily on user-contributed plugins and the namespace is too big to install everything by default. FreeNAS solves this by shipping an index of all user-contributed plugins by default, and only pulling the packages and enabling them when the user toggles them on.
#### Users who want small binaries are an edge case
This argument is weak: "my package cant be installed via package manager because some users want all the plugins and some users don't want any".
Disk space is cheap, most people dont care about small binaries, and those that do can be provided a custom build command that installs only the parts they need. manually assembling the program every time from a list of untrusted user-provided plugins is not a great UX though, even `pip install package[optionaldep,otherdep]` is rarely used becaue people so heavily favor just doing `pip install package` and getting everything right out of the box.
## My Packaging Background
"Do you even package, what authority do you have on this topic?" you may ask...
~~It's true, I don't even package on a daily basis, bro. I have a few projects up on NPM and PyPi, but the biggest open source project I've released doesn't even have an `apt` or `brew` version available, but at least my appetite for this particular brand of dogfood is growing, so I hope to upload some releases for those platforms soon.~~* I have been a sysadmin, library maintainer, and application developer for 10+ years at this point, and I've dealt with maintaining and managing thousands of dependencies for large-scale projects across dozens of companies. The crux of my argument is that the UX for the end user who has to install and update the project is paramount, because if it's difficult to update or install, the package wont get installed, and if it is installed, it wont be updated reliably and security vulns will lie unfixed for longer than they need to. With care and consideration, both the end users and the package creators can be made happy.
\* (this is no longer the case, I now package code regularly for `apt`/`deb`, `brew`, `pip`, `npm`, and `docker`)
Plz shred my arguments on Twitter (mention [@theSquashSH](https://twitter.com/theSquashSH)), let me know where my assumptions are wrong.
## Hash Verification of Curl'ed Scripts
I think this is mostly useless, but if you want a slightly better `curl | sh` process you can verify the hashes of curl'ed scrips before executing with a tool like:
curl https://... | hashpipe sha256 e8ad3b2b10fa2a5778950ba028f70ca4e5401ea23b52412227eaf908eb7d9b3e | sh -
It's only good for asserting that a script stays at a certain version though, and will break on any changes. It doesn't fix any of the other problems with custom install scripts.
## Docker is `curl ... | sh` all over again
I don't really agree with this viewpoint, but it's an interesting take to think about.
> Back then, years ago, Linux distributions were trying to provide you with a safe operating system. With signed packages, built from a web of trust. Some even work on reproducible builds.
> »Docker is the new ‘curl | sudo bash‘«. That’s right, but it’s now pretty much mainstream to download and run untrusted software in your “datacenter”. That is bad, really bad. Before, admins would try hard to prevent security holes, now they call themselves “devops” and happily introduce them to the network themselves!
## Further Reading