Supply Chain Intrigue, or The Spy Who Shagged My Repo

Update 1, 3/31/2024: Lassie Collin has published an update on the xz backdoor on his own site, tukaani.org. The DNS CNAME "xz.tukaani.org" has been temporarily removed.

Ladies and gentlemen, fellow SysAdmins, distinguished guests - we've got a serious problem on our hands. For those who are new to the party, on 3.29.2024, a backdoor was discovered in the open source package xz. Luckily for us all, the malicious dev's code was sloppy and drew the attention of Anders Freund, a longtime PostgreSQL developer, now working at Microsoft.

I was going to do a detailed timeline, but honestly, it's already being covered fantastically:

Evan Boehs - "Everything I Know About the xz Backdoor"
lcamtuf - "Technologist vs spy: the xz backdoor debate".
Sam James (thesamesam on Github) - "FAQ on the xz-utils backdoor"

For anyone who's not up to speed, I strongly encourage you to read those links.

However, the xz intrigue is only a symptom of a larger problem in the world of software development: securing the supply chain. In just the past few months, we've seen issues with malicious packages on both PyPi and NPM, the main repositories of 3rd party libraries for Python and Javascript respectively. Fortunately, the malicious packages were relatively unsophisticated and were discovered before significant damage could be done, but they hint at an issue that is only going to grow in significance. The xz backdoor is probably the most sophisticated attack - in terms of planning and execution, if not necessarily the source code involved - that we've seen so far (that we know of), involving extensive planning and almost 3 years of preparation and lead time.

The widespread adoption of git and the popularity of Github.com fostered a revolution in software development. Open source repositories were certainly not new; I was using Perl packages from CPAN - arguably the first open source repository for 3rd party language libraries - in the late '90s (check out this episode of the CoRecursive podcast for a fascinating history of CPAN). But by the mid-2000s, due to stagnation and other issues, Perl was falling out of favor. The rise of PHP, Ruby, Python, and Javscript (via jQuery and Ajax) in web development, and Python, Rust, Javascript (via Node.js), and Go on the systems side - all languages with vibrant developer communities and, most importantly, extensive built-in support for external package repositories - had begun in earnest by the late 2000s. In April 2005, in response to revocation of the open source BitKeeper license for Linux, Linus Torvalds developed git (see the Wikipedia article for a detailed history) and moved Linux kernel versioning to it that same year. In 2008, Github.com debuted and the revolution truly began. According to the 2023 developer survey by Stack Overflow, 93.9% of respondents used git for version control, compared to 69.3% in 2015. In the same survey, Javascript was the most used language by far, used by 65.8% of respondents. For someone like me who's been in technology for over 25 years and developing software as a major part of my job since 2008, the change in both languages and methods of development were nothing short of revolutionary.

The increased speed of development, widespread adoption of open source, and extensive usage of 3rd party libraries also brought with it a set of problems unique in the history of software development. As a developer, I can now create sophisticated programs at light speed without having to reinvent the wheel, thanks to the use of external libraries. However, it also makes me nervous when I dig into the package source and see 2,000 folders in the node_modules/ directory. And that's really the crux of the issue: attacks on software supply chains will only become more prevalent and harder to detect over time and it's becoming nearly impossible for even well-staffed development teams to keep a handle on their application dependencies. As we've seen before with the leftpad package saga, wittingly or unwittingly, a single developer can inflict a huge amount of damage. Some organizations have tried freezing their dependency chains at a certain "known good" version to combat the issue, but that's at best a stopgap solution. Ultimately, if you had a group of nation-state actors or blackhats who had both the patience and the programming chops to pull off a "low and slow" attack - gaining the trust of package maintainers by committing quality code, being active in the communities over a long period of time, and spreading malicious code out over multiple repositories such that it's only functional when all the pieces are combined on a given system - it would be nearly impossible to prevent it.

As the xz backdoor has shown, it's entirely possible that this is already happening, and sophisticated backdoors could already be in place in packages considered "critical software infrastructure", e.g OpenSSH and/or Linux kernel modules. It's long been known that a sizable portion of the global tech infrastructure is maintained by volunteers who do it simply because they're passionate about their craft (h/t xkcd.org):

It's especially shameful the large number of tech companies who've made their vast fortunes on the backs of these developers, placating the community by hiring a few programmers or donating a token amount of money to open source projects. The open source software revolution in general (cue Cathedral & Bazaar reference) has been the engine powering the rise of the Internet, smartphones, and the Internet of Things, and is largely responsible for kicking off the current ongoing revolution that is AI and LLMs. But as any tech founder can relate, "move fast and break stuff" isn't really compatible with code audits and the principle of least-privilege. We've made HUGE advancements in the 21st century, including changing our entire software development model multiple times, but we've arrived at a point where our future success will depend on our ability to secure what we've built and keep it secure going forward. In my opinion, this is where the rubber meets the road in the debate about AI being used for good or evil. We're faced with a massive challenge; time will tell if we're able to rise to it.

Tip of the Hat, Wag of the Finger

Tip of the Hat to the sources referenced above, from which all the other information in this post flows. Big time Tip of the Hat to Anders Freund for discovering and analyzing the xz backdoor:

And I know I'm not the only person to feel this way, but the biggest Tip of the Hat in this post rightfully belongs to Lassie Collin (Larhzu on Github). In addition to being completely innocent of anything surrounding the backdoor, he thanklessly maintained an unexciting, but crucial, package that's installed in pretty much every Linux distro in existence. This included having to defend himself against complaints that he wasn't working fast enough while dealing with issues in his personal life. Reminder: this was NOT Lassie's full time job, nor was he paid at all as far as I'm aware.

And now a Wag of the Finger to Github for closing down the xz repository and suspending the accounts involved - including Lasse Collin's account which is absolutely shameful;

Github: restore his account NOW!

A second Wag of the Finger (or maybe just "the finger") to all the armchair admins out there who will take the time to criticize open source maintainers, who - again, for those who are new to this - are not paid a single dime for their work, for not working fast enough, for not implementing whichever major feature this week that would make their (the admin's) job easier, on and on and on. Hey, fellow admins - if the package maintainers aren't working fast enough for you, why don't you learn to fscking code and fix it yourself. Here, let me help you get started: