diekmeier.de – A blog about software development

Streaming Files from SvelteKit

November 6, 2023

Disclaimer: I’m pretty sure this only applies if you use @sveltejs/adapter-node, because it’s using Node.js specific APIs.

For Eintrittskarten.io, I wanted to build an endpoint which returns access controlled files. I didn’t want to load the complete files into RAM at once, instead I looked for a way to stream the files from the server to the client.

SvelteKit does not have this “built in”, but I put the following code together and it seemed to work fine:

import { Readable } from 'node:stream'

export async function GET({ params, url, locals }) {
  const filePath = path.join(MY_UPLOADS_PATH, `${params.file}-${template}.pdf`)
  // DO NOT COPY THIS INTO YOUR APP. READ ON.
  const fileStream = Readable.toWeb((fs.createReadStream(filePath))) as BodyInit
  return new Response(fileStream)
}

Create a good old Node.js ReadStream, use an (at the time of this writing) “experimental” function to transform it into another kind of Readable Stream, and send it off in the Response. Easy! Too easy!

After deploying this, every now and then, I’d get this error delivered to my Sentry instance:

TypeError: Invalid state: Controller is already closed
  File "node:internal/errors", line 406, col 5, in new NodeError
  File "node:internal/webstreams/readablestream", line 1056, col 13, in ReadableStreamDefaultController.close
  File "node:internal/webstreams/adapters", line 454, col 16, in ReadStream.<anonymous>
  File "node:internal/util", line 531, col 12, in ReadStream.<anonymous>
  File "node:internal/streams/end-of-stream", line 162, col 14, in ReadStream.onclose
  File "node:events", line 514, col 28, in ReadStream.emit
  File "node:domain", line 488, col 12, in ReadStream.emit
  File "node:internal/streams/destroy", line 132, col 10, in emitCloseNT
  File "node:internal/process/task_queues", line 81, col 21, in process.processTicksAndRejections

I could not make sense of it. This error would of course be more helpful if my application code would be somewhere in there, but no luck.

Googling for it brings me to (unrelated issues in) Unidici. Unidici is the new HTTP client in Node.js, so I understand that it’s related to me trying to stream a file, but it’s completely unclear to me why this error is happening or what I could do to prevent it. I was also never able to reproduce the error locally.

I think what is happening is that if a user stops the Response file download mid-stream (e.g. by closing the browser), the stream is closed once, but when the Node.js stream has read the whole file from disk, it tries to close the stream again – Leading to the error. But this is just a guess.

Anyway, after clicking deeper and deeper into the Unidici source code, I found a ~~terrible hack~~ nice convenience that you can/should use instead:

 import { Readable } from 'node:stream'

 export async function GET({ params, url, locals }) {
   const filePath = path.join(MY_UPLOADS_PATH, `${params.file}-${template}.pdf`)
-  const fileStream = Readable.toWeb((fs.createReadStream(filePath))) as BodyInit
+  const fileStream = fs.createReadStream(filePath) as unknown as BodyInit
   return new Response(fileStream)
 }

You can pass a Node.js stream directly to Response, because Unidici has a special case for this! (Even though this is not part of the spec for Response.)

(Yes, I’m sorry about the as unknown as BodyInit, but it does work. Something is wrong with the types.)

I haven't had the error pop up in Sentry even a single time since deploying this change. So, I guess by using the platform, SvelteKit does have streaming built in. Really makes you think!

Finally Understanding `finally` in JavaScript

May 7, 2023

I've been writing JavaScript since around 1973, but I've never understood the point of the finally Keyword. The following two implementations of coolFunction are equivalent, so why bother with the additional curly braces?

function coolFunction() {
  try {
    thisFunctionThrows()
  } catch (error) {
    console.log('whoops')
  } finally {
    console.log('dont worry, i got u')
  }
}

function coolFunction() {
  try {
    thisFunctionThrows()
  } catch (error) {
    console.log('whoops')
  }

  console.log('dont worry, i got u')
}

// both of these print the same thing:
coolFunction()
// 'whoops'
// 'dont worry, i got u'

What I didn't understand was that you can use it to run some other code in the last moment before leaving the function – for example if you want to return early or throw an error. This can come in handy when you need to clean up some other resources, no matter in which way you leave the function.

While this can be extremely useful, finally is also not completely straightforward. For example, the order of execution gets a little weird. Consider this example, which surprised me:

function hello() {
  try {
    console.log('hello')
    return 'world'
  } finally {
    console.log('finally')
  }
}

console.log(hello())
// 'hello'
// 'finally'
// 'world'

And while I'm at it, this double-return also feels weird:

function hello() {
  try {
    return "world"
  } finally {
    return "finally"
  }
}

console.log(hello())
// 'finally'

Maybe don't overdo the esoteric stuff.

Time for a Practical Use Case

Here's an example from Eintrittskarten.io (edited for length and width) where we render a PDF from a URL:

async function renderPDF(url, filePath) {
  const browser = await puppeteer.launch({ headless: true })
  const page = await browser.newPage()
  const response = await page.goto(url)
  await page.pdf({ path: filePath })
  await browser.close()
}

Rendering PDFs from URLs is a thankless job. In the past, we've accidentally sent automated emails where we had attached a PDF of our webapp showing an error message. (I actually find this hilarious.) To protect us from these mistakes in the future, we added a check that throws an error if we don't get a response from the server, or if the HTTP status is not 200.

async function renderPDF(url, filePath) {
  const browser = await puppeteer.launch({ headless: true })
  const page = await browser.newPage()
  const response = await page.goto(url)

+ if (!response) {
+   throw new Error(`PDF Renderer: No response from ${url}`)
+ }

+ if (response.status() !== 200) {
+   throw new Error(
+     `PDF Renderer: Expected page status to be 200, ` +
+     `was ${response.status()} for ${url}`
+   )
+ }

  await page.pdf({ path: filePath })
  await browser.close()
}

But sadly, bugs are often fractal, so while this check does save us from sending incorrect PDFs to customers, it introduces a new bug: Whenever we throw an error, the browser we start with puppeteer.launch(...) does not get closed. This can be a problem if your server does not have infinite RAM or is expected to work. Sadly, both of these are true for us.

Combined with automatic retry in the case of errors, I built a magnificent machine that crashes itself whenever there is a problem with the PDFs. Annoyingly, this bug only surfaced while I was at a ramen restaurant, trying to enjoy a nice bowl of Tantan Ramen while rebooting the server from my iPhone.

This is when the usefulness of finally finally dawned on me. This is the perfect use case: I want to make sure that I close the browser, but I also want to be able to throw errors if something goes wrong!

async function renderPDF(url, filePath) {
  const browser = await puppeteer.launch({ headless: true })

+ try {
    const page = await browser.newPage()
    const response = await page.goto(url)

    if (!response) {
      throw new Error(`PDF Renderer: No response from ${url}`)
    }

    if (response.status() !== 200) {
      throw new Error(
        `PDF Renderer: Expected page status to be 200, ` +
        `was ${response.status()} for ${url}`
      )
    }

    await page.pdf({ path: filePath })
+ } finally {
+   // We have to close the browser or it will
+   // keep running in the background.
    await browser.close()
+ }
}

In my local testing, this worked great. (Thankfully, the bug was very reproducible, so I'm pretty confident I fixed it.) This feels pretty good! I'm excited to find out which even tinyer bug hides in the new code, but I'll be ready whenever it shows itself.

Using the Built In Test Runner to make Node.js projects more sustainable

May 5, 2023

TL;DR: I converted four old Node.js projects to use the brand new built in test runner. This removed hundreds of dependencies and will make it easier to maintain these projects in the future.

I own a few npm packages that basically nobody uses. Despite this fact, Github sends me a lot of Dependabot alerts about critical security problems in some ancient version of minimist in a dependency of a dev dependency of a package I barely use.

I don't really care. This adds a lot of noise to my Github life and makes me more likely to miss important alerts. I already turned off Dependabot Alerts for a lot of old projects – especially if I don't use them myself.

Also, I'm currently reading Sustainable Web Development with Ruby on Rails by David Bryant Copeland. A big topic is the carrying cost of decisions you make while building your projects. (The book is very insightful and I recommend reading it.)

A carrying cost in this case is additional work you have to put in while trying to work on the project. For example, updating dependencies when they stop working or have security problems. Or fixing tests that break because of a change in a dependency. This is closely related to technical debt, but not quite the same. Technical debt is like paying of a loan you took out in the past. Carrying costs are more like the rent you pay for the house you live in.

One of my takeaways from the book is that whenever possible, you should try to make decisions that reduce the carrying cost of your project.

While not enormous, the carrying cost of my old Node.js projects is also not zero. I have to deal with Dependabot alerts, and I have to make sure that the tests still run.

Thankfully, the stars have finally aligned to make this a bit easier: Node.js v20 has just been released, and with it, Node.js now has a stable built in test runner: node:test.

I think this is great news. In my older packages, I often installed the Ava test runner. This pulled in about 300 gazillion transitive dependencies, which, down the line, would get marked as terrible security problems by Dependabot. With the new node:test module, I was able to remove this dependency completely.

The changes boiled down to this:

# Switching from ava to node:test
- import test from 'ava'
+ import { test } from 'node:test'
+ import assert from 'node:assert'

# Switching from t.is to assert.strictEqual
  test('converts to string', (t) => {
   const result = Spoons.convert(1)
-  t.is(result, '1 tsp')
+  assert.strictEqual(result, '1 tsp')
  }

And running the tests is now as easy as node --test.

So, over the course of the weekend, I dusted off four old projects and migrated them to modern alternatives.

@danieldiekmeier/async-worker: I converted the project to ESM and removed all dependencies. Commit 8b24337, +40/-4,468 lines
todo-or-die: I converted the project to ESM and removed all dependencies. Commit 2e6b948, +39/-4,588 lines
Salt Bae: I switched from Parcel to SvelteKit and from Ava to the built in test runner. Commit d986fa0 +671/-5,956 lines, 3b4eb4b, +5,987/-8600 lines
Spotify Latest Releases: I switched from Vue, Koa and Webpack to SvelteKit. I moved from axios to native fetch and from moment.js to plain Date. Commit 5100758, +1,330/-5,011 lines

Most of these deletions (around 20,000 lines!) stem from package-lock.json files (or equivalent files from yarn or pnpm). This is amazing! This means that I probably have hundreds, if not thousands of dependencies less than when I started.

I don't know what the future will bring, but I'm pretty sure that I will have to do less work to keep these projects running. If you have small Node.js projects that don't benefit from dedicated test runners, I recommend you give node:test a try.

Why does RuboCop want me to use `has_many :through` instead of `has_and_belongs_to_many`?

March 19, 2023

A few weeks ago, I had to implement a Label feature. Picture Github’s labels for Issues and Pull Requests:

This is a classic many-to-many relationship: A label can be assigned to many issues, and an issue can have many labels.

For this project, we use Rails, so I created a Label model and connected it to our existing Issue model with has_and_belongs_to_many:

class Label < ApplicationRecord
  has_and_belongs_to_many :issues
end

class Issue < ApplicationRecord
  has_and_belongs_to_many :labels
end

This worked fine to create the many-to-many relationship I was after. I implemented the UI and was ready to call it a day.

I’m not sure how I missed it for so long, but I finally noticed that RuboCop complained about the has_and_belongs_to_many. According to the rule Rails/HasAndBelongsToMany, you should never use it and always prefer has_many ... through.

I found that surprising! The rule itself does not explain a reason, so I looked around and found a few answers here on StackOverflow.

In most cases, people were concerned that has_many :through will probably be needed everywhere at some point, which makes it the future-proof choice:

You should use has_many :through if you need validations, callbacks, or extra attributes on the join model.

From my experience it's always better to use has_many :through because you can add timestamps to the table.

If you decided to use has_and_belongs_to_many, and want to add one simple datapoint or validation 2 years down the road, migrating this change will be extremely difficult and bug-prone. To be safe, default to has_many :through

While writing this post, I found the corresponding entry in the Rails Style Guide, which indeed explains:

Using has_many :through allows additional attributes and validations on the join model.

For myself, I ended up with the following conclusion: I you ever want to talk about the relationship itself (and you probably will!), you should use has_many :through. I expect this will even help me outside of Rails, because many-to-many relationships are a common pattern in many projects.

I rewrote my code to has_many :through and it was no problem – especially because RuboCop helped me to catch this so early. I even ended up adding timestamps to the join table, which wouldn’t have been possible with has_and_belongs_to_many.

Deploying SvelteKit on Uberspace

February 19, 2023

This blog is built with SvelteKit and their Static Adapter. Running npm run build gives me a bunch of what we in the business call: “files”. Now I just need to upload them somewhere.

Uberspace is my favourite web host. I’ve been using them since 2012! The Lesetagebuch has been running on Uberspace since 2013. They taught me to use the command line before I ever thought about becoming a software developer. Give them a try!

Over the years, they have made it incredibly easy to deploy a website – Especially if the website only consists of some static files.

Nevertheless, there were a few things I had to figure out, and now I’m telling you about them!

Always add trailing slashes

When I first deployed the site, I could open the root https://diekmeier.de, but none of the articles. I could navigate to the articles from the root, thanks to SvelteKit’s client side navigation, but that was it. If I tried to open the URLs directly, I got a “Permission Denied” error.

Thank god this is not my first rodeo. After looking into the build output, I noticed how the different pages were saved in the form /posts/example.html. I’d assumed they’d be saved as /posts/example/index.html. Indeed, with this hint, I found the relevant part of documentation:

You must ensure SvelteKit's trailingSlash option is set appropriately for your environment. If your host does not render /a.html upon receiving a request for /a then you will need to set trailingSlash: 'always' to create /a/index.html instead.

That’s exactly what I did, and everything started working immediately.

// +layout.ts
export const prerender = true;
export const trailingSlash = 'always';

Don’t forget your cache headers

I have a habit of running most of the sites that I manage through Lighthouse every now and again. (Probably a little too often.) After deploying the first version of this blog, Lighthouse was furious about missing cache headers on all my files.

Luckily, both SvelteKit and Uberspace are prepared for this. SvelteKit actually produces a dedicated immutable folder that contains most assets and generated files. Fittingly, Uberspace makes it incredibly easy to add custom headers for any path:

$ uberspace web header set diekmeier.de/_app/immutable \
    Cache-Control "max-age=31536000"

With this small change, all my immutable assets are cached for a year!

Automatic™ Deployment

I don’t need a Deployment Pipeline for a blog. I can execute a script, like it’s the 80s! The script builds the static files, and then uses rsync to yeet them onto my server. I love rsync. It is one of those tools in my toolbelt that becomes useful again and again and again.

# scripts/sync.sh
npm run build
rsync --verbose --recursive --delete -e ssh ./build/ danjel7@diekmeier.de:/var/www/virtual/danjel7/diekmeier.de

If I time this, it runs in just under 5 seconds:

$ time scripts/sync.sh
# lots of output later
scripts/sync.sh  5.42s user 0.58s system 128% cpu 4.665 total

Not bad!

Check all your Rails Routes at once

February 18, 2023

I think sometimes it’s a good idea to make broad, sweeping checks across your whole application. Sometimes, you want to say categorically: “This is the way we do things. Time to draw an RSpec shaped line in the sand.”

At work, I recently stumbled across a test case that was hidden deep inside an integration test, which looked like this:

# hundreds of lines above

it "does not include umlauts" do
  expect(confirmation_path =~ URI::UNSAFE).to be_nil
end

# hundreds of lines below

This had a ~~good~~ historic reason: The path in question is /bestaetigen (the german word for “confirm”) and, very much on purpose, does not include an ä (even though that would be theoretically possible).

Someone almost changed it to /bestätigen once, but stopped at the last second because nobody was sure if this would work with our external CRM, HubSpot. Maybe the CRM doesn’t support umlauts? Maybe some other part of our system doesn’t? Who knows?

In any case, the developer decided to revert the change and to cement this decision with a test: This special URL should never contain an umlaut.

This is not a bad idea, but personally, I feel it’s a bit short sighted. If we’re afraid of this change, then we should probably also be afraid for all our paths!

I decided to delete the test case and add a new file:

# spec/routing/route_path_naming_spec.rb

require "rails_helper"

describe "list of all paths" do
  it "does not contain characters that need to be escaped" do
    # Get an array of all paths in the whole Rails app.
    # No, `.routes.routes` is not a typo. 🙃
    paths = Rails.application.routes.routes.map do |route|
      route.path.spec.to_s
    end

    paths.each do |path|
      expect(path).not_to match(URI::UNSAFE)
    end
  end
end

This spec runs through all our static routes in milliseconds. It even found two offenders that had slipped through, which was fun.

I’m really happy with how this turned out. I think there is value in specs that assert something about the system as a whole.

“We don’t use umlauts in paths“ could have been a rule in the README.md, or a sentence in a long forgotten Confluence document. But now, it’s part of the app and will always be enforced. This is a kind of automation I really enjoy.

Undo your last commit with `git reset --soft "HEAD^"` for a better™ stash workflow

January 4, 2023

At work, I switch between different Git branches all the time. Often, I’m working on a larger feature, but need to switch to some other branches for some quick fixes or because I’m reviewing someone elses code.

When I do that, I want to save my progress to the current branch.

I know, I know: I could use git stash for this, but I don’t really like it. It feels like throwing all your clothes on that chair in the corner of your room, hoping that you’ll find them again when you need them.

Instead, I got into the habit of committing all my changes to the current branch as a WIP commit.

git add . && git commit -m "WIP"

⚠️ Note: Don’t push this to the remote repository! It’s just a temporary commit to save your current progress.

This way, I can easily switch to another branch, do whatever needs to be done, and then bring back my changes:

git reset --soft "HEAD^"

By using --soft here, we bring back the changes to the working directory, and with "HEAD^", we only undo the last commit.

Update, February 18, 2023

Timo saw this post and created a Git Alias so he can easily WIP and un-WIP his changes:

wip = !"[[ $(git log -1 --pretty=%B) == \"WIP\" ]] && (git reset --soft \"HEAD^\") || (git add . && git commit -m \"WIP\")"

At first glance, this seemed like magic outside my comfort zone (when I see [[ in bash, I’m out), but it merely checks whether your latest commit was already called WIP. If it was, it does the soft reset, otherwise it creates the commit. So you can use git wip to toggle between the two states.

Personally, I am not a big fan of toggling, so I adapted his idea into two different aliased commands:

wip = !"git add . && git commit -m \"WIP\""
unwip = !"git reset --soft \"HEAD^\" && git status --short"

Through the git status, I can even see what my working directory looks like after un-WIPping:

$ git wip
[main 2836fcf] WIP
 2 files changed, 12 insertions(+), 37 deletions(-)

$ git unwip
M  README.md
M  src/posts/2023-01-04-git-reset-soft/post.md

Compensate for Missing Changelogs with npm-diff

January 3, 2023

I need to tell you about a cool tool that I found: It’s npm-diff by Julian Gruber.

At work, we use Depfu to keep our dependencies up to date. (I guess we could also use Dependabot? But somehow Depfu seems to be a little bit less annoying? I'm actually not sure!)

Anyway. Depfu creates a pull request for each new version of your dependencies. More importantly (for this blog post), it also tries to show you the changelog. But sometimes, the changelog is not very helpful. Often, Depfu can't actually find it – this happens often if the dependency is part of a monorepo, or if it moved, or if Depfu just doesn't feel well. In other cases, the changelog just contains a list of commits, which is very noisy and hard to understand.

The Google API Ruby Client is especially bad at this. The Changelog only says that the API Client was automatically regenerated.

Wouldn't it be nice if we could see the actual changes in the new version?

An example is probably worth a thousand words. This command shows the changes between the 4.1.3 and 4.1.4 versions of the @splidejs/splide package:

npx npm-diff @splidejs/splide 4.1.3 4.1.4

--- 4.1.3/package.json	1985-10-26 09:15:00
+++ 4.1.4/package.json	1985-10-26 09:15:00
@@ -1,6 +1,6 @@
 {
   "name": "@splidejs/splide",
-  "version": "4.1.3",
+  "version": "4.1.4",
@@ -91,15 +91,16 @@
   ],
   "exports": {
     ".": {
+      "types": "./dist/types/index.d.ts",
       "require": "./dist/js/splide.cjs.js",
       "import": "./dist/js/splide.esm.js",
       "default": "./dist/js/splide.esm.js"
     },

(The actual output is a bit longer, but this is enough to get the idea.)

Now we know: It looks like the main change is that the package now exports its types. If CI passes, this looks like a safe change to me.

Bonus: If you’re using Ruby, there is also a Bundler plugin that does the same thing for Ruby Gems: https://github.com/readysteady/bundle-diff.

SvelteKit: Don't Run Transitions When Navigating

January 2, 2023

Hello! I just added a few transitions to my SvelteKit app. They run when adding or removing an element from a list. The effect is quite nice and I was (once again) impressed by how easy it is to add transitions to Svelte.

This is all it took:

<script>
import { fade } from 'svelte/transition'
</script>

{#each items as item}
  <div transition:fade>
    {item.name}
  </div>
{/each}

Now, whenever the list changes, items fade in or out.

But I noticed that the transitions also run when navigating to a different page. This was a bit unexpected.

This is especially annoying when navigating away, because the navigation only happens after the transition has finished. The navigation is delayed by a few hundred milliseconds while I look at all my list items fading away. That's so weird!

Thankfully, this was easy to fix. I just needed to add a |local modifier to the transition:

{#each items as item}
  <div transition:fade|local>
    {item.name}
  </div>
{/each}

Now I can leave pages immediately, and don’t have to wait for transitions! Incredible!