www.peterbe.com
Open in
urlscan Pro
2a0b:4d07:102::1
Public Scan
Submitted URL: http://www.peterbe.com/
Effective URL: https://www.peterbe.com/
Submission: On October 27 via manual from US — Scanned from DE
Effective URL: https://www.peterbe.com/
Submission: On October 27 via manual from US — Scanned from DE
Form analysis
0 forms found in the DOMText Content
PETERBE.COM Peter Bengtsson's blog * Archive * About * Contact * Search COMPARING DIFFERENT EFFORTS WITH WEBP IN SHARP OCTOBER 5, 2023 0 COMMENTS NODE, JAVASCRIPT When you, in a Node program, use sharp to convert an image buffer to a WebP buffer, you have an option of effort. The higher the number the longer it takes but the image it produces is smaller on disk. I wanted to put some realistic numbers for this, so I wrote a benchmark, run on my Intel MacbookPro. THE BENCHMARK It looks like this: async function e6() { return await f("screenshot-1000.png", 6); } async function e5() { return await f("screenshot-1000.png", 5); } async function e4() { return await f("screenshot-1000.png", 4); } async function e3() { return await f("screenshot-1000.png", 3); } async function e2() { return await f("screenshot-1000.png", 2); } async function e1() { return await f("screenshot-1000.png", 1); } async function e0() { return await f("screenshot-1000.png", 0); } async function f(fp, effort) { const originalBuffer = await fs.readFile(fp); const image = sharp(originalBuffer); const { width } = await image.metadata(); const buffer = await image.webp({ effort }).toBuffer(); return [buffer.length, width, { effort }]; } Then, I ran each function in serial and measured how long it took. Then, do that whole thing 15 times. So, in total, each function is executed 15 times. The numbers are collected and the median (P50) is reported. A 2000X2000 PIXEL PNG IMAGE 1. e0: 191ms 235KB 2. e1: 340.5ms 208KB 3. e2: 369ms 198KB 4. e3: 485.5ms 193KB 5. e4: 587ms 177KB 6. e5: 695.5ms 177KB 7. e6: 4811.5ms 142KB What it means is that if you use {effort: 6} the conversion of a 2000x2000 PNG took 4.8 seconds but the resulting WebP buffer became 142KB instead of the least effort which made it 235 KB. This graph demonstrates how the (blue) time goes up the more effort you put in. And how the final size (red) goes down the more effort you put in. A 1000X1000 PIXEL PNG IMAGE 1. e0: 54ms 70KB 2. e1: 60ms 66KB 3. e2: 65ms 61KB 4. e3: 96ms 59KB 5. e4: 169ms 53KB 6. e5: 193ms 53KB 7. e6: 1466ms 51KB A 500X500 PIXEL PNG IMAGE 1. e0: 24ms 23KB 2. e1: 26ms 21KB 3. e2: 28ms 20KB 4. e3: 37ms 19KB 5. e4: 57ms 18KB 6. e5: 66ms 18KB 7. e6: 556ms 18KB CONCLUSION Up to you but clearly, {effort: 6} is to be avoided if you're worried about it taking a huge amount of time to make the conversion. Perhaps the takeaway is; that if you run these operations in the build step such that you don't have to ever do it again, it's worth the maximum effort. Beyond that, find a sweet spot for your particular environment and challenge. Please post a comment if you have thoughts or questions ZIPPING FILES IS APPENDING BY DEFAULT - WATCH OUT! OCTOBER 4, 2023 0 COMMENTS LINUX This is not a bug in the age-old zip Linux program. It's maybe a bug in its intuitiveness. I have a piece of automation that downloads a zip file from a file storage cache (GitHub Actions actions/cache in this case). Then, it unpacks it, and plucks some of the files from it into another fresh new directory. Lastly, it creates a new .zip file with the same name. The same name because that way, when the process is done, it uploads the new .zip file into the file storage cache. But be careful; does it really create a new .zip file? To demonstrate the surprise: $ cd /tmp/ $ mkdir somefiles $ touch somefiles/file1.txt $ touch somefiles/file2.txt $ zip -r somefiles.zip somefiles adding: somefiles/ (stored 0%) adding: somefiles/file1.txt (stored 0%) adding: somefiles/file2.txt (stored 0%) Now we have a somefiles.zip to work with. It has 2 files in it. Next session. Let's say it's another day and a fresh new /tmp directory and the previous somefiles.txt has been downloaded from the first session. This time we want to create a new somefile directory but in it, only have file2.txt from before and a new file file3.txt. $ rm -fr somefiles $ unzip somefiles.zip Archive: somefiles.zip creating: somefiles/ extracting: somefiles/file1.txt extracting: somefiles/file2.txt $ rm somefiles/file1.txt $ touch somefiles/file3.txt $ zip -r somefiles.zip somefiles updating: somefiles/ (stored 0%) updating: somefiles/file2.txt (stored 0%) adding: somefiles/file3.txt (stored 0%) And here comes the surprise, let's peek into the newly zipped up somefiles.txt (which was made from the somefiles/ directory which only contained file2.txt and file3.txt): $ rm -fr somefiles $ unzip -l somefiles.zip Archive: somefiles.zip Length Date Time Name --------- ---------- ----- ---- 0 2023-10-04 16:06 somefiles/ 0 2023-10-04 16:05 somefiles/file1.txt 0 2023-10-04 16:06 somefiles/file2.txt 0 2023-10-04 16:06 somefiles/file3.txt --------- ------- 0 4 files I did not see that coming! The command zip -r somefiles.zip somefiles/ doesn't create a fresh new .zip file based on recursively walking the somefiles directory. It does an append by default! The solution is easy. Right before the zip -r somefiles.zip somefiles command, do a rm somefiles.zip. Please post a comment if you have thoughts or questions INTRODUCING HYLITE - A NODE CODE-SYNTAX-TO-HTML HIGHLIGHTER WRITTEN IN BUN OCTOBER 3, 2023 0 COMMENTS NODE, BUN, JAVASCRIPT hylite is a command line tool for syntax highlight code into HTML. You feed it a file or some snippet of code (plus what language it is) and it returns a string of HTML. Suppose you have: ❯ cat example.py # This is example.py def hello(): return "world" When you run this through hylite you get: ❯ npx hylite example.py <span class="hljs-keyword">def</span> <span class="hljs-title function_">hello</span>(): <span class="hljs-keyword">return</span> <span class="hljs-string">"world"</span> Now, if installed with the necessary CSS, it can finally render this: # This is example.py def hello(): return "world" (Note: At the time of writing this, npx hylite --list-css or npx hylite --css don't work unless you've git clone the github.com/peterbe/hylite repo) HOW I USE IT This originated because I loved how highlight.js works. It supports numerous languages, can even guess the language, is fast as heck, and the HTML output is compact. Originally, my personal website, whose backend is in Python/Django, was using Pygments to do the syntax highlighting. The problem with that is it doesn't support JSX (or TSX). For example: export function Bell({ color }: {color: string}) { return <div style={{ backgroundColor: color }}>Ding!</div> } The problem is that Python != Node so to call out to hylite I use a sub-process. At the moment, I can't use bunx or npx because that depends on $PATH and stuff that the server doesn't have. Here's how I call hylite from Python: command = settings.HYLITE_COMMAND.split() assert language command.extend(["--language", language, "--wrapped"]) process = subprocess.Popen( command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True, cwd=settings.HYLITE_DIRECTORY, ) process.stdin.write(code) output, error = process.communicate() The settings are: HYLITE_DIRECTORY = "/home/django/hylite" HYLITE_COMMAND = "node dist/index.js" HOW I BUILT HYLITE What's different about hylite compared to other JavaScript packages and CLIs like this is that the development requires Bun. It's lovely because it has a built-in test runner, TypeScript transpiler, and it's just so lovely fast at starting for anything you do with it. In my current view, I see Bun as an equivalent of TypeScript. It's convenient when developing but once stripped away it's just good old JavaScript and you don't have to worry about compatibility. So I use bun for manual testing like bun run src/index.ts < foo.go but when it comes time to ship, I run bun run build (which executes, with bun, the src/build.ts) which then builds a dist/index.js file which you can run with either node or bun anywhere. By the way, the README as a section on Benchmarking. It concludes two things: 1. node dist/index.js has the same performance as bun run dist/index.js 2. bunx hylite is 7x times faster than npx hylite but it's bullcrap because bunx doesn't check the network if there's a new version (...until you restart your computer) Please post a comment if you have thoughts or questions SHALLOW CLONE VS. DEEP CLONE, IN NODE, WITH BENCHMARK SEPTEMBER 29, 2023 0 COMMENTS NODE, JAVASCRIPT A very common way to create a "copy" of an Object in JavaScript is to copy all things from one object into an empty one. Example: const original = {foo: "Foo"} const copy = Object.assign({}, original) copy.foo = "Bar" console.log([original.foo, copy.foo]) This outputs [ 'Foo', 'Bar' ] Obviously the problem with this is that it's a shallow copy, best demonstrated with an example: const original = { names: ["Peter"] } const copy = Object.assign({}, original) copy.names.push("Tucker") console.log([original.names, copy.names]) This outputs: [ [ 'Peter', 'Tucker' ], [ 'Peter', 'Tucker' ] ] which is arguably counter-intuitive. Especially since the variable was named "copy". Generally, I think Object.assign({}, someThing) is often a red flag because if not today, maybe in some future the thing you're copying might have mutables within. The "solution" is to use structuredClone which has been available since Node 16. Actually, it was introduced within minor releases of Node 16, so be a little bit careful if you're still on Node 16. Same example: const original = { names: ["Peter"] }; // const copy = Object.assign({}, original); const copy = structuredClone(original); copy.names.push("Tucker"); console.log([original.names, copy.names]); This outputs: [ [ 'Peter' ], [ 'Peter', 'Tucker' ] ] Another deep copy solution is to turn the object into a string, using JSON.stringify and turn it back into a (deeply copied) object using JSON.parse. It works like structuredClone but full of caveats such as unpredictable precision loss on floating point numbers, and not to mention date objects ceasing to be date objects but instead becoming strings. BENCHMARK Given how much "better" structuredClone is in that it's more intuitive and therefore less dangerous for sneaky nested mutation bugs. Is it fast? Before even running a benchmark; no, structuredClone is slower than Object.assign({}, ...) because of course. It does more! Perhaps the question should be: how much slower is structuredClone? Here's my benchmark code: import fs from "fs" import assert from "assert" import Benchmark from "benchmark" const obj = JSON.parse(fs.readFileSync("package-lock.json", "utf8")) function f1() { const copy = Object.assign({}, obj) copy.name = "else" assert(copy.name !== obj.name) } function f2() { const copy = structuredClone(obj) copy.name = "else" assert(copy.name !== obj.name) } function f3() { const copy = JSON.parse(JSON.stringify(obj)) copy.name = "else" assert(copy.name !== obj.name) } new Benchmark.Suite() .add("f1", f1) .add("f2", f2) .add("f3", f3) .on("cycle", (event) => { console.log(String(event.target)) }) .on("complete", function () { console.log("Fastest is " + this.filter("fastest").map("name")) }) .run() The results: ❯ node assign-or-clone.js f1 x 8,057,542 ops/sec ±0.84% (93 runs sampled) f2 x 37,245 ops/sec ±0.68% (94 runs sampled) f3 x 37,978 ops/sec ±0.85% (92 runs sampled) Fastest is f1 In other words, Object.assign({}, ...) is 200 times faster than structuredClone. By the way, I re-ran the benchmark with a much smaller object (using the package.json instead of the package-lock.json) and then Object.assign({}, ...) is only 20 times faster. Mind you! They're both ridiculously fast in the grand scheme of things. If you do this... for (let i = 0; i < 10; i++) { console.time("f1") f1() console.timeEnd("f1") console.time("f2") f2() console.timeEnd("f2") console.time("f3") f3() console.timeEnd("f3") } the last bit of output of that is: f1: 0.006ms f2: 0.06ms f3: 0.053ms which means that it took 0.06 milliseconds for structuredClone to make a convenient deep copy of an object that is 5KB as a JSON string. CONCLUSION Yes Object.assign({}, ...) is ridiculously faster than structuredClone but structuredClone is a better choice. Please post a comment if you have thoughts or questions PIP-OUTDATED.PY WITH INTERACTIVE UPGRADE SEPTEMBER 21, 2023 0 COMMENTS PYTHON Last year I wrote a nifty script called Pip-Outdated.py "Pip-Outdated.py - a script to compare requirements.in with the output of pip list --outdated". It basically runs pip list --outdated but filters based on the packages mentioned in your requirements.in. For people familiar with Node, it's like checking all installed packages in node_modules if they have upgrades, but filter it down by only those mentioned in your package.json. I use this script often enough that I added a little interactive input to ask if it should edit requirements.in for you for each possible upgrade. Looks like this: ❯ Pip-Outdated.py black INSTALLED: 23.7.0 POSSIBLE: 23.9.1 click INSTALLED: 8.1.6 POSSIBLE: 8.1.7 elasticsearch-dsl INSTALLED: 7.4.1 POSSIBLE: 8.9.0 fastapi INSTALLED: 0.101.0 POSSIBLE: 0.103.1 httpx INSTALLED: 0.24.1 POSSIBLE: 0.25.0 pytest INSTALLED: 7.4.0 POSSIBLE: 7.4.2 Update black from 23.7.0 to 23.9.1? [y/N/q] y Update click from 8.1.6 to 8.1.7? [y/N/q] y Update elasticsearch-dsl from 7.4.1 to 8.9.0? [y/N/q] n Update fastapi from 0.101.0 to 0.103.1? [y/N/q] n Update httpx from 0.24.1 to 0.25.0? [y/N/q] n Update pytest from 7.4.0 to 7.4.2? [y/N/q] y and then, ❯ git diff requirements.in | cat diff --git a/requirements.in b/requirements.in index b7a246e..0e996e5 100644 --- a/requirements.in +++ b/requirements.in @@ -9,7 +9,7 @@ python-decouple==3.8 fastapi==0.101.0 uvicorn[standard]==0.23.2 selectolax==0.3.16 -click==8.1.6 +click==8.1.7 python-dateutil==2.8.2 gunicorn==21.2.0 # I don't think this needs `[secure]` because it's only used by @@ -18,7 +18,7 @@ requests==2.31.0 cachetools==5.3.1 # Dev things -black==23.7.0 +black==23.9.1 flake8==6.1.0 -pytest==7.4.0 +pytest==7.4.2 httpx==0.24.1 That's it. Then if you want to actually make these upgrades you run: ❯ pip-compile --generate-hashes requirements.in && pip install -r requirements.txt To install it, download the script from: https://gist.github.com/peterbe/a2b158c39f1f835c0977c82befd94cdf and put it in your ~/bin and make it executable. Now go into a directory that has a requirements.in and run Pip-Outdated.py Please post a comment if you have thoughts or questions PARSE A CSV FILE WITH BUN SEPTEMBER 13, 2023 0 COMMENTS BUN I'm really excited about Bun and look forward to trying it out more and more. Today I needed a quick script to parse a CSV file to compute some simple arithmetic on some numbers in it. To do that, here's what I did: bun init bun install csv-simple-parser code index.ts And the code: import parse from "csv-simple-parser"; console.time("total"); const numbers: number[] = []; const file = Bun.file(process.argv.slice(2)[0]); type Rec = { Pageviews: string; }; const csv = parse(await file.text(), { header: true }) as Rec[]; for (const row of csv) { numbers.push(parseInt(row["Pageviews"] || "0")); } console.timeEnd("total"); console.log("Mean ", numbers.reduce((a, b) => a + b, 0) / numbers.length); console.log("Median", numbers.sort()[Math.floor(numbers.length / 2)]); And running it: ❯ wc -l file.csv 13623 file.csv ❯ /usr/bin/time bun run index.ts file.csv [8.20ms] total Mean 7.205534757395581 Median 1 0.04 real 0.03 user 0.01 sys (On my Intel MacBook Pro...) The reading in the file and parsing the 13k lines took 8.2 milliseconds. The whole execution took 0.04 seconds. Pretty neat. Please post a comment if you have thoughts or questions Previous page Next page * Home * Archive * About * Contact * Search © peterbe.com 2003 - 2023 Check out my side project: That's Groce!