www.da.vidbuchanan.co.uk Open in urlscan Pro
2a0b:7080:20::1:b343 Public Scan

Back to summary
URL:
https://www.da.vidbuchanan.co.uk/blog/hello-png.html
Submission: On May 20 via manual (May 20th 2024, 5:57:06 am UTC) from IL — Scanned from GB
Form analysis
0 forms found in the DOM

Text Content

Welcome to my

::'########::'##::::::::'#######:::'######:::
:: ##.... ##: ##:::::::'##.... ##:'##... ##::
:: ##:::: ##: ##::::::: ##:::: ##: ##:::..:::
:: ########:: ##::::::: ##:::: ##: ##::'####:
:: ##.... ##: ##::::::: ##:::: ##: ##::: ##::
:: ##:::: ##: ##::::::: ##:::: ##: ##::: ##::
:: ########:: ########:. #######::. ######:::
::........:::........:::.......::::......::::

CTF writeups, programming, and miscellaneous stuff.

Blog Index


HELLO, PNG!

By David Buchanan, 16th January 2023

PNG is my favourite file format of all time. Version 1.0 of the specification
was released in 1996 (before I was born!) and the format remains widely used to
this day. I think the main reasons it stuck around for so long are:

 * It's "Good enough" at lossless image compression.
 * It builds on existing technologies (zlib/DEFLATE compression).
 * It's simple to implement (helped by the above point).
 * It supports a variety of modes and bit-depths, including "true color" (24-bit
   RGB) and transparency.
 * It isn't patented.

There are other similarly-old and similarly-ubiquitous formats (cough ZIP cough)
that are disgusting to deal with due to legacy cruft, ad-hoc extensions, spec
ambiguities, and mutually incompatible implementations. On the whole, PNG is not
like that at all, and it's mostly due to its well-thought-out design and careful
updates over the years.

I'm writing this article to fulfil my role as a PNG evangelist, spreading the
joy of good-enough lossless image compression to every corner of the internet.
Similar articles already exist, but this one is mine.

I'll be referencing the Working Draft of the PNG Specification (Third Edition)
released in October 2022 (!), but every feature I mention here should still be
present in the 1.0 spec. I'll aim to update this article once the Third Edition
releases officially.


WRITING A PNG FILE

I think the best way to get to grips with a file format is to write code for
reading or writing it. In this instance we're going to write a PNG, because we
can choose to focus on the simplest subset of PNG features.

A minimum-viable PNG file has the following structure:

PNG signature || "IHDR" chunk || "IDAT" chunk || "IEND" chunk


The PNG signature (aka "magic bytes") is defined as:

"89 50 4E 47 0D 0A 1A 0A" (hexadecimal bytes)


Or, expressed as a Python bytes literal:

1

b'\x89PNG\r\n\x1a\n'


These magic bytes must be present at the start of every PNG file, allowing
programs to easily detect the presence of a PNG.


PNG CHUNKS

After the signature, the rest of the PNG is just a sequence of Chunks. They each
have the same overall structure:

Length      - A 31-bit unsigned integer (the number of bytes in the Chunk Data field)
Chunk Type  - 4 bytes of ASCII upper or lower-case characters
Chunk Data  - "Length" bytes of raw data
CRC         - A CRC-32 checksum of the Chunk Type + Chunk Data


PNG uses Network Byte Order (aka "big-endian") to encode integers as bytes.
"31-bit" is not a typo - PNG defines a "PNG four byte integer", which is limited
to the range 0 to 231-1, to defend against the existence of C programmers.

If you're not familiar with these concepts, don't worry - Python will handle all
the encoding for us.

The Chunk Type, in our instance, will be one of IHDR, IDAT, or IEND (more on
these later).

The CRC field is a CRC-32 checksum. The spec gives a terse mathematical
definition, but we can ignore all those details and use a library to handle it
for us.

The meaning of data within a chunk depends on the chunk's type, and potentially,
context from prior chunks.

Putting all that together, here's a Python script that generates a vaguely
PNG-shaped file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

import zlib

# https://www.w3.org/TR/2022/WD-png-3-20221025/#5PNG-file-signature
PNG_SIGNATURE = b'\x89PNG\r\n\x1a\n'

# https://www.w3.org/TR/2022/WD-png-3-20221025/#5Chunk-layout
def write_png_chunk(stream, chunk_type, chunk_data):
	# https://www.w3.org/TR/2022/WD-png-3-20221025/#dfn-png-four-byte-unsigned-integer
	chunk_length = len(chunk_data)
	if chunk_length > 2**31 - 1:  # This is unlikely to ever happen!
		raise ValueError("This chunk has too much chonk!")
	
	# https://www.w3.org/TR/2022/WD-png-3-20221025/#5CRC-algorithm
	# Fortunately, zlib's CRC32 implementation is compatible with PNG's spec:
	crc = zlib.crc32(chunk_type + chunk_data)

	stream.write(chunk_length.to_bytes(4, "big"))
	stream.write(chunk_type)
	stream.write(chunk_data)
	stream.write(crc.to_bytes(4, "big"))


if __name__ == "__main__":
	"""
	This is not going to result in a valid PNG file, but it's a start
	"""

	ihdr = b"\0" * 13  # TODO: populate real values!
	idat = b""  # ditto

	with open("samples/out_0.png", "wb") as f: # open file for writing
		f.write(PNG_SIGNATURE)
		write_png_chunk(f, b"IHDR", ihdr)
		write_png_chunk(f, b"IDAT", idat)
		write_png_chunk(f, b"IEND", b"")


The write_png_chunk() function is complete and fully functional. However, we
don't have any real data to put in the chunks yet, so the script's output is not
a valid PNG.

Running the unix file tool against it gives the following output:

$ file samples/out_0.png 
samples/out_0.png: PNG image data, 0 x 0, 0-bit grayscale, non-interlaced


It correctly recognises a PNG file (due to the magic bytes), and the rest of the
summary corresponds to the 13 zeroes I packed into the IHDR chunk as a
placeholder. Since we haven't populated the chunks with any meaningful data yet,
image viewers will refuse to load it and give an error (there is nothing to
load!).


IMAGE INPUT

Before we continue, we're going to need some actual image data to put inside our
PNG. Here's an example image I came up with:



Funnily enough, it's already a PNG file, but we don't have a way to read PNGs
yet - how can we get the pixel data into our script? One simple method is to
convert it into a raw bitmap, which is something ImageMagick can help us with. I
used the following command:

$ convert ./samples/hello_png_original.png ./samples/hello_png.rgb


hello_png.rgb now contains the raw uncompressed RGB pixel data, which we can
trivially read as-is from Python. For every pixel in every row, it stores a
3-byte value corresponding to the colour of that pixel. Each byte is in the
range 0-255, corresponding to the brightness of each RGB sub-pixel respectively.
To be pedantic, these values represent coordinates in the sRGB colourspace, but
that detail is not strictly necessary to understand.

This .rgb file isn't a "real" image file format, and we need to remember certain
properties to be able to make sense of it. Firstly we need to know the width and
height (in this case 320x180), the pixel format (24-bit RGB, as described
above), and the colourspace (sRGB). The PNG file that we generate will contain
all this metadata in its headers, but since the input file doesn't contain them,
we will hardcode the values in our Python script.


THE IHDR (IMAGE HEADER) CHUNK

The IHDR Chunk contains the most important metadata in a PNG - and in our
simplified case, all the metadata of the PNG. It encodes the width and height of
the image, the pixel format, and a couple of other details:

Name                Size

Width               4 bytes
Height              4 bytes
Bit depth           1 byte
Colour type         1 byte
Compression method  1 byte
Filter method       1 byte
Interlace method    1 byte


There isn't much to say about it, but here's the relevant section of the spec.

I mentioned earlier that our RGB values are in the sRGB colourspace. PNG has
ways to signal this information explicitly (through "Ancilliary Chunks"), but in
practice, sRGB is assumed to be the default, so for our minimum-viable PNG
implementation we can just leave it out. Colour spaces are a complex topic, and
if you want to learn more I recommend watching this talk as an introduction: Guy
Davidson - Everything you know about colour is wrong


THE IDAT (IMAGE DATA) CHUNK

The IDAT chunk contains the image data itself, after it's been Filtered and then
Compressed (to be explained shortly).

The data may be split over multiple consecutive IDAT chunks, but for our
purposes, it can just go in one big chunk.


THE IEND (IMAGE TRAILER) CHUNK

This chunk has length 0, and marks the end of the PNG file. Note that a
zero-length chunk must still have all the same fields as any other chunk,
including the CRC.


FILTERING

The idea of filtering is to make the image data more readily compressible.

You may recall that the IHDR chunk has a "Filter method" field. The only
specified filter method is method 0, called "adaptive filtering" (the others are
reserved for future revisions of the PNG format).

In Adaptive Filtering, each row of pixels is prefixed by a single byte that
describes the Filter Type used for that particular row. There are 5 possible
Filter Types, but for now, we're only going to care about type 0, which means
"None".

If we had a tiny 3x2 pixel image comprised of all-white pixels, the filtered
image data would look something like this: (byte values expressed in decimal)

0   255 255 255  255 255 255  255 255 255
0   255 255 255  255 255 255  255 255 255


I've added whitespace and a newline to make it more legible. The two zeroes at
the start of each row encode the filter type, and the "255 255 255"s each encode
a white RGB pixel (with each sub-pixel at maximum brightness).

This is the simplest possible way of "filtering" PNG image data. Of course, it
doesn't do anything especially useful since we're only using the "None" filter,
but it's still a requirement to have a valid PNG file. I've implemented it in
Python like so:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21

# This is all the code required to read subpixel values from an ".rgb" file.
# subpixel 0=R, 1=G, 2=B
def read_rgb_subpixel(rgb_data, width, x, y, subpixel):
	return rgb_data[3 * ((width * y) + x) + subpixel]

# Note: This function assumes RGB pixel format!
# Note: This function could be written more concisely by simply concatenating
# slices of rgb_data, but I want to use approachable syntax and keep things
# abstracted neatly.
def apply_png_filters(rgb_data, width, height):
	# we'll work with an array of ints, and convert to bytes at the end
	filtered = []
	for y in range(height):
		filtered.append(0) # Always filter type 0 (none!)
		for x in range(width):
			filtered += [
				read_rgb_subpixel(rgb_data, width, x, y, 0), # R
				read_rgb_subpixel(rgb_data, width, x, y, 1), # G
				read_rgb_subpixel(rgb_data, width, x, y, 2)  # B
			]
	return bytes(filtered)



COMPRESSION

Once the image data has been filtered, it needs to be compressed. You may recall
that the IHDR chunk has a "Compression method" field. The only compression
method specified is method 0 - a similar situation to the Filter Method field.
Method 0 corresponds to DEFLATE-compressed data stored in the "zlib" format. The
zlib format adds a small header and a checksum (adler32), but the details of
this are outside the scope of this article - we're just going to use the zlib
library (part of the Python standard library) to handle it for us.

If you do want to understand the intricacies of zlib and DEFLATE, check out this
article.

Implementing this in Python is dead simple:

1

idat = zlib.compress(filtered, level=9) # level 9 is maximum compression!


As noted, level 9 is the maximum compression level offered by the zlib library
(and also the slowest). Other tools such as zopfli can offer even better
compression ratios, while still conforming to the zlib format.


PUTTING IT ALL TOGETHER

Here's what our minimum-viable PNG writer looks like in full:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87

import zlib

# https://www.w3.org/TR/2022/WD-png-3-20221025/#5PNG-file-signature
PNG_SIGNATURE = b'\x89PNG\r\n\x1a\n'

# https://www.w3.org/TR/2022/WD-png-3-20221025/#dfn-png-four-byte-unsigned-integer
# Helper function to pack an int into a "PNG 4-byte unsigned integer"
def encode_png_uint31(value):
	if value > 2**31 - 1:  # This is unlikely to ever happen!
		raise ValueError("Too big!")
	return value.to_bytes(4, "big")

# https://www.w3.org/TR/2022/WD-png-3-20221025/#5Chunk-layout
def write_png_chunk(stream, chunk_type, chunk_data):
	# https://www.w3.org/TR/2022/WD-png-3-20221025/#5CRC-algorithm
	# Fortunately, zlib's CRC32 implementation is compatible with PNG's spec:
	crc = zlib.crc32(chunk_type + chunk_data)

	stream.write(encode_png_uint31(len(chunk_data)))
	stream.write(chunk_type)
	stream.write(chunk_data)
	stream.write(crc.to_bytes(4, "big"))

def encode_png_ihdr(
		width,
		height,
		bit_depth=8,           # bits per sample
		colour_type=2,         # 2 = "Truecolour" (RGB)
		compression_method=0,  # 0 = zlib/DEFLATE (only specified value)
		filter_method=0,       # 0 = "adaptive filtering" (only specified value)
		interlace_method=0):   # 0 = no interlacing (1 = Adam7 interlacing)

	ihdr = b""
	ihdr += encode_png_uint31(width)
	ihdr += encode_png_uint31(height)
	ihdr += bytes([
		bit_depth,
		colour_type,
		compression_method,
		filter_method,
		interlace_method
	])

	return ihdr

# This is all the code required to read subpixel values from an ".rgb" file.
# subpixel 0=R, 1=G, 2=B
def read_rgb_subpixel(rgb_data, width, x, y, subpixel):
	return rgb_data[3 * ((width * y) + x) + subpixel]

# Note: This function assumes RGB pixel format!
# Note: This function could be written more concisely by simply concatenating
# slices of rgb_data, but I want to use approachable syntax and keep things
# abstracted neatly.
def apply_png_filters(rgb_data, width, height):
	# we'll work with an array of ints, and convert to bytes at the end
	filtered = []
	for y in range(height):
		filtered.append(0) # Always filter type 0 (none!)
		for x in range(width):
			filtered += [
				read_rgb_subpixel(rgb_data, width, x, y, 0), # R
				read_rgb_subpixel(rgb_data, width, x, y, 1), # G
				read_rgb_subpixel(rgb_data, width, x, y, 2)  # B
			]
	return bytes(filtered)


if __name__ == "__main__":
	# These values are hardcoded because the .rgb "format" has no metadata
	INPUT_WIDTH = 320
	INPUT_HEIGHT = 180
	# read entire file as bytes
	input_rgb_data = open("./samples/hello_png.rgb", "rb").read()

	ihdr = encode_png_ihdr(INPUT_WIDTH, INPUT_HEIGHT)

	filtered = apply_png_filters(input_rgb_data, INPUT_WIDTH, INPUT_HEIGHT)

	# Apply zlib compression
	idat = zlib.compress(filtered, level=9) # level 9 is maximum compression!

	with open("samples/out_1.png", "wb") as f: # open file for writing
		f.write(PNG_SIGNATURE)
		write_png_chunk(f, b"IHDR", ihdr)
		write_png_chunk(f, b"IDAT", idat)
		write_png_chunk(f, b"IEND", b"")


That's only 87 lines of liberally commented and spaced-out Python code. If we
run it, we get this output:



It's... exactly the same as the one I showed earlier, which means it worked! We
made a PNG from scratch! (Well, not quite from scratch - we used zlib as a
dependency).

Verifying it using the pngcheck utility results in the following:

$ pngcheck ./samples/out_1.png 
OK: ./samples/out_1.png (320x180, 24-bit RGB, non-interlaced, 15.6%).


Looks good! Now let's have a look at some file sizes:

hello_png_original.png       128286 bytes
hello_png.rgb                172800 bytes
out_1.png                    145787 bytes


We started off with a 128286-byte PNG file, exported from GIMP using the default
settings.

We converted it to a raw RGB bitmap using ImageMagick, resulting in 172800 bytes
of data. Taking this as the "original" image size, that means GIMP's PNG encoder
was able to compress it to 74% of its original size.

Our own PNG encoder only managed to compress it down to 145787 bytes, which is
84% of the original size. How did we end up 10% worse?

It's because we cheaped out on our Filtering implementation. GIMP's encoder
chooses a filter type for each row adaptively, probably based on heuristics (I
haven't bothered looking at the specifics). If we implemented the other filter
types, and used heuristics to pick between them, we'd probably get the same or
better results as GIMP. This is an exercise left to the reader - or maybe a
future blog post from me!

As a quick example, Adaptive Filter type 2 subtracts the byte values of the
pixel above from those of the "current" pixel. If one row was identical (or
similar) to the row above it, the filtered version of that row would compress
very efficiently (because it would be all or mostly zeroes).

Full source code and example files are available on my Git repo:
https://github.com/DavidBuchanan314/hello_png


THINGS I DIDN'T MENTION

Things I didn't mention in this article, which you may still want to know,
include:

 * Support for other bit-depths.
 * Indexed colour (i.e. using a palette)
 * Further metadata, and other chunk types.
 * Interlacing.
 * The other filter types.
 * APNG.

...and probably a few other things I forgot. I might update this list when I
remember them.


PNG DEBUGGING TIPS

If you're trying to generate or parse your own PNGs and running into opaque
errors, here are a couple of tips.

Try using ImageMagick to convert the PNG into another format (the destination
format doesn't matter). This is useful because it gives specific errors about
what went wrong. For example, if I try to convert the initial out_0.png image we
generated (which had the basic file structure but no data), we get this:

$ convert samples/out_0.png /tmp/bla.png
convert: insufficient image data in file `samples/out_0.png' @ error/png.c/ReadPNGImage/4270.


This error makes sense, because IDAT was empty. You could probably track down
the specific line of png.c if you wanted even more details.

My next tip is to try using an advanced hex-editor like ImHex to inspect the
file. ImHex supports a "pattern" for PNG, which effectively gives you byte-level
syntax highlighting, as well as letting you view the parsed structures of the
file.


RELATED MATERIALS AND PNG TRICKS

I recently made a PNG/MD5 hashquine which various people wrote about and
discussed, including myself (I do plan on writing a proper blog post on it,
eventually).

I also found a bug in Apple's PNG decoder, due to a poorly thought-out
proprietary extension they made to the format. They've since fixed that instance
of the bug, although it's still possible to trigger it using a slightly
different approach. There was also related discussion and articles.

I made a proposal for a backwards-compatible extension to the PNG file format
that enables the PNG decoding process to be parallelised. Others have made
similar proposals, and it is likely that some variation will make it into a
future version of the official PNG specification.

I found an edge-case in Twitter's image upload pipeline that allows PNG/ZIP
polyglot files to be hosted on their CDN. Related article. I abused the same
trick to upload web-streamable 4K 60fps video (a feature Twitter is yet to
officially support!).

PNG also supports "Adam7" interlacing, which I abused to create a crude form of
animated PNG (without using APNG, heh). Related discussion.

Maybe now you believe me when I say it's my favourite file format?

--------------------------------------------------------------------------------

Homepage - Blog Index - RSS

--------------------------------------------------------------------------------

This blog is part of the Haunted Webring

< Previous - Random - Next >

--------------------------------------------------------------------------------

A word from our unofficial sponsors:
www.da.vidbuchanan.co.uk Open in urlscan Pro 2a0b:7080:20::1:b343 Public Scan

Form analysis 0 forms found in the DOM

Text Content

www.da.vidbuchanan.co.uk Open in urlscan Pro
2a0b:7080:20::1:b343 Public Scan

Form analysis
0 forms found in the DOM