Glitch Machine | Diana's Domain

Diana's Domain






Glitch Machine


At bottom, all files are binary. So what happens if you mess with the bits?

The Glitch Machine is a simple Python script which steps through the nibbles (half-bytes) of a given file. At each step, there is a chance of mutating the current nibble to a random new one.

A nibble, being 4 bits, can be represented by a single hexadecimal digit. Hexadecimal is base-16, which means that by borrowing some symbols from the alphabet one can count all the way to 15 on a single digit. Base-10 gives us only 0-9 to work with, hence we must add a second digit upon reaching 10.

In base-16, we would count "0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A (10), B (11), C (12), D (13), E (14), F (15), 10 (16) 11 (17) ... 19 (25), 1A (26), 1B (27)..." and so on. We do not need a second digit until we reach 16, represented in hexadecimal by "10".

So what does hexadecimal have to do with the binary code of a file? Basically, hexadecimal is numerically convenient for representing binary numbers. Below are the 16 digits of hexadecimal with their binary equivalents:

The number 16 is a power of two: 24 =16. Hence base-16 is nicely convertible between binary, or base-2, and can be used to box up binary numbers into nice little packages. 4-bit nibbles need just one hexadecimal digit to represent them, and any 8-bit byte you could imagine can be encapsulated in just two hexadecimal symbols. That's 256 possiblities wound up in two characters!

Higher bases allow for greater density of expression with the tradeoff of needing many different symbols. Electronic computers use base-2 in part because all one needs to implement two different symbolic states is an electrical signal with either high or low voltage. DNA, on the other hand, is a base-4 system of informational representation. This allows a 3 "digit" codon to represent up to 43 = 64 possibilities, more than enough for the 20-something different amino acids being encoded. In comparison, 3 binary digits ("bits") can encode just 23 = 8 possibilities.

Here's a useful number conversion tool to see what various decimal (base-10) values look like in other bases: unitconversion.org

The main reason for using hexadecimal in the glitchMachine program is that it makes things simpler from the perspectives of human comprehension of binary code as well as ease of programming—ironic since it makes this explanatory introduction more verbose than it really needs to be. The workings of the hexadecimal encoding itself has little relevence to understanding how this script is working.

Anyway, the script first uses the binascii Python module's hexlify() function to get a hexadecimal representation of all the file's bits. It then reads through these hexadecimal digits and pauses at each one.

At each digit, the script rolls an imaginary die in order to decide whether or not that digit (again, itself equivalent to a specific nibble, a string of 4 bits) will be mutated. The result of the die roll—whether or not to move forward with mutation— is based on a certain probability. Setting the mutation probability to be higher results in more severe glitch effects, although a file's sensitivity to mutation varies by format.

If the script decides to mutate the current hexadecimal symbol to another hexadecimal symbol, it rolls another imaginary die in order to select the new symbol. Because the mutation probability has to be small enough number to not totally break the file, the first "die roll" as described above would have to use a die with thousands of sides. This second "die roll", however, could be performed with a die with 16 faces. The computer could be rolling an imaginary custom d16, an octagonal bipyramid with one of the 16 hexadecimal digits painted on each of the 16 sides. A CPU can dream. As it is, the script uses the random Python module's handy dandy choice() function to pseudo-randomly pick the new symbol out of a list of the 16 options.

So, for any given hexadecimal symbol, "2" might be mutated into "A", a "D" might be turned into an "F", a "5" could become an "8", and sure, a "7" could be selected for mutation only to be mutated into a "7". What this is really doing is changing a sequence of "0010" into "1010", "1101" into "1111", "0101" into "1000", and of course lucky number "0111" just gets to stay as "0111".

The script then converts the hexadecimal representation of the file back into binary using binascii's unhexlify() function and writes the binary code to new file, a glitched version of the input.

Interesting Results

The effects of the Glitch Machine vary greatly depending on the file format of the input. Regarding the sources for these examples, some are in the public domain/creative commons while for the rest I claim fair use by way of educational/research purposes. Or parody.

Images

Here is a JPG of Jabba The Hutt run through the Glitch Machine.


A glitched JPG of Jabba The Hutt

JPG glitching is most often characterized by colorful, horizontal, chunky stripes. The glitching sometimes causes a visual "cascade effect", showing how mutations early on in the file can have an influence which ripples down throughout the rest of the image.


Red Holstein from Les Meloures at Luxembourgish Wikipedia.

A second type of JPG glitch effect has many thinner stripes which don't seem to disrupt as much of the rest of the image. A JPG which glitches in this manner will always do so—it will not display the chunkier effects. Whether this has to do with the image dimensions or the manner of encoding or something else entirely I don't know.


Some JPGs glitch into thinner, right-justified stripes.

JPGs and PNGs mutate differently. Below is a PNG portrait of Chester Arthur, the 21st president of the United States, run through the script with a mutation probability of 0.0001%.


Gentleman Boss! May your auroral whiskers always glow, unlike your forgotten legacy.

For lots of detail about PNG glitching specifically, check out this cool webpage by Tokyo glitch artist UCNV: The Art of PNG Glitch.

In order to demonstrate both the differences in JPG and PNG glitching as well as the difference in effects at various rates of mutation, here are some more mutated bovine images:

Mutation Rate
JPG 0.0001% 0.0002% 0.0003%

1010944 nibbles (493 kb)

1 nibble mutated

4 nibbles mutated

4 nibbles mutated
0.001% 0.002% 0.003%

11 nibbles mutated

20 nibbles mutated

24 nibbles mutated
0.01% 0.02% 0.03%

92 nibbles mutated

192 nibbles mutated

257 nibbles mutated
PNG 0.0001% 0.0002% 0.0003%

2034262 nibbles (993 kb)

2 nibbles mutated

6 nibbles mutated

9 nibbles mutated
0.001% 0.002% 0.003%

14 nibbles mutated

24 nibbles mutated

44 nibbles mutated
0.01% 0.02% 0.03%

173 nibbles mutated

365 nibbles mutated

555 nibbles mutated

A Polish cow (krowa) from Krzysztof at Wikimedia Commons.

A few things:

First, PNGs are more sensitive than JPGS. There are also frequently instances where the number of mutated nibbles is low, but the resulting effect is very intense. We can assume that in these cases it is the specific nibbles which ended up being mutated that makes the difference. JPGS will do this as well, but it is more pronounced in PNG glitching. The coolest PNG glitching effect is the more subtle, diagonal hazy color striping, but in the cow testing this only happened on its own when one or two nibbles was mutated—and even then was still a lot less common than the colorful noise effect.

Second, the strip at the tops of the images that remains unglitched is a consequence of the fact that the Glitch Machine has a buffer of bits at the start of the file which it ignores. This is to avoid screwing up certain encoding information which I suppose comes at the beginning of the file in most formats. Adding the buffer seemed to result in fewer unreadable, totally broken files, especially for video files. In order to get good results over a variety of formats and files, sometimes it is helpful to adjust the code to make this invulnerable buffer larger or smaller.

Third, if you're looking at the number of mutated nibbles compared to the total number of nibbles in the original file and thinking "You take me for a fool—173 is not 0.01% of 2034262. It should be more like 203!", there is an explanation for that. There are two reasons:

  1. Every time the script decides that a nibble is about to mutated, there is a 1 in 16 chance that the nibble will be 'mutated' into the same hexadecimal digit which it was to begin with. These nibbles are not added to the mutated nibble count. I guess the script could just as easily pick from the remaining 15 digits instead of including the current digit, or re-roll until it mutates the digit into something else...but whatever!
  2. There is a buffer of invulnerable nibbles. These nibbles are passed over and are never candidates for mutation.

So in the case of a file with 2034262 nibbles, we can subtract 156481 nibbles to account for the buffer (during cow tests, set to be the first 1/13th of the file). We then have 1877781 nibbles up for possible mutation. 0.01% of this would be 187.7781 nibbles. 187.7781 * (15/16) = ~176, which is closer to 173.

Hence the number of mutated nibbles tends to end up a little less than the mutation rate would initially suggest. Chance is still at play, however, so there are always instances where the number of mutated bits ends up higher or lower than average.


Two more for the road

Gifs

For gifs, a few steps had to be taken to re-save the actual glitched result because Firefox gives up when handed a corrupt gif file to display. Internet Explorer is much braver and actually displays the glitched result. Hence, in order to display a glitched result on all browers, the glitched gif as played back in IE was recorded with screen-capturing software (in this case CamStudio, though I don't recommend it), saved as an .avi file, decomposed into frames, and re-saved as a new, uncorrupted .gif file.


Gif source unknown, sadly.

In Internet Explorer, a gif run through the script tends to look like a busted VHS tape. In fact, with many of the glitched files, the nature of the distortions look different depending on the program used to view it.

I have not tried viewing glitched gifs on many other browsers, but you can see what your browser does with the raw file here.

There is another kind of possible glitched gif—recall that a gif is made up of many frames. One can thus run the glitch machine on a bunch of static image files separately, and then put them together into one gif file. This results in an animation of the JPG or PNG effects.


Never enough cows

Videos

The greatest challenge with video glitching is getting messed up files with enough integrity remaining to not crash the video player part-way through playback. Distortions vary between formats (avi, mp4), different encoding schemes within the same formats, and the programs used to view the files. Because there is so much variability, the aim here is to get files that can be displayed in browers without playback error.

Maintaining consistency of distortions across programs (and reducing outright errors which crash the player programs) can be achieved by re-encoding the glitched video. What is presented below are re-encoded versions of three glitched video files which play back successfully in Firefox, but during testing would not play back in their entirety in Internet Explorer, Chrome, or Midori.

In order to display a consistent, non-broken result, the freeware program OBS Studio was used to screen-record the videos as they play back in Firefox. This process (similar to the process used for the gif in the previous section) results in an uncorrupted file with the visual and audio glitches "baked-in". Since this is a process of re-encoding by recording, we can call it re-encording.







If you would like to test your browser's mettle against the raw, glitched files themselves (or download them to try them in various media player software packages), they are available below:
C'est La Vie: B*Witched (1998)
No Scrubs: TLC (1999)
Whether or not the whole videos play, the distortions are sure to render a bit differently from their Firefox forms as presented above.

MP3s

The Glitch Machine reveals something interesting about MP3 files: Compared to image and video files, they're very resilient to mutation! While visual files only need a bit of mutation to render them either totally broken or rather wacky, to get a very noticeable distortion in an MP3 file requires bumping up the mutation rate quite a bit. Even very intense mutations still tend to sound like the same song played at an underwater jam session.




T-Swizzle mutated at a rate of 0.8%.

The snaps, crackles, and pops of the glitched mp3 appear in the audio signature as intense spikes of short duration, the amplitudes of which hit saturation:


The resilience of audio data is also demonstrated by the fact that in the prior section's glitched video examples, the audio tracks remain relatively intact.

Video Game ROMs

Thousands of video games have been ripped from their original media forms and can be run on "emulators", software designed to behave like specific video game consoles. If it the game was originally stored on a cartridge, the ripped game files are generally called ROMs, for "Read-Only Memory". Running the glitch machine script on video game ROMs is a hoot in those instances where the file remains runnable.

I have tried running some more complex game formats through the script (such as PS1 ISOs, which are files ripped from disks), but the glitched outcomes are generally too corrupt to play or otherwise freeze up as early as the splash screens. Maybe the difference in format (flash storage vs. optical media) means more places where a single bit out of place screws up the whole thing. Hence the results shown here present results for cartridge based media around 10-35 years old.

Mario:

We will start with a classic: Super Mario Bros for the NES (1985).

To run NES ROMs one needs an NES emulator such as Nestopia or FCEUX.

If you live under a rock, inexplicably hate video games, or are 9 years old or something here is how the opening level of Super Mario Bros. works without glitches:


The game in its normal state

Fun Fact: The flickering goombas at ~0:42 are not a glitch or an imperfection of emulation but are actually true to the limitations of the NES hardware.

Here are four different results of running the Super Mario Bros. ROM through the Glitch Machine:


From top left to bottom right:

Stuck Mod: We see a version of the game where Mario is stuck in place. He can jump, but walking and running gets him nowhere. He actually spawns a bit further to the left than he is supposed to, which is maybe why he can't move. ROM

Fatalist Mod: Mario dies immediately after the level begins. He falls off the stage and uses up all his lives in quick succession. ROM

Cycle Mod: Running to the right side of the screen deposits Mario back at the left side. It's as if he were running across a cylindrical surface which connects the right side of the screen with his spawn point. Mario also cannot jump in this version, as pressing the A button resets the game instead. ROM

Very Unusual Mod: Outside of pressing "Start" to begin the level (which also works to pause the game), controls do nothing here. The music only plays the first few bars. The world itself communicates its dissipation while Mario twirls around the letters as if they were an end-level flagpole. The color scheme and floating arches make the game look more like Castlevania than Mario. The goomba prowls eternally, just beyond the plumber's reach. Glitching can be truly nightmarish. ROM

This technique of randomized ROM Hacking gives varied and interesting results with little effort. Intentional ROM hacking requires specific knowledge of how a certain game is encoded in order to get a desired effect. For one example of an intentional Super Mario Bros Hack executed on the physical cartridge, see Arcangel's Super Mario Clouds.

The Legend of Zelda:

Here are some assorted screenshots and gifs from glitching The Legend of Zelda for the NES (1986):
(most images linked to ROM download)

There was another result where the naming/registration screen freaked out, flooded with 0's and other characters, and then started flashing "THANKS LINK, YOU'RE THE HERO OF HYRULE", which is the message the game displays when you finish the quest. I guess you win the game before you even name your save file. Sadly I overwrote this one without thinking. D'oh!

Below are two results which share something in common. A little background info:

When the game first begins, Link starts in an area with three exits and a cave to the northwest. If he enters that cave, he meets an old man who gives Link a sword and a warning before mysteriously disappearing. It is the first of many useful items Link will obtain over the course of his quest—unless the person controlling him is a masochist/expert and doing a swordless run or something crazy like that.

Pretty straightforward. The Glitch Machine, however, has other ideas:

In the result on the left, instead of receiving the sword from the old man, the player receives the boomerang! In the course of the normal game, the boomerang is found within the first dungeon (the area Link enters towards the end of the video). The environmental graphics are also very distorted.

In the result on the right, the food/bait/meat sprite follows Link around as if it were a returning boomerang. The overworld also traps Link within a mountain staircase area.

I was wondering whether the fact that both of these results did something related to the boomerang item meant that perhaps the mutations had affected the same areas of the code. A simple comparison script revealed that none of the same exact nibbles were affected. Nor were any of the alterations within the same byte, either.

Going into the unglitched ROM and manually changing the byte at memory address 6C00 to 20 instead of E0 (as this was just one of the many changes the script had made in the glitched version in the video to the left above) made it so Link picked up the boomerang in the cave [ROM]. Also, killing the bats in the west room of the first dungeon (which require only the boomerang to kill; most other enemies are only stunned) and picking up the key which then appears gives the player a dungeon map instead of the key. One single nibble can have quite an effect, and it's not immediately clear what that particular byte is controlling.

For a cool example of some real-time ROM hacking of Zelda (involving the use of tools for sleuthing out the meaning of various memory addresses), see Double-Fine developer Brandon Dillon's video here.

The NES examples show that video game glitches can affect not only images and sounds but also the behavior of the game and the rules which govern the virtual world. The next several examples are of glitched GBA ROMs. When GBA games remain playable, more of the glitching seems to be with the visuals of the game as opposed to its procedural behavior. Perhaps this is because proportionally more of the game file is dedicated to graphics than with an NES ROM. Since the files are significantly smaller, it's possible that randomly glitching an NES file is more likely to hit something encoding how the actual rules of how the game work.

A decent GBA emulator is VisualBoyAdvance (although it is no longer updated).

Kirby:

In this result, it appears (without progressing terribly far into the game) that the script only substantially affected some of the environmental tiles of Kirby: Nightmare in Dreamland (2002).


Glitches in both the foreground and background components of Kirby's parallax scrolling.


Harvest Moon:

Here are glitched title screens of two (very similar) Harvest Moon games:


Harvest Moon: More Friends of Mineral Town (2005) and Harvest Moon: Friends of Mineral Town (2003)

Final Fantasy:

While originally released on the SNES in 1994, there exists a fine 2006 GBA port of Final Fantasy VI, the last mainline entry in the long-running JRPG series to feature 2D, sprite-based graphics and appear first on a Nintendo console.

We can actually use an endgame save file from the unglitched version and rename it so it can be loaded by the emulator for use with the glitched version of the ROM. This way, we can skip to a part of the game where there is overworld and airship access.

Mild spoilers below, but then again this game is as old as I am.


The World of Ruin


The World of Ruin looking extra ruinous.

In this result, the overworld map is covered in a near-homogenous glitch soup. As if the world weren't in bad enough shape already following Kefka's wrath! Now it's suffered heat death, too. Battles still work but all the sprites are invisible. Oddly, while they're a heck of a lot harder to locate, the enterable town areas escape basically unscathed.


I feel you brah

The overworld of the World of Balance is similarly distorted:


The examples above give a taste of random ROM glitching possiblities.

Text files and Code

Text Files

Text files (.txt) are a good case study to make the point that those nibbles and bytes and 1's and 0's actually mean something. Character encoding systems are in many ways more transparent and straightforward than the encoding systems found within more complex (or proprietary) file types.

Do not confuse simple, portable .txt files —which contain bytes interpretable as just ASCII characters, a.k.a. "plain text"—for something like word documents (.doc), which contain layout and formatting data and are meant to be opened and displayed as text only within a specific word processor program.

Here is a nifty .txt file consisting of various ASCII characters open in the text editor Notepad:

In order to see the hexadecimal representations of these characters, we can use a program called a "hex editor". A simple, freeware hex editor is HxD.


The hexadecimal representation of nifty.txt

ASCII characters are each represented by 1 byte—a.k.a 8 bits*, or 2 nibbles. Hence, the hex-editor displays two hexadecimal characters for each ASCII character in the file.

*Technically the standard ASCII specification only takes 7 bits and consists of 128 different characters. "Extended ASCII" specifications use all 8 bits and can hence represent 256 characters.

The encoding for each printable symbol is unique. The exclamation point, for example, has the hex-code "21", which means it is encoded in memory as the byte "00100001".

For a table of standard ASCII character codes which includes the most common extended encoding, see ascii-code.com.

With that in mind, running this .txt file through the glitch machine such that 5 nibbles are reported to be mutated results in something like this:


puni nifly

And when viewed in the hex editor:


A methodical comparison confirms that 5 hexidecimal digits have indeed been altered. This is precisely what the glitch machine does! We see that the letter "t" in the word "Nifty" has been turned into a vertical bar. The hex code for the ASCII character "t" is 74, binary 01110100. The hex code for the ASCII character "|" is 7C, binary 01111100. When the glitch machine stepped through the nibbles of this file, it randomly changed that specific hexadecimal 4 into a hexadecimal C. In binary, 0100 became 1100.

The .txt file glitching works similarly on ASCII art, too:


at.txt and glitched_at.txt as rendered in Notepad.

The unglitched version is made only of standard ASCII characters. The glitched version contains characters outside the 7-bit standard ASCII range but within the extended 8-bit Windows-1252 encoding, as well some unaccounted-for symbols which really have no business being there. For some reason Notepad (on my Windows 7 system, at least) displays certain non-printable ASCII characters (i.e., control characters) as symbols outside the ASCII specification or its ANSI extension. Hex code "10" indicates the control character "Data Link Escape" and renders as what looks like an inverted cross/dagger or a box-drawing symbol. It's very strange.



Source Code:

Source code is just a text file which another program parses and does something with. For high-level, portable programming languages like Python or Java, this would be an interpreter or a compiler (respectively), which turns nice, human readable code into machine-level instructions that the computer can execute. Thank Grace Hopper. For low-level programming languages close to the machine, this code-reading program would be an assembler. In the case of a mark-up language like HTML, the code-reading program is the web browser, which uses the various tags and syntactical information provided to display text and other media elements on the screen as specified.

.py file

Time to get self-referential. This is computer science, after all. We'll get quite perverse and run the script on itself.

The interpreter doesn't mind. The exact python file being run can apparently also serve as an input stream at the same time. The files which are outputted are brand new files—that is, the program doesn't mutate itself in place.

Running the script through the script, the script ends up with various mutated characters, misspelled functions/variables, and extra, irrelevant symbols.


A section of a mutant script.


Because programming languages are justifiably persnickety, trying to use a glitched version of the script predictably fails:


Incontrovertible proof that computers cannot appreciate music

It's really weird that Windows (in both Notepad and IDLE) likes to interpret the nonprintable ASCII control character "Shift-Out" (hexcode 0E) as a musical note, even when apparently using the ANSI standard/1252 code page. This particular symbol looks to come from the IBM-PC Code Page 437, which offers an alterntive ASCII extension and furthermore substitutes actual characters for the nonprintable control codes. In this case the byte "0E" actually *does* stand for the musical symbol within Code Page 437 and renders as such in Notepad. However, as mentioned above, other control characters render as somewhat wonky looking box-drawing symbols, which, while present in some form on Code Page 437, are not within the hex code range of 00-1F. While it kind of makes sense that a text-editor would substitute printable characters for what should be non-printable bytes within the stated encoding, if Notepad or IDLE were using the symbols of Code Page 437 only for the codes nonprintable within 1252, the byte "0E" would indeed be the eighth-notes but the byte "10" would be not be what looks kind of like an inverted dagger or a box-drawing symbol but rather a black triangle pointing to the right! 'Tis a mystery. If anyone in the world has any idea what's going on, please let me know.



.html file

This very webpage which appears before you is also underpinned by a mutatable text file with the file extension ".html".

Unlike programming languages, mark-up languages are linguistically fault tolerant. This means a brower can take an .html file with incorrect syntax and erroneously placed symbols and still render it in some capacity. When the syntactical elements of the HTML tags get messed up, the browser can just interpret the broken tags as content instead of markup. Here is a glitched version of this page with an invulnerable buffer of 1/100th and a mutation rate of 5%:


Click to enter a parallel universe of incoherent web design

If we retain even a modestly sized invulnerable buffer, the basic page formatting present at the top of the file will be untouched while the rest of the page is incessantly mangled. In the example below, the buffer is 1/40th of the file and the mutation rate is a whopping 99%. Pretty much every nibble not present within the buffer gets mutated:


Click to enter a parallel universe of meaningless symbols

A non-plain text file glitched to such an extent would almost certainly not be openable by its intended program. When it comes to textual symbols, however, even bytes with arbitrary bit arrangements can have an interpretable meaning under a specific encoding system.


It may not look like well-formed HTML, but it runs okay.

Presidents

These glitching techniques were applied to all the Glitch-manders in Chief:

Hacking Democracy:
Experiments in Presidential Corruption


How To Run

The script is available to download here, or at GitHub.

This script is run from your operating system's command line or terminal. You must have Python installed on your system. If you are running Windows, you'll have to install Python yourself if you don't have it already. If you are on a Mac, your system already has Python. If you are using Linux, I do not need to be explaining this. The script should work in both Python 3 and Python 2.

The script takes up to three arguments, the second and third being optional. The first argument is the input filename. The second argument is the mutation rate. The third argument is the number of results the script will output. By default, the mutation rate is 0.005% and the script will output a single glitched copy of the input file with "glitched_" prepended to the original file name. If you want to use the third argument (that is, tell the script to make n glitched files instead of just 1), you must also provide it with the second argument (i.e., give it a mutation rate as well).


Example command for running the script from Windows CMD using all three arguments.

If one felt so inclined, the size of the invulnerable buffer can be adjusted in the code by changing the variable called "div". It is currently set to 20, so it ignores the first 1/20th of the file.

Development

The Glitch Machine was one of a handful of simple media-manipulation scripts originally written as a final project for the Fall 2013 course Intro to Media with Prof. Maria Cecire. Here is a link to our class blog (from 4+ years ago...that's like 20 years in internet time).

As a group, the simple scripts were intended to demonstrate the equivalence of all digital files on the level of binary code. The many different results of the Glitch Machine script were interesting enough to present at length.

Another one of the simple python scripts was called Media Mapping: Image To Sound— this program "translates" an image file into a sound file. It was made possible by Mark Wirt's MidiUtil library in conjuction with the PIL library for importing images. Check out the sound files below to hear the changes in color. Time moves left to right, with each beat of the sound basing its tones off the colors of an entire vertical column of pixels. Listen to the difference in the the lossless format of the PNG vs. the lossy image compression of the JPG:



Lossless PNG



Compressed, lossy JPG


v.1.1  Sept.  5th, 2018.
v.1.0 June 11th, 2018.