Oric minigamesProcedural imageChangelogDay One (2020-05-28)Day Two (2020-05-29)Day Three (2020-05-30)Implementation notesSize coding is an interesting challenge: After spending a day at work using a computer that counts memory in gigabytes and storage in terabytes, it's refreshing to go back to basics.
Back in the days, people HAD to write compact and efficient code, simply because the hardware would not allow for anything else.
These days it's more like an intellectual puzzle game, which I find deeply satisfying.
Mind Bender, my entry to Sommarhack 2020's 256 bytes intro competition is not my first attempt at size coding.
Oric minigamesIf you've followed my productions over time you probably already know three of my Oric minigames:
- 4K Kong
- 3K FreeCell
They are all stored on disk as compressed executables, and decompressed in memory before they run.
Additionally, they are all written for the 6502 CPU, which may not be very advanced in term of supported operations, but has the benefit of using 8 bit opcodes, resulting in quite compact code compared to the 68000 cpu which supports a rich instruction set and quite advanced addressing modes, but requires 16 bit opcodes for each instruction.
If your target size is quite large (all things considered), then you can definitely pay the cost of having the depacker, but when targeting very small sizes, the trade off become very hard to justify.
If you have to drop the idea of using compressed executables, then the set of tricks you can use changes drastically!
Obviously this "copy pasted"1 code takes room, but if your executable is compressed, things like sequences of "move.l (a0)+,(a1)+" will be encoded in just a few bytes.
When using compressed executables, size coding become a game of "making patterns that can be repeated efficiently".
Procedural imageThe "Mind Bender" idea started when Wietze contacted me on IRC to ask me if I would participated with a remote entry for Sommarhack 2020.
The main topic was "medium resolution", which was kind of inspiring to me since I've done trickery in both Low and High resolution, but not quite anything in Med rez, so I said yes.
Then some time later there started to be people playing with bootsectors, like in the good old days, and somebody suggested that if there was enough entries then they could add a 256 bytes intro category to Sommarhack... and since I was starting to get a bit short on time, I decided to try that first, and if I still had time, then eventually the Med Rez challenge.
Then at about the same time, Patapom shared this picture on Facebook, with the following comment
(Take that, shitty brain!!!)
It's one of these typical pictures that seems to move despite being totally static, and I wondered if it was possible to redo in 256 bytes, because after all it's just a chessboard with an alternate set of colors and some small flowery patterns at the intersection between tiles.
How hard can that be?
ChangelogLooking at the change log, it actually took about three days.
I basically followed by usual process, which starts by trying to get something running at all, even if it's ugly, slow, too big, etc... It's pointless to spend time optimizing at the start if you don't even have a rough idea of what a basic implementation would look like!
Then, when you have something running, look at it in detail (in this case, by tracing the code line by line in the debugger) to get a better idea of what is actually taking room or could be differently.
So, let's see how the process day by day
Day One (2020-05-28)
So, the first implementation took 1207 bytes, which is quite far from the 256 bytes target!
Here is what it looks like:
It's not particularly awesome, the black bars are horrible, the dots are truncated and not centered, but the visual effect kind of work, so there's still some hope :D
Looking at the code history, the drop from 1207 bytes to 1051 bytes was caused by removing some of the uncessary system setup and restore code which I reused from a previous intro, as well as disabling debug information (symbols) from the final executable file format.
To drop from 1051 to 482 I introduced conditional assembly to switch between "development" and "release" modes: In the final version it was not necessary to be able to properly handle the keyboard and cleanly return to the operating system, so in "final release" mode I simply skipped all this part!
Simple gain really :)
The drop to 478 bytes was calling an alternative function of the operating system to switch to Supervisor mode2which did the same thing in less bytes.
; Development mode: We call the main routine in supervisor mode
; and when we come back, we return to the caller
move.w #$26,-(sp) ; XBIOS: SUPEXEC
clr.w -(sp) ; GEMDOS: PTERM(0)
; Final Release: We switch to supervisor, but never come back
; so no need to correct the stack pointer
move.w #$20,-(sp) ; GEMDOS: SUPER
To achieve 463 bytes I stopped setting the values of the MFP IMRA and IMRB registers (since I was not relying on any timer).
To go down to 331 bytes, I replaced a sequences of nops by a timed loop of equivalent duration:
And finally to go down to 296 bytes, I replaced a small table of bytes used to define the alternate color patterns of dots from
This basically achieve the same result of having a list of values that change3.
Day Two (2020-05-29)
To go down from 296 to 284 bytes, I basically replaced the remaining blocks of nops by neutral instructions that took the same amount of time to execute but using less room in memory, such as
which takes 20 clock cycles so is the equivalent of
where each of the 5 nop take 4 cycle to execute.
Obviously you better have to be sure that d3 can be modified at this place in the code!
Regarding the move.b #0 and move.w #0, each of these takes 4 bytes: 2 bytes for the move instruction opcode, and 2 bytes for the "0" immediate payload, which neither SF (Set False) or CLR require - easy gains.
At this point, we are down to 256 bytes (from the original 1207) and it still look exactly the same.
So the big question is: Now that we have something that technically fits the constraints and could be sent to the Sommarhack people, is it possible to reduce even more the size so we can make it prettier?
Day Three (2020-05-30)
The big win was achieve on day 3, when I decided to use the screen information setup in the system instead of defining my own screen.
A typical screen setup would be something either involving a call to a XBIOS function, or manual setup of screen registers:
; Set the screen on the buffer
save_screen_addr_1 ds.b 1
save_screen_addr_2 ds.b 1
save_vbaselo ds.b 1
screen ds.l 32000/4
while in practice we can just read the current screen address in page 4 system's variables:
move.l $44e.w,a0 ; screenpt
The main issue, is that you inherit of the screen as it was, which means in our case, a big ugly green image with a title bar with the name of our executable!
So the question become: How can I efficiently erase the screen?
Obviously we can just use brute force and some code to loop around to erase the 32000 bytes we found by reading $44e, but I decided to go for something a little bit fancy, and instead use the print function of the Gemdos to display a text!
; Print the Dbug title
dc.b 27,"b",3 ; Set 1 as Ink color
dc.b 27,"E" ; Erase Screen
In case you did not know, the Atari ST system print function supports VT52 control codes, which means a single print out can do things like changing the color of the text, of the background, enable inverted video, move the cursor, etc... so I used that to select the color of my "Dbug" text while also deleting the entire image, just in a few additional bytes.
The rest of the optimisation was just some small things, like replacing the setting of two colors from two instructions to one, from
move.w #$707,$ffff8242.w ; The Purple crosses
move.w #$777,$ffff8246.w ; The White crosses
move.l #$07770707,$ffff8242.w ; The White and Purple crosses
And the rest is some small tweaks here and there, like playing with the color registers during the display to generate some gradient colors which are more satisfying to the eye.
Anyway, here is the final result:
Implementation notesIf you don't know the Atari ST, it may not be obvious what I did.
The image generated by my program appears to occupy the entire screen, including the borders, which normally cannot contain any graphics and only show the content of the first entry of the color palette (register 0 at address $ffff8240).
The normal display area is 320x200, which is much smaller than the actual viewable area - in this case 384x270 but it's more or less variable depending of the machines and screens -.
You can visualize the two areas on this annotated screenshot:
If you look at the inner part of the image, you will notice that the "Dbug @ Sommarhack 2020" text, as well as all the small white and purple patterns are inside the 320x200 rectangle: The reason is that I'm not doing any of the usual "fullscreen/overscan" trickery normally used to coerce and Atari ST in displaying actual bitmap in the borders.
The reason why you can see the green chessboard tiles over all the screen, is that it's not made of bitmap, it's only done by changing the background color register during the entire display:
if you do that often enough, and using the right timings, you can easily do vertical bands of alternative colors, and if every "n" lines, you add a few more additional delays, you can shift your entire pattern of column by one, resulting in the chessboard pattern you are seeing.
The nice thing about the Atari ST, is that the timings of the screen and instructions is 100% reliable and predictable, which makes such a code relatively easy to do, so here are some numbers for you:
- The CPU runs at 8mhz (8 millions cycles per second)
- At 50hz (screen refresh) we have 160256 cycles per frame (roughly 8mz/50hz)
- These 160256 cycles are spread of 313 scanlines, with 512 cycles per scanline
Anyway, here is the complete source code of what is actually assembled in the final release4.
I hope that was interesting for you, if you have questions don't hesitate to ask :)
; Program start
move.w #$20,-(sp) ; GEMDOS: SUPER
move.l #$07770727,$ffff8242.w ; The White and Purple crosses
; Print the Dbug title
; Draw the cross patterns
move.l $44e.w,a0 ; screenpt
; VSync and disable interrupts
move.w #37,-(sp) ; VSYNC
; Main loop
move.w #$345,(a6) ; Light blue-gray background
move.w #$370,d6 ; Color of first tile
move.w #$263,d7 ; Color of second tile
; Line zero synchronisation
; To leave room for the "Dbug" line
; 64 nops*8 =
; The alternated color grid
move.w d6,(a6) ; 8/2 background color change
lsl.l #8,d3 ; 24/6 (DELAY)
move.w d7,(a6) ; 8/2 background color change
move.w (a6),(a6) ; (DELAY)
dbra d0,loop_squares ; 12/3 if branches / 16/4 if not taken
dc.b 27,"b",4 ; Set 4 as Ink color
dc.b 27,"E" ; Erase Screen
dc.b "Dbug @ Sommarhack 2020"
1. In practice we tend to use macro functionalities in our assemblers, like REPT-ENDR in Devpac↩
2. On the Atari ST, you are not allowed to access hardware registers and system variables as a simple user, you need to escalate to supervisor mode↩
3. I originally choose to have a table because I was not certain I only have two values there↩