Flying Shark: Drawing

Tom Grove
Jan 21, 2021
6 min read

The approach flying shark takes to drawing the screen is quite involved. It is a kind of "virtual" tile map approach similar to that used in "Uridium". The main structures are their relationships are summarized in the diagram below:

The Background map holds pointers to the tiles that make-up the map - these are the pointers that are generated when the tile map is decompressed. There is one pointer for every 8x8 cell in the play area. I.e. 36 by 18 pointers.

A pointer stores the address of the top row ( i.e. the row that will form the first row on the display ). This pointer advances every 8 frames as new cell pointers are generated. When the pointer reaches the last row it wraps around to the top.

Every frame, the 24 rows following the background pointer are copied to the pointer map. These cells will be the cells rendered and this copy effectively initialises this map with the background cells.

Some of these pointers will be replaced when objects are rendered. Objects are rendered into a further map. This is structured as 18, 192 byte columns. Each column starts on a page boundary ( i.e. at 256 byte intervals ). This is efficient for sprite rendering, as the high part of the address can simply be found by adding the index of the sprites left most column to the MSB of the buffer address ( 0x88 ). The low part is given by the sprites y position. Pointers into this map overwrite the background pointers in the cell map.

The buffer is ultimately rendered using this fragment of code. The stack is pointed at the pointer map, the tile addresses are popped. The stack pointer is then pointed at these tile addresses and the cell is filled using a series of pops and writes.

           LD         E,0x18       ; 24 rows                            
	   LD         B,0x6	   ; inner loop draws 3 tiles; 3x6 = 18							 
Row   	   POP       HL		   ; stack earlier setup to point at                   
                                   ; tilemap; this is first tile addr
	   POP       IX		   ; second tile
	   POP       IY		   ; third tile
	   LD      ( StackSave ),SP									
	   LD        SP,HL	   ; point at image data for first tile
	   EXX
LAB_ram_c79f+1	
	   LD  	     D, 0x40       ; patched																																	                                               											     																																														          																				
           POP        BC
	   LD         (HL),C
	   INC        H
	   LD         (HL),B
	   INC        H
	   POP        BC
	   LD         (HL),C
	   INC        H
	   LD         (HL),B
	   INC        H
	   POP        BC
	   LD         (HL),C
	   INC        H
	   LD         (HL),B
	   INC        H
	   POP        BC
	   LD         (HL),C
	   INC        H
	   LD         (HL),B                                
	   INC        H
	   EX         DE,H							 
	   :
								 
 and another 2 times, taking image data from IX and IY

This writes the pointer map to the screen, but how is smooth scrolling achieved? The trick is to maintain pointers ( in HL and DE ) to both the current tile and following tile. The code above is then patched to switch one of the INC H for a EX HL, DE. E.g. if the buffer position modulo 8 was 4, then the code above would be:

		 POP        BC
		 LD         (HL),C
		 INC        H
		 LD         (HL),B
		 INC        H
		 POP        BC
		 LD         (HL),C
		 INC        H
	         LD         (HL),B
		 EX 	    DE, HL ; <-- swap to next character cell
		 POP        BC
		 LD         (HL),C
		 INC        H
		 LD         (HL),B
		 INC        H
		 POP        BC
		 LD         (HL),C
		 INC        H
		 LD         (HL),B                                
		 INC        H
		 EX         DE,HL

One feature of the way that this is written is that the first row - which won't be fully on screen - will write into the ROM. This wouldn't have done any harm on a real spectrum but would cause mischief in any scenario where these addresses were writable.

This might seem a relatively convoluted way of managing a back buffer and it was by no means the most common approach at the time. On the zx spectrum, smooth vertical scrolling was often achieved by creating a linear back buffer - a buffer where a sequence of bytes representing one row is followed by the spatially adjacent row - and copying from an offset in this buffer to the screen. The translation from the linear buffer to the spectrum's more exotic screen memory organisation can be done during the copy. By updating the source address by a line from frame to frame, smooth scrolling can be achieved for "free". The approach taken in flying shark, however, is more complex, needs more memory and is slightly slower.

These criticisms would only apply if the only thing you were interested in doing was scrolling the screen. For an actual game, though, this approach has several desirable features:

Static objects can be rendered simply by replacing the tiles that they overlap. This is used for the static gun emplacements
Tiles can be animated by changing their image data. This allows for effects like animated water.
It's possible to distinguish between background and foreground cells and - if desired - run sprites behind the foreground cells. This is used throughout the game to allow tanks to emerge from buildings.

Furthermore, the sprite rendering is quite efficient. There is no need to track dirty regions of the background since the initial copy effectively clears the sprites. A linear buffer would also likely require sprite code to perform an ADD every few bytes to advance to the next row, which is significantly more costly than the INC r instructions that are used in the majority of the sprite code.

Objects themselves are drawn in a variety of ways. The simplest and cheapest method is to simply overwrite the background tile. This is used for the gun emplacements. The second simplest are aircraft and bullets - these first clip themselves against the play area extents then initialise part of the data buffer with a copy of the data pointed to by the cell buffer. This will either be pointers to background cells or pointers to other addresses within the data map if this sprite is overwriting another. A rendering routine is then chosen based on the sprite's orientation: there are a number of routines that handle the four combinations of reflections about the horizontal or vertical axis. The innermost ops are:

        POP        BC		; mask/byte pair
	LD         A,( DE )	; byte at dest
        AND        B		; and with mask
        LD         L,C		; low byte of reflection table
        XOR        (HL )	; xor  image byte into A
        LD         (DE ), A     ; store in dst
        INC        E		; inc low-byte of buffer addr 
                                ; - dec if reflected

Due to the structure of the data buffer, sprites are laid out in columns of bytes rather than rows. The code above pops a mask/byte pair from the stack and reflects the image ( non-mask ) part of the pair using a 256 byte reflection table. The game stores the mask byte reflected relative to the image byte, so only one reflected look-up needs to be done within the inner loop. While this is slightly more expensive than pre-reflecting the sprites, this would be unacceptable memory-wise, given that most of the sprites can be oriented.

Ground based enemies use a slightly different combination of operations. They only initialise the cells in the pointer map if the cells are "background" cells. Tanks also render two sprites - one for the body and one for the turret. Composing objects this way once again saves memory. Ground based enemies can also go through another pathway if the "OR" composition flag is set in the object instance record. In this case, the sprite is first written to a small additional buffer which is then bitwise OR'ed into the data buffer. This simulates tanks being beneath trees. The script code includes an op to set this flag - presumably the idea being that after tanks have emerged from trees they might need to mask themselves to avoid disappearing. However, this is never used.

In total, there are 16 copies of the unrolled sprite drawing code: 8 and 16 byte versions of the masked ( with 4 reflection combinations ), 4 routines that copy ( with reflection ) into the small composition buffer and 4 which mask a sprite onto the composition buffer. There are also two versions of the buffer initialisation - background and foreground and background only. This is a lot of code - about half of all the code in the game - but necessary to draw things in an efficient way on the spectrum's limited hardware.

Once again, code is here https://github.com/tomgrove/FlyingSharkBlog

Flying Shark: Drawing

Recent Posts

Comments