1

Is there any reason to NOT use pre-shifted hardware coordinates for sprites in Neo Geo games?. I have been optimising the homebrew game Neo Thunder. And it has annoyed me how much CPU time was spent converting the X and Y (especially) screen coordinates of sprites to hardware coordinates. This is done to make sure they are in the right form to update VRAM correctly. The shifts alone (7 places to the left) take 20 CPU cycles each. With a large amount of sprites, quite a lot of time is "wasted" doing this.


O3sZmaM.png


I think using native hardware coordinates would also mean that implementing a sprite update buffer (where you save sprite updates for the vertical blank period) becomes much more trivial + faster. In the sense it would not be needed for many sprites now. Since the X and Y values can now be *directly* taken from the bullet-list, alien-list etc with no need for duplication or pointers

Negatives (for Neo Thunder) :

- With the current version of NeoThunder, doing it this way, would mean i can no longer use ADDQ, SUBQ to quickly make a bounding box for a bullet in the player bullets VS Aliens collision detection routine (the most time intensive routine which is in inline assembler)

- I also won't be able to use the limit-check trick that works with unsigned numbers where you only need one limit check instead of 2.


However I think overall it would be much faster to do it this way because I would save a lot of time throughout the program

Neo Thunder is maybe unique in that every sprite is only 1 character in height and has no sticky bits set - so I can keep these lower bits permanently set and they won't affect comparisons etc. But even in games that don't have this advantage, it would be fast to just OR in the correct settings. (actually thinking about it more, some comparisons will work anyway so the OR is not needed for those)

I don't *need* to do this in order to get the game running smoothly, but it has made me think about the best way to store X and Y values for future projects. And it is always good to save time if possible

What do people think about doing it like this? Are there downsides that I am not seeing?

Thank you 👍

2

I just looked up Sega Megadrive sprites and so much easier 🙂 No need for shifts and y coords increase as they go down the screen. Only downside is top left of screen is 128,128

LSzwPVu.png

3

I am 60% of the way through converting the game code to use "pre-shifted hardware coordinates now" so I thought I would report back in case anyone else is interested in doing this in future.

Firstly I made a mistake about the range trick not being possible - it obviously is, since it actually works with *unsigned numbers*

Single Range checks can be slightly more tricky + slow though

e.g

If (JoyLeft and (x <= -24))
... stop player movement to left

becomes something like

if (JoyLeft and ((x + 3072) >= 44800))

Please note this is equivalent to : (before pre-shifting)

if (JoyLeft and ((x + 24) >= 350))


The other main issue I correctly identified is that ADDQ, SUBQ (these can only be used with numbers 1-8) are no longer possible in many cases. Which I think adds 4 CPU cycles for each ADDI,SUBI instruction that is substituted for them. I am mostly using C but I believe the C Compiler uses ADDQ, SUBQ etc where it can

Some examples :

-It's common in NeoThunder to subtract 1 from every alien and background tile position to move them to the left

-Player bullets are moved by 5 pixels to the right each time.

-And as mentioned in my first post, bounding boxes for the player and enemy bullets are constructed this way. (4 operations per bullet)


So these were all suitable for ADDQ, SUBQ use

The extra time taken does add up for loops with a large number of elements



BUT the saving of doing it this new way, is very large. e.g. each Y screen coordinate to VRAM conversion takes 44 CPU cycles and each X coord takes 20 CPU cycles (just the left shift 7 places needed for X coords!)

With up to 250 individual sprites on screen this adds up to 21 display lines saved


*So far* I think I have made the correct decision to do this. But maybe I will still find a bigger problem, as I continue with the conversion

It's also been quite annoying doing this, since gaps in my understanding made me make several mistakes that caused bugs which took me a while to fix. But I am learning more as I go on.

4

I have realized there is a big issue when collision boxes are off the left side of the screen. e.g. x now becomes 500 instead of what was previously a - 12 screen coordinate

I can't really think of a way round this - other than to add an offset to all x coords to make sure all checks are *always* done onscreen. Then when I update VRAM I will have to subtract this offset. This seems like the best way round to do it because I don't want to slowdown the Player Bullets VS Enemies collision detection routine by doing extra arithmetic in those loop(s). e.g. 60 enemies x 20 player bullets = 1200 checks. Where as changing x everytime I update 200+ sprites is faster.

Does anyone have any better solutions to this? Thank you

5

FINAL UPDATE : his turned out to be a great way to speed up my program. I have tried to max it out and I am getting 340+ total sprites moving on screen, at 60fps with collision detection on too. Not bad!

Doing it this way - using "hardware coordinates" for everything - was tricky for me to fully understand at first and there are a few things it makes slower. But for the simple shooter I am working on - this gave a nice speed boost overall. Would recommend doing it if you need some extra speed in your game.

6

Thanks for sharing this. Just posting to say it's being read with interest.
avatar