Cosmic RikoLe 11/05/2025 à 15:35
it turned out that -O3 was the best to use. They all used pretty much the same technique but -O3 inlined the sprite function where as the others didn't (although I could possibly force this)
I made a simple example just to show : (this just updates the x coordinate of each player bullet - since player bullets only move horizontally)
C-Code :
typedef struct
{
// where the sprite is placed on the screen
short x, y;
// hardware sprite to use
short spr;
// padding to make struct 8 bytes in size for speed of access
short pad1;
} pbullet_t;
for(i=0; i<=blp_nd; i++) // blp_nd = number of player bullets + 1
{
change_spritex_pos(bulletpsprites[i].spr, bulletpsprites[i].x);
}
Compiles to this (-O3) :
001104: 97CB suba.l A3, A3
001106: 3839 0010 000C move.w $10000c.l, D4
00110C: B84B cmp.w A3, D4
00110E: 6D34 blt $1144
001110: 41F9 0010 0446 lea $100446.l, A0 // load base address of spr field in player bullet struct
001116: 45E8 FFFC lea (-$4,A0), A2 // load base address of x field in player bullet struct
*LOOP START*
00111A: 300B move.w A3, D0 // (A3 = i, the loop index) These 3 lines calculate offset of current element from base address. Struct is 8 bytes in size
00111C: 48C0 ext.l D0
00111E: E788 lsl.l #3, D0
001120: 3630 0800 move.w (A0,D0.l), D3 // Get Sprite number from struct
001124: 3432 0800 move.w (A2,D0.l), D2 // Get x coord from struct
*CHNAGE SPRITE X COORD ROUTINE*
001128: 43F9 003C 0002 lea $3c0002.l, A1
00112E: 3003 move.w D3, D0
001130: 0640 8400 addi.w #-$7c00, D0
001134: 3340 FFFE move.w D0, (-$2,A1) // write correct address for that sprite number to VRAM port
001138: 3202 move.w D2, D1
00113A: EF49 lsl.w #7, D1
00113C: 3281 move.w D1, (A1) // write new x coordinate of sprite to VRAM
*END OF SPRITE ROUTINE*
00113E: 524B addq.w #1, A3 // increment loop index
001140: B84B cmp.w A3, D4
001142: 6CD6 bge $111a
*END LOOP*
Why not just load base address of struct into A0 before the loops starts? and then just do :
LEA 8(A0),A0 each loop to increment A0 by 8 bytes
Then to access fields of the struct do MOVE.W (A0),D2 (for x coord) and MOVE 4(A0),D3 ( for sprite number)
To me , this just seems the simple way. But it's currently calculating the address from the index every time. It even did this when I used to have structs of size 24 bytes and it had to do more work to get there (with a shift and some adds). Back then the indexing modes used were even slower with extra unnecessary offsets