switch jump table?

Other misc things
Post Reply
alank2
Member
Posts: 116
Joined: Wed Mar 01, 2017 7:24 pm

switch jump table?

Post by alank2 »

I've got an emulation project with a large switch for opcodes that is switching near 256 ways for an unsigned char. I build it both ways (z80) and (8080) and I noticed that it doesn't build a jump table and instead builds a long stream of:

240 04d6 cad709 jp z,i_42 ;
240 04d9 fe01 cp +(1% 256)
240 04db cad709 jp z,i_43 ;
240 04de fe02 cp +(2% 256)
240 04e0 cad709 jp z,i_44 ;
240 04e3 fe03 cp +(3% 256)

I wondered if this was an 8080 thing, but I built for z80 and it also does the same. Can either build a jump table?

If not, what would be much faster would be a binary search where it could determine the code path with 8 comparisons (2^8). Implementing something like that would probably be a tall ask! I did see some webpages on Z80 and jump table though when doing a search.
User avatar
dom
Well known member
Posts: 2304
Joined: Sun Jul 15, 2007 10:01 pm

Re: switch jump table?

Post by dom »

For a large switch both mechanisms we use (the cp/jp as used for uint8_t and the value/address map used for other types) is not at all efficient. I'm not sure of the number of case statements where a binary search mechanism becomes more efficient - I'm guessing it might be around about 30 case statements?

Effectively what you need is a computed goto, but I have a hunch that I've not implemented that which is a pain.

Either way I need to some work, but there's a couple of ways to improve performance as is.

The first one is minor structural change - instead of having one large 256 case switch, split into chunks of 16 using the high nibble, and then switch inside that.

Alternatively create 256 functions, stick their addresses in an array and dispatch on that.
alank2
Member
Posts: 116
Joined: Wed Mar 01, 2017 7:24 pm

Re: switch jump table?

Post by alank2 »

So casting the switch to an unsigned short will force a jump table then?
Timmy
Well known member
Posts: 425
Joined: Sat Mar 10, 2012 4:18 pm

Re: switch jump table?

Post by Timmy »

As someone who sometimes use switch for all kind of stuff, I'd prefer not to cast something as flexible as a switch statement into a jump table.
Especially if the switch case has sparse numbers. Or if you have only 32 cases but you still need to make a jump table with 256 entries.

It might be easier and convenient if you just make an array of jump table yourself and jump it with jump_table[number].
(That said, my GAC interpreter had about 32 cases (or 48 or 64, can't remember) and I still used switch.)

On the other hand, I don't know how (in)efficient switch statements are translated, so 512 bytes might be better than lots of comparison cases.
alank2
Member
Posts: 116
Joined: Wed Mar 01, 2017 7:24 pm

Re: switch jump table?

Post by alank2 »

The cast only pushed it into a different method of check and jmp where it iterates a list of values and addresses to determine what to jump to.

Is there a jump table capability for z80 and/or 8080 with the instructions available?

I'm going to try a binary style if array to split the options down (8 ifs can be 2^8 or 256 separate "jumps").
alank2
Member
Posts: 116
Joined: Wed Mar 01, 2017 7:24 pm

Re: switch jump table?

Post by alank2 »

The if tree that first splits it by above or below 0x80, then 0x40, then 0x20, and so on definitely gave a solid performance increase (3.25X). I'm writing an i4004 emulator and testing it under CP/M, DOS, and WIN32. Here are some performance benchmarks. 0.95 used switch, 0.96 uses the if array method:

CP/M testing (2MHz 8080) v0.95 = 4133 Hz, v0.96 = 13470 Hz (3.25 times faster)
DOS testing (7.16MHz 8086) v0.95 = 92297 Hz, v0.96 = 103391 Hz (1.12 times faster)
DOS testing (VM workstation modern PC) v0.95 = 525615560 Hz, v0.96 = 880065334 Hz (1.67 times faster)
WIN32 testing (modern PC) v0.95 = 600929291 Hz, v0.96 = 2340454835 (3.89 times faster)
User avatar
jorgegv
Well known member
Posts: 312
Joined: Wed Nov 18, 2020 5:08 pm

Re: switch jump table?

Post by jorgegv »

dom wrote: Thu Feb 22, 2024 12:22 pm Alternatively create 256 functions, stick their addresses in an array and dispatch on that.
I do It this way in the Code that dispatches rule checks and actions in RAGE1. But to be space efficient, it's best that your switch values start at 0 and are contiguous (in RAGE1, they are).

Oh, and one advantage of this schema is that when you start having more than 255 switch cases, just bump the switch var to uint16_t, and no Code needs to be changed.
Post Reply