Page 1 of 1

bytecode generation / VM for C ?

Posted: Fri Oct 26, 2018 12:27 pm
by nippur72
I was reading this stackexchange post about why Z80 is a poor target for C and I was thinking... why not go in the opposite direction instead..? I mean instead of striving for optimization, why not produce bytecode that is executed on a sort of a C virtual machine ? There would be a loss of execution speed but at least the code would be very compact. Has anyone ever investigated in this direction? What are your thoughts ? Is it worth exploring?

Posted: Fri Oct 26, 2018 5:53 pm
by dom
That aralbrec person seems to know a lot about z80 C compilers...

I'm not seen any projects that do as you suggest (apart from p-code of course), but there's some interesting ideas out there:

https://github.com/lefticus/x86-to-6502

There's a z80 fork, converting the x86 output of llvm from x86 to z80, subject to all the usual caveats over large datatypes, I'm not sure how well it would cope with cope with the complex codebases that z88dk is compiling.

For me the following is more interesting (I've turned into a Java/Kotlin developer to earn money):

https://github.com/mikeakohn/java_grinder

Which uses JVM bytecode and then converts it to assembler, the caveat here is library support of course. I think there's interceptors for string operations so they can be handled natively.

The interpreting route will be slower, take Simon Owen's vic20 emulator: https://simonowen.com/spectrum/vic20emu/ - it runs at about 1/10 of the expected speed - so I guess somewhere between 100 and 300Mhz equivalent (depending on how you interpret 1/10) 6502 on a 3.5 Mhz z80. I guess a carefully chosen bytecode (with high-level, more complex instructions) would be faster when interpreted than that though.

Posted: Sat Oct 27, 2018 8:11 am
by nippur72
Java_grinder is interesting, I have to try it out one of these days...

Also LLVM IR is interesting, there is an experimental Z80 frontend, but nothing about turning it into p-code.

My idea is to have the C compiler emit compact bytecode by default and occasionally switch to pure machine language when speed is needed (e.g. with the use of #pragma).