GPU in Main… Science!

This can be somewhat of a taboo in the Jaguar world, and it seems to crop up every once and a while, sometimes heralded as the ultimate fix, sometimes just mentioned as an interesting quirk.  The RISC CPU’s in the Jag have their fair share of bugs, one of these is related to the GPU executing it’s code from the systems main RAM, restricting it to running code out of the limited 4K of local RAM built onto the chip.  Naturally no one ever abides by manufacturers rules and it was soon discovered that it is in actual fact possible to run code from main memory!  There are a few caveats about address restrictions when it comes to jumps but nothing too complex.  It is most likely a simple cock-up that snuck past in the final design of the chip and Atari at the time thought it easier to simply say “do not do this” rather than having to come up with work around solutions, needless to say there are a few commercially released games on the Jag that actually run code from main RAM (Rayman being one of them).

Anyway, that’s all by the by.  There is a lot of passion and unfortunately the fud that comes with passion relating to this technology.  So I sat down and decided to try and shine some sciency light on this afterall! (I may as well put that BSc Computer Science (Hons) to use I guess 😀 )

So here are a few facts:

  1. The Atari Jaguar has a single shared bus between all of it’s devices and the main memory
  2. Main memory is 2MB of DRAM (120ns)
  3. The local RAM on the RISC devices has it’s own local bus to the RISC core, is 32 bits wide and SRAM
  4. If the GPU is accessing main RAM it is tying up the bus, so unless a higher priority CPU comes along and nabs it, it has the bus, nothing else gets to play with the main RAM.

What does this mean performance wise? well DRAM is significantly slower than SRAM, and requires regular refreshing.  So reads of instructions are going to be slower, and that is assuming nothing else has the bus (there are 4 other devices that could grab it or want it)

The performance aspects however always seem to be overlooked, some rules seem to suggest avoiding “tight loops” in main RAM, but to be honest this is irrelevant anyway as everything you run will take longer.  To prove this point (here comes the science) I have crafted a simple little piece of code.

My aim to accurately time the GPU running in local RAM and also the exact same code in main RAM.  To do this I am using the programmable timers available within the Jag, setting JPIT counters will cause them to decrement based on the ticks from the system clock (~25MHz).  The idea is simple,

  1. set-up a counter
  2. read the counters value at the start
  3. Do some busy work (ensuring not to access any register to cause a pipeline stall)
  4. read the counters value at the end
  5. Save both counter values and subtract one from the other

The final value will be the number of ticks of the counter to complete the busy work.

To remove any question about loops etc I made a nice simple flat piece of code for the testing:

gpucode:
.GPU
.ORG G_RAM

    movei    #$F10036,r0     ; The JPIT Readable counter
    movei    #startval,r1    ; where we are going to store our start counter
    movei    #endval,r2      ; where we are going to store our end counter
    moveq    #0,r3           ; start counter value reg
    moveq    #0,r4           ; end counter value reg

    ; get the current counter value
    loadw    (r0),r3         ; save this in the start counter reg

    ; now for some busy work
    rept    400           ; 400 repetitions
        moveq    #4,r10
        move    r12,13
        moveq    #6,r11
        move    r14,r15
    endr

    ; get the counter now
    loadw    (r0),r4
    nop
    nop
    nop

    ; save our counters
    store    r3,(r1)
    store    r4,(r2)

    ; lots of pointless faffing just to make sure the writes have completed
    nop
    nop
    nop
    nop

    ; change the screen colour so we know we have finished faffing
    movei    #BG,r20
    movei    #$4400000,r21
    nop
    nop
    store    r21,(r20)

    moveq    #0,r5        ; stop the GPU
    movei    #G_CTRL,r6
    nop
    nop
    store    r5,(r6)
    nop
    nop
    nop

As you can see, nothing amazingly complex, and the test code performs no reads or writes, these are pure and simple instructions which should all complete in a single operation.  The results from this little test are quite telling, but not surprising really:

I ran the test 3 times for each, the values output are the hex values of the timer, as I simply reset the Jaguar with the jcp -r command the JPIT counter doesn’t actually reset but carries on regardless! (I didn’t know that until now! learning! isn’t science great! 😀 )  This is why the values move around, but the interesting part is the difference between the two values, this represents how long it took to complete our 1200 lines of code (4*400).  So first up, running the code in local RAM on the chip:

$d44c – $cae2 = $96a = 2410
$988c – $8f4f = $93e = 2531
$8d48 – $83e6 = $962 = 2402

Average of about 2448 ticks to complete 1200 instructions

And now EXACTLY the same code in Main RAM

$f519 – $a335 = $51e4 = 20964
$7567 – $23a5 = $51c2 = 20930
$86e9 – $3519 = $51d0 = 20944

Average of about 20946!!

That is almost 10 times slower!! and these instructions don’t really do ANYTHING! and this is on a system where the only other thing running is the 68K which is sat patiently waiting for results to appear.  If additional padding nops were added to code to make jumps work, or there were instructions that actually accessed other areas of main RAM, or perhaps even WRITE to main RAM.. well things are going to get slower and I dare say more messy as the RAM page is flipped back and forth..

So my verdict.. run it in Local people, there may be some situations where it may be necessary to run in main, I would view these as the edge cases, minorities.  It should be possible to pretty much run everything in local, a bit of thought and some paging of code if required should be all that’s needed to keep your GPU code running in a tip-toppety fashion.

Hopefully people will find this an informative and useful read.  At the end of the day this is a hobby, if you want to run your code in main, go for it! have fun! enjoy what you are doing! but just don’t expect it to be the most snappy code.

SoundEngine 0.21 released

A much smaller than originally planned update, I was just too excited to get the new pad reading release out there, as well as the updated manual.

As well as the release of this version I have also updated the website to include a list of known and resolved bugs.  If you find a bug that’s not already listed here, please let me know.

As always the latest version can be downloaded here

Enjoy

 

Manual work

Another night spent working my way through the SoundEngine manual, version 0.21 nears launch, so there is a new feature to document, some old features I had forgotten to document (Thanks CJ for pointing those out 🙂 ) and a general tidy up here and there.

Still hoping to have the next version out this month (February), so keep checking back, or the usual places on AtariAge and Jagware (which I will update as and when it’s released).

Freeing up RAM and bus

Resources can be quite tight within the Jaguar system.  All though CPUs fighting for time on the bus, only 2MB of RAM, 4MB (mostly) of cart space.  Compromising assets to make a game fit or work isn’t something you want to do, especially the people that have created those assets 🙂

With that in mind I am always looking for ways to free up ROM/RAM, and keep the SoundEngine off the main bus as much as possible.  I have spent the last few days pondering a few tweaks that could benefit both storage capacity and bus use, and from the initial very basic experiments tonight these are looking quite promising.  I shouldn’t really count my bytes before they have hatched, but I am pretty sure that there could be some significant savings coming soon in regards to music playback on the Jaguar..

Watch this space…

Inclusion in Rebooteroids furthering feature development

It’s quite an exciting time in the U-235 bunker at the moment, work has been progressing nicely these last few weeks adding some extra features to the SoundEngine.  U-235 are very proud to have had our SoundEngine selected for use by Reboot in their RAPTOR Engine and the subsequent games based on this technology.  So it’s obviously a great honour to have our work also be ported into the upcoming release Rebooteroids!

Working closely with the guys in Reboot to meet the needs of Rebooteroids highlighted a new feature for the SoundEngine, Pad & spinner code!  Now the SoundEngine takes care of polling pad 1 & 2, presenting the state of the pads as two individual bitmaps.  Spinner code has also been added to allow the reading of rotary controllers.

At this time the code is only present in the custom build of the SoundEngine, but these features will of course be ported into full general releases of the SoundEngine.  Pad read code is planned to be included in the next release, with spinner reading code coming in a release after that most likely.  The next release of the SoundEngine will hopefully be in February this year assuming no big hitches with the development or available time.

Big thanks to Cyrano Jones of Reboot for the suggestion and various code fragments and feedback.

Jaguar Demo

Not something produced by U-235, but I have always loved demos on whatever platform.  Always wanted to make one myself 🙂

It’s good to see them starting to appear on the Jag too! Given the toys under the hood it should be interesting to see what the demo boffins can produce on the cat!

Checkout J_ by Checkpoint over at pouet

Or watch the YouTube video uploaded by Reboot:

e-Jagfest

Amazing how quickly time flies when you are having fun (moving house etc).. e-Jagfest 2013 is nearly upon us!  Celebrating 20 years of the Jag!  Has to be the best Atari Jaguar event in the whole world!  A full 64 bits of awesomeness, interactive, multimedia…. ahem…

We (U-235) are going to be there, along with fellow Jaguar fans to wish the Jaguar happy 20th birthday.  Why not come along, join in the fun, talk to devs, fans, collectors.. eat tasty German sausage, drink their beer & coffee and generally have a bloody good time!

Join the event page on Facebook here

New project…

We decided to try and get something together for an up-coming show, time has been quite short and there have been a few sticking points but progress has been made and things are moving along nicely.  If you follow my twitter feed (@link2076) in amongst all the other random nonsense you will find the odd tid-bit of info as I work on the project.

It is going to be a game, and yup poor ole GazTee is the hero 🙂  for this game, lets hope he packed some warm clothing…..

 

Jaguar Archives

They’re back!

I just noticed that the Archive of files we host on our site wasn’t as readily accessible as I thought! OOPS! my bad! sorry!  I have now (hopefully) rectified this, point your pointing thing over the ‘Jaguar Archives‘ page and fill yer boots.  🙂

Sound Engine 0.20 RELEASED!

It’s been far far too long, and not all of the features I hoped to get in this release are there, still plenty to do and add.  A significant bug was identified by Matmook of Jagware which would result in small looped samples sounding terribly off key.  This release fixes that bug.

In addition to this fix there is now a Pseudo random number generator built into the Sound Engine! whilst it is loaded into the DSP and the DSP is running it will merrily generate 16bits of pseudo rubbish.  Hopefully this will be of use and free up some ticks elsewhere in your projects.

Also now added support of the Vibrato & Volume slide effect.

The majority of changes have mostly been around some internal code tidying and documentation to aid in future development.

Head here to download the latest version, enjoy!