Best Way to Stream to Multiple Channels on the Sega CD?

Discussion in 'Discussion & Q&A' started by Devon, Oct 20, 2018.

  1. Devon

    Devon I'm a loser, baby, so why don't you kill me? Member

    Joined:
    Aug 26, 2013
    Messages:
    1,376
    Location:
    your mom
    I know this forum may be kind of a silly place to ask, but I know it has some activity and users that MAY know.

    Anyways, I've been trying to stream PCM data to each of the 8 channels on the Sega CD, and while it works for the most part, my issue is that it's been causing lag. What I mean by that is that it's been slowing down my driver by quite a bit and I am unsure what the cause is. Is the PCM chip really just that slow or could there be a problem?

    This is what I have.

    I guess it will help to say that I am using INT3 and have the timer set to 1. Could that be causing anything?
     
    Pacca likes this.
  2. MarkeyJester

    MarkeyJester ♡ ! Member

    Joined:
    Jun 27, 2009
    Messages:
    2,867
    You will have to elaborate on the function of this subroutine, I have a suggestion but won't make it til I am sure this subroutine is doing what I think it's doing:
    Code:
        bsr.w   WaitPCM
    I can make a few suggestions with the rest of the code though:
    Code:
    .Copy:
        move.b  (a0)+,d0            ; Get sample byte
        cmpi.b  #$FF,d0             ; Is it the termination flag?
        bne.s   .Write				; If not, branch
        movea.l tSampLoop(a4),a0        ; Reset sample position
        move.b  (a0)+,d0            ; Get byte from there
    
    .Write:
        move.b  d0,(a1)+            ; Store in wave bank
        addq.w  #1,a1               ; Fix wave bank address
        dbf d1,.Copy            ; Copy until done
    You want the conditional branch for FF the exact opposite way around.

    PCM wave samples are very long as you well know, assume you have about a second or two of audio and it is $800 bytes, and the $801th byte is an FF end marker (loop marker). This means that from a probability perspective the chances of you successfully obtaining an FF is below 1%. Even a sample which has just 1 byte of PCM data with the second byte being FF would only have a 50% probability of being FF (and this percentage only drops the larger the sample is). For a short branch, not branching is quicker than branching, and since not getting an FF is most likely to happen, you'll want to branch only if you get an FF:
    Code:
    .Reset:
    		movea.l	tSampLoop(a4),a0
    
    .Copy:
    		move.b	(a0)+,d0
    		cmpi.b	#$FF,d0
    		beq.s	.Reset
    		move.b	d0,(a1)+
    		addq.w	#$01,a1
    		dbf	d1,.Copy
    You will save 2 cycles for every PCM byte you load, and if a sample is $800 bytes, you will have saved yourself $1000 CPU cycles. A few other aspects include placing an FF into a register, and then comparing register with register instead of an immediate value. This way, the compare instruction is a word shorter and saves you 4 cycles for every PCM byte and that's $2000 cycles.

    More complicated ways of saving time (given that your buffers are more than likely going to be a power of 2), is to do a long-word at a time:
    Code:
    		moveq	#$FFFFFFFF,d1
    		bra.s	Next
    
    Reset:
    		movea.l	tSampLoop(a4),a0
    
    Next:
    		cmp.b	(a0),d1
    		beq.s	Reset
    		move.l	(a0)+,d0
    		movep.l	d0,$00(a1)
    		addq.w	#$08,a1
    		dbf	d2,Next
    This will cost you a padding to the next long-word for every sample to ensure that the end of the sample's FF begins on the first byte of a long-word, and will also require the sample to be aligned to an even address for obvious reasons, but it'll be worth the CPU time you would save.

    One final thought, when a channel is not playing a sample, do you simply not write data to the PCM RAM buffer area and mute the channel, or do you give your code an address to a blank sample so it appears to be playing nothing? If it's the latter then how large is this blank sample? If it is a small one byte 00 followed by an FF or something ridiculous like that, then you will get some time loss as you have increased the amount of time it has to reload the sample address in your loop (no matter which method above is used, including yours), I would then suggest holding a large enough blank sample of 00's to give your loop less chances of reloading, say at least a buffer sizes' worth.

    There are a few minorities here and there but nothing I think is going to cause you major time loss. I am still curious about the "WaitPCM" though, if it is what I think it is, then you may have another potential problem on your hands.
     
    Devon and StephenUK like this.
  3. Devon

    Devon I'm a loser, baby, so why don't you kill me? Member

    Joined:
    Aug 26, 2013
    Messages:
    1,376
    Location:
    your mom
    Thanks for the tips.

    My WaitPCM routine is like so:
    Code:
    WaitPCM:
        move.l    d0,-(sp)            ; Save d0
        moveq    #20,d0                ; Delay count
        dbf    d0,*
        move.l    (sp)+, d0            ; Restore d0
        rts
    I could probably find a way to make it not have to save d0 to the stack, but what I am unsure if storing 20 into d0 is appropriate. I have it as that because every other PCM wait function I've seen used that value. With that, it's probably exactly what you thought :V

    Also, I just stop sounding it and not stream.
     
    Last edited: Oct 20, 2018