Rev01 Labyrinth Zone deformation slowdowns

Discussion in 'Discussion and Q&A Archive' started by Selbi, Aug 17, 2014.

Thread Status:
Not open for further replies.
  1. Selbi

    Selbi The Euphonic Mess Member

    Joined:
    Jul 20, 2008
    Messages:
    2,429
    Location:
    Northern Germany
    I have no hack projects going on at the moment, so this is just a question out of curiosity.

    MrCat posted this video, and like many other people he copied the fancy underwater Labyrinth Zone background deformation over to Marble Zone. I remember me doing this some years ago as well; it is a nice effect to simulate heat after all. However, regardless of where I see it, it always comes with terrible slowdowns for the game.

    What always bothered me about my own hack is that the first few seconds of Labyrinth Zone are horribly slow because of this.

    Not that it matters anymore, but I want to know: Is the LZ rev01 deformation code terribly unoptimized or is it simply because of this effect being so incredibly resource-demanding?

    The code in question. Apparently, my Firefox doesn't want me to use tabs and spaces. Gah.



    Deform_LZ:move.w ($FFFFF73A).w,d4

    ext.l d4

    asl.l #7,d4

    move.w ($FFFFF73C).w,d5

    ext.l d5

    asl.l #7,d5

    bsr.w ScrollBlock1

    move.w ($FFFFF70C).w,($FFFFF618).w

    lea (LZ_Wave_Data).l,a3

    lea (Obj0A_WobbleData).l,a2

    move.b ($FFFFF7D8).w,d2

    move.b d2,d3

    addi.w #$80,($FFFFF7D8).w ; '€'

    add.w ($FFFFF70C).w,d2

    andi.w #$FF,d2

    add.w ($FFFFF704).w,d3

    andi.w #$FF,d3

    lea ($FFFFCC00).w,a1

    move.w #$DF,d1 ; 'ß'

    move.w ($FFFFF700).w,d0

    neg.w d0

    move.w d0,d6

    swap d0

    move.w ($FFFFF708).w,d0

    neg.w d0

    move.w ($FFFFF646).w,d4

    move.w ($FFFFF704).w,d5

    Deform_LZ_1: ; XREF: Deform_LZ

    cmp.w d4,d5

    bge.s Deform_LZ_2

    move.l d0,(a1)+

    addq.w #1,d5

    addq.b #1,d2

    addq.b #1,d3

    dbf d1,Deform_LZ_1

    rts

    ; ===========================================================================

    Deform_LZ_2: ; XREF: Deform_LZ

    move.b (a3,d3.w),d4

    ext.w d4

    add.w d6,d4

    move.w d4,(a1)+

    move.b (a2,d2.w),d4

    ext.w d4

    add.w d0,d4

    move.w d4,(a1)+

    addq.b #1,d2

    addq.b #1,d3

    dbf d1,Deform_LZ_2

    rts

    ; End of function Deform_LZ

    ; ===========================================================================

    LZ_Wave_Data: dc.b 1, 1, 2, 2, 3, 3, 3, 3, 2, 2, 1, 1, 0, 0, 0, 0

    dc.b 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

    dc.b 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

    dc.b 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

    dc.b 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

    dc.b 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

    dc.b 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

    dc.b 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

    dc.b $FF,$FF,$FE,$FE,$FD,$FD,$FD,$FD,$FE,$FE,$FF,$FF, 0, 0, 0, 0

    dc.b 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

    dc.b 1, 1, 2, 2, 3, 3, 3, 3, 2, 2, 1, 1, 0, 0, 0, 0

    dc.b 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

    dc.b 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

    dc.b 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

    dc.b 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

    dc.b 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

    ; ||||||||||||||| S U B R O U T I N E |||||||||||||||||||||||||||||||||||||||
     
    Last edited by a moderator: Aug 17, 2014
  2. Clownacy

    Clownacy Retired Staff lolololo Member

    Joined:
    Aug 15, 2014
    Messages:
    1,020
    qiuu brought up that a certain optimisation could be made, but for me, it was bugged. It caused some RAM elsewhere to get overwritten when the water occupies half of the screen (though, this was Sonic 2). The concept is interesting though: Instead of checking every single line of the screen, there could just be a calculation of where the water begins. But it doesn't really help when the full screen is always deformed. I've tried to figure out S3K's method, but it all went over my head.
     
    Last edited by a moderator: Aug 17, 2014
  3. Pacca

    Pacca Having an online identity crisis since 2019 Member

    Joined:
    Jul 5, 2014
    Messages:
    1,175
    Location:
    Limbo
    This effect does slow down the engine tremendously. In fact, they removed it from Sonic 2 final because it caused so much lag. I'm not quite sure how one would optimize beyond porting things like the S3K object manager to speed up the rest of the game.
     
  4. Varion Icaria

    Varion Icaria Well-Known Member Member

    Joined:
    Apr 26, 2012
    Messages:
    77
    The method that S3K uses to handle the drawing functions allows for much faster processing which eliminates most of the slowdown caused by the water ripple effect. Mainly your best bet for removing slowdown is either to optimize how S1/2 draw's it's levels or simply port the S3K draw functions which is doable but if you don't know how they work your levels may not draw/update correctly.
     
  5. redhotsonic

    redhotsonic Also known as RHS Member

    Joined:
    Aug 10, 2007
    Messages:
    2,969
    Location:
    England
    Porting S3K's Priority, Object, Touch_Response and Build_Sprites manager into my S2R (Sonic 2), as well as other optimisations in places like layer deformation and etc. is what took the slowdown away that I could add water deformation without any slowdown. The water deformation code, I don't think can really be optimised.
     
  6. MarkeyJester

    MarkeyJester ♡ ! Member

    Joined:
    Jun 27, 2009
    Messages:
    2,867
    The major excessive lag that's happening at the beginning of your level, is more than likely due to the inclusion of decompressing Nemesis data as part of its PLC system (Not to say the rest of the engine does not play a roll in this mind). The lag in general through-out the level via the wave effect is indeed however caused by unoptimised code:

    Deform_LZ_2: ; XREF: Deform_LZ
    move.b (a3,d3.w),d4
    ext.w d4
    add.w d6,d4
    move.w d4,(a1)+
    move.b (a2,d2.w),d4
    ext.w d4
    add.w d0,d4
    move.w d4,(a1)+
    addq.b #1,d2
    addq.b #1,d3
    dbf d1,Deform_LZ_2
    rtsI mean look at it, it's horrendous, one pass through that is one single scanline, now imagine it having to do that E0 times. To optimise in terms of speed will require using "word" tables instead of "byte", otherwise you're looking at an additional "moveq #$00,d#" or "ext.w d#" in order to prepare the register you're using for word operations. Additionally, since the table values are all consistent, I would additionally suggest the tables containing the values looped up to E0 scanlines worth, this is so the value can be loaded directly from the table via "move.w (a3)+,d4" without the need for increasing counters. For example (AND ONLY FOR EXAMPLE):
    Code:
    Deform_LZ_2:
    		move.w	(a3)+,d4
    		add.w	d6,d4
    		swap	d4
    		move.w	(a2)+,d4
    		add.w	d0,d4
    		move.l	d4,(a1)+
    		dbf	d1,Deform_LZ_2
    		rts
    Further optimisation for speed can be achieved by sacrificing further memory, this by means of the removable of the "dbf" instruction and having the routine repeating E0 times directly one after another. You will then use the value of "d1" to calculate a jump to the correct starting position in the chain of the routine.
    Of course, if you're really clever and have the sinewave values in the table written correctly, you could further increase the speed by means of:

    Code:
    Deform_LZ_2:
    		add.w	(a3)+,d6
    		move.w	d6,(a1)+
    		add.w	(a3)+,d0
    		move.w	d0,(a1)+
    		dbf	d1,Deform_LZ_2
    		rts
    I would also like to strive in explainging that Sonic 3 & Knuckles does not hold all the answers for you, what I'm trying to say is, you guys will succeed greater by studying 68k assembly in general instead of "doing what S3K does". There are lots of techniques that do not exist in the games you seek, and I would love for you all to be the best that you can by working on new ways to get the job done quicker.
     
    Last edited by a moderator: Aug 18, 2014
  7. Clownacy

    Clownacy Retired Staff lolololo Member

    Joined:
    Aug 15, 2014
    Messages:
    1,020
    Well whaddya know, I finally found S3's water deform routines, and they actually do employ an optimisiation of yours. The word sized table one. Just that one.

    You're right, S3K doesn't hold all the answers, since it missed some right here.
     
  8. MarkeyJester

    MarkeyJester ♡ ! Member

    Joined:
    Jun 27, 2009
    Messages:
    2,867
    Bump: http://pastebin.com/W9588Rww (This can also be used on a standard Rev00 disassembly).

    ; ===========================================================================
    ; ---------------------------------------------------------------------------
    ; Scroll routine for Labyrinth Zone - optimised by MarkeyJester
    ; ---------------------------------------------------------------------------

    Deform_LZ:
    moveq #$07,d0 ; prepare multiplication 100 / 2 for BG scrolling
    move.w ($FFFFF73A).w,d4 ; load horizontal movement distance (Since last frame)
    ext.l d4 ; extend to long-word signed
    asl.l d0,d4 ; align as fixed point 16, but divide by 2 for BG
    move.w ($FFFFF73C).w,d5 ; load vertical movement distance (Since last frame)
    ext.l d5 ; extend to long-word signed
    asl.l d0,d5 ; align as fixed point 16, but divide by 2 for BG
    bsr.w ScrollBlock1 ; adjust BG scroll positions (and set draw code direction flags)
    move.w ($FFFFF70C).w,($FFFFF618).w ; set BG V-scroll position
    lea ($FFFFCC00).w,a1 ; load H-scroll buffer
    move.w ($FFFFF700).w,d0 ; load FG X position
    neg.w d0 ; reverse
    swap d0 ; send to upper word
    move.w ($FFFFF708).w,d0 ; load BG X position
    neg.w d0 ; reverse
    moveq #$00,d3 ; clear d3
    move.b ($FFFFF7D8).w,d3 ; load wave-scroll timer
    addi.w #$0080,($FFFFF7D8).w ; increase wave-scroll timer
    move.w #$00E0,d2 ; prepare water-line count
    move.w ($FFFFF646).w,d1 ; load water line position
    sub.w ($FFFFF704).w,d1 ; minus FG Y position
    bmi.s DLZ_Water ; if the screen is already underwater, branch
    cmp.w d2,d1 ; is the water line below the screen?
    ble.s DLZ_NoWater ; if not, branch
    move.w d2,d1 ; set at maximum

    DLZ_NoWater:
    sub.w d1,d2 ; subtract from water-line count
    add.b d1,d3 ; advance scroll wave timer to correct amount
    subq.b #$01,d1 ; decrease above water count
    bcs.s DLZ_Water ; if finished, branch

    DLZ_Above:
    move.l d0,(a1)+ ; save scroll position to buffer
    dbf d1,DLZ_Above ; repeat for all above water lines

    DLZ_Water:
    subq.b #$01,d2 ; decrease below water count
    bcs.s DLZ_Finish ; if finished, branch
    move.w d0,d1 ; copy BG position back to d1
    swap d0 ; move FG position back to lower word in d0
    move.w d3,d4 ; copy sroll timer for BG use
    add.b ($FFFFF704+$01).w,d3 ; add FG Y position
    add.b ($FFFFF70C+$01).w,d4 ; add BG Y position
    add.w d3,d3 ; multiply by word size (2)
    add.w d4,d4 ; ''
    lea (DLZ_WaveBG).l,a3 ; load beginning of BG wave data
    adda.w d4,a3 ; advance to correct starting point
    move.b (a3),d4 ; get current position byte
    asr.b #$02,d4 ; get only the position bits
    ext.w d4 ; extend to word
    add.w d4,d1 ; adjust BG's current position
    lea DLZ_WaveFG(pc,d3.w),a2 ; load correct starting point of FG wave data
    move.b (a2),d4 ; get current position byte
    asr.b #$02,d4 ; get only the position bits
    ext.w d4 ; extend to word
    add.w d4,d0 ; adjust FG's current position

    DLZ_Below:
    add.w (a2)+,d0 ; alter FG horizontal position
    move.w d0,(a1)+ ; save to scroll buffer
    add.w (a3)+,d1 ; alter BG horizontal position
    move.w d1,(a1)+ ; save to scroll buffer
    dbf d2,DLZ_Below ; repeat for all below water lines

    DLZ_Finish:
    rts ; return

    ; ---------------------------------------------------------------------------
    ; Scroll data for the FG
    ; ---------------------------------------------------------------------------

    DLZ_WaveFG:
    rept $02
    dc.w $0001,$0400,$0401,$0800,$0801,$0C00,$0C00,$0C00,$0FFF,$0800,$0BFF,$0400,$07FF,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $03FF,$FC00,$FFFF,$F800,$FBFF,$F400,$F400,$F400,$F401,$F800,$F801,$FC00,$FC01,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0001,$0400,$0401,$0800,$0801,$0C00,$0C00,$0C00,$0FFF,$0800,$0BFF,$0400,$07FF,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    endr

    ; ---------------------------------------------------------------------------
    ; Scroll data for the BG
    ; ---------------------------------------------------------------------------

    DLZ_WaveBG: rept $04
    dc.w $FC01,$0000,$0000,$0000,$0000,$0000,$0001,$0400,$0400,$0400,$0400,$0401,$0800,$0800,$0800,$0800
    dc.w $0800,$0800,$0801,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00
    dc.w $0C01,$13FF,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0FFF
    dc.w $0800,$0800,$0800,$0800,$0800,$0800,$0BFF,$0400,$0400,$0400,$0400,$07FF,$0000,$0000,$0000,$0000
    dc.w $0000,$03FF,$FC00,$FC00,$FC00,$FC00,$FFFF,$F800,$F800,$F800,$F800,$FBFF,$F400,$F400,$F400,$F400
    dc.w $F400,$F400,$F7FF,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000
    dc.w $F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F001
    dc.w $F400,$F400,$F400,$F400,$F400,$F400,$F401,$F800,$F800,$F800,$F800,$F801,$FC00,$FC00,$FC00,$FC00
    endr

    ; ===========================================================================I've had this tested by your average hack players, and they've reported nothing suspiciously wrong, so I can only assume that it's solid. This is an optimised version of the LZ scroll code, it's been pretty much rewritten from ground up and uses a few of the optimisation methods I mentioned above.
    Not all optimisation methods I mentioned have been used, just the vital ones that don't get rediclously out of hand in terms of ROM space usage. A few methods to speed this up further may include unrolling the dbf loops by means of removing the dbf instruction entirely, and having the instructions repeated E0 times over and jumping to the appropirate start position. Another optimisation can be done by removing the "current position" values from the word wave tables and having them in their own seperate tables calculated correctly, I didn't do this because that calculation only occurs once and probably wouldn't be worth having a large table for.

    But that's for you to decide, the best optimisation method is the one that works well for your project (for example, one method may not be appropriate for another project because it may not have enough space for the job).

    The "rept" and "endr" functions are asm68k specific for repeating the section a certain number of times, if you are using another assembler, you may need to replace them with the appropriate function. If one does not exist, then you'll need to copy and paste the table the set number of times one after another. It needs the table repeating for overflow/repeat sections.
     
    Last edited by a moderator: Sep 7, 2014
    LuigiXHero likes this.
  9. Selbi

    Selbi The Euphonic Mess Member

    Joined:
    Jul 20, 2008
    Messages:
    2,429
    Location:
    Northern Germany
    Overwriting the default deformation code with what you provided there barely gave the start of the level any performance boost, so the main root of the slowdown problem must originate from somewhere else in my hack. In any case, thanks for the code, as it helped me a few things I didn't quite get previously, and now I also know what the rept and endr instructions are for!
     
  10. Clownacy

    Clownacy Retired Staff lolololo Member

    Joined:
    Aug 15, 2014
    Messages:
    1,020
    So for you cool kids who're using water deformation in Sonic 2, here's a modified version of MJ's code that works as an extension to another zone's SwScrl data (replace rts of target zone's SwScrl with a branch to this code, and maybe a check for if the current act has water, like how CPZ needs it):


    SwScrl_Water:
    lea (Horiz_Scroll_Buf).w,a1 ; load H-scroll buffer
    moveq #0,d3 ; clear d3
    move.b (Wobble_timer).w,d3 ; load wave-scroll timer
    addi.w #$80,(Wobble_timer).w ; increase wave-scroll timer
    move.w #224,d2 ; prepare water-line count
    move.w (Water_Level_1).w,d1 ; load water line position
    sub.w (Camera_Y_pos).w,d1 ; minus FG Y position
    bmi.s .water ; if the screen is already underwater, branch
    cmp.w d2,d1 ; is the water line below the screen?
    ble.s .nowater ; if not, branch
    move.w d2,d1 ; set at maximum

    .nowater:
    sub.w d1,d2 ; subtract from water-line count
    add.b d1,d3 ; advance scroll wave timer to correct amount
    subq.b #1,d1 ; decrease above water count
    bcs.s .water ; if finished, branch

    add.w d1,d1
    add.w d1,d1 ; multiply by 4
    adda.w d1,a1 ; offset into correct water line area of buffer
    moveq #0,d1 ; clear d1

    .water:
    subq.b #1,d2 ; decrease below water count
    bcs.s .finish ; if finished, branch
    move.w d3,d4 ; copy scroll timer for BG use
    add.b (Camera_Y_pos+1).w,d3 ; add FG Y position
    add.b (Camera_BG_Y_pos+1).w,d4 ; add BG Y position
    add.w d3,d3 ; multiply by word size (2)
    add.w d4,d4 ; ''

    lea Lz_Scroll_DataBG(pc),a3 ; load beginning of BG wave data
    adda.w d4,a3 ; advance to correct starting point
    move.b (a3),d1 ; get current position byte
    asr.b #2,d1 ; get only the position bits
    ext.w d1 ; extend to word

    lea Lz_Scroll_DataFG(pc,d3.w),a2 ; load correct starting point of FG wave data
    move.b (a2),d0 ; get current position byte
    asr.b #2,d0 ; get only the position bits
    ext.w d0 ; extend to word

    .below:
    add.w (a2)+,d0 ; alter FG horizontal position
    add.w d0,(a1)+ ; save to scroll buffer
    add.w (a3)+,d1 ; alter BG horizontal position
    add.w d1,(a1)+ ; save to scroll buffer
    dbf d2,.below ; repeat for all below water lines

    .finish:
    rts ; return

    ; ---------------------------------------------------------------------------
    ; Scroll data for the FG
    ; ---------------------------------------------------------------------------

    Lz_Scroll_DataFG:
    rept 2
    dc.w $0001,$0400,$0401,$0800,$0801,$0C00,$0C00,$0C00,$0FFF,$0800,$0BFF,$0400,$07FF,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $03FF,$FC00,$FFFF,$F800,$FBFF,$F400,$F400,$F400,$F401,$F800,$F801,$FC00,$FC01,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0001,$0400,$0401,$0800,$0801,$0C00,$0C00,$0C00,$0FFF,$0800,$0BFF,$0400,$07FF,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    dc.w $0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000,$0000
    endm

    ; ---------------------------------------------------------------------------
    ; Scroll data for the BG
    ; ---------------------------------------------------------------------------

    Lz_Scroll_DataBG:
    rept 4
    dc.w $FC01,$0000,$0000,$0000,$0000,$0000,$0001,$0400,$0400,$0400,$0400,$0401,$0800,$0800,$0800,$0800
    dc.w $0800,$0800,$0801,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00
    dc.w $0C01,$13FF,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0C00,$0FFF
    dc.w $0800,$0800,$0800,$0800,$0800,$0800,$0BFF,$0400,$0400,$0400,$0400,$07FF,$0000,$0000,$0000,$0000
    dc.w $0000,$03FF,$FC00,$FC00,$FC00,$FC00,$FFFF,$F800,$F800,$F800,$F800,$FBFF,$F400,$F400,$F400,$F400
    dc.w $F400,$F400,$F7FF,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000
    dc.w $F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F000,$F001
    dc.w $F400,$F400,$F400,$F400,$F400,$F400,$F401,$F800,$F800,$F800,$F800,$F801,$FC00,$FC00,$FC00,$FC00
    endm

    Now, how about adding your own (or ported) wave tables? MJ's code requires a new format for the wave tables, so here's how to convert them:

    This format is complicated, to say the least. I figured it out, but to write it down is... a challenge.

    So, you have your S1/S3-style table, let's convert it to this new format. First, change the 'dc.b's to 'dc.w's (only S1), then make each entry four digits.

    Now, the table needs to be converted to a list of additions. For example, what would be this:
     

    $0001,$0001,$0002,$0002,$0003,$0003,$0003,$0003,$0002,$0002,$0001,$0001,$0000,$0000,$0000,$0000
    becomes this (relative to $0000):
     


    $0001,$0000,$0001,$0000,$0001,$0000,$0000,$0000,$00FF,$0000,$00FF,$0000,$00FF,$0000,$0000,$0000
    I can't put the entire thing into words, but if the value remains constant, then make the new value zero, if the value changes, make the new value that of how the value changed. Because of rollover, $FF is technically -1, so make use of that.

    Now for the hard bit: the format of the other byte of each entry's word. Let's break the whole word down in binary as this:
     


    AAAAAABBCCCCCCCC
    CCCCCCCC is the addition value we've already discussed, but AAAAAA and BB, on the other hand...

    AAAAAA appears to be the distance from the default value ($0000) that the current value is.

    BB looks be a subtraction flag: whenever the number decrements, BB is guaranteed to be set to 11.

    Let's try to convert our table to incorporate the AAAAAA values: In the beginning, the current value is 0. The default value is 0, so AAAAAA should be zero. How about the next value in the table, $0001? Think of this as more of a 'previous events' log; even though $0001 means that the current value will be changed, AAAAAA only shows this on the following word. So $0001 stays.

    Now for the word after it... Be sure to set AAAAAA to 1. Doing this will change the second nybble of the words after the increment to be 4. You can see this pattern in MarkeyJester's example: whenever the number increments, that nybble goes from 0, to 4, to 8, to $C, and so forth. Decrementing works as you'd expect it to, AAAAAA will contains a lower number, to the point of becoming negative. As can be seen in the example, AAAAAA is sometimes set to 111111, then 111110, then 111101, etc.

    Now to work on adding the BB value. In MarkeyJester's example, you can see how whenever the other byte is $FF (-1), a decrement, BB is set to 11. So let's do that ourselves. Unlike AAAAAA, there's no lag between an increment and BB being different: whenever there's a decrement, that very same word will have a BB of 11.

    So, let's recap. Your bytes become words; your words become addition values, each one relative to the one before it; the difference off the current value from 0 is contained in AAAAAA; and BB is set to 11 when the current value is a decrement, and not an increment or neither ($0000). Applying all of this to your table should convert it to this new format. I've applied this to S3K's AIZ1 underwater wave tables and confirm that this does indeed work.
     
    Last edited by a moderator: Sep 13, 2014
    Nat The Porcupine likes this.
  11. MarkeyJester

    MarkeyJester ♡ ! Member

    Joined:
    Jun 27, 2009
    Messages:
    2,867
    WOW.

    I... I just... don't know what to say. I am pretty mightily impressed that; not only have you made different varients of the code, but you took the time to understand it, and you actually tried to explain it too!

    Lz_Scroll_DataBG(pc)Does that really reach?! If so, then I've underestimated it...
    There is a much simpler way of explaining the table format, and there is also one flaw with your design that impairs the speed, but I don't want to trample on your hard work here, so I'll leave it for another time.

    Either way, I like the cut of your jib.
     
    Last edited by a moderator: Sep 13, 2014
  12. Clownacy

    Clownacy Retired Staff lolololo Member

    Joined:
    Aug 15, 2014
    Messages:
    1,020
    Now I don't know what to say... I don't often get that kind of response. It means a lot to hear that from you. Thanks.

    About that one flaw, if the addq in a dbf loop was what you're talking about, I weeded that one out a bit ago with some code to multiply the counter into an offset. It won't help if d1 is low, but for the higher numbers I think it should work well. I haven't done any cycle counting, but it seems solid enough.

    The check at the beginning of the code for if the level has water could stand to be moved to before the branch to SwScrl_Water, I'll do that too.
     
  13. MarkeyJester

    MarkeyJester ♡ ! Member

    Joined:
    Jun 27, 2009
    Messages:
    2,867
    I was actually talking about the fact that the water scrolling is performed after the level's main scrolling, rather than in conjunction with, thus meaning passing through the scroll table twice. Of course, that may depend on how complex the level's main scrolling is. So it's really a matter of specifics based on the level at hand, thus an exceptional sacrifice.
     
Thread Status:
Not open for further replies.