How to work with background deformation

Discussion in 'Discussion & Q&A' started by Pacca, Aug 5, 2016.

Tags:
  1. Pacca

    Pacca Having an online identity crisis since 2019 Member

    Joined:
    Jul 5, 2014
    Messages:
    1,175
    Location:
    Limbo
    One of the most interesting, impressive and hard to modify features of the original Sonic games is the background scrolling. It allows a level to add a sense of depth to the backgrounds, and the background layer can even be used in unique ways that allow for awesome effects (like in Hill Top and Marble Garden Zone). I recently decided that I wanted to do something along the lines of Marble Gardens' effects in one of my levels, so I set to work trying to figure out how to mess with the level scrolling.

    However, I've had trouble figuring out what I should even do! The code for the deformation scripts (at least in the sonic 2 disassemblies) aren't commented, and use commands I'm not quite comfortable with. When I try to dive in and fiddle with the code, It always seems to result in the foreground getting messed up in rather trippy ways. And although Selbis' Deformation Generator produces very well commented and easy to modify code, it doesn't allow for movement on the y axis, which the effect I have in mind desperately needs.

    Does anyone have any tips on how to get started on creating advanced background effects? I've gotten thus far on reusing the original code, Selbis' previously mentioned generator, and fiddling with the output of that program, but that is very limited in comparison to what the games normally do.
     
    NoneofyourBusiness and HackGame like this.
  2. MarkeyJester

    MarkeyJester ♡ ! Member

    Joined:
    Jun 27, 2009
    Messages:
    2,867
    There are one of two ways we could do this.
    1. I could release some sort of source code that allows you to edit the scrolling with minimal stress and problems, provided you accept certain given limitations (limitation or not, it'll prove better for you than what you have now).
    2. I could explain in full detail how it works, but you'd need to have an understanding of the assembly language in general, along with an open mind to the fact that what you're dealing with is illusional data manipulation (it's not just about scroll positions, it's about map tile redrawing too in many cases, which furthermore stems to hardware VDP knowledge).
    Make a decision and I'll quote you.
     
    ProjectFM, Pacca, FireRat and 2 others like this.
  3. Pacca

    Pacca Having an online identity crisis since 2019 Member

    Joined:
    Jul 5, 2014
    Messages:
    1,175
    Location:
    Limbo
    After some serious thought, I think I might have to go with the first option. Although I love me a good explanation of how things work, and have a rather firm grip on the basics of 68000 assembly code, I know practically nothing about how the VDP works, and that likely isn't going to change overnight. I also tend to learn far more from well commented code then I get from an explanation of how things work, so with a good example of how things work in front of me, I might be able to further from there.

    Thank you for taking the time to further my knowledge; I know it must take quite some time to put something together for a purpose like this.
     
  4. ProjectFM

    ProjectFM Optimistic and self-dependent Member

    Joined:
    Oct 4, 2014
    Messages:
    912
    Location:
    Orono, Maine
    I would appreciate learning how it works. It seems that doing cool stuff with deformation is how a lot of the better programmers show off and I'd like to know what exactly is happening so I can figure out how to make the code do what I want to.

    However, code would help out almost anyone and it's Pacguy's choice, anyway. If anyone could point me to a good explanation on how the Genesis does it, I'd be happy to see it.

    Anyway, if you can't wait for MarkeyJester to release his code, here's what I figured out in order to get working x and y scrolling in Sonic 1:

    Code:
    Deform_SYZ:
            move.w    ($FFFFF73C).w,d5
            ext.l    d5
            asl.l    #4,d5
            move.l    d5,d1
            asl.l    #1,d5
            add.l    d1,d5
            bsr.w    ScrollBlock2
            move.w    ($FFFFF70C).w,($FFFFF618).w
            lea    ($FFFFA800).w,a1
            move.w    ($FFFFF700).w,d2
            neg.w    d2
            move.w    d2,d0
            asr.w    #3,d0
            sub.w    d2,d0
            ext.l    d0
            asl.l    #3,d0
            divs.w    #8,d0
            ext.l    d0
            asl.l    #4,d0
            asl.l    #8,d0
            moveq    #0,d3
            move.w    d2,d3
            asr.w    #1,d3
            move.w    #7,d1
    
    Deform_SYZ_1:                ; XREF: Deform_SYZ
            move.w    d3,(a1)+
            swap    d3
            add.l    d0,d3
            swap    d3
            dbf    d1,Deform_SYZ_1
            move.w    d2,d0
            asr.w    #3,d0
            move.w    #4,d1
    
    Deform_SYZ_2:                ; XREF: Deform_SYZ
            move.w    d0,(a1)+
            dbf    d1,Deform_SYZ_2
            move.w    d2,d0
            asr.w    #2,d0
            move.w    #5,d1
    
    Deform_SYZ_3:                ; XREF: Deform_SYZ
            move.w    d0,(a1)+
            dbf    d1,Deform_SYZ_3
            move.w    d2,d0
            move.w    d2,d1
            asr.w    #1,d1
            sub.w    d1,d0
            ext.l    d0
            asl.l    #4,d0
            divs.w    #$E,d0
            ext.l    d0
            asl.l    #4,d0
            asl.l    #8,d0
            moveq    #0,d3
            move.w    d2,d3
            asr.w    #1,d3
            move.w    #$D,d1
    
    Deform_SYZ_4:                ; XREF: Deform_SYZ
            move.w    d3,(a1)+
            swap    d3
            add.l    d0,d3
            swap    d3
            dbf    d1,Deform_SYZ_4
            lea    ($FFFFA800).w,a2
            move.w    ($FFFFF70C).w,d0
            move.w    d0,d2
            andi.w    #$1F0,d0
            lsr.w    #3,d0
            lea    (a2,d0.w),a2
            bra.w    Deform_All
    ; End of function Deform_SYZ
    This is Spring Yard Zone's REV01 deformation code.
    Code:
            move.w    d2,d0
            asr.w    #3,d0    ; Scrolling Speed; Lower = Faster
            move.w    #4,d1    ; Height multiplied by 16
    
    Deform_SYZ_2:                ; XREF: Deform_SYZ
            move.w    d0,(a1)+
            dbf    d1,Deform_SYZ_2
    These lines of code can be duplicated and modified in order to create custom background deformation. It could be change into something like this:
    Code:
    Deform_SYZ:
            move.w    ($FFFFF73C).w,d5
            ext.l    d5
            asl.l    #4,d5
            move.l    d5,d1
            asl.l    #1,d5
            add.l    d1,d5
            bsr.w    ScrollBlock2
            move.w    ($FFFFF70C).w,($FFFFF618).w
            lea    ($FFFFA800).w,a1
        move.w   ($FFFFF700).w,d2
         neg.w   d2
         move.w   d2,d0
         asr.w   #1,d0
         move.w   #7,d1
    Deform_SYZ_1:         ; XREF: Deform_SYZ
         move.w   d0,(a1)+
         dbf   d1,Deform_SYZ_1
         move.w   d2,d0
         asr.w   #3,d0
         move.w   #3,d1
    Deform_SYZ_2:         ; XREF: Deform_SYZ
         move.w   d0,(a1)+
         dbf   d1,Deform_SYZ_2
         move.w   d2,d0
         asr.w   #1,d0
         move.w   #5,d1
    Deform_SYZ_3:         ; XREF: Deform_SYZ
         move.w   d0,(a1)+
         dbf   d1,Deform_SYZ_3
         move.w   d2,d0
         asr.w   #2,d0
         move.w   #$12,d1
    Deform_SYZ_4:         ; XREF: Deform_SYZ
         move.w   d0,(a1)+
         dbf   d1,Deform_SYZ_4
            lea    ($FFFFA800).w,a2
            move.w    ($FFFFF70C).w,d0
            move.w    d0,d2
            andi.w    #$1F0,d0
            lsr.w    #3,d0
            lea    (a2,d0.w),a2
            bra.w    Deform_All
    ; End of function Deform_SYZ
    
    It's limited and probably not the most efficient method but it works. Going upwards or downwards far enough will cause the code to reset vertically so if you're using the y-wrap, you need to repeat some of the code so the player doesn't notice when the screen resets. I'm sure edits to this could be made to bypass these limitations.
     
  5. GenesisDoes

    GenesisDoes What Nintendont Member

    Joined:
    Jan 2, 2016
    Messages:
    161
    Location:
    Pittsburgh, PA
    May I ask what the purpose/usage of those unnamed variables used in most deform code? Those are some of the few variables in the S1 Github disasm that do not have user-friendly names in variables.asm, and it would make understanding the code easier.

    Also, I would be interested too in learning more about deformation, especially on how to autoscroll certain deform blocks or how to make a custom deform so that both the FG & BG chunks are aligned with each other horizontally.
     
  6. Clownacy

    Clownacy Retired Staff lolololo Member

    Joined:
    Aug 15, 2014
    Messages:
    1,021
    FYI, I've documented at least one of the BG deformation routines in the S2 Git disasm before: Swscrl_ARZ. They make sense once you get your head around them.
     
  7. MarkeyJester

    MarkeyJester ♡ ! Member

    Joined:
    Jun 27, 2009
    Messages:
    2,867
    This is quite lengthy, but please bear with me, it might not seem relevant to begin with, but if you read enough, you'll understand how relevant it all actually is...

    --- Plane Mappings ---

    The VDP has two scroll planes (Plane A, and Plane B), each scroll plane is nothing more than a series of word sized mappings, each word consists of the following in binary (PCCV HTTT TTTT TTTT):
    • P = Priority (0 = Low plane | 1 = High plane)
    • CC = Colour palette to display with (00 = Line 1 | 01 = Line 2 | 10 = Line 3 | 11 = Line 4)
    • V = Vertical flip (0 = No | 1 = Yes)
    • H = Horizontal flip (0 = No | 1 = Yes)
    • T... = Tile index (hexadecimal 000 to 7FF)
    These map planes are stored inside VRAM itself, at locations specified by two VDP registers:
    Code:
    		move.w	#$82XX,($C00004).l	; 00FE D000 - Scroll Plane A Map Table VRam address
    		move.w	#$84XX,($C00004).l	; 0000 0FED - Scroll Plane B Map Table VRam address
    82 and 84 are the registers, and XX is the value being sent to the register, XX is to be broken into binary, for 82 (Plane A) the binary format is 00FE D000, and for 84 (Plane B) the binary format is 0000 0FED. 0 indicates a null bit which has no effect, while the bits D to F indicate the b-set address. In simple terms both plane's address can be set to any VRAM address in multiples of 2000 (e.g. VRAM 0000, 2000, 4000, 6000, 8000, A000, etc...), the binary positions are just a tad different.

    Sonic 1 & Sonic 2 set the Plane A address to C000, and the Plane B address to E000.

    --- Plane Size ---

    These planes have a certain width/height, this is controlled through register 90:
    Code:
    		move.w	#$90XX,($C00004).l	; 00VV 00HH - Plane Y Size (00-20|01-40|11-80) | Plane X size (00-20|01-40|11-80)
    Where XX in binary is 00VV 00HH:
    • VV = Plane Y size (00 = $20 tiles | 01 = $40 tiles | 11 = $80 tiles)
    • HH = Plane X size (00 = $20 tiles | 01 = $40 tiles | 11 = $80 tiles)
    • Setting 10 is designated as prohibited
    Now obviously, the larger you have these planes, the more VRAM for them will be required. Sonic 1 & 2 have the Plane Y size set to $20 tiles and the Plane X size set to $40 tiles. So, there are $40 tiles horizontally, multiply by $20 tiles vertically, equals $800, each map entry is word sized (as stated in the previous section), so $800 x 2 = $1000 bytes of VRAM.

    Sonic 1 & 2 have; Plane A at VRAM C000 to CFFF (1000 bytes), and Plane B at VRAM E000 to EFFF (1000 bytes).

    --- Scrolling ---

    First thing's first, there are two types of V-Scroll and three types of H-Scroll, all controlled by register 8B:
    Code:
    		move.w	#$8BXX,($C00004).l	; 0000 EVHH - External Interrupt (0N|1Y) | V-Scroll: (0-Full|1-2Celled) | H-Scroll: (00-Full|10-Celled|11-Sliced)
    Where XX in binary is 0000 EVHH, of which E we're not interested in right now.
    • V = V-Scroll mode (0 = Whole plane | 1 = Every two tiles)
    • HH = H-Scroll mode (00 = Whole plane | 01 = Prohibited | 10 = Every tile | 11 Every pixel/scanline)
    V-Scroll data is not actually stored in VRAM, there is a designated memory space within the VDP chip itself we'll call VSRAM (Vertical Scroll RAM), many sources suggest this is $50 bytes of memory, though I'm under the impression it's $80 bytes, regardless, only $50 are accessed/used.

    VSRAM consists of two words per entry; e.g. $AAAA $BBBB $AAAA $BBBB $AAAA $BBBB $AAAA $BBBB etc... Where $AAAA is Plane A's V-Scroll position, and $BBBB is Plane B's V-Scroll position.
    • When in "Whole plane" mode, only the first two words are read, meaning the first $AAAA $BBBB will control the entire plane's V-Scroll position.
    • When in "Every two tiles" mode, each $AAAA $BBBB will control two tile vertical strips of the V-Scroll position, from left to right of the display.
    H-Scroll data is stored in VRAM, this is controlled by register 8D:
    Code:
    		move.w	#$8DXX,($C00004).l	; 00FE DCBA - Horizontal Scroll Table VRam address
    0 indicates a null bit which has no effect, while the bits A to F indicate the b-set address. In simple terms the H-Scroll address can be set to any VRAM address in multiples of 400 (e.g. VRAM 0000, 0400, 0800, 0C00, 1000, 1400, etc...).

    Sonic 1 & Sonic 2 have the H-Scroll table set to VRAM FC00.

    H-Scroll consists of two words per entry; e.g. $AAAA $BBBB $AAAA $BBBB $AAAA $BBBB $AAAA $BBBB etc... Where $AAAA is Plane A's H-Scroll position, and $BBBB is Plane B's H-Scroll position.
    • When in "Whole plane" mode, only the first two words are read, meaning the first $AAAA $BBBB will control the entire plane's H-Scroll position.
    • When in "Every tile" mode, every 8th $AAAA $BBBB entry will control a tile horizontal strip of the H-Scroll position, from top to bottom of the display.
    • When in "Every pixel/scanline" mode, each $AAAA $BBBB will control a pixel/scanline horizontal strip of the H-Scroll position, from top to bottom of the display.
    Sonic 1 & 2 have the V-Scroll mode set the "Whole plane" mode, and H-Scroll mode set to "Every pixel/scanline".

    --- Scrolling Out ---

    Now, as I stated, the planes have a certain width/height to them, in Sonic's case, this is $40 x $20 ($200 pixels x $100 pixels), if a scroll position in either V-Scroll or H-Scroll should scroll such that the plane scrolls out of display, the plane is repeated/redisplayed on the other side (in a sort of "wrapping" form), the best way to imagine this, is that the plane is meshed together in an endless repeating pattern:

    [​IMG]

    Now, what this means is, Sonic games have to write new map tiles into the Plane's VRAM space (outside of the screen/display) while the game is scrolling. This means there needs to be a communication between the scroll positions and the tile redrawing, and this is where people will often screw up, they'll understand the scrolling fine, but then have trouble with the draw code not drawing the tiles in correct succession.

    --- Regarding Sonic 1 & 2 ---

    Sonic 1 uses a method of dividing the background into individual scroll sections (each one that moves at a different horizontal position), unfortunately, this was only used in one level (Green Hill Zone), and it only used two sections out of a planned four. The first section being the clouds/top mountains, and the second section being the bottom mountains and sea/lake, the lake is repetetive enough that it doesn't need to scroll in correct succession with the tile redrawing (Sonic Team even made the sea a bright blue to hide it, which is why it's no longer dark blue like the beta/title screen). Other levels in Sonic 1 (Labyrinth Zone, Spring Yard Zone, etc), have a really "bland" and basic scrolling horizontally/vertically with no slicing at all, thus, the scroll section method is useless.

    Star Light Zone is a very interesting one though, there is NO horizontal tile redrawing for the background, only vetical redrawing, they are using the very nature of the VDP in this case which is the whole "plane wrapping effect" I explained in the previous section, where the plane literally repeat the pattern over and over (Star Light Zone has a repetetive background horizontally making it ideal), and since the vertical scrolling isn't sliced (it's "Whole plane"), having it redraw vertically in succession with V-Scroll is easy to handle. In fact, this method was so favour-able in Sonic 1, it was reused in Spring Yard Zone in revision 01, unfortunately, the Spring Yard Zone background isn't repetetive enough horizontally in one area in the mountains, and as a result, you can see it cut off, appart from that, it works.

    (Interestingly, the very same Star Light Zone revision 00/Spring Yard Zone revision 01, that I talked about is the very same method that ProjectFM has shown above).

    Sonic 2 uses mostly the same method, though it has opted for a different approach in some cases, for example, Chemical Plane Zone, has a table of values to signify which scroll position it should use for which block it needs to draw horizontally, this makes redrawing in succession with scrolling easier to control along with a vertical scroll, something of which Sonic 1 could have done if they finished writing the "scroll sections" method, but I guess they ran out of time.

    --- My verdict ---

    I would suggest using the method used similarily in Star Light Zone revision 00/Spring Yard Zone revision 01. Your limit will be that you'd have to make the background repetetive horizontally, but it'll allow you to scroll the horizontal lines in various unique ways without worrying about redrawing, whilst still allowing you to scroll vertically for a change in background as you progress up/down the level. A bonus is that you save processing time as there's no need for horizontal redrawing (only the foreground will need redrawing).

    This does NOT limit you to scrolling per blocks, Star Light Zone and Spring Yard Zone used blocks, but that doesn't mean you cannot used scanlines directly.

    --- Explaining the scrolling in depth ---

    OK, so going back to "Scrolling" I mentioned that:
    Sonic games write this scroll data to a RAM space called a "buffer", temporarily, and transfer it to VRAM later during V-blank, so the format is the same, but you'll be writing to 68k RAM.

    I'm going to use Sonic 1, since that's what I'm most favour-able with, but this will work for Sonic 2, just name changes/RAM address changes...

    First thing is initialisation, in Sonic 1 these routines are called "BgScroll_XXX" where XXX is the zone name (be it GHZ, LZ, etc), what we need to setup here is the starting X and Y positions when the level starts, d0 contains the Screen's Y position on start up, so we can use that to setup the background position.
    Code:
    BgScroll_GHZ:
    		move.w	#$0100,($FFFFF708).w			; force X position to 2nd chunk's position (So redraw always occurs at beginning correctly...)
    		asr.w	#$02,d0					; divide by 4
    		move.w	d0,($FFFFF70C).w			; save as BG Y position
    		rts
    
    Now, I'm going to have the background scroll vertically 1/4 of the speed of the foreground. I've used the asr instruction to shift the position right by 2 (this will cause it to divide by 4, since we need 1/4), you can get a variety of different divisions through shifting/adding/subtraction, etc. Here's an example of getting 1/6 of the speed:
    Code:
    BgScroll_GHZ:
    		move.w	#$0100,($FFFFF708).w			; force X position to 2nd chunk's position (So redraw always occurs at beginning correctly...)
    		asr.w	#$02,d0					; divide by 4
    		move.w	d0,d1					; copy to d1
    		asr.w	#$02,d1					; divide by another 4 (division 16)
    		sub.w	d1,d0					; subtract /16 from /4 (will get /6)
    		move.w	d0,($FFFFF70C).w			; save as BG Y position
    		rts
    You could use the div instruction to get a specific division if you really wanted, though obviously, the div instruction is extremely slow for the 68k to process, so it should be avoided unless necessary.

    Now, in both cases, you may have noticed that I moved 100 into the background X position, the reason is that the draw code works by reading tiles outside of the screen, so if the screen is to the very left of the level, it'll try to read tiles from outside of the level (presumably the far right of the level), the 100 setting will help you to keep the background chunks in the background layout together, consistently making it easier to edit (if you are working with 128x128 chunks, you'll want to move 80 instead of 100).

    We now have the initialisation setup, the main scrolling itself "Deform_XXX" where XXX is the zone name (be it GHZ, LZ, etc), the first thing we'll do, is setup the vertical redraw flag/scroll first:
    Code:
    Deform_GHZ:
    		moveq	#$00,d4					; set no X movement redraw
    		move.w	($FFFFF73C).w,d5			; load Y movement
    		ext.l	d5					; extend to long-word
    		asl.l	#$08-2,d5				; multiply by 100, then divide by 4
    		bsr.w	ScrollBlock2				; perform redraw for Y
    		move.w	($FFFFF70C).w,($FFFFF618).w		; save as VSRAM BG scroll position
    		...
    The above "$FFFFF73C" will contain the distance that the foreground has moved vertically by (this is in the format of QQ.DD, fixed point 8-bit math, upper byte being quotient, lower byte being dividend). This is extended to long-word signed, and the asr instruction will shift it 6 bits left. This would be the same as shifting it 8 bits left (to multiply it by 100), and then shifting right 2 bits (to divide by 4 (the 1/4 speed we need)). The position MUST be considered as a division of 100 to begin with.

    You could seperate the asl instruction into two if you wanted, one asl of 8, and one asr of 2, but that's up to you, it won't take a lot of processing time, but it's something to consider.

    ScrollBlock2 is design to update the X and Y positions (this will update F70C, the background Y position), this subroutine will ONLY set the Y drawing flags, not the X, which is all we want for this. This background Y position in F70C is copied to F618 which is a single position V-Scroll buffer, this will be transfered to VSRAM later on during V-blank.

    Now the actual scrolling itself...

    The H-Scroll buffer is held at CC00, we'll load this address to a1 and start manipulating it with a1.
    Code:
    		lea	($FFFFCC00).w,a1
    Now remember, every two words is a single scanline, $AAAA $BBBB, you don't want the FG position to be tampered with, so $AAAA always needs to be set to the FG X position. Sonic Team often move the FG X position into a data register, but keep it on the upper word:
    Code:
    		move.w	($FFFFF700).w,d0			; load FG X position
    		neg.w	d0					; reverse direction (for H-Scroll)
    		swap	d0					; send to upper word of d0
    Now d0 = AAAA#### (Where A is the FG position, and # is the "soon to be" BG position).

    We'll change the lower word of d0 to be the BG X position that we want before moving the whole long-word of d0 to the H-Scroll table. We'll pretend for a moment we want the first $70 scanlines to scroll at one speed, and then the bottom $70 scanlines to scroll at a different speed:
    Code:
    	; --- Top 70 Scroll ---
    
    		move.w	($FFFFF700).w,d0			; load FG X position
    		neg.w	d0					; reverse direction
    		asr.w	#$03,d0					; divide by 8 (/8 the speed of the FG)
    		move.w	#$0070-1,d1				; set number of scanlines to write
    
    DGHZ_TopLoop:
    		move.l	d0,(a1)+				; write AAAA BBBB
    		dbf	d1,DGHZ_TopLoop				; repeat for all $70 scanlines)
    
    	; --- Bottom 70 Scroll ---
    
    		move.w	($FFFFF700).w,d0			; load FG X position
    		neg.w	d0					; reverse direction
    		asr.w	#$02,d0					; divide by 4 (/4 the speed of the FG)
    		move.w	#$0070-1,d1				; set number of scanlines to write
    
    DGHZ_NextLoop:
    		move.l	d0,(a1)+				; write AAAA BBBB
    		dbf	d1,DGHZ_NextLoop			; repeat for all $70 scanlines)
    
    		rts						; return (finished)
    Now keep in mind that obviously the background needs to be repetetive horizontally, but it can be complex vertically.

    What you should notice is that the background will always be scrolled vertically 1/4 the position of the foreground, likewise, you'll find that the top $70 pixels will always be scrolled horizontally 1/8 the position of the foreground, as well as the bottom $70 pixels being scrolled 1/4 the position of the foreground.

    Using a combination of instructions to multiply/divide the FG X position, you'll be able to get almost any speed desire-able on a scanline basis.

    There is one concern of course, and that's scrolling vertically doesn't cause the H-Scroll positions to move up or down... Well, you have to account for that too, take the background Y position, and use that to determin the number of scanlines to write.

    ....I hope your math is OK, because it's gonna require a bit of it....

    We need to write E0 scroll positions (this is the display's current vertical size), and we've broken it up into 70 scroll positions for the top, and 70 different scroll positions for the bottom, but it needs to move vertically when the background moves vertically.

    So, if the background moves down 1 pixel, we only need to write 6F scroll positions for the top, but 71 for the bottom (70 - 1 = 6F | 70 + 1 = 71). 6F + 71 = E0.

    If the background moves down say... 14 pixels, we only need to write 5C scroll positions for the top, but 84 for the bottom (70 - 14 = 5C | 70 + 14 = 84). 5C + 84 = E0.

    So the idea here is to subtract the number of top scroll positions, but add to the number of bottom scroll positions:
    Code:
    	; --- Top 70 Scroll ---
    
    		move.w	($FFFFF700).w,d0			; load FG X position
    		neg.w	d0					; reverse direction
    		asr.w	#$03,d0					; divide by 8 (/8 the speed of the FG)
    		move.w	#$0070-1,d1				; set number of scanlines to write
    		sub.w	($FFFFF70C).w,d1	; *NEW*		; subtract Y position from scanline count
    		blo.s	DGHZ_NoTop		; *NEW*		; if there is not top to scroll, branch
    
    DGHZ_TopLoop:
    		move.l	d0,(a1)+				; write AAAA BBBB
    		dbf	d1,DGHZ_TopLoop				; repeat for all $70 scanlines)
    
    DGHZ_NoTop:  ; *NEW*
    
    	; --- Bottom 70 Scroll ---
    
    		move.w	($FFFFF700).w,d0			; load FG X position
    		neg.w	d0					; reverse direction
    		asr.w	#$02,d0					; divide by 4 (/4 the speed of the FG)
    		move.w	#$0070-1,d1				; set number of scanlines to write
    		add.w	($FFFFF70C).w,d1	; *NEW*		; add Y position to the scanline count
    		cmpi.w	#$00E0-1,d1		; *NEW*		; are there too many scanlines to scroll now?
    		blo.s	DGHZ_NextLoop		; *NEW*		; if not, branch
    		move.w	#$00E0-1,d1		; *NEW*		; force to the maximum (don't go higher)
    
    DGHZ_NextLoop:
    		move.l	d0,(a1)+				; write AAAA BBBB
    		dbf	d1,DGHZ_NextLoop			; repeat for all $70 scanlines)
    
    		rts						; return (finished)
    I've highlight the "*NEW*" lines so you can see what's going on. Of course, the more complex you want it, the more you need to account for, and this can be considerably tedious, but that should pretty much explain what you've got to do in general.

    --- Code ---

    I'm going to provide some code which should take the tedious part out of all of this, but I really do suggest having a good read, perhaps you could improve it, or make your own system for handling it. For example, you could make a system whereby the FG and BG positions were in seperate buffers, and transfered them seperately to avoid worrying about the FG when dealing with the BG, and other crap like that.

    You'll need the code from above, up until just before loading the H-Scroll table buffer CC00 to a1, because this is where the change will occur.

    You'll need this subroutine somewhere in your source as well:
    Code:
    ; ===========================================================================
    ; ---------------------------------------------------------------------------
    ; Deform scanlines correctly using a list
    ; ---------------------------------------------------------------------------
    
    DeformScroll:
    		lea	($FFFFCC00).w,a2			; load H-scroll buffer
    		move.w	#$00E0,d7				; prepare number of scanlines
    		move.w	($FFFFF70C).w,d6			; load Y position
    		move.l	($FFFFF700).w,d1			; prepare FG X position
    		neg.l	d1					; reverse position
    
    DS_FindStart:
    		move.w	(a0)+,d0				; load scroll speed address
    		beq.s	DS_Last					; if the list is finished, branch
    		movea.w	d0,a1					; set scroll speed address
    		sub.w	(a0)+,d6				; subtract size
    		bpl.s	DS_FindStart				; if we haven't reached the start, branch
    		neg.w	d6					; get remaining size
    		sub.w	d6,d7					; subtract from total screen size
    		bmi.s	DS_EndSection				; if the screen is finished, branch
    
    DS_NextSection:
    		subq.w	#$01,d6					; convert for dbf
    		move.w	(a1),d1					; load X position
    
    DS_NextScanline:
    		move.l	d1,(a2)+				; save scroll position
    		dbf	d6,DS_NextScanline			; repeat for all scanlines
    		move.w	(a0)+,d0				; load scroll speed address
    		beq.s	DS_Last					; if the list is finished, branch
    		movea.w	d0,a1					; set scroll speed address
    		move.w	(a0)+,d6				; load size
    
    DS_CheckSection:
    		sub.w	d6,d7					; subtract from total screen size
    		bpl.s	DS_NextSection				; if the screen is not finished, branch
    
    DS_EndSection:
    		add.w	d6,d7					; get remaining screen size and use that instead
    
    DS_Last:
    		subq.w	#$01,d7					; convert for dbf
    		bmi.s	DS_Finish				; if finished, branch
    		move.w	(a1),d1					; load X position
    
    DS_LastScanlines:
    		move.l	d1,(a2)+				; save scroll position
    		dbf	d7,DS_LastScanlines			; repeat for all scanlines
    
    DS_Finish:
    		rts						; return
    
    ; ===========================================================================
    This is something that I wrote several years ago that'll automatically adjust H-Scroll positions based on the Y position.

    So, here we have "Deform_GHZ" again, but using the subroutine:
    Code:
    Deform_GHZ:
    		moveq	#$00,d4					; set no X movement redraw
    		move.w	($FFFFF73C).w,d5			; load Y movement
    		ext.l	d5					; extend to long-word
    		asl.l	#$06,d5					; multiply by 100, then divide by 2
    		bsr.w	ScrollBlock2				; perform redraw for Y
    		move.w	($FFFFF70C).w,($FFFFF618).w		; save as VSRAM BG scroll position
    
    		move.w	($FFFFF700).w,d0			; load X position
    		neg.w	d0					; reverse direction
    		asr.w	#$03,d0					; divide by 8
    		move.w	d0,($FFFFA800).w			; set speed 1
    
    		move.w	($FFFFF700).w,d0			; load X position
    		neg.w	d0					; reverse direction
    		asr.w	#$02,d0					; divide by 4
    		move.w	d0,($FFFFA802).w			; set speed 2
    
    		lea	DGHZ_Act1(pc),a0			; load scroll data to use
    		bra.w	DeformScroll				; continue
    
    ; ---------------------------------------------------------------------------
    ; Scroll data
    ; ---------------------------------------------------------------------------
    
    DGHZ_Act1:	dc.w	$A800,  $70				; top 70 scroll
    		dc.w	$A802,  $70				; bottom 70 scroll
    		dc.w	$0000
    
    ; ===========================================================================
    Now, I believe Sonic 3 did something similar, but don't quote me on that, but basically, what you have now is a list of speeds followed by number of scanlines to write. The idea is, you take the FG X position, divide it/multiply it, whatever... And plop the position as a word somewhere in RAM (in Sonic 1 you can use A800 - AA00 which is used as the block buffer, so it's free here), you put that RAM address in the list, followed by the number of scanlines you want to scroll at that speed.

    In the example above, I've put the FG X position divided by 8 into RAM A800-A801, then I've put the FG X position divided by 4 into RAM A802-A803. Now the list uses A800 for 70 scanlines (the FG X / 8 we made earlier), then A802 for the next 70 scanlines (the FG X / 4 we made earlier).

    You can change it anyway you want, for example:
    Code:
    DGHZ_Act1:	dc.w	$A800,  $01
    		dc.w	$A802,  $02
    		dc.w	$A800,  $04
    		dc.w	$A802,  $08
    		dc.w	$A800,  $10
    		dc.w	$A802,  $20
    		dc.w	$A800,  $40
    		dc.w	$A802,  $80
    		dc.w	$0000
    I think it should be clear what's going on now.

    There are a few things to note, firstly; the list MUST end with $0000, secondly, if the list doesn end, the subroutine will continue using the last scroll position (this means the A802 in the case above will continue to the bottom of the screen).

    You could however use the "Block" method as used by Star Light Zone/Spring Yard Zone (and explained by ProjectFM), the benefit to this one is that it'll save on processing time, the downfall is, you'll only be able to scroll a single block space at a time.
     
  8. redhotsonic

    redhotsonic Also known as RHS Member

    Joined:
    Aug 10, 2007
    Messages:
    2,969
    Location:
    England
    Pinned.
     
    HackGame likes this.
  9. nineko

    nineko I am the Holy Cat Member

    Joined:
    Mar 24, 2008
    Messages:
    1,902
    Location:
    italy
    That's impressive, even I could understand some of it.

    I have a question, though. Maybe I am misreading it, but (1/4) - (1/16) = (4/16) - (1/16) = 3/16, which isn't 1/6.
     
  10. MarkeyJester

    MarkeyJester ♡ ! Member

    Joined:
    Jun 27, 2009
    Messages:
    2,867
    Well what I actually meant was, getting a near division of 6, not really a fraction of 1/6 per se. But it's a mute point anyway since I did the calculation slightly wrong, the first division was meant to be / 8, not / 4, and it should have been added, not subtracted.

    4000 / 8 = 800 / 4 = 200 800 + 200 = A00 x 6.666666666666...... = 4000. But, you can't really have fractions without creating a dividend fix point space, so it's truncated to 6.

    We could go into mathematical philosophy, but the point is to demonstrate quicker ways of getting near divisions using simple powers of two division/multiplication, without resporting to using the div instruction.
     
    AURORA☆FIELDS and nineko like this.
  11. nineko

    nineko I am the Holy Cat Member

    Joined:
    Mar 24, 2008
    Messages:
    1,902
    Location:
    italy
    Of course! Please don't misread my attitude, I didn't want to imply you were wrong, I was just confused by that calculation. Since your skills are far beyond my ones, I always have to reread your posts :U
    Thanks for taking the time to explain it with further detail :)

    Also, it's 6.4, not 6.6, this time I'm sure, since it's 16384/2560. I got the point :U
     
  12. MarkeyJester

    MarkeyJester ♡ ! Member

    Joined:
    Jun 27, 2009
    Messages:
    2,867
    I'm sorry, I didn't mean for my post to come across as aggressive, you were technically right. The problem is, I'm writing it in hex, and you're reading it in decimal.

    It's 6.4 in decimal. 6.6666 in hexadecimal.

    16384 / 2560 = 6.4
    4000 / A00 = 6.6666
     
    nineko likes this.
  13. nineko

    nineko I am the Holy Cat Member

    Joined:
    Mar 24, 2008
    Messages:
    1,902
    Location:
    italy
    Sometimes a written media isn't the best place to express our emotions, because I never said you were aggressive, and vice versa I apologise if I seemed disrespectful even though I prefixed my post with "don't misread my attitude". It should be clear that my respect towards you is extremely high, and once again I thank you for taking the time to clear up something it wasn't clear to me, sometimes even alleged "Pro Users" like myself can be even more annoying than the latest newcomers. Hexadecimal numbers with a fractional part aren't common, hence my confusion about your 6.6; therefore, your clarification is probably for the best in absolute terms, as other people will probably benefit from it. Once again, your posts are always an interesting read :)
     
    MarkeyJester likes this.
  14. StephenUK

    StephenUK Working on a Quackshot disassembly Member

    Joined:
    Aug 5, 2007
    Messages:
    1,026
    Deviating slightly from the topic, but still somewhat relevant, is there ever a point to using MUL and DIV commands or should they always be avoided? I've never looked into how many cycles these instructions use, but I was just wondering if there was ever a point where they should be used instead of long hand arithmetic like you've used in the deformation examples. I never use MUL and DIV myself simply because I have read numerous times that they take up a substantial amount of processing time, but could you possibly put this into perspective against the method you used in your example?
     
  15. Devon

    Devon DROWN, DROWN, DROWN MYSELF! Member

    Joined:
    Aug 26, 2013
    Messages:
    1,400
    Location:
    your mom
    Effective address calculations from here.

    I guess if the total amount of cycles when using long hand arithmetic somehow builds up to be greater than using MUL and DIV, you can use those. Other than that, try to avoid them, since they take up quite the number of cycles.
     
    Last edited: Aug 12, 2016
  16. MarkeyJester

    MarkeyJester ♡ ! Member

    Joined:
    Jun 27, 2009
    Messages:
    2,867
    As Ralakimus said, an example would be if the scroll division/multiplication factor is very precise, but far away from the power of two:
    Code:
            muls.w   #$0497,d0
    The quickest way I could think of, without using muls, or using look-up tables, is:
    Code:
            move.w   d0,d1
            add.w    d0,d0
            add.w    d0,d1
            add.w    d0,d0
            add.w    d0,d1
            clr.w    -(sp)
            move.b    d0,(sp)
            add.w    (sp)+,d1
            add.w    d0,d0
            add.w    d0,d0
            add.w    d0,d1
            lsl.w    #$03,d0
            add.w    d0,d1
    The muls instruction has proved quicker in this instance...

    Another example is where the source operand is variable, as in to say, the value changes based on circumstance:
    Code:
            muls.w    d1,d0
     
  17. Crash

    Crash Well-Known Member Member

    Joined:
    Jul 15, 2010
    Messages:
    302
    Location:
    Australia
    yeah, as far as i've figured out, if you know beforehand how much you need to multiply/divide by, it's probably faster to use a combination of bit rotation and addition/subtractions, but if the amount you need to multiply/divide by needs to be calculated then you should probably go with the built in instrucions.

    it's close enough that after inserting your code i could port the marble garden zone act 1 deformation code over with almost no modification :p

    Staff edit: Merged posts
     
    Last edited by a moderator: Aug 13, 2016