Sonic 2 Disassembly

StephenUK · Sep 28, 2014

As some of you may be aware, I am currently working on a restructuring of the Sonic 2 disassembly. This will involve regrouping some of the files for quicker access as well as splitting some elements of the assembly file out into seperate files, and reintegrating some of the currently split files back in.

To do this, I am basing it off the Git disassembly. Now, before anybody starts posting about how the Git disassembly is a bitch to work with and how the Xenowhirl one is better, let me just stop you there before you start. The Git disassembly does have a lot of improvements over the Xenowhirl disassembly, although for beginners it can be a lot more daunting to navigate and work with. The aim of this new disassembly will be to make an assembly file that is not only current, but is also friendly to beginners.

That said, no disassembly will necessarily be friendly to the majority unless I take some ideas and direction from that majority. So this is your chance to voice your opinion. What changes would you like to see made? What are the other disassemblies lacking?

In terms of file structure itself, I pretty much have an idea of how to lay it out, and it will involve grouping of related items for easy locating and replacing. For example, zone art, mappings, palettes, layouts and direct code such as deformation and DLE are now all located under a single folder (such as EHZ Data). Object art, mappings, and code are located in an object data folder, with subfolders for each object.

The reason for this is based around something I had a problem with in the past. I wanted to port out a level from a hack I'd made into another hack I was working on, and had to open a lot of folders to pull out and copy data one file at a time. Now, imagine you could just mass replace a zones data folder and have it all working immediately. Copy a data folder (e.g. EHZ Data) over the top of the same folder in another copy of the disassembly, and (providing you didn't rename everything in that folder previously) once overwritten you can build and run. Obviously custom objects would have to be taken into account on layouts, but objects would be just as easy to duplicate and alter in bulk without working out which file corresponds to each object.

Anyway, those are my ideas at the moment. What are your thoughts?

DumbLemon · Sep 28, 2014

As a beginner who wants to move to Sonic 2 sometime, I can see this being useful for me. Will defiantly use this when it comes out.

redhotsonic · Sep 28, 2014

StephenUK said: ↑

For example, zone art, mappings, palettes, layouts and direct code such as deformation and DLE are now all located under a single folder (such as EHZ Data). Object art, mappings, and code are located in an object data folder, with subfolders for each object.
Click to expand...

Please don't. I like all my code in one single ASM file. That way, if I ever need to replace a RAM name or something, I can just use the Ctrl+H (Replace) and replace it all. Rather than open several files and replace them. Main reason why I don't hack Sonic 1 and never ported the priority manager into it. Was able to change all priority RAMs in Xeno's S2 within 5 seconds. With Sonic 1's SVN with hundreds of ASM files everywhere and making sure not to miss one, I couldn't be arsed. Plus, I'll have hundreds of files open just to replace stuff and locate certain codes, which is exactly what you're trying to avoid.

Or am I missing the point?

StephenUK · Sep 28, 2014

Change a single equate for the RAM address. Job done.

As for opening up lots of asm files, not really. The major bulk of the game code will all remain within a single ASM file. The only things that will be split will be things that are somewhat self contained, such as deformation code, DLE code, objects and stuff like that. The majority of what you've done in your hack involves things that will still remain in a single asm files, like the main managers and stuff. The new file structure also allows people to work in groups on hacks and not have to worry too much about changes other people make. With everything in a single assembly file, if someone modified the sound driver and someone else modified the Sonic object, but in their own files, then you have the tedious task of having to pool the changes all into one disassembly. With the way I'm setting it up, the two could be altered by people separately and brought together later by simply copying and overwriting the relevant files. Of course, an SVN is also an option for that kind of stuff, but this should be able to achieve a similar kind of scenario.

amphobius · Sep 28, 2014

Yes, you are missing the point; having seperate code files for different things is pretty much common practice in the real world. Splitting things off is better for self-management and organisation, but only if it's done well. Sonic 1's current disassembly is a bad example of splitting off, but if it's done well it's very nice for handling things.

Having one big file that has everything makes it a pain in the ass to handle. I'd much rather have the main file be the main core of everything, with everything split off into where it's needed - Sonic's code in a folder with all his art and stuff, all the music code in a chain of music folders (like ./Sound/Music, ./Sound/Z80, ./Sound/SFX, etc). I think that's the general idea - something that's very organised and self-explainatory to use.

(was directed at RHS, Stephen sniped me whilst writing this post)

redhotsonic · Sep 28, 2014

StephenUK said: ↑

Change a single equate for the RAM address. Job done.
Click to expand...

Not true. All priorities had to be changed from .b to .w so I was able to do that with a single ASM file with the replace command. Rather than open every single object ASM file and replace the .b with .w
For example, it means opening up Sonic.asm, then spring.asm, then monitor.asm, blah blah blah, just to change the priority RAM from .b to .w. But as long as the main code and general shit is in one place, I'm happy.

Also, I guessing you are as you said change the RAM equate to my previous post; make sure the RAM stays like:

Sonic_Dust = ramaddr( $FFFFD100 )
Tails_Dust = ramaddr( $FFFFD140 )
...
Tails_respawn_counter = ramaddr( $FFFFF704 )Rather than:
Code:
Sonic_Dust:			; Sonic's spin dash dust
				ds.b	object_size
Tails_Dust:			; Tails' spin dash dust
				ds.b	object_size
...
Tails_respawn_counter:		ds.w	1
Like, what the fuck does that even mean? What RAM address is that? I had to look into Xenowhirl's ASM file to find out what RAM address it actually was.
If it wasn't for these two problems, I'd probably would use SVN/Git rather than Xenowhirl.

EDIT: Sorry, just realised I was moaning about which disassembly was better then. Whoops! Anyway, that's my only two suggestions for your disassembly. Just don't go nuts with the seperate files and make the RAM equates clear

Pacca · Sep 28, 2014

Please make sure every file name nd label makes sense. Particularly with files, having to go to sonic retro to find out the object IDs for hive rain disassemblys map pings is painful.

StephenUK · Sep 28, 2014

The equates are one of my biggest problems with the Git disassembly, and I am more inclined to using the method you shown unless someone can prove to me why the Git version is the better way to go. Some other things in the equates section that I don't see the point of as well, is using constants in place of shared art locations in VRAM. It's all well and good if you are just building stock S2, but when you are hacking and might want to shift art around on a per zone basis, it ends up becoming a completely pointless addition. Depending on how much of a pain it is to change the equate system, I may build my disassembly over the Xenowhirl one, but incorporating the better features of the Git disassembly.

Hitaxas · Sep 28, 2014

redhotsonic said: ↑

Click to expand...

Also, I guessing you are as you said change the RAM equate to my previous post; make sure the RAM stays like:

Sonic_Dust = ramaddr( $FFFFD100 )
Tails_Dust = ramaddr( $FFFFD140 )
...
Tails_respawn_counter = ramaddr( $FFFFF704 )
Rather than:
Code:
Sonic_Dust:			; Sonic's spin dash dust
				ds.b	object_size
Tails_Dust:			; Tails' spin dash dust
				ds.b	object_size
...
Tails_respawn_counter:		ds.w	1
Like, what the fuck does that even mean? What RAM address is that? I had to look into Xenowhirl's ASM file to find out what RAM address it actually was.

I second this. It is much easier (in some ways) to work with the RAM when you can actually see what the address is rather than having to add all the byte sizes up to figure it out. This is what turns me away from the git disassembly so much.

However, that method also makes it easier to move the equate without having to figure out exactly where you need to (which can lead to messing the game up if not done right, obviously!) So it's a double bladed knife...

nineko · Sep 28, 2014

If you do a good job I might reconsider the chance to work on a Sonic 2 hack.

The one thing I really wouldn't want to see are those confusing +/- labels.

Hitaxas · Sep 28, 2014

The +/- labels are awesome, are you kidding? They do exactly what they need to do, branching to a + branches to the next plus down the line, branching to a - branches back to the last - up the line (Of course there are many different things you can do with them, but that is the most convenient example I could make at this time). >.>

MainMemory · Sep 28, 2014

StephenUK said: ↑

The equates are one of my biggest problems with the Git disassembly, and I am more inclined to using the method you shown unless someone can prove to me why the Git version is the better way to go.
Click to expand...

I doubt I can do that, but here is why I prefer it:

Every variable has its size attached to it by the attribute and number on the ds directives.

There are extra labels allowing you to see when a range of RAM is treated specially, such as the area of dynamic object RAM that isn't used in 2P mode, or the misc level variables that are cleared when the level starts.

It's much more difficult to inadvertently clobber a variable with another, such as assigning a word variable to a RAM address that is occupied by the low word of a longword variable.

Moving variables around or inserting new variables is as easy as cut and paste, and everything else adjusts automatically.

Areas of unused RAM are clearly marked as unused right in the declarations.

Getting the RAM addresses for debugging is as easy as telling AS to generate a listing file (add -L to the AS command line and change all instances of 'listing off' to 'listing purecode') (this may become the default behavior for the Git disassembly in the future).

nineko · Sep 28, 2014

+/- labels are annoying and confusing because I have to look up for them and it's always possible to count in a wrong way. Names like sloc_1234 are much better even if they're meaningless. I like to know at the first sight the label I'm going to branch to instead of having to figure it out.

StephenUK · Sep 28, 2014

MainMemory said: ↑

The equates are one of my biggest problems with the Git disassembly, and I am more inclined to using the method you shown unless someone can prove to me why the Git version is the better way to go.
Click to expand...

I doubt I can do that, but here is why I prefer it:

Every variable has its size attached to it by the attribute and number on the ds directives.

While I can see what you mean with this, I think unless the RAM area is of a significant size, it doesn't really make much of a difference. If the RAM addresses in the equates are listed in order, then if you see FE12 followed by FE14, then it's clear to see the size of the RAM value. Unused RAM variables can also be listed in the same way.

There are extra labels allowing you to see when a range of RAM is treated specially, such as the area of dynamic object RAM that isn't used in 2P mode, or the misc level variables that are cleared when the level starts.

I can see the benefit of this, but is there no way of handling these in conjunction with the fixed variable method? Looking at some of the equates, it starts to get a look more complicated than it probably is. It's not too bad for experienced hackers, but from a beginner's perspective it starts to look quite daunting.

It's much more difficult to inadvertently clobber a variable with another, such as assigning a word variable to a RAM address that is occupied by the low word of a longword variable.

I actually fully agree with this point. It's one of the redeeming features for me with this version of the equates table.

Moving variables around or inserting new variables is as easy as cut and paste, and everything else adjusts automatically.

I can see what you mean to some extent, but for the most part variables could be moved by changing the attached addresses, and inserting new ones is a copy and paste effort, with an altered RAM variable. I understand you run the risk of causing a clash, but if the unused variables are clearly outlined then there really shouldn't be any reason why you would move stuff around so much and insert stuff in a way that would break the game.

Areas of unused RAM are clearly marked as unused right in the declarations.

Unused RAM could still be defined within the equates.

Getting the RAM addresses for debugging is as easy as telling AS to generate a listing file (add -L to the AS command line and change all instances of 'listing off' to 'listing purecode') (this may become the default behavior for the Git disassembly in the future).

Not had any experience with that personally so I can't really comment on it either way.

I've broken down the points above and shared my view on them. While I agree with some of them, I do have a few issues. Thanks for sharing your views though, I need more things like this to see which is the best way to take this.

ThomasThePencil · Sep 29, 2014

I'm kinda on both sides of the room here.

Firstly, the +'s and -'s. In all honesty, I actually find these easier to use. Why? Because with them, you don't have to worry about "symbol double defined" errors. With "normal" names (such as byte_21F200), you could very well end up writing a label name that is already used, and after a while you realize it can be a bit tedious to figure out which ones are used and which ones are not.

Secondly, RAM equates, in which Xenowhirl gets my vote. The fixed variable system allows for simplistic modification of variables, since you know exactly where something is, whereas the ds method, while more idiot-proof (for lack of a nicer term), is kind of daunting to beginners. I've experienced the difficulties of working with the Git disasm of S2 firsthand, so I know the ins and outs of it.

Point the third: OMG TEH ERRORS WHEN USING GIT. 'Nuff said.

I'm leaning towards the non-Git side, but eh.

I'm liking the idea of a new disasm which kind of does everything the Xenowhirl does, plus the better features of the Git disasm.

MarkeyJester · Sep 29, 2014

StephenUK said: ↑

While I can see what you mean with this, I think unless the RAM area is of a significant size, it doesn't really make much of a difference. If the RAM addresses in the equates are listed in order, then if you see FE12 followed by FE14, then it's clear to see the size of the RAM value. Unused RAM variables can also be listed in the same way.
Click to expand...

While I have no comment regarding the decisional aspects here, I think it's important for me to point out a minor flaw, that would be in everyones best interests for future referrence. I agree that it seems clear that "FE12 followed by FE14" would indicate the size of the "RAM value" (In this case it is assumed to be a word). However you are not garenteed the notion on that, and I'll explain why...

We'll say as an example of practice here, that Sonic Team used FE12 for a byte value, but then they needed a space for a word value afterwards, they would not have been able to contract the word value to FE13 - FE14 due to the systemic flaw of the 68k; that is, the address error upon attempting a word write/read from an odd address. They would therefore align it to the next even address of FE14 - FE15, leaving FE13 invalidly free for use. Of course, you wouldn't know that by simply seeing the gap, as it would require scrutinising the code that accesses these addresses.

It is also a posibility that FE13 would have been equated for use with something during developement, but obviously, was removed but the equate was left in, thus, leaving an unused byte space overlooked by Sonic Team themselves. I'm not implying such a situation would or would not occur in Sonic 2, though I do know there are various places in Sonic 1 that your idea would have implications on, I'm just making sure that you understand the flaw here.

As a compromise, could we not have the size of it in a comment on the end, or have the size indicated in the equate name? (Not that I want to promote such an idea, I believe that'll lead to a larger mess, but I'm playing devil's advocate here, so cut me some slack in advance).

Spanner · Sep 29, 2014

StephenUK said: ↑

Depending on how much of a pain it is to change the equate system, I may build my disassembly over the Xenowhirl one, but incorporating the better features of the Git disassembly.
Click to expand...

I think this would probably be a better thing to do. Add the improvements, don't bother with the stuff that isn't needed to be changed.

StephenUK · Sep 29, 2014

MarkeyJester said: ↑

While I can see what you mean with this, I think unless the RAM area is of a significant size, it doesn't really make much of a difference. If the RAM addresses in the equates are listed in order, then if you see FE12 followed by FE14, then it's clear to see the size of the RAM value. Unused RAM variables can also be listed in the same way.
Click to expand...

While I have no comment regarding the decisional aspects here, I think it's important for me to point out a minor flaw, that would be in everyones best interests for future referrence. I agree that it seems clear that "FE12 followed by FE14" would indicate the size of the "RAM value" (In this case it is assumed to be a word). However you are not garenteed the notion on that, and I'll explain why...We'll say as an example of practice here, that Sonic Team used FE12 for a byte value, but then they needed a space for a word value afterwards, they would not have been able to contract the word value to FE13 - FE14 due to the systemic flaw of the 68k; that is, the address error upon attempting a word write/read from an odd address. They would therefore align it to the next even address of FE14 - FE15, leaving FE13 invalidly free for use. Of course, you wouldn't know that by simply seeing the gap, as it would require scrutinising the code that accesses these addresses.It is also a posibility that FE13 would have been equated for use with something during developement, but obviously, was removed but the equate was left in, thus, leaving an unused byte space overlooked by Sonic Team themselves. I'm not implying such a situation would or would not occur in Sonic 2, though I do know there are various places in Sonic 1 that your idea would have implications on, I'm just making sure that you understand the flaw here.As a compromise, could we not have the size of it in a comment on the end, or have the size indicated in the equate name? (Not that I want to promote such an idea, I believe that'll lead to a larger mess, but I'm playing devil's advocate here, so cut me some slack in advance).
Fair comment, I can see how this may cause confusion at some points. I also don't want to be going overboard by commenting every equate with a size. That said, do you have any ideas on what could be used as a happy medium? The Git method is clearly causing a lot of negativity towards it, but at the same time the Xenowhirl method has its flaws.

Thinking about it though, I'll probably stick with the Xenowhirl method. My reasoning being that if people like the Git method, then the Git disassembly is perfectly suited to their needs. The Xenowhirl is a bit too outdated and is one of the main reasons I started work on this project, so I think an updated version based around that will be the better option, for a starting point at least.

Cinossu · Sep 29, 2014

While it may add further complication and work initially with the making of the disassembly, why not have both solutions, with a simple variable equate on which to load? That way, at the beginning of someone downloading this disassembly, they get the choice on which they use and can ignore the other entirely. Both should be possible to use throughout a disassembly without issue, seeing as labels for either can just stay the same.

redhotsonic · Sep 29, 2014

MarkeyJester said: ↑

Of course, you wouldn't know that by simply seeing the gap, as it would require scrutinising the code that accesses these addresses.
Click to expand...

Another reason why I prefer Xenowhirl's dis. All code is in one file. Do a search for FE13. No results? It's not being used (do a search for FE12 and FE10 in case they're using .w or .l to be on the safe side). If you have a lot of split files, like SVN/Git, then I see your point, as I don't want to search every single file to see if a RAM is being used or not. That's why I like all my code in one file, but that's down to personal preference, and me moaning again =P

Log in or Sign up

Sonic 2 Disassembly

StephenUK Working on a Quackshot disassembly Member

DumbLemon am back Member

redhotsonic Also known as RHS Member

StephenUK Working on a Quackshot disassembly Member

amphobius spreader of the pink text Member

redhotsonic Also known as RHS Member

Pacca Having an online identity crisis since 2019 Member

StephenUK Working on a Quackshot disassembly Member

Hitaxas Retro 80's themed Paladins Twich streamer Member

nineko I am the Holy Cat Member

Hitaxas Retro 80's themed Paladins Twich streamer Member

MainMemory Well-Known Member Exiled

nineko I am the Holy Cat Member

StephenUK Working on a Quackshot disassembly Member

ThomasThePencil resident psycho Member

MarkeyJester ♡ ! Member

Spanner Zzz... Member

StephenUK Working on a Quackshot disassembly Member

Cinossu A blend of secret herbs and spices Member

redhotsonic Also known as RHS Member

Log in or Sign up

Sonic 2 Disassembly

StephenUK Working on a Quackshot disassembly Member

DumbLemon am back Member

redhotsonic Also known as RHS Member

StephenUK Working on a Quackshot disassembly Member

amphobius spreader of the pink text Member

redhotsonic Also known as RHS Member

Pacca Having an online identity crisis since 2019 Member

StephenUK Working on a Quackshot disassembly Member

Hitaxas Retro 80's themed Paladins Twich streamer Member

nineko I am the Holy Cat Member

Hitaxas Retro 80's themed Paladins Twich streamer Member

MainMemory Well-Known Member Exiled

nineko I am the Holy Cat Member

StephenUK Working on a Quackshot disassembly Member

ThomasThePencil resident psycho Member

MarkeyJester ♡ ! Member

Spanner Zzz... Member

StephenUK Working on a Quackshot disassembly Member

Cinossu A blend of secret herbs and spices Member

redhotsonic Also known as RHS Member

Useful Searches