RTA Retiming

This page has documentation for RTA Retiming (intended for load retiming specifically).

RTA Retiming | Start + End
RTA Retiming | Load Visuals
- In Transitions
- Out Transitions
RTA Retiming | FMVs

RTA Retiming | Start + End

An Any% run starts on the first frame of no text after starting a file, and ends on the first frame of FLUDD textbox after the last platform is hit in the final boss. It’s the two frames on the right in the examples below (which show consecutive frames):

^ Normal end frame

^ Glitched end frame

Note that some leaderboards still use the old convention of the last frame of text for start frame rather than first frame of no text, but mixing first and last frames in a duration is incorrect in general (e.g. biased estimation of continuous quantities, wrong result in 60fps). This old convention results in runs that are 1 frame longer (at recording frame-rate) than the correct convention.

RTA Retiming | Load Visuals

Gameplay segments and shine-selects are all timed using reference visuals. This doc covers start and end frames for video (for load timing), but not stop frames (for IL timing). “NTSC” below refers to NTSC-J and NTSC-U simultaneously.

In Transitions

First frames always match the reference.

In Circle

Levels (non-initial); Delfino (from Delfino).

Peephole shape, consistent size worth remembering.

In Spiral

Delfino (from non-Noki levels).

Sliver with edge pointing about 30° below right; following frame is more like 70°.

In Fade (White)

Levels (initial).

These are normally clear.

ℹ️ Check: frames (7,8) are the first pair with black bars that don’t move (where frame 0 is the first frame of the fade, pictured above).

In Fade (Black)

^ PAL

^ NTSC

Delfino (from Noki levels).

On common Any% routes, the Shadow Mario Turbo event cutscene will accompany this transition. The first frame has SM pointing the paintbrush slightly below horizontal, whereas future frames are above horizontal.

ℹ️ Check: frames (7,8) are the first pair with black bars that don’t move (where frame 0 is the first frame of the fade, pictured above).

⚠️ NTSC fades: much darker than PAL. The example above is from Twitch; with YouTube encoding, this frame will often be invisible. Compare: PAL YouTube example.

In Fade (Shine Select)

Shine Selects.

This is vague and quite variable, but always identifiable via a slight discolouration, some diagonal bands, and/or being the only frame with the shine sprites either out-of-place or missing.

ℹ️ Check: frame 14 is the first alignment of the shine sprite – the three spheres on its top spikes line up (where frame 0 is the first frame of the fade, pictured above).

Out Transitions

Last frames are always the first all-black/white frame following a segment, so references are given for the frame right before.

Out Circle

Level segments (all terminal ones and all but 11 non-terminal ones).

Pinhole shape, consistent size worth remembering. This one is often too dark to see, so worth learning the shape of the frame 1f before (smaller than peephole) and 2f before (larger than peephole), and failing that:

ℹ️ Check: circle out transitions are 23f starting with the first appearance of the black corners (top two black, bottom two not black); labelling this as frame 0, frame 20 should be slightly larger than the peephole (see circle in transitions), frame 21 slightly smaller, frame 22 is the pinhole (pictured here above), and then frame 23 the first blackout frame, which is the one we’re trying to identify.

Out Spiral

Delfino (non-Noki level entrance; corona entrance).

Claw shape (edge pointing NW).

Out Fade (White)

Delfino (Noki level entrance).

These are normally clear.

ℹ️ Check: standard PAL fadeouts are 10f and NTSC 12f (meaning that if the first faded frame is frame 0, the first whiteout frame is frame 10/12 (PAL/NTSC), so the reference frame is frame 9/11 (PAL/NTSC)).

Out Fade (Black)

^ PAL

^ NTSC

11 level segments: fmv (ae, b2e, r1e, cc), Sirena hotel entry (×5), p1y, n3e; title screen.

ℹ️ Check: standard PAL fadeouts are 10f and NTSC 12f (meaning that if the first faded frame is frame 0, the first blackout frame is frame 10/12 (PAL/NTSC), so the reference frame is frame 9/11 (PAL/NTSC)). The visibility of the first faded frame varies; it helps to stare at light parts of the level (n3e sky, cc clouds, etc.). For Sirena hotel entry and p1y specifically, the first faded frame of the last textbox is frame -3 relative to the above numbering, which can be used instead (on SHE in particular, the first faded frame is very hard to see).

⚠️ NTSC fades: much darker than PAL. The example above is from Twitch; with YouTube encoding, this frame will often be almost invisible or even practically invisible. Compare: PAL YouTube example.

Out Fade (Shine Select Solid)

Shine Selects (new episodes).

Pattern of radial lines; consistent and easy enuff to see if you compare with the following whiteout frame and scan your eyes in a circle to confirm no lines are visible.

ℹ️ Check: shine select fadeouts are most easily counted from the frame the white circle appears over the shine sprite to indicate it’s been selected. The shine revolves thru 1 full turn in 30f, and this frame is the first frame of the fadeout (tho it’s very subtle on NTSC so not worth noting). Marking this as frame 0, the first whiteout frame is frame 25/30 (PAL/NTSC), so the reference frame is frame 24/29 (PAL/NTSC). The reference frames look similar on PAL and NTSC.

Out Fade (Shine Select Dashed)

Shine Selects (old episodes, usually 3YG in Any%).

Pattern of radial dashes; fairly hard to see.

ℹ️ Check: same as Out Fade (Shine Select Solid).

⚠️ YouTube: the last frame is very vulnerable to YouTube encoding.

Out Swish

Death.

You can’t miss this one (you also can’t miss death).

Out Diamond

Savewarp.

Easy to see owing to blue savebox background.

RTA Retiming | FMVs

Theory

Unskipped FMVs can be easily retimed by pinpointing their first end blackout frames (this is preferred to start frames because NTSC fades and YouTube often make them invisible), and this is done to distinguish FMV loads from level loads, tho it’s not necessary for a total load timing. The duration of an unskipped FMV is the length of the video file.

Skipped FMVs must be retimed because the frames late an FMV is skipped influence the length of that FMV in the run, which must be deducted from loads. The theoretical model used for this is as described below. I don’t have too much confidence in it but it best explains the evidence I’ve seen so far.

The frame a skipped FMV starts on is the frame on which it displays (or would have displayed, if it had been visible) its first frame; the frame it ends on is the start frame + 2 × ∆fs, where ∆fs is the number of frames late the fmv is skipped. Optimally, an FMV starts and ends on the same frame (meaning its duration is zero); a ∆fs=1 FMV ends 2f after it starts, and so on. The timeloss incurred by the FMV is thus 2 × ∆fs (this breaks down if the FMV is allowed to fully fade in, which takes over 15f IIRC).

Non-Fluddless FMV Visuals

The FMV is visible according to the following pattern (which continues while the FMV is still fading in). This applies to every Any% FMV except the Fluddless FMV:

∆fs	turn	end
0	0	0
1	1	1
2	2	3
3	3	5
4	4	7

The numbers in the data columns are the frame number relative to frame 0, defined as the first frame the FMV is visible. The turn frame is the first frame where the FMV does not get brighter relative to the previous frame; the end frame is the first frame of blackout. This means the visible FMV brightens for [turn] frames and lasts for [end] frames. Since the FMV duration is 2∆fs, the FMV is visible for one less frame than it lasts, unless it lasts 0 frames. Hence, the first frame of blackout is one before the end frame (compare this to gameplay timing, where these always coincide).

Fluddless FMV Visuals

The Fluddless FMV is the only one that starts with a black frame, so consequently obeys this pattern:

∆fs	turn	end
0	0	0
1	0	0
2	1	2
3	2	4
4	3	6

Hence, in this case, the first visible frame is one after the FMV starts (rather than coinciding), and the last visible frame is one before the FMV ends (as with non-Fluddless).

Audiovisual Timing

Owing to the facts that NTSC fades and YouTube together make the first few frames of skipped FMVs impossible to see, audio can be used to give more information. However, both audio and video analysis give self-inconsistent results sometimes, much less contradicting each other, so skipped FMV timings should always be treated as estimates.

Audio can be analysed by opening the video in Audacity and looking for sound cues (note that every FMV is first visible when ∆fs = 1, except Fluddless, which is ∆fs = 2):

Plane: airplane white noise (first audible when ∆fs = 2)
FLUDD Meet: beep (first audible when ∆fs = 2)
FLUDD Tutorial: white noise (first audible when ∆fs = 1)
Court: Peach voice (first audible when ∆fs = 2)
Jail: (inaudible)
Cop: percussion (first audible when ∆fs = 0, i.e. always)
FLUDDless: (first audible when ∆fs = 2)

Furthermore, the length of the audio can be identified visually in Audacity. Rounding to frames usually works well, tho half-frames sometimes fit better. Amplify the sound (Effects > Amplify) to make it more visible and audible. All sounds above are continuous except FLUDD Meet (beep lasts 1f, then has a quiet tail lasting 0.5f, then silence for 1.5f until second beep; that’s why it’s usually reported as 1.5f length).

Testing on emulator, these are the patterns expected to be observed, tho again this isn’t completely consistent so audio and video might indicate different values of ∆fs for the same sample. v denotes the visible video length (the “end” columns above), whereas a is the measured length of the audio rounded to the nearest frame (or 0.5f in the case of FLUDD Meet, per the above).

Here are two worked examples of a/v analyses, with abridged video files available here: 1:14:03, 1:14:13.

I put estimates for ∆fs in the “t” column and marked in red the one FMV in the sample that doesn’t fit (notably, in this sample, the menubox appears 2f after it’s expected to, suggesting a ∆fs=2 FMV skip – menuboxes usually start 2f before they appear, on the same frame the preceding FMV ends). Estimates of 0.5 are used for Fluddless FMVs that are invisible and inaudible, since ∆fs=0 and ∆fs=1 are indistinguishable for those.