Version 7 (modified by r2d, 18 years ago) (diff) |
---|
stream version 7 format specification
The data stream is to be seen as a stream of 32 bit words, which is however decoded starting from the MSB. If you regard it as an octet stream, then the order of bits is as follows: (Counting starts from 1, MSB on the left)
v-----------------------v-----------------------v-----------------------v-----------------------v-----------------------v--- |25|26|27|28|29|30|31|32|17|18|19|20|21|22|23|24| 9|10|11|12|13|14|15|16| 1| 2| 3| 4| 5| 6| 7| 8|57|58|59|60|61|62|63|64|49| ^-----------------------^-----------------------^-----------------------^-----------------------^-----------------------^---
Because of different means of access one has to determine the stream version through another mechanism, because you can access the magic number too late (sic) through this structure. Furthermore, since there are nearly no stream versions below 7 out in the field, and even if they were out there, their quality is suboptimal, only SV7 is described here.
This bit structure mainly causes problems on 16 bit and 64 bit CPUs because one always has to decode with 32 bit words.
=============================== BASIC STRUCTURE ============================ Header ~~~~~~ -0------------------------------------------------------------------------- StreamMinorVersion 4 bit 0...1 Currently 0 (PNS not used) or 1 (PNS used) StreamMajorVersion 4 bit 7 Streamversion 7 Signature 24 bit 0x2B504D Signature "MP+" -1------------------------------------------------------------------------ FrameCount 32 bit 0...0xFFFFFFFF Number of frames, every frame contains 1152 samples per channel, the last frame contains 1 to 1152 samples per channel. Furthermore, one has to consider the latency of the analysis and the synthesis filterbank of 481 samples. See note 2. -2------------------------------------------------------------------------- IntensityStereo 1 bit 0...1 usually 0, when using intensity stereo coding (IS) 1. Not used by any encoder right now. See note 3. MidSideStereo 1 bit 0...1 When MidSideStereo is in use, this bit is 1, otherwise 0. MaxBand 6 bit 0...32 last subband used in the whole file. Typical values range from 23 to 29. Profile 4 bit 0, 7...13 Used profile 0: no profile 1: Unstable/Experimental 2: unused 3: unused 4: unused 5: below Telephone (q= 0.0) 6: below Telephone (q= 1.0) 7: Telephone (q= 2.0) 8: Thumb (q= 3.0) 9: Radio (q= 4.0) 10: Standard (q= 5.0) 11: Xtreme (q= 6.0) 12: Insane (q= 7.0) 13: BrainDead (q= 8.0) 14: above BrainDead (q= 9.0) 15: above BrainDead (q=10.0) Link 2 bit 00: Title starts or ends with a very low level (no live or classical genre titles) 01: Title ends loudly 10: Title starts loudly 11: Titel starts loudly and ends loudly SampleFreq 2 bit 00: 44100 Hz CD 01: 48000 Hz DAT, DVC, ADR 10: 37800 Hz CD-ROM/XA 11: 32000 Hz DSR, DAT-LP, DVC-LP MaxLevel 16 bit 0...32768 Maximum level of the coded PCM input signal See note 4. -3------------------------------------------------------------------------- TitleGain 16 bit -32768...+32767 Change in the replay level. Value is treated as signed 16 bit value and the level is changed by that many mB (Millibel). Thus level changes of -327.68 dB to +327.67 dB are possible. TitlePeak 16 bit 0...65535 Maximum level of the decoded title 16422: -6 dB 32767: 0 dB 65379: +6 dB -4------------------------------------------------------------------------- AlbumGain 16 bit -32768...32767 Change in the replay level if the whole cd is supposed to be played with the same level change for all tracks. Value is treated as signed 16 bit value and the level is attenuated by that many mB (Millibel) Thus, level changes of -327.68 dB to +327.67 dB are possible. AlbumPeak 16 bit 0...65535 Maximum level of the whole decoded CD 16422: -6 dB 32767: 0 dB 65379: +6 dB -5------------------------------------------------------------------------- TrueGapless 1 bit Is True Gapless in use? 0: no 1: yes LastFrameLength 11 bit 0, 1...1152 Used Samples of the Last Frame. TrueGapless = 0: always 0 TrueGapless = 1: 1...1152 1 bit Can fast seeking can be used safely ? 0: no 1: yes 19 bit unused (must be 0) -6------------------------------------------------------------------------- EncoderVersion 8 bit Encoder version * 100 (106 = 1.06) EncoderVersion % 10 == 0 Release (1.0) EncoderVersion % 2 == 0 Beta (1.06) EncoderVersion % 2 == 1 Alpha (1.05a...z)
Audio Data (in total FrameCount times)
LengthOfFrame
20 bit
FrameData
? bit
See sv7 frame decoding for more info on the content of audio data
End
LastFrameUsedSamples
11 bit (See furthermore note 2)
LengthOfFrame
20 bit
FrameData
? bit
FillBits
The last data word needs to be filled, even if only one bit of it is in use, because this bit is in the last byte of the word in this case.
0...31 bit
Note 1 (Order of bits)
The bit order can only be decoded on a 16 or 64 bit CPU with difficulties.
Note 2 (Effects of the filter bank delay)
448
Note 3 (Intensity Stereo)
Intensity Stereo, if in use, has to be used from 2.75 upwards. In every frame. In SV4 to 6, one was able to choose between 5.5, 8.25 or 11 kHz, but only for the whole file. Which was unflexible enough. Furthermore, with higher bitrates, there is the possibility that the decoder assumes undefined states, because variables on which the result of decoding heavily depends aren't initialized then.
Note 4 (Clipping)
Clipping Prevention right now works in a way that the biggest sample of the PCM input
is saved in the mpegplus file and upon replaying one assumes that the peak level
is no more than 18% over the level of the PCM input (differences caused by added noise or encoding
fluctuations of the level).
This is however a very crude approximation which sometimes attenuates the level too much or not enough.
The exact maximum replay level can only be determined by doing a decode.
Furthermore, note the "FS+>0 dB" problem: There are
overdrives of DA converters up to 3 dB possible to be caused by non-clipped samples, if
interpolated samples between those samples become too big.
32000 32000 -32000 -32000
becomes after 2x oversampling:
32000 45255 32000 0 -32000 -45255 -32000 0
A second problem are level changes on live albums between the tracks, if those
have different peak levels. In these cases, the whole album should be attenuated
to the required peak level.