9. Standard ASF media types

ASF files store a wide variety of digital media content. It is anticipated that implementations will produce unique media types of their own creation. However, a rich set of standard media types that are commonly supported are defined to enable compatibility between diverse implementations.

The purpose of this section is to define a set of standard ASF media types. The explicit intention of this section is that if an implementation supports a media type defined within this section, that type must be supported in the manner described within this section if the implementation is to be considered content-compliant with the ASF specification. This commonality will define a minimum subset of digital media within which multi-vendor interoperability is possible. No restrictions are placed on how implementations support nonstandard media types (in other words, types other than those covered in this section).

The following subsections will define the core media types for audio, video, and commands.




9.1 Audio media type

When the Stream Type of the Stream Properties Object has the value ASF_Audio_Media, the ASF audio media type that populates the Type-Specific Data field of the Stream Properties Object is represented using the following structure (the WAVEFORMATEX structure).


Field name

Field type

Size (bits)

Codec ID / Format Tag

WORD

16

Number of Channels

WORD

16

Samples Per Second

DWORD

32

Average Number of Bytes Per Second

DWORD

32

Block Alignment

WORD

16

Bits Per Sample

WORD

16

Codec Specific Data Size

WORD

16

Codec Specific Data

BYTE

varies


The fields are defined as follows:


Codec ID / Format Tag

Specifies the unique ID of the codec used to encode the audio data. There is a registration procedure for new codecs. Defined as the wFormatTag field of a WAVEFORMATEX structure.

Number of Channels

Specifies the number of audio channels. Monaural data uses one channel and stereo data uses two channels. 5.1 audio uses six channels. Defined as the nChannels field of a WAVEFORMATEX structure.

Samples Per Second

Specifies a value in Hertz (cycles per second) that represents the sampling rate of the audio stream. Defined as the nSamplesPerSec field of a WAVEFORMATEX structure.

Average Number of Bytes Per Second

Specifies the average number of bytes per second of the audio stream. Defined as the nAvgBytesPerSec field of a WAVEFORMATEX structure.

Block Alignment

Specifies the block alignment, or block size, in bytes of the audio codec. Defined as the nBlockAlign field of a WAVEFORMATEX structure.

Bits per Sample

Specifies the number of bits per sample of monaural data. Defined as the wBitsPerSample field of a WAVEFORMATEX structure.

Codec Specific Data Size

Specifies the size, in bytes, of the Codec Specific Data buffer. Defined as the cbSize field of a WAVEFORMATEX structure. This value should be 0 when Codec ID is 1 (WAVE_FORMAT_PCM).

Codec Specific Data

Specifies an array of codec-specific data bytes.


For more information about the WAVEFORMATEX structure, see the MSDN Library documentation.



9.1.1 Spread audio

One Error Correction Type is spread audio. This refers to an error correction approach that minimizes the impact of lost audio data by spreading audio over a span of packets. The compressed silence is used for silence injection if lost payload data cannot be recreated. This approach works well for fixed bit rate audio codecs that have no interframe dependencies.


The Error Correction Data field is represented using the following structure.


Field name

Field type

Size (bits)

Span

BYTE

8

Virtual Packet Length

WORD

16

Virtual Chunk Length

WORD

16

Silence Data Length

WORD

16

Silence Data

BYTE

varies


The fields are defined as follows:

Span

Specifies the number of packets over which audio will be spread. Typically, this value should be set to 1.

Virtual Packet Length

Specifies the virtual packet length. The value of this field should be set to the size of the largest audio payload found in the audio stream.

Virtual Chunk Length

Specifies the virtual chunk length. The value of this field should be set to the size of the largest audio payload found in the audio stream.

Silence Data Length

Specifies the number of bytes stored in the Silence Data field. This value should be set to 1. It is also valid for this value to equal the Block Alignment value (from the Audio Media Type).

Silence Data

Specifies an array of silence data bytes. This value should be set to 0 for the length of Silence Data Length.



9.1.2 Audio payload sizes

Audio payloads do not need to be of equal size. However, they need to be a multiple of the Block Alignment field of the WAVEFORMATEX structure defined at the beginning of this section.



9.2 Video media type

When the Stream Type of the Stream Properties Object has the value ASF_Video_Media, the ASF video media type that populates the Type-Specific Data field of the Stream Properties Object is represented using the following structure.


Field name

Field type

Size (bits)

Encoded Image Width

DWORD

32

Encoded Image Height

DWORD

32

Reserved Flags

BYTE

8

Format Data Size

WORD

16

Format Data

See below

varies


The fields are defined as follows:


Encoded Image Width

Specifies the width of the encoded image in pixels.

Encoded Image Height

Specifies the height of the encoded image in pixels.

Reserved Flags

Specifies reserved flags, and shall be set to 2.

Format Data Size

Specifies the size of the Format Data field in bytes.

Format Data

Specifies the details of the format of the image data. This format is structured as follows (the BITMAPINFOHEADER structure):

Field name

Field type

Size (bits)

Format Data Size

DWORD

32

Image Width

LONG

32

Image Height

LONG

32

Reserved

WORD

16

Bits Per Pixel Count

WORD

16

Compression ID

DWORD

32

Image Size

DWORD

32

Horizontal Pixels Per Meter

LONG

32

Vertical Pixels Per Meter

LONG

32

Colors Used Count

DWORD

32

Important Colors Count

DWORD

32

Codec Specific Data

BYTE

varies

The fields are defined as follows:

Format Data Size

Specifies the number of bytes stored in the Format Data field. Defined as the biSize field of a BITMAPINFOHEADER structure.

Image Width

Specifies the width of the encoded image in pixels. Defined as the biWidth field of a BITMAPINFOHEADER structure. This should be equal to the Encoded Image Width field defined previously.

Image Height

Specifies the height of the encoded image in pixels. Defined as the biHeight field of a BITMAPINFOHEADER structure. This should be equal to the Encoded Image Height field defined previously.

Reserved

Reserved. Shall be set to 1. Defined as the biPlanes field of a BITMAPINFOHEADER structure.

Bits Per Pixel Count

Specifies the number of bits per pixel. Defined as the biBitCount field of a BITMAPINFOHEADER structure.

Compression ID

Specifies the type of the compression, using a four-character code. For ISO MPEG-4 video, this contains MP4S, mp4s, M4S2, or m4s2. In the Compression ID, the first character of the four-character code appears as the least-significant byte; for instance MP4S uses the Compression ID 0x5334504D. Defined as the biCompression field of a BITMAPINFOHEADER structure.

Image Size

Specifies the size of the image in bytes. Defined as the biSizeImage field of a BITMAPINFOHEADER structure.

Horizontal Pixels Per Meter

Specifies the horizontal resolution of the target device for the bitmap in pixels per meter. Defined as the biXPelsPerMeter field of a BITMAPINFOHEADER structure.

Vertical Pixels Per Meter

Specifies the vertical resolution of the target device for the bitmap in pixels per meter. Defined as the biYPelsPerMeter field of a BITMAPINFOHEADER structure.

Colors Used Count

Specifies the number of color indexes in the color table that are actually used by the bitmap. Defined as the biClrUsed field of a BITMAPINFOHEADER structure.

Important Colors Count

Specifies the number of color indexes that are required for displaying the bitmap. If this value is zero, all colors are required. Defined as the biClrImportant field of a BITMAPINFOHEADER structure.

Codec Specific Data

Specifies an array of codec specific data bytes. The size of this array is equal to the Format Data Size field minus the size of the Format Data fields listed previously.


For more information about the BITMAPINFOHEADER structure, see the MSDN Library documentation.



9.3 Command media type

When the Stream Type of the Stream Properties Object has the value ASF_Command_Media, the ASF command media type that populates the Type-Specific Data field of the Stream Properties Object shall be null and the value of Type-Specific Data Length shall be 0.


Whereas the name-value pairs associated with the commands can be any value, system-defined command types include URL, Filename, and Text. The URL command type indicates that the URL is to be opened by a client in an HTML window or frame. The Filename command type indicates that the digital media file indicated is to be played immediately. The Text command type indicates that the data strings should be interpreted as captioned text to go along with the presentation.


For commands that are not stored in the Script Command Object (see section 3.6), each digital media sample is composed of a nul-terminated command type string followed by a nul-terminated command string. (Note that previous versions of this specification indicated that there should be an extra nul WCHAR between the two strings. This was not correct; there should be exactly one nul WCHAR between the strings, and that should be the nul-terminating character for the first string.)



9.4 Image media type

ASF supports both a JFIF image type and a degradable JPEG image type. The former is a media type indicating that the stream is in the JFIF format; the latter is a media type indicating that it is a loss-tolerant stream of JPEG images. To encode to or decode from the degradable JPEG type, you must use the Windows Media Format SDK.


9.4.1 JFIF/JPEG media type


When the Stream Type of the Stream Properties Object has a value equal to ASF_JFIF_Media, the ASF media type that populates the Type-Specific Data field of the Stream Properties Object is represented using the following structure.



Field name

Field type

Size (bits)

Image width

DWORD

32

Image height

DWORD

32

Reserved

DWORD

32


The fields are defined as follows:

Image width

Specifies the width of the encoded image in pixels.

Image height

Specifies the height of the encoded image in pixels.

Reserved

Reserved, must be 0.




9.4.2 Degradable JPEG media type

When the Stream Type of the Stream Properties Object has a value equal to ASF_Degradable_JPEG_Media, the ASF media type that populates the Type-Specific Data field of the Stream Properties Object is represnted using the following structure.



Field name

Field type

Size (bits)

Image width

DWORD

32

Image height

DWORD

32

Reserved

WORD

16

Reserved

WORD

16

Reserved

WORD

16

Interchange data length

WORD

16

Interchange data

BYTE

varies


The fields are defined as follows:

Image width

Specifies the width of the encoded image in pixels.

Image height

Specifies the height of the encoded image in pixels.

Reserved

These three fields must be, respectively, 0, 2, and 4.

Interchange data length

Specifies the number of bytes in the Interchange data field. If this value is 0, then Interchange data field shall consist of the single byte value 0x00.

Interchange data

Specifies the interchange data for this stream. If Interchange data length is set to 0, then this field shall still be present and shall consist of the single byte 0x00.


9.5 File transfer and binary media types

When the Stream Type of the Stream Properties Object has a value equal to either ASF_File_Transfer_Media or ASF_Binary_Media, the ASF media type that populates the Type-Specific Data field of the Stream Properties Object is represented using the following structure.


Field name

Field type

Size (bits)

Major media type

GUID

128

Media subtype

GUID

128

Fixed-size samples

DWORD

32

Temporal compression

DWORD

32

Sample size

DWORD

32

Format type

GUID

128

Format data size

DWORD

32

Format data

See below

varies


The fields are defined as follows:

Major media type

This value must be equal to the Stream Type value in the Stream Properties Object.

Media subtype

Indicates the media subtype. Can be set to 0 if not relevant. ASF_Web_Stream_Media_Subtype is a possible value for file transfer streams that are Web streams.

Fixed-size samples

Valid values are 0 and 1. This value shall be set to 1 if this stream has fixed-size samples.

Temporal compression

Valid values are 0 and 1. This value shall be set to 1 if compression of media object N might depend on media object N-1. In general, this value should be set to 0.

Sample size

If the Fixed-size samples field has a value of 1, then this value is the fixed sample size. Otherwise, the value is ignored and should be 0.

Format type

If there is no additional media type information, this field, along with the value in the Format data size field, should be set to 0. For a Web stream, this can be set to ASF_Web_Stream_Format. Custom non-standard format types can also be defined, but they will not necessarily be understood across implementations.

Format data size

This is the number of bytes in Format data. If there is no format data, this field, along with the value in the Format type field, should be set to 0.

Format data

This is the additional format data for this media type. This shall be present only if Format data size is greater than 0. Custom format types can define how these bytes are formatted. If the Format type is equal to ASF_Web_Stream_Format, then there is a standard format for these bytes, detailed in section 9.5.1.


9.5.1 Web streams

Web streams are a subtype of file transfer streams. The media type should be expressed as previously mentioned. The Media subtype shall be set to ASF_Web_Stream_Media_Subtype. The Format type shall be set to ASF_Web_Stream_Format. The Format data size shall be set to 8. The Format data should use the values in the following table.


Field name

Field type

Size (bits)

Web stream format data size

WORD

16

Fixed sample header size

WORD

16

Version number

WORD

16

Reserved

WORD

16


The fields are defined as follows:

Web stream format data size

This shall be set to 8.

Fixed sample header size

This shall be set to 10. See below for a description of the Web stream header.

Version number

This shall be set to 1.

Reserved

Reserved, must be 0.

In addition, all media objects for a Web stream need to begin with 10 bytes formatted as follows:



Field name

Field type

Size (bits)

Total header length

WORD

16

Part number

WORD

16

Total part count

WORD

16

Sample type

WORD

16

URL string

WCHAR

Varies


The fields are defined as follows:

Total header length

This is the total size of the media object header. This value should be set to 10 plus the length (not including the nul-terminating character) of the URL string field.

Part number

Current part of the file (0-based). Valid values are from 0 to Total number of parts – 1.

Total part count

Number of parts in the file.

Sample type

Valid values for this field are 1, which indicates the sample type is “file”, and 2, which indicates the sample type is “render” (which is essentially a command to render the data).

URL string

This is a nul-terminated string containing the URL for the file being transferred.