ASF files store a wide variety of digital media content. It is anticipated that implementations will produce unique media types of their own creation. However, a rich set of standard media types that are commonly supported are defined to enable compatibility between diverse implementations.
The purpose of this section is to define a set of standard ASF media types. The explicit intention of this section is that if an implementation supports a media type defined within this section, that type must be supported in the manner described within this section if the implementation is to be considered content-compliant with the ASF specification. This commonality will define a minimum subset of digital media within which multi-vendor interoperability is possible. No restrictions are placed on how implementations support nonstandard media types (in other words, types other than those covered in this section).
The following subsections will define the core media types for audio, video, and commands.
9.1 Audio media type
When the Stream Type of the Stream Properties Object has the value ASF_Audio_Media, the ASF audio media type that populates the Type-Specific Data field of the Stream Properties Object is represented using the following structure (the WAVEFORMATEX structure).
Field name |
Field type |
Size (bits) |
Codec ID / Format Tag |
WORD |
16 |
Number of Channels |
WORD |
16 |
Samples Per Second |
DWORD |
32 |
Average Number of Bytes Per Second |
DWORD |
32 |
Block Alignment |
WORD |
16 |
Bits Per Sample |
WORD |
16 |
Codec Specific Data Size |
WORD |
16 |
Codec Specific Data |
BYTE |
varies |
The fields are defined as follows:
Codec ID / Format Tag
Specifies the unique ID of the codec used to encode the audio data. There is a registration procedure for new codecs. Defined as the wFormatTag field of a WAVEFORMATEX structure.
Number of Channels
Specifies the number of audio channels. Monaural data uses one channel and stereo data uses two channels. 5.1 audio uses six channels. Defined as the nChannels field of a WAVEFORMATEX structure.
Samples Per Second
Specifies a value in Hertz (cycles per second) that represents the sampling rate of the audio stream. Defined as the nSamplesPerSec field of a WAVEFORMATEX structure.
Average Number of Bytes Per Second
Specifies the average number of bytes per second of the audio stream. Defined as the nAvgBytesPerSec field of a WAVEFORMATEX structure.
Block Alignment
Specifies the block alignment, or block size, in bytes of the audio codec. Defined as the nBlockAlign field of a WAVEFORMATEX structure.
Bits per Sample
Specifies the number of bits per sample of monaural data. Defined as the wBitsPerSample field of a WAVEFORMATEX structure.
Codec Specific Data Size
Specifies the size, in bytes, of the Codec Specific Data buffer. Defined as the cbSize field of a WAVEFORMATEX structure. This value should be 0 when Codec ID is 1 (WAVE_FORMAT_PCM).
Codec Specific Data
Specifies an array of codec-specific data bytes.
For more information about the WAVEFORMATEX structure, see the MSDN Library documentation.
9.1.1 Spread audio
One Error Correction Type is spread audio. This refers to an error correction approach that minimizes the impact of lost audio data by spreading audio over a span of packets. The compressed silence is used for silence injection if lost payload data cannot be recreated. This approach works well for fixed bit rate audio codecs that have no interframe dependencies.
The Error Correction Data field is represented using the following structure.
Field name |
Field type |
Size (bits) |
Span |
BYTE |
8 |
Virtual Packet Length |
WORD |
16 |
Virtual Chunk Length |
WORD |
16 |
Silence Data Length |
WORD |
16 |
Silence Data |
BYTE |
varies |
The fields are defined as follows:
Span
Specifies the number of packets over which audio will be spread. Typically, this value should be set to 1.
Virtual Packet Length
Specifies the virtual packet length. The value of this field should be set to the size of the largest audio payload found in the audio stream.
Virtual Chunk Length
Specifies the virtual chunk length. The value of this field should be set to the size of the largest audio payload found in the audio stream.
Silence Data Length
Specifies the number of bytes stored in the Silence Data field. This value should be set to 1. It is also valid for this value to equal the Block Alignment value (from the Audio Media Type).
Silence Data
Specifies an array of silence data bytes. This value should be set to 0 for the length of Silence Data Length.
9.1.2 Audio payload sizes
Audio payloads do not need to be of equal size. However, they need to be a multiple of the Block Alignment field of the WAVEFORMATEX structure defined at the beginning of this section.
9.2 Video media type
When the Stream Type of the Stream Properties Object has the value ASF_Video_Media, the ASF video media type that populates the Type-Specific Data field of the Stream Properties Object is represented using the following structure.
Field name |
Field type |
Size (bits) |
Encoded Image Width |
DWORD |
32 |
Encoded Image Height |
DWORD |
32 |
Reserved Flags |
BYTE |
8 |
Format Data Size |
WORD |
16 |
Format Data |
See below |
varies |
The fields are defined as follows:
Encoded Image Width
Specifies the width of the encoded image in pixels.
Encoded Image Height
Specifies the height of the encoded image in pixels.
Reserved Flags
Specifies reserved flags, and shall be set to 2.
Format Data Size
Specifies the size of the Format Data field in bytes.
Format Data
Specifies the details of the format of the image data. This format is structured as follows (the BITMAPINFOHEADER structure):
Field name |
Field type |
Size (bits) |
Format Data Size |
DWORD |
32 |
Image Width |
LONG |
32 |
Image Height |
LONG |
32 |
Reserved |
WORD |
16 |
Bits Per Pixel Count |
WORD |
16 |
Compression ID |
DWORD |
32 |
Image Size |
DWORD |
32 |
Horizontal Pixels Per Meter |
LONG |
32 |
Vertical Pixels Per Meter |
LONG |
32 |
Colors Used Count |
DWORD |
32 |
Important Colors Count |
DWORD |
32 |
Codec Specific Data |
BYTE |
varies |
The fields are defined as follows:
Format Data Size
Specifies the number of bytes stored in the Format Data field. Defined as the biSize field of a BITMAPINFOHEADER structure.
Image Width
Specifies the width of the encoded image in pixels. Defined as the biWidth field of a BITMAPINFOHEADER structure. This should be equal to the Encoded Image Width field defined previously.
Image Height
Specifies the height of the encoded image in pixels. Defined as the biHeight field of a BITMAPINFOHEADER structure. This should be equal to the Encoded Image Height field defined previously.
Reserved
Reserved. Shall be set to 1. Defined as the biPlanes field of a BITMAPINFOHEADER structure.
Bits Per Pixel Count
Specifies the number of bits per pixel. Defined as the biBitCount field of a BITMAPINFOHEADER structure.
Compression ID
Specifies the type of the compression, using a four-character code. For ISO MPEG-4 video, this contains MP4S, mp4s, M4S2, or m4s2. In the Compression ID, the first character of the four-character code appears as the least-significant byte; for instance MP4S uses the Compression ID 0x5334504D. Defined as the biCompression field of a BITMAPINFOHEADER structure.
Image Size
Specifies the size of the image in bytes. Defined as the biSizeImage field of a BITMAPINFOHEADER structure.
Horizontal Pixels Per Meter
Specifies the horizontal resolution of the target device for the bitmap in pixels per meter. Defined as the biXPelsPerMeter field of a BITMAPINFOHEADER structure.
Vertical Pixels Per Meter
Specifies the vertical resolution of the target device for the bitmap in pixels per meter. Defined as the biYPelsPerMeter field of a BITMAPINFOHEADER structure.
Colors Used Count
Specifies the number of color indexes in the color table that are actually used by the bitmap. Defined as the biClrUsed field of a BITMAPINFOHEADER structure.
Important Colors Count
Specifies the number of color indexes that are required for displaying the bitmap. If this value is zero, all colors are required. Defined as the biClrImportant field of a BITMAPINFOHEADER structure.
Codec Specific Data
Specifies an array of codec specific data bytes. The size of this array is equal to the Format Data Size field minus the size of the Format Data fields listed previously.
For more information about the BITMAPINFOHEADER structure, see the MSDN Library documentation.
9.3 Command media type
When the Stream Type of the Stream Properties Object has the value ASF_Command_Media, the ASF command media type that populates the Type-Specific Data field of the Stream Properties Object shall be null and the value of Type-Specific Data Length shall be 0.
Whereas the name-value pairs associated with the commands can be any value, system-defined command types include URL, Filename, and Text. The URL command type indicates that the URL is to be opened by a client in an HTML window or frame. The Filename command type indicates that the digital media file indicated is to be played immediately. The Text command type indicates that the data strings should be interpreted as captioned text to go along with the presentation.
For commands that are not stored in the Script Command Object (see section 3.6), each digital media sample is composed of a nul-terminated command type string followed by a nul-terminated command string. (Note that previous versions of this specification indicated that there should be an extra nul WCHAR between the two strings. This was not correct; there should be exactly one nul WCHAR between the strings, and that should be the nul-terminating character for the first string.)
9.4 Image media type
ASF supports both a JFIF image type and a degradable JPEG image type. The former is a media type indicating that the stream is in the JFIF format; the latter is a media type indicating that it is a loss-tolerant stream of JPEG images. To encode to or decode from the degradable JPEG type, you must use the Windows Media Format SDK.
9.4.1 JFIF/JPEG media type
When the Stream Type of the Stream Properties Object has a value equal to ASF_JFIF_Media, the ASF media type that populates the Type-Specific Data field of the Stream Properties Object is represented using the following structure.
Field name |
Field type |
Size (bits) |
Image width |
DWORD |
32 |
Image height |
DWORD |
32 |
Reserved |
DWORD |
32 |
The fields are defined as follows:
Image width
Specifies the width of the encoded image in pixels.
Image height
Specifies the height of the encoded image in pixels.
Reserved
Reserved, must be 0.
9.4.2 Degradable JPEG media type
When the Stream Type of the Stream Properties Object has a value equal to ASF_Degradable_JPEG_Media, the ASF media type that populates the Type-Specific Data field of the Stream Properties Object is represnted using the following structure.
Field name |
Field type |
Size (bits) |
Image width |
DWORD |
32 |
Image height |
DWORD |
32 |
Reserved |
WORD |
16 |
Reserved |
WORD |
16 |
Reserved |
WORD |
16 |
Interchange data length |
WORD |
16 |
Interchange data |
BYTE |
varies |
The fields are defined as follows:
Image width
Specifies the width of the encoded image in pixels.
Image height
Specifies the height of the encoded image in pixels.
Reserved
These three fields must be, respectively, 0, 2, and 4.
Interchange data length
Specifies the number of bytes in the Interchange data field. If this value is 0, then Interchange data field shall consist of the single byte value 0x00.
Interchange data
Specifies the interchange data for this stream. If Interchange data length is set to 0, then this field shall still be present and shall consist of the single byte 0x00.
9.5 File transfer and binary media types
When the Stream Type of the Stream Properties Object has a value equal to either ASF_File_Transfer_Media or ASF_Binary_Media, the ASF media type that populates the Type-Specific Data field of the Stream Properties Object is represented using the following structure.
Field name |
Field type |
Size (bits) |
Major media type |
GUID |
128 |
Media subtype |
GUID |
128 |
Fixed-size samples |
DWORD |
32 |
Temporal compression |
DWORD |
32 |
Sample size |
DWORD |
32 |
Format type |
GUID |
128 |
Format data size |
DWORD |
32 |
Format data |
See below |
varies |
The fields are defined as follows:
Major media type
This value must be equal to the Stream Type value in the Stream Properties Object.
Media subtype
Indicates the media subtype. Can be set to 0 if not relevant. ASF_Web_Stream_Media_Subtype is a possible value for file transfer streams that are Web streams.
Fixed-size samples
Valid values are 0 and 1. This value shall be set to 1 if this stream has fixed-size samples.
Temporal compression
Valid values are 0 and 1. This value shall be set to 1 if compression of media object N might depend on media object N-1. In general, this value should be set to 0.
Sample size
If the Fixed-size samples field has a value of 1, then this value is the fixed sample size. Otherwise, the value is ignored and should be 0.
Format type
If there is no additional media type information, this field, along with the value in the Format data size field, should be set to 0. For a Web stream, this can be set to ASF_Web_Stream_Format. Custom non-standard format types can also be defined, but they will not necessarily be understood across implementations.
Format data size
This is the number of bytes in Format data. If there is no format data, this field, along with the value in the Format type field, should be set to 0.
Format data
This is the additional format data for this media type. This shall be present only if Format data size is greater than 0. Custom format types can define how these bytes are formatted. If the Format type is equal to ASF_Web_Stream_Format, then there is a standard format for these bytes, detailed in section 9.5.1.
9.5.1 Web streams
Web streams are a subtype of file transfer streams. The media type should be expressed as previously mentioned. The Media subtype shall be set to ASF_Web_Stream_Media_Subtype. The Format type shall be set to ASF_Web_Stream_Format. The Format data size shall be set to 8. The Format data should use the values in the following table.
Field name |
Field type |
Size (bits) |
Web stream format data size |
WORD |
16 |
Fixed sample header size |
WORD |
16 |
Version number |
WORD |
16 |
Reserved |
WORD |
16 |
The fields are defined as follows:
Web stream format data size
This shall be set to 8.
Fixed sample header size
This shall be set to 10. See below for a description of the Web stream header.
Version number
This shall be set to 1.
Reserved
Reserved, must be 0.
In addition, all media objects for a Web stream need to begin with 10 bytes formatted as follows:
Field name |
Field type |
Size (bits) |
Total header length |
WORD |
16 |
Part number |
WORD |
16 |
Total part count |
WORD |
16 |
Sample type |
WORD |
16 |
URL string |
WCHAR |
Varies |
The fields are defined as follows:
Total header length
This is the total size of the media object header. This value should be set to 10 plus the length (not including the nul-terminating character) of the URL string field.
Part number
Current part of the file (0-based). Valid values are from 0 to Total number of parts – 1.
Total part count
Number of parts in the file.
Sample type
Valid values for this field are 1, which indicates the sample type is “file”, and 2, which indicates the sample type is “render” (which is essentially a command to render the data).
URL string
This is a nul-terminated string containing the URL for the file being transferred.