
The URL functions deal with URL manipulation and access. These functions operate in a task-oriented manner. The content and format of the URL that is being used by the function is not verified. Usage of these functions should be tracked by the calling application to ensure that the data is in the format intended. For example, InternetCanonicalizeUrl would convert the unsafe character "%" into the escape sequence "%25" when using no flags. If InternetCanonicalizeUrl is used on the canonicalized URL, the escape sequence "%25" would be converted into the escape sequence "%2525", which would not work properly.
| InternetCanonicalizeUrl |
| InternetCombineUrl |
| InternetCrackUrl |
| InternetCreateUrl |
| InternetOpenUrl |
BOOL InternetCanonicalizeUrl(
IN LPCSTR lpszUrl,
OUT LPSTR lpszBuffer,
IN OUT LPDWORD lpdwBufferLength,
IN DWORD dwFlags
);
Canonicalizes a URL, which includes converting unsafe characters and spaces into escape sequences.
| ERROR_BAD_PATHNAME | The URL could not be canonicalized. |
| ERROR_INSUFFICIENT_BUFFER | Canonicalized URL is too large to fit in the buffer provided. The *lpdwBufferLength parameter is set to the size, in bytes, of the buffer required to hold the resultant, canonicalized URL. |
| ERROR_INTERNET_INVALID_URL | The format of the URL is invalid. |
| ERROR_INVALID_PARAMETER | Bad string, buffer, buffer size, or flags parameter. |
| ICU_BROWSER_MODE | Does not encode or decode characters after "#" or "?", and does not remove trailing white space after "?". If this value is not specified, the entire URL is encoded, and trailing white space is removed. |
| ICU_DECODE | Converts all %XX sequences to characters, including escape sequences, before the URL is parsed. |
| ICU_ENCODE_SPACES_ONLY | Encodes spaces only. |
| ICU_NO_ENCODE | Does not convert unsafe characters to escape sequences. |
| ICU_NO_META | Does not remove meta sequences (such as "." and "..") from the URL. |
If no flags are specified (dwFlags = 0), the function converts all unsafe characters and meta sequences (such as \.,\ .., and \...) to escape sequences.
InternetCanonicalizeUrl always encodes by default, even if the ICU_DECODE flag has been specified. To decode without re-encoding, use ICU_DECODE | ICU_NO_ENCODE. If the ICU_DECODE flag is used without ICU_NO_ENCODE, the URL is decoded before being parsed; unsafe characters then are re-encoded after parsing. This function will handle arbitrary protocol schemes, but to do so it must make inferences from the unsafe character set.
The application calling InternetCanonicalizeUrl should track the usage of this function on a particular URL. If unsafe characters in a URL have been converted to escape sequences, using InternetCanonicalizeUrl again on the URL (with no flags) will cause the escape sequences to be converted to another escape sequence. For example, a blank space in a URL would be converted to the escape sequence "%20". Calling InternetCanonicalizeUrl again on the URL would cause the escape sequence "%20" to be converted to the escape sequence "%2520", because the "%" sign is an unsafe character that is reserved for escape sequences and is replaced by the function with the escape sequence "%25".
BOOL InternetCombineUrl(
IN LPCSTR lpszBaseUrl,
IN LPCSTR lpszRelativeUrl,
OUT LPSTR lpszBuffer,
IN OUT LPDWORD lpdwBufferLength,
IN DWORD dwFlags
);
Combines a base and relative URL into a single URL. The resultant URL will be canonicalized (see InternetCanonicalizeUrl).
| ERROR_BAD_PATHNAME | The URLs could not be combined. |
| ERROR_INSUFFICIENT_BUFFER | The buffer supplied to the function was insufficient or NULL. The DWORD indicated by the lpdwBufferLength parameter will contain the number of bytes required to hold the resultant, combined URL. |
| ERROR_INTERNET_INVALID_URL | The format of the URL is invalid. |
| ERROR_INVALID_PARAMETER | Bad string, buffer, buffer size, or flags parameter. |
| ICU_BROWSER_MODE | Does not encode or decode characters after "#" or "?", and does not remove trailing white space after "?". If this value is not specified, the entire URL is encoded and trailing white space is removed. |
| ICU_DECODE | Converts all %XX sequences to characters, including escape sequences, before the URL is parsed. |
| ICU_ENCODE_SPACES_ONLY | Encodes spaces only. |
| ICU_NO_ENCODE | Does not convert unsafe characters to escape sequences. |
| ICU_NO_META | Does not remove meta sequences (such as "." and "..") from the URL. |
BOOL InternetCrackUrl(
IN LPCSTR lpszUrl,
IN DWORD dwUrlLength,
IN DWORD dwFlags,
IN OUT LPURL_COMPONENTS lpUrlComponents
);
Cracks a URL into its component parts.
| ICU_DECODE | Converts encoded characters back to their normal form. This can be used only if the user provides buffers in the URL_COMPONENTS structure to copy the components into. |
| ICU_ESCAPE | Converts all escape sequences (%xx) to their corresponding characters. This can be used only if the user provides buffers in the URL_COMPONENTS structure to copy the components into. |
The required components are indicated by members of the URL_COMPONENTS structure. Each component has a pointer to the value and has a member that stores the length of the stored value. If both the value and the length for a component are equal to zero, that component is not returned. If the pointer to the value of the component is NULL and the value of its corresponding length member is nonzero, the address of the first character of the corresponding component in the lpszUrl string is stored in the pointer, and the length of the component is stored in the length member.
If the pointer contains the address of the user-supplied buffer, the length member must contain the size of the buffer. InternetCrackUrl copies the component into the buffer, and the length member is set to the length of the copied component, minus 1 for the trailing string terminator.
For InternetCrackUrl to work properly, the size of the URL_COMPONENTS structure must be stored in the dwStructSize member.
See also FtpOpenFile, InternetCloseHandle, InternetFindNextFile, InternetSetStatusCallback
BOOL InternetCreateUrl(
IN LPURL_COMPONENTS lpUrlComponents,
IN DWORD dwFlags,
OUT LPSTR lpszUrl,
IN OUT LPDWORD lpdwUrlLength
);
Creates a URL from its component parts.
| ICU_ESCAPE | Converts all escape sequences (%xx) to their corresponding characters. |
| ICU_USERNAME | When adding the user name, uses the name that was specified at logon time. |
HINTERNET InternetOpenUrl(
IN HINTERNET hInternetSession,
IN LPCSTR lpszUrl,
IN LPCSTR lpszHeaders,
IN DWORD dwHeadersLength,
IN DWORD dwFlags,
IN DWORD dwContext
);
Begins reading a complete FTP, Gopher, or HTTP URL. Use InternetCanonicalizeUrl first if the URL being used contains a relative URL and a base URL separated by blank spaces.
This is a general function that an application can use to retrieve data over any of the protocols that the Win32 Internet functions support. This function is particularly useful when the application does not need to access the particulars of a protocol, but only requires the data corresponding to a URL. The InternetOpenUrl function parses the URL string, establishes a connection to the server, and prepares to download the data identified by the URL. The application can then use InternetReadFile (for files) or InternetFindNextFile (for directories) to retrieve the URL data. It is not necessary to call InternetConnect before InternetOpenUrl.
InternetOpenUrl disables Gopher on ports less than 1024, except for port 70 (the standard Gopher port) and port 105 (typically used for Central Services Organization [CSO] name searches).
Use InternetCloseHandle to close the handle returned from InternetOpenUrl. However, note that closing the handle before all the URL data has been read results in the connection being terminated.
© 1997 Microsoft Corporation. All rights reserved. Terms of Use.