Would you like to retrieve and parse the contents of a remote web page with ASP, maybe extract and index all the links? Maybe you're planning to build your own search engine, be the next big Google competitor ;). Well this function will show you how to build that with ASP.
You will need one or two items. To retrieve the pages, you'll be using MSXML 4.0. If you use an older version, you may run into an error with the responseText, where all special/foreign/accented characters are replaced with '?' questions marks. This is due to the encoding, and MSXML 4.0 solves that.
If you are behind a proxy server and you use ServerXMLHTTP code, you will get the error "Access Denied" or "The server name or address cannot be resolved". You need proxycfg. Run it from the command line like this "proxycf -u", and it will copy your proxy settings from IE.
So here's the function get the remote page
'=== grab a web page, return as string function getPage(strURL) dim strBody, objXML set objXML = CreateObject("MSXML2.ServerXMLHTTP.4.0") objXML.Open "GET", strURL, False 'objXML.setRequestHeader "User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)" '=== falsify the agent 'objXML.setRequestHeader "Content-Type", "text/html; Charset:ISO-8859-1" 'objXML.setRequestHeader "Content-Type", "text/html; Charset:UTF-8" objXML.Send strBody = objXML.responseText set objXML = nothing getPage = strBody end function