Parse
HTML Text Object |
Parse
HTML Text Object Version 2.00 |
Description
The Parse HTML Text Object is designed to parse either an HTML text string or HTML file in to it's basic components of tags and text. The object goes further by providing methods which can identify tags and return their name and attributes.
Syntax
ParseHTML.Text
Remarks
The Parse HTML Text Object can be use to
- Create a search engine by extracting text information and hyperlinks from HTML pages.
- Build MHTML email messages and add functions such as "Click Here to email this page to a friend".
- Extract data, such as, stock information from and HTML page.
This object can be especially useful when combined with like HTTP Get/Post or AspHTTP.
The following is an example of code which can extract all hyperlinks from an HTML Web page.
set objHTTPUtils = Server.CreateObject("qInternet.HTTPUtils")
'Note: qInternet.HTTPUtils is part of HTTP Get/Post
objHTTPUtils.Webserver = "www.microsoft.com"
Set objParseHTMLText = Server.CreateObject("ParseHTML.Text")
objParseHTMLText.HTMLString = objHTTPUtils.Getit
strParsedText = objParseHTMLText.GetParsedText
Do While strParsedText <> ""
If ParsedTextIsTag("a") Then
Response.Write Server.HTMLEncode(strParsedText) & <br> & vbCRLF
End If
strParsedText = objParseHTMLText.GetParsedText
LoopIn the code shown above, the CreateObject function returns the a Parse HTML Text Object (objParseHTMLText). The HTMLString property is used to initialize the object with an HTML text string obtained from the HTTP Get/Post object method GetIt. GetParsedText is used to begin processing the HTML text. Each time it is called it returns the next component (tag or text) from the HTML string. A While loop is used to process each component returned by GetParsedText. A zero-length string ("") indicated the end of the HTML text has been reached. ParedTextIsTag checks the last parsed text string returned to see if a tag has been returned. In this example we are looking for all Anchor tags (i.e. <a href="http://www.microsoft.com">) in the HTML document and writing them to our Web page.
Copyright (c)
1999-2002 by
Cimarron
Ravine, L.L.C.
All Rights Reserved.