Parse HTML Text Object Parse HTML Text Object
Version 2.00

Properties                     Methods


Description

The Parse HTML Text Object is designed to parse either an HTML text string or HTML file in to it's basic components of tags and text.  The object goes further by providing methods which can identify tags and return their name and attributes.

Syntax

ParseHTML.Text

Remarks

The Parse HTML Text Object can be use to

This object can be especially useful when combined with like HTTP Get/Post or AspHTTP.

The following is an example of code which can extract all hyperlinks from an HTML Web page.

set objHTTPUtils = Server.CreateObject("qInternet.HTTPUtils")
'Note: qInternet.HTTPUtils is part of HTTP Get/Post
objHTTPUtils.Webserver = "www.microsoft.com"

Set objParseHTMLText = Server.CreateObject("ParseHTML.Text")

objParseHTMLText.HTMLString = objHTTPUtils.Getit
strParsedText = objParseHTMLText.GetParsedText

Do While strParsedText <> ""
  If ParsedTextIsTag("a") Then
    Response.Write Server.HTMLEncode(strParsedText) & <br> & vbCRLF
  End If
   strParsedText = objParseHTMLText.GetParsedText
Loop

In the code shown above, the CreateObject function returns the a Parse HTML Text Object (objParseHTMLText).  The HTMLString property is used to initialize the object with an HTML text string obtained from the HTTP Get/Post object method GetIt.  GetParsedText is used to begin processing the HTML text.  Each time it is called it returns the next component (tag or text) from the HTML string.  A While loop is used to process each component returned by GetParsedText.  A zero-length string ("") indicated the end of the HTML text has been reached.  ParedTextIsTag checks the last parsed text string returned to see if a tag has been returned.  In this example we are looking for all Anchor tags (i.e. <a href="http://www.microsoft.com">) in the HTML document and writing them to our Web page.


Copyright (c) 1999-2002 by Cimarron Ravine, L.L.C.
All Rights Reserved.