WinBatch Tech Support Home

Database Search

If you can't find the information using the categories below, post a question over in our WinBatch Tech Support Forum.

TechHome

OLE with MSIE
plus

Can't find the information you are looking for here? Then leave a message over on our WinBatch Tech Support Forum.

HTML Analysis


Question:

Has anyone written a WinBatch script to analyze an HTML document, similar to what the analysis.wbt script distributed with WinBatch does with window hierarchies? Not wanting to reinvent the wheel, I thought I'd check first. Didn't see anything in the tech database.

Answer:

I've written a script that does a simple inventory of HTML documents (all objects) so you can see which TAGS are present and how many of each.
#DefineSubRoutine startMSIE(url)
   Browser = ObjectOpen("InternetExplorer.Application")
   Browser.addressbar = @FALSE
   Browser.statusbar = @FALSE
   Browser.menubar = @FALSE
   Browser.toolbar = @FALSE
   browser.visible = @TRUE
   browser.navigate(url)
   WaitForPageLoad()
   ;   setup the document object...
   browserDoc = Browser.Document
   all = browserdoc.all
   Return(browser)
#EndSubRoutine

#DefineSubroutine WaitForPageLoad()  ; assume Browser
   While browser.busy || browser.readystate == 1
      TimeDelay(0.5)
   EndWhile
   While browser.Document.ReadyState != "complete"
      TimeDelay(0.5)
   EndWhile
   return
#EndSubroutine



;   set up what to do if the browser is closed before the script ends...
IntControl(73, 1, 0, 0, 0)

;   start the browser...
url = "C:\Program Files\RPC\Tidy_Selection.html"
br = startMSIE(url)

;hfiles = fileitempath("C:\Writing\Law and Order\test.html.html")
;"C:\Writing\Rifts\html\*.html")

;url = askitemlist("HTML Inventory", hfiles, @tab, @sorted, @single)
;browser.navigate(url)
;WaitForPageLoad()


;message("Debug", browser.locationURL)
Message("Total Objects on Page", all.length)

objlist = ""
objcount= ""

For x = 0 To all.length-1
;   dp = all.%x%
   dp = all.item(x)
;   if dp.tagname == "" then message("Debug", dp.innerhtml)
   location = ItemLocate(dp.tagname, objlist, @TAB)
   If location == 0
      objlist = ItemInsert(dp.tagname, -1, objlist, @TAB)
      objcount= ItemInsert("1", -1, objcount, @TAB)
   Else
      count   = ItemExtract(location, objcount, @TAB)
      objcount= ItemReplace(count+1, location, objcount, @TAB)
   EndIf
Next

For x = 1 To ItemCount(objlist, @TAB)
   thisobj = ItemExtract(x, objlist, @TAB)
   If thisobj == "" Then objlist = ItemReplace("", x, objlist, @TAB)
Next

finaltxt = ""
sortlist = ItemSort(objlist, @TAB)

For x = 1 To ItemCount(objlist, @TAB)
   thisobj = ItemExtract(x, sortlist, @TAB)
   usloc   = ItemLocate(thisobj, objlist, @TAB)
   thiscnt = ItemExtract(usloc, objcount, @TAB)
   finaltxt = StrCat(finaltxt, thisobj, @TAB, thiscnt, @CRLF)
Next

Message("Debug", finaltxt)

;sortlist = strcat("exit", @tab, sortlist)
;while @true
;   tag = AskItemList("Tag Details", sortlist, @tab, @sorted, @single)
;   if tag == "exit" then break
;   gosub showTAGS
;endwhile

ClipPut(finaltxt)

br.quit

:WBERRORHANDLER

Exit

:showTAGS
Taglist = ""
objcoll = browserdoc.GetElementsByTagname(tag)
For tt = 0 To objcoll.length-1
   thistag = objcoll.item(tt)
   Taglist = ItemInsert(thistag.name, -1, Taglist, @TAB)
   ObjectClose(thistag)
Next
Taglist = ItemSort(Taglist, @TAB)
Taglist = StrReplace(Taglist, @TAB, @CR)
finaltxt = StrCat(finaltxt, @CRLF, @CRLF, tag, " <----------", @CRLF, Taglist)
Return



Article ID:   W16637
File Created: 2005:02:18:12:21:42
Last Updated: 2005:02:18:12:21:42