Can't find the information you are looking for here? Then leave a message over on our WinBatch Tech Support Forum.
Detailed HTML Page Inventory.wbt -- which gives me the analysis I gave you.
I had to write a Web Crawler program last weekend for one of the posters here and that's what I came up with (detailed).
It's not finished and not purty (I still have to add the CSS/formatting), but it helped me in this instance. You can
customize it for your needs.
MSIE INVENTORY.WBT
Simple Inventory
#definesubroutine startMSIE(url) Browser = ObjectOpen("InternetExplorer.Application") Browser.addressbar = @false Browser.statusbar = @false Browser.menubar = @false Browser.toolbar = @false browser.visible = @true browser.navigate(url) timedelay(1) while browser.busy timedelay(0.5) endwhile ; setup the document object... browserDoc = Browser.Document all = browserdoc.all return(browser) #endsubroutine ; set up what to do if the browser is closed before the script ends... IntControl(73, 1, 0, 0, 0) ; start the browser... url = "C:\Program Files\RPC\Tidy_Selection.html" ;<--- change to your URL or local file name... ;url = "C:\Test\coding\typesetting test.html" br = startMSIE(url) message("Total Objects on Page", all.length) objlist = "" objcount= "" for x = 0 to all.length-1 ; dp = all.%x% dp = all.item(x) ; if dp.tagname == "" then message("Debug", dp.innerhtml) location = itemlocate(dp.tagname, objlist, @tab) if location == 0 objlist = iteminsert(dp.tagname, -1, objlist, @tab) objcount= iteminsert("1", -1, objcount, @tab) else count = itemextract(location, objcount, @tab) objcount= itemreplace(count+1, location, objcount, @tab) endif next for x = 1 to itemcount(objlist, @tab) thisobj = itemextract(x, objlist, @tab) if thisobj == "" then objlist = itemreplace("<blank>", x, objlist, @tab) next finaltxt = "" sortlist = itemsort(objlist, @tab) for x = 1 to itemcount(objlist, @tab) thisobj = itemextract(x, sortlist, @tab) usloc = itemlocate(thisobj, objlist, @tab) thiscnt = itemextract(usloc, objcount, @tab) finaltxt = strcat(finaltxt, thisobj, @tab, thiscnt, @crlf) next message("Debug", finaltxt) ;sortlist = strcat("exit", @tab, sortlist) ;while @true ; tag = AskItemList("Tag Details", sortlist, @tab, @sorted, @single) ; if tag == "exit" then break ; gosub showTAGS ;endwhile clipput(finaltxt) br.quit :WBERRORHANDLER exit :showTAGS Taglist = "" objcoll = browserdoc.GetElementsByTagname(tag) for tt = 0 to objcoll.length-1 thistag = objcoll.item(tt) Taglist = iteminsert(thistag.name, -1, Taglist, @tab) objectclose(thistag) next Taglist = itemsort(Taglist, @tab) Taglist = strreplace(Taglist, @tab, @cr) finaltxt = strcat(finaltxt, @crlf, @crlf, tag, " ----------", @crlf, Taglist) return
DETAILED HTML PAGE INVENTORY.WBT
Detailed Inventory
#definesubroutine startMSIE(url) Browser = objectopen("InternetExplorer.Application") Browser.addressbar = @false Browser.statusbar = @false Browser.menubar = @false Browser.toolbar = @false browser.visible = @true browser.navigate(url) ; wait until page loads... while Browser.busy || Browser.readystate <> 4 timedelay(0.5) endwhile ; setup the document object... browserDoc = Browser.Document all = browserdoc.all return(browser) #endsubroutine #defineSubroutine WaitForBrowser() while Browser.busy || Browser.readystate <> 4 timedelay(0.5) endwhile return #endSubroutine #defineFunction BuildPageHeader(HList) HeaderHTML = `<div id="Header" align="center">` HeaderHTML = strcat(HeaderHTML, `<table class="blanktable" id="ButtonTable" border="0"><tr>`, @crlf) for x = 1 to itemcount(HList, "|") ThisValue = itemextract(x, HList, "|") InputHTML = strcat(`<input type="submit" value="`, ThisValue, `" onclick="DetailDivs('`, ThisValue, `')"/>`) HeaderHTML = strcat(HeaderHTML, `<td class="blanktd">`, InputHTML, `</td>`) next HeaderHTML = strcat(HeaderHTML, `</tr></table></div>`, @crlf) return(HeaderHTML) #endFunction #defineFunction BuildDetailDivs(HList) DetailHTML = `<div id="DetailDiv">` for x = 1 to itemcount(HList, "|") ThisValue = itemextract(x, HList, "|") DetailHTML = strcat(DetailHTML, `<div style="display:none;" id="`, ThisValue, `">%ThisValue%</div>`, @crlf) next DetailHTML = strcat(DetailHTML, `</div>`, @crlf) return(DetailHTML) #endFunction #defineFunction GetFormsData(doc, page) FormsCollection = doc.GetElementsByTagName("FORM") FormsHTML = strcat(`<p>There are `, FormsCollection.length, ` forms on <i>`, page, `</i></p><br>`, @crlf) for x = 0 to FormsCollection.length-1 FormsHTML = strcat(FormsHTML, `<table cellspacing="0" cellpadding="2" style="border-collapse:collapse" border="1">`, @crlf) FormsHTML = strcat(FormsHTML, `<caption>Form #`, x+1, `</caption>`, @crlf) FormsHTML = strcat(FormsHTML, `<tr><td><b>ID</b></td><td>`, FormsCollection.item(x).id, `</td></tr>`, @crlf) FormsHTML = strcat(FormsHTML, `<tr><td><b>Name</b></td><td>`, FormsCollection.item(x).name, `</td></tr>`, @crlf) FormsHTML = strcat(FormsHTML, `<tr><td><b>Method</b></td><td>`, FormsCollection.item(x).method, `</td></tr>`, @crlf) FormsHTML = strcat(FormsHTML, `<tr><td><b>Action</b></td><td>`, FormsCollection.item(x).action, `</td></tr>`, @crlf) FormsHTML = strcat(FormsHTML, `</table><br>`, @crlf) next objectclose(FormsCollection) return(FormsHTML) #endFunction #defineFunction GetTablesData(doc, page) TablesCollection = doc.GetElementsByTagName("TABLE") TablesHTML = strcat(`<p>There are `, TablesCollection.length, ` tables on <i>`, page, `</i></p><br>`, @crlf) for x = 0 to TablesCollection.length-1 TablesHTML = strcat(TablesHTML, `<table cellspacing="0" cellpadding="2" style="border-collapse:collapse" border="1">`, @crlf) TablesHTML = strcat(TablesHTML, `<caption>Table #`, x+1, `</caption>`, @crlf) TablesHTML = strcat(TablesHTML, `<tr><td><b>ID</b></td><td>`, TablesCollection.item(x).id, `</td></tr>`, @crlf) TablesHTML = strcat(TablesHTML, `<tr><td><b>Name</b></td><td>`, TablesCollection.item(x).name, `</td></tr>`, @crlf) TablesHTML = strcat(TablesHTML, `<tr><td><b>Rows</b></td><td>`, TablesCollection.item(x).rows.length, `</td></tr>`, @crlf) TablesHTML = strcat(TablesHTML, `</table><br>`, @crlf) next objectclose(TablesCollection) return(TablesHTML) #endFunction #defineFunction GetInputsData(doc, page) InputsCollection = doc.GetElementsByTagName("INPUT") InputsHTML = strcat(`<p>There are `, InputsCollection.length, ` inputs on <i>`, page, `</i></p><br>`, @crlf) for x = 0 to InputsCollection.length-1 InputsHTML = strcat(InputsHTML, `<table cellspacing="0" cellpadding="2" style="border-collapse:collapse" border="1">`, @crlf) InputsHTML = strcat(InputsHTML, `<caption>Input #`, x+1, `</caption>`, @crlf) InputsHTML = strcat(InputsHTML, `<tr><td><b>ID</b></td><td>`, InputsCollection.item(x).id, `</td></tr>`, @crlf) InputsHTML = strcat(InputsHTML, `<tr><td><b>Name</b></td><td>`, InputsCollection.item(x).name, `</td></tr>`, @crlf) InputsHTML = strcat(InputsHTML, `<tr><td><b>Value</b></td><td>`, InputsCollection.item(x).value, `</td></tr>`, @crlf) InputsHTML = strcat(InputsHTML, `<tr><td><b>Type</b></td><td>`, InputsCollection.item(x).type, `</td></tr>`, @crlf) InputsHTML = strcat(InputsHTML, `<tr><td><b>OnClick</b></td><td>`, InputsCollection.item(x).attributes.onclick.value, `</td></tr>`, @crlf) InputsHTML = strcat(InputsHTML, `<tr><td><b>OnChange</b></td><td>`, InputsCollection.item(x).attributes.onchange.value, `</td></tr>`, @crlf) InputsHTML = strcat(InputsHTML, `<tr><td><b>OnBlur</b></td><td>`, InputsCollection.item(x).attributes.onblur.value, `</td></tr>`, @crlf) InputsHTML = strcat(InputsHTML, `</table><br>`, @crlf) next objectclose(InputsCollection) return(InputsHTML) #endFunction #defineFunction GetSelectsData(doc, page) InputsCollection = doc.GetElementsByTagName("SELECT") InputsHTML = strcat(`<p>There are `, InputsCollection.length, ` selects on <i>`, page, `</i></p><br>`, @crlf) for x = 0 to InputsCollection.length-1 InputsHTML = strcat(InputsHTML, `<table cellspacing="0" cellpadding="2" style="border-collapse:collapse" border="1">`, @crlf) InputsHTML = strcat(InputsHTML, `<caption>Select #`, x+1, `</caption>`, @crlf) InputsHTML = strcat(InputsHTML, `<tr><td><b>ID</b></td><td>`, InputsCollection.item(x).id, `</td></tr>`, @crlf) InputsHTML = strcat(InputsHTML, `<tr><td><b>Name</b></td><td>`, InputsCollection.item(x).name, `</td></tr>`, @crlf) InputsHTML = strcat(InputsHTML, `<tr><td><b>Value</b></td><td>`, InputsCollection.item(x).value, `</td></tr>`, @crlf) InputsHTML = strcat(InputsHTML, `<tr><td><b>OnClick</b></td><td>`, InputsCollection.item(x).attributes.onclick.value, `</td></tr>`, @crlf) InputsHTML = strcat(InputsHTML, `<tr><td><b>OnChange</b></td><td>`, InputsCollection.item(x).attributes.onchange.value, `</td></tr>`, @crlf) InputsHTML = strcat(InputsHTML, `<tr><td><b>OnBlur</b></td><td>`, InputsCollection.item(x).attributes.onblur.value, `</td></tr>`, @crlf) InputsHTML = strcat(InputsHTML, `</table><br>`, @crlf) next objectclose(InputsCollection) return(InputsHTML) #endFunction ; path = "c:\test\coding\sharret\" htmlfiles = fileitemize(strcat(path, "s*.htm")) thisfile = strcat(path, AskItemList("HTML Details", htmlfiles, @tab, @sorted, @single)) br = startMSIE(thisfile) ; ;------------ Begin Inventory ; FormsHTML = GetFormsData(br.document, thisfile) TablesHTML = GetTablesData(br.document, thisfile) InputsHTML = GetInputsData(br.document, thisfile) SelectsHTML = GetSelectsData(br.document, thisfile) ; ;------------ Begin Report ; HList = "Forms|Tables|Inputs|Selects" ;br = startMSIE("about:blank") browser.navigate("about:blank") WaitForBrowser() ; vbscript = "" vbscript = strcat(vbscript, `<script language="vbscript">`, @crlf) vbscript = strcat(vbscript, ` Sub DetailDivs(DivName)`, @crlf) vbscript = strcat(vbscript, ` dim x, DetailCollection`, @crlf) vbscript = strcat(vbscript, ` set DetailCollection = document.GetElementByID("DetailDiv").GetElementsByTagName("DIV")`, @crlf) vbscript = strcat(vbscript, ` For x = 0 to document.GetElementByID("DetailDiv").GetElementsByTagName("DIV").length-1`, @crlf) vbscript = strcat(vbscript, ` if DetailCollection.item(x).id = DivName then`, @crlf) vbscript = strcat(vbscript, ` DetailCollection.item(x).style.display = ""`, @crlf) vbscript = strcat(vbscript, ` else`, @crlf) vbscript = strcat(vbscript, ` DetailCollection.item(x).style.display = "none"`, @crlf) vbscript = strcat(vbscript, ` end if`, @crlf) vbscript = strcat(vbscript, ` Next`, @crlf) vbscript = strcat(vbscript, ` End Sub`, @crlf) vbscript = strcat(vbscript, `</script>`, @crlf) ; browser.document.writeln(vbscript) ; browserdoc.writeln(`<style>`) browserdoc.writeln(`caption, .tdbold {font-weight: bold}`) browserdoc.writeln(`.blanktable, .blanktd {border: none}`) browserdoc.writeln(`td, th, table, caption {border: .25mm solid black}`) browserdoc.writeln(`td {background: whitesmoke}`) browserdoc.writeln(`table, p {font-size: 9pt}`) browserdoc.writeln(`caption, th {color: yellow; background: black}`) browserdoc.writeln(`</style>`) ; browserdoc.title = strcat("Detailed HTML for ", thisfile) ; browser.document.writeln(BuildPageHeader(HList)) browser.document.writeln(`<br><br>`) browser.document.writeln(BuildDetailDivs(HList)) ; ; now insert the data for each... browser.document.GetElementByID("Forms").innerHTML = FormsHTML browser.document.GetElementByID("Tables").innerHTML = TablesHTML browser.document.GetElementByID("Inputs").innerHTML = InputsHTML browser.document.GetElementByID("Selects").innerHTML = SelectsHTML ; if askyesno("HTML Details", "Export Details?") outfile = "C:\test\coding\Detailed HTML.html" fileput(outfile, browser.document.GetElementsByTagName("HTML").item(0).outerHTML) endif exit
Article ID: W17180
File Created: 2007:07:03:14:28:36
Last Updated: 2007:07:03:14:28:36