Wednesday, December 10, 2008

What's wrong with Internet Explorer Automation?

The Microsoft Office products (Word, Excel, Power Point, Access, Outlook) allow their users to manipulate Office documents from Visual Basic or Visual Basic for Applications (VBA) code. It is possible to write a VBA macro in Excel that initializes a series of cells, and uses the cells to display a chart for instance.

Automation is the process of controlling one product from another product with the result that the client product can use the objects, methods, and properties of the server product. The client has access to the object model of the server.

Though Internet Explorer browser is not part of the Office suite, it supports automation. Here is a short sample:

// Create an IE automation object.
var ie = new ActiveXObject("InternetExplorer.Application");

// Make it visible and navigate to a given URL.
ie.Visible = true;

// Give it some time to load the page and then get the document.
var doc = ie.Document;

// Fill out search field.
var edit = doc.getElementsByName("q").item(0);
edit.value = "codecentrix";

// ... and press the submit button.
var submit = doc.getElementsByName("btnG").item(0);;

Here is ie_auto.js file for download.
However there are problems with Internet Explorer automation:
  • it may not work at all on Windows Vista unless the script is running at the same integrity level as iexplore.exe process. Simply clicking the js file won't do it. The script will run at medium integrity level and Internet Explorer has low integrity level and as result the script fails. If you run the script at high integrity level the newly started IE instance will have the same high integrity level and the script works (but this is not the best option from a security point of view). Changing the integrity level of the running script (or application) is not always the most desirable or easiest thing to do.
  • no support to "connect" to already existing IE documents.
  • difficult search of elements across all sub-documents inside frames/iframes (and sometimes impossible, see the point above).
  • difficult and time consuming search of HTML elements on attributes other than id or name (getElementById and getElementsByName are the only methods I know that search elements directly wihtout browsing element collections which might be very slow when performed out of process).
  • no direct support for synchronizing input actions (clicks, keys) with the HTML document loading (it could be implemented by registering to IE events like document complete or looping while the browser becomes ready to accept inputs).
  • no advanced search criteria like regular expression or searching on multiple attributes.
If you are interested in solving the issues above, let me introduce a project I've been working on for some time now. Here's Twebst, web automation library for Internet Explorer!

Get it FREE!

(to be continued)

1 comment:

Nilpo said...

Hello Adrian,

I wanted to drop by and thank you for including a link to my site in your post. I've just posted a suggestion to my readers to give your COM object a try.

A Clipboard Control for WSH

Happy Coding!

Nilpo, the Windows Guru