Monday, December 15, 2008

Free Web Macros for Internet Explorer

As I presented in my previous post, automating Internet Explorer can be a difficult task.
Twebst Web Automation Library can make things easier.

It gives full programmatic control over the Internet Explorer browser. Twebst is a library of COM object that can be used within any environment that supports COM, from scripting languages (JScript, VB Script) to high level programming languages (C#, C++). For more information, see Twebst Libray Online Documentation. And yes, it's free!

Get it FREE!

What Twebst can do?

  • increase productivity by automating repetitive web tasks
  • automate regression testing of web applications
  • automate web actions and data-entry
  • automatically log in to different web sites
  • fill out web-forms automatically
  • extract data from web pages (web scraping).
  • monitor web pages

Twebst features

  • Start new browsers and navigate to a specified URL.
  • Connect to existing browsers.
  • Search and access HTML elements and frames inside browsers.
  • Intuitive names for HTML elements using the text that appears on the screen.
  • Advanced search of browsers and HTML elements using regular expressions.
  • Perform actions on all HTML controls (button, combo-box, list-box, edit-box etc).
  • Simulates user behavior generating hardware or browser events.
  • Get access to native interfaces exposed by Internet Explorer so you don't need to learn new things if you already know IE web programming.
  • Synchronize web actions and navigation by waiting the page to complete in a specified timeout.
  • Available from any programming or script language that supports COM
  • Optimized search methods and collections.

Wednesday, December 10, 2008

What's wrong with Internet Explorer Automation?

The Microsoft Office products (Word, Excel, Power Point, Access, Outlook) allow their users to manipulate Office documents from Visual Basic or Visual Basic for Applications (VBA) code. It is possible to write a VBA macro in Excel that initializes a series of cells, and uses the cells to display a chart for instance.

Automation is the process of controlling one product from another product with the result that the client product can use the objects, methods, and properties of the server product. The client has access to the object model of the server.

Though Internet Explorer browser is not part of the Office suite, it supports automation. Here is a short sample:

// Create an IE automation object.
var ie = new ActiveXObject("InternetExplorer.Application");

// Make it visible and navigate to a given URL.
ie.Visible = true;

// Give it some time to load the page and then get the document.
var doc = ie.Document;

// Fill out search field.
var edit = doc.getElementsByName("q").item(0);
edit.value = "codecentrix";

// ... and press the submit button.
var submit = doc.getElementsByName("btnG").item(0);;

Here is ie_auto.js file for download.
However there are problems with Internet Explorer automation:
  • it may not work at all on Windows Vista unless the script is running at the same integrity level as iexplore.exe process. Simply clicking the js file won't do it. The script will run at medium integrity level and Internet Explorer has low integrity level and as result the script fails. If you run the script at high integrity level the newly started IE instance will have the same high integrity level and the script works (but this is not the best option from a security point of view). Changing the integrity level of the running script (or application) is not always the most desirable or easiest thing to do.
  • no support to "connect" to already existing IE documents.
  • difficult search of elements across all sub-documents inside frames/iframes (and sometimes impossible, see the point above).
  • difficult and time consuming search of HTML elements on attributes other than id or name (getElementById and getElementsByName are the only methods I know that search elements directly wihtout browsing element collections which might be very slow when performed out of process).
  • no direct support for synchronizing input actions (clicks, keys) with the HTML document loading (it could be implemented by registering to IE events like document complete or looping while the browser becomes ready to accept inputs).
  • no advanced search criteria like regular expression or searching on multiple attributes.
If you are interested in solving the issues above, let me introduce a project I've been working on for some time now. Here's Twebst, web automation library for Internet Explorer!

Get it FREE!

(to be continued)