Monday, December 15, 2008

Free Web Macros for Internet Explorer

As I presented in my previous post, automating Internet Explorer can be a difficult task.
Twebst Web Automation Library can make things easier.

It gives full programmatic control over the Internet Explorer browser. Twebst is a library of COM object that can be used within any environment that supports COM, from scripting languages (JScript, VB Script) to high level programming languages (C#, C++). For more information, see Twebst Libray Online Documentation. And yes, it's free!

Get it FREE!

What Twebst can do?

  • increase productivity by automating repetitive web tasks
  • automate regression testing of web applications
  • automate web actions and data-entry
  • automatically log in to different web sites
  • fill out web-forms automatically
  • extract data from web pages (web scraping).
  • monitor web pages

Twebst features

  • Start new browsers and navigate to a specified URL.
  • Connect to existing browsers.
  • Search and access HTML elements and frames inside browsers.
  • Intuitive names for HTML elements using the text that appears on the screen.
  • Advanced search of browsers and HTML elements using regular expressions.
  • Perform actions on all HTML controls (button, combo-box, list-box, edit-box etc).
  • Simulates user behavior generating hardware or browser events.
  • Get access to native interfaces exposed by Internet Explorer so you don't need to learn new things if you already know IE web programming.
  • Synchronize web actions and navigation by waiting the page to complete in a specified timeout.
  • Available from any programming or script language that supports COM
  • Optimized search methods and collections.

Wednesday, December 10, 2008

What's wrong with Internet Explorer Automation?

The Microsoft Office products (Word, Excel, Power Point, Access, Outlook) allow their users to manipulate Office documents from Visual Basic or Visual Basic for Applications (VBA) code. It is possible to write a VBA macro in Excel that initializes a series of cells, and uses the cells to display a chart for instance.

Automation is the process of controlling one product from another product with the result that the client product can use the objects, methods, and properties of the server product. The client has access to the object model of the server.

Though Internet Explorer browser is not part of the Office suite, it supports automation. Here is a short sample:

// Create an IE automation object.
var ie = new ActiveXObject("InternetExplorer.Application");

// Make it visible and navigate to a given URL.
ie.Visible = true;

// Give it some time to load the page and then get the document.
var doc = ie.Document;

// Fill out search field.
var edit = doc.getElementsByName("q").item(0);
edit.value = "codecentrix";

// ... and press the submit button.
var submit = doc.getElementsByName("btnG").item(0);;

Here is ie_auto.js file for download.
However there are problems with Internet Explorer automation:
  • it may not work at all on Windows Vista unless the script is running at the same integrity level as iexplore.exe process. Simply clicking the js file won't do it. The script will run at medium integrity level and Internet Explorer has low integrity level and as result the script fails. If you run the script at high integrity level the newly started IE instance will have the same high integrity level and the script works (but this is not the best option from a security point of view). Changing the integrity level of the running script (or application) is not always the most desirable or easiest thing to do.
  • no support to "connect" to already existing IE documents.
  • difficult search of elements across all sub-documents inside frames/iframes (and sometimes impossible, see the point above).
  • difficult and time consuming search of HTML elements on attributes other than id or name (getElementById and getElementsByName are the only methods I know that search elements directly wihtout browsing element collections which might be very slow when performed out of process).
  • no direct support for synchronizing input actions (clicks, keys) with the HTML document loading (it could be implemented by registering to IE events like document complete or looping while the browser becomes ready to accept inputs).
  • no advanced search criteria like regular expression or searching on multiple attributes.
If you are interested in solving the issues above, let me introduce a project I've been working on for some time now. Here's Twebst, web automation library for Internet Explorer!

Get it FREE!

(to be continued)

Tuesday, November 25, 2008

Creating shortcuts to Quick Launch Toolbar with WSH

I had this problem of creating shortcuts to Quick Launch Tollbar while working on Script Of The Day application. This small product is almost entirely created using JScript and Windows Scripting Host (WSH).

SpecialFolders method of WScript.Shell object provides the full path for some special folders like Desktop and Favorites but the directory for Quick Launch Toolbar is not supported. To get it I used %userprofile% env var like this:
var shell          = WScript.CreateObject("WScript.Shell");
var quickLaunchDir = shell.ExpandEnvironmentStrings("%userprofile%") +
"\\Application Data\\Microsoft\\Internet Explorer\\Quick Launch";
var oShellLink = shell.CreateShortcut(quickLaunchDir + "\\Codecentrix.lnk");

oShellLink.TargetPath = "";
oShellLink.IconLocation = "";
oShellLink.WindowStyle = 1;
oShellLink.Description = "Web Site";

Saturday, August 09, 2008

focus vs fireEvent("onfocus")

While working on Twebst web automation library I encountered this problem: how to simulate setting the focus on HTML edit controls in Internet Explorer? There are two ways to do this.

  1. Call IHTMLElement2::focus() method on target element that "causes the element to receive the focus and executes the code specified by the onfocus event".
  2. Rise onfocus event on target element by calling IHTMLElement3::fireEvent() method.

The two approaches are quite similar but there are some interesting differences.

  1. fireEvent("onfocus") does not actually set the focus on the element, it just executes the code of the onfocus handler event.
  2. Calling focus method sets the focus on target element and call the onfocus event handler but not immediately. The onfocus event seems to be inserted in a queue and its handler is executed asynchronously after the current handler is finished.
  3. If focus method is called from inside the onfocus handler nothing happens if the control already has the focus (that prevents an infinite recursion).


<script type="text/javascript" language="javascript">
function BtnFocusClick()
     window.status += "b";

function BtnOnFocusClick()
     window.status += "c";

function EditOnFocus()
     window.status += "a";

     <input type="text" onfocus="EditOnFocus()"; id="editTest"/><br/>
     <input type="button" value="focus" id="btnFocus" onclick="BtnFocusClick();"/>
     <input type="button" value="fire onfocus" id="btnOnFocus" onclick="BtnOnFocusClick();"/>

If pressing the button "fire onfocus" button the message in the Internet Explorer status bar is the expected one "ac". If pressing the "focus" button, the message is in reverse order than expected: "ba". That suggests that EditOnFocus handler is called after BtnFocusClick exit.

Thursday, June 19, 2008

IHTMLDocument3::getElementsByTagName and IHTMLElementCollection

A common task when writing Internet Explorer extensions is to browse a collection of objects based on a specified element tag-name. To work with collections IE provides IHTMLElementCollection interface that represents a collection of elements in an HTML document.

Usually a collection is retrieved by calling methods of IHTMLDocument2 interface. For some tag-names there specialized methods to retrieve collection (IHTMLDocument2::get_anchors , IHTMLDocument2::get_applets, IHTMLDocument2::get_forms, IHTMLDocument2::get_images, IHTMLDocument2::get_links, IHTMLDocument2::get_scripts).

To get all elements collection there is IHTMLDocument2::get_all.
One way to get a collection of elements having a specified tag-name is:
// CComQIPt<IHTMLDocument2> spDocument is a document object.

CComQIPtr<IHTMLElementCollection> spAllCollection;
HRESULT hRes = spDocument->get_all(&spAllCollection);
_ASSERTE(SUCCEDED(hRes) && (spAllCollection != NULL));

// Get the sub-collection of elements that have the "input" tag name.
CComVariant varTagName(CComBSTR("input"));
CComQIPtr<IDispatch> spDispCollection;
hRes = spAllCollection->tags(varTagName, &spDispCollection);

CComQIPtr<IHTMLElementCollection> spInputCollection = spDispCollection;
_ASSERTE(spInputCollection != NULL);

// Now you can browse spInputCollection using
// IHTMLElementCollection::item and IHTMLElementCollection::get_length methods.

The second method is:
// CComQIPt spDocument is a document object.
// Query for IHTMLDocument3 interface
CComQIPtr<IHTMLDocument3> spDoc3 = spDocument;
_ASSERTE(spDoc3 != NULL);

// Get the collection of elements that have the "input" tag name.
CComQIPtr<IHTMLElementCollection> spInputCollection;
HRESULT hRes = spDoc3->getElementsByTagName(CComBSTR("input"), &spInputCollection);

// Now you can browse spInputCollection using
// IHTMLElementCollection::item and IHTMLElementCollection::get_length methods.

Monday, May 19, 2008

ATL thunks and Windows DEP story

When using older ATL versions the program may generate an access violation due to Data Execution Prevention. This is basically a memory protection feature that prevents executing code from memory pages marked as non-executable.

The ATL implementation of CWindow class uses a technique called thunk. A thunk is a small piece of code that ATL generates in a region of memory allocated on the heap. Older versions of ATL did not set the execution flag on the allocated memory pages and that generates a crash on Windows machines where DEP is enabled.

Usually ATL is used in Internet Explorer extensions. If DEP is enabled then the result is a browser crash. This is a good reason to upgrade to Visual Studio 2005 or later. Find more on how to activate DEP in IE7 here.

Monday, April 21, 2008

ATL + STL = CAdapt

I didn't know about CAdapt class until I tried to convert a VS2003 project to VS2005. Everything worked OK but I couldn't compile it! The compiler complained about:
std::list<CComQIPtr<IHTMLElement> >.
I found that what was accepted by VC++ 2003 compiler is not accepted by VC++ 2005 compiler. The reason of for is the address operator overloaded by CComQIPtr class. std::list class needs the address of a CComQIPtr object but & operator returns a IHTMLElement* address.

Here's where CAdapt class comes to save us. The list of smart pointers becomes: std::list<CAdapt<CComQIPtr<IHTMLElement> > >
Also you need to use m_T member where needed.

Tuesday, April 15, 2008

Did you know that ? fatal error C1091: compiler limit

It doesn't happen every day to find a new compiler error message. Here's one I didn't expect:

fatal error C1091: compiler limit: string exceeds 65535 bytes in length

I was about to complete my programming task when this error struck and I really needed a very long string. On VC++ 2003 compiler the limit is even smaller, 16 Kb.

The conclusion of this story: don't put the whole story of your life inside a C++ string constant!

Monday, March 31, 2008

.Net and COM interop story

.Net allows programmers to reuse COM components in their managed code. To make this possible a managed wrapper object around the native object is needed. Besides that, one can use the COM object like any other managed object. Even if it sounds simple, you have to be aware of the differences between the CLR's object lifetime management and the COM version of object lifetime management.

COM programmers have to call Release on every interface that has been AddRef'ed. For C# programmers using COM objects that means AddRef is called when:
- a COM object is created.
- a COM object is returned by calling a method or a property.
- a COM object is cast'ed to another COM interface type.

To release a COM object in C# there are two options:
- leave the GC to collect managed wrappers and to call their finalizers that will call Release on native COM object.
- manually call Marshal.ReleaseComObject on every interface used in the code.

Let's see a short example using COM objects exposed by IE. The code bellow changes the color of every link in a HTML document.

// IHTMLDocument2 doc;
foreach (IHTMLElement elem in doc.all)
IHTMLAnchorElement anchor = elem as IHTMLAnchorElement;
if (anchor != null)
{ = "red";
This first approach leaves the task of releasing COM objects to garbage collector. Let's manually release COM objects now:

// IHTMLDocument2 doc;
IHTMLElementCollection allCollection = doc.all;
foreach (IHTMLElement crntElem in allCollection)
IHTMLAnchorElement anchor = crntElem as IHTMLAnchorElement;
if (anchor != null)
IHTMLStyle style =;
style.color = "red";




As you can see the number of code lines doubles! I personally prefer to leave the task of releasing COM objects to GC even if they will be eventually released after some time when GC comes into action.

Some might be tempted to call GC.Collect after a large chunk of code that work with COM objects but this could be even worse because other managed objects could be promoted to next GC generation and their lifespan is therefore longer than necessary.

In theory it is possible to create a lot of large COM objects that will exceed the native heap while the managed heap has a lot of available memory because managed wrappers are smaller in size. GC won't be called in this scenario so the native heap won't be freed.

If your application suffers from this kind of memory allocation problem, maybe using COM objects from managed code is not the best approach for you.

Monday, February 25, 2008

The game of programming. Programming the game.

The first computer I've ever seen was a Z80 Spectrum. That was back in 1988. Like any kid of my age I was amazed by computer games. I started to learn programming with the hope that one day I will create my own game.

That happened after many years, in 1999 and here's the result. I wrote these two little games just to learn some Java language. Now looking back at this old code of mine, it's nice to see that I wasn't that bad after all :-)

Source code:
Click on images below to play!

Saturday, February 02, 2008

When IHTMLWindow2.document throws UnauthorizedAccessException

This is basically a C# translation of one of my older articles "When IHTMLWindow2::get_document returns E_ACCESSDENIED". Some .Net people encountered difficulties to use it, so I decided to make their life easier.

The main problem is the confusion created by System.IServiceProvider .Net interface because it has the same name as the COM interface. Once this issue is passed the code translation is straightforward. Here's the interop code to declare the COM interface IServiceProvider.
// This is the COM IServiceProvider interface, not System.IServiceProvider .Net interface!
[ComImport(), ComVisible(true), Guid("6D5140C1-7436-11CE-8034-00AA006009FA"),
public interface IServiceProvider
[return: MarshalAs(UnmanagedType.I4)][PreserveSig]
int QueryService(ref Guid guidService, ref Guid riid, [MarshalAs(UnmanagedType.Interface)] out object ppvObject);
You find here full source code of the sample assembly.

This technique was successfully implemented and tested in Twebst web automation library.

Wednesday, January 09, 2008

Nunit and STAThread story

I use NUnit unit-testing framework to test my pet project Twebst. Being a collection of COM objects, Twebst can be used within any environment that supports COM. That means it can be used from .Net languages like C#.

First I started by creating an assembly to be used from NUnit GUI. Some tests failed without an obvious reason. After some research I understood that the COM apartment must be STAThread. The threading model must be set before the thread is started but I don't have access to NUnit GUI main thread from my assembly.

One possible solution to this problem is to transform the assembly into an EXE application that uses the NUnit framework like this:

public static void Main(string[] args)
new string[] { System.AppDomain.CurrentDomain.BaseDirectory + "MyExe.exe", "/nothread" });
When /nothread command line flag is used the tests are executed by the main thread which already has the right COM apartment properly set.