Monday, January 05, 2009

WSH and clipboard access

I did some Windows Script Host programming recently and I was pleasantly surprised by its power, features and flexibility. One thing that I couldn't accomplish was accessing the clipboard from WSH. Digging the internet I found some solutions like this one based on Internet Explorer Automation. There are several problems with this approach as you can read in my article about Internet Explorer Automation: What's wrong with Internet Explorer Automation?

My solution for scripting the clipboard content in WSH is a regular COM object created with VC++ and ATL.

Download full source code and compiled DLL: WSH_clipboard.zip
To install the COM object run register.bat

I found scripting the clipboard useful enough to add this feature to the next release of Twebst Web Automation Library.

Monday, December 15, 2008

Free Web Macros for Internet Explorer

As I presented in my previous post, automating Internet Explorer can be a difficult task.
Twebst Web Automation Library can make things easier.

It gives full programmatic control over the Internet Explorer browser. Twebst is a library of COM object that can be used within any environment that supports COM, from scripting languages (JScript, VB Script) to high level programming languages (C#, C++). For more information, see Twebst Libray Online Documentation. And yes, it's free!

Get it FREE!

What Twebst can do?

  • increase productivity by automating repetitive web tasks
  • automate regression testing of web applications
  • automate web actions and data-entry
  • automatically log in to different web sites
  • fill out web-forms automatically
  • extract data from web pages (web scraping).
  • monitor web pages

Twebst features

  • Start new browsers and navigate to a specified URL.
  • Connect to existing browsers.
  • Search and access HTML elements and frames inside browsers.
  • Intuitive names for HTML elements using the text that appears on the screen.
  • Advanced search of browsers and HTML elements using regular expressions.
  • Perform actions on all HTML controls (button, combo-box, list-box, edit-box etc).
  • Simulates user behavior generating hardware or browser events.
  • Get access to native interfaces exposed by Internet Explorer so you don't need to learn new things if you already know IE web programming.
  • Synchronize web actions and navigation by waiting the page to complete in a specified timeout.
  • Available from any programming or script language that supports COM
  • Optimized search methods and collections.

Wednesday, December 10, 2008

What's wrong with Internet Explorer Automation?

The Microsoft Office products (Word, Excel, Power Point, Access, Outlook) allow their users to manipulate Office documents from Visual Basic or Visual Basic for Applications (VBA) code. It is possible to write a VBA macro in Excel that initializes a series of cells, and uses the cells to display a chart for instance.

Automation is the process of controlling one product from another product with the result that the client product can use the objects, methods, and properties of the server product. The client has access to the object model of the server.

Though Internet Explorer browser is not part of the Office suite, it supports automation. Here is a short sample:

// Create an IE automation object.
var ie = new ActiveXObject("InternetExplorer.Application");

// Make it visible and navigate to a given URL.
ie.Visible = true;
ie.Navigate("http://www.google.com/");

// Give it some time to load the page and then get the document.
WScript.Sleep(3000);
var doc = ie.Document;

// Fill out search field.
var edit = doc.getElementsByName("q").item(0);
edit.value = "codecentrix";

// ... and press the submit button.
var submit = doc.getElementsByName("btnG").item(0);
submit.click();


Here is ie_auto.js file for download.
However there are problems with Internet Explorer automation:
  • it may not work at all on Windows Vista unless the script is running at the same integrity level as iexplore.exe process. Simply clicking the js file won't do it. The script will run at medium integrity level and Internet Explorer has low integrity level and as result the script fails. If you run the script at high integrity level the newly started IE instance will have the same high integrity level and the script works (but this is not the best option from a security point of view). Changing the integrity level of the running script (or application) is not always the most desirable or easiest thing to do.
  • no support to "connect" to already existing IE documents.
  • difficult search of elements across all sub-documents inside frames/iframes (and sometimes impossible, see the point above).
  • difficult and time consuming search of HTML elements on attributes other than id or name (getElementById and getElementsByName are the only methods I know that search elements directly wihtout browsing element collections which might be very slow when performed out of process).
  • no direct support for synchronizing input actions (clicks, keys) with the HTML document loading (it could be implemented by registering to IE events like document complete or looping while the browser becomes ready to accept inputs).
  • no advanced search criteria like regular expression or searching on multiple attributes.
If you are interested in solving the issues above, let me introduce a project I've been working on for some time now. Here's Twebst, web automation library for Internet Explorer!

Get it FREE!


(to be continued)

Tuesday, November 25, 2008

Creating shortcuts to Quick Launch Toolbar with WSH

I had this problem of creating shortcuts to Quick Launch Tollbar while working on Script Of The Day application. This small product is almost entirely created using JScript and Windows Scripting Host (WSH).

SpecialFolders method of WScript.Shell object provides the full path for some special folders like Desktop and Favorites but the directory for Quick Launch Toolbar is not supported. To get it I used %userprofile% env var like this:
var shell          = WScript.CreateObject("WScript.Shell");
var quickLaunchDir = shell.ExpandEnvironmentStrings("%userprofile%") +
"\\Application Data\\Microsoft\\Internet Explorer\\Quick Launch";
var oShellLink = shell.CreateShortcut(quickLaunchDir + "\\Codecentrix.lnk");

oShellLink.TargetPath = "http://www.codecentrix.com/";
oShellLink.IconLocation = "http://www.codecentrix.com/favicon.ico";
oShellLink.WindowStyle = 1;
oShellLink.Description = "Web Site";
oShellLink.Save();
Downloads:

Saturday, August 09, 2008

focus vs fireEvent("onfocus")

While working on Twebst web automation library I encountered this problem: how to simulate setting the focus on HTML edit controls in Internet Explorer? There are two ways to do this.

  1. Call IHTMLElement2::focus() method on target element that "causes the element to receive the focus and executes the code specified by the onfocus event".
  2. Rise onfocus event on target element by calling IHTMLElement3::fireEvent() method.

The two approaches are quite similar but there are some interesting differences.

  1. fireEvent("onfocus") does not actually set the focus on the element, it just executes the code of the onfocus handler event.
  2. Calling focus method sets the focus on target element and call the onfocus event handler but not immediately. The onfocus event seems to be inserted in a queue and its handler is executed asynchronously after the current handler is finished.
  3. If focus method is called from inside the onfocus handler nothing happens if the control already has the focus (that prevents an infinite recursion).

Example:


<html>
<script type="text/javascript" language="javascript">
function BtnFocusClick()
{
     document.getElementById('editTest').focus();
     window.status += "b";
}

function BtnOnFocusClick()
{
     document.getElementById('editTest').fireEvent('onfocus');
     window.status += "c";
}

function EditOnFocus()
{
     window.status += "a";
}
</script>

<body>
     <input type="text" onfocus="EditOnFocus()"; id="editTest"/><br/>
     <input type="button" value="focus" id="btnFocus" onclick="BtnFocusClick();"/>
     <input type="button" value="fire onfocus" id="btnOnFocus" onclick="BtnOnFocusClick();"/>
</body>
</html>

If pressing the button "fire onfocus" button the message in the Internet Explorer status bar is the expected one "ac". If pressing the "focus" button, the message is in reverse order than expected: "ba". That suggests that EditOnFocus handler is called after BtnFocusClick exit.


Thursday, June 19, 2008

IHTMLDocument3::getElementsByTagName and IHTMLElementCollection

A common task when writing Internet Explorer extensions is to browse a collection of objects based on a specified element tag-name. To work with collections IE provides IHTMLElementCollection interface that represents a collection of elements in an HTML document.

Usually a collection is retrieved by calling methods of IHTMLDocument2 interface. For some tag-names there specialized methods to retrieve collection (IHTMLDocument2::get_anchors , IHTMLDocument2::get_applets, IHTMLDocument2::get_forms, IHTMLDocument2::get_images, IHTMLDocument2::get_links, IHTMLDocument2::get_scripts).

To get all elements collection there is IHTMLDocument2::get_all.
One way to get a collection of elements having a specified tag-name is:
// CComQIPt<IHTMLDocument2> spDocument is a document object.

CComQIPtr<IHTMLElementCollection> spAllCollection;
HRESULT hRes = spDocument->get_all(&spAllCollection);
_ASSERTE(SUCCEDED(hRes) && (spAllCollection != NULL));

// Get the sub-collection of elements that have the "input" tag name.
CComVariant varTagName(CComBSTR("input"));
CComQIPtr<IDispatch> spDispCollection;
hRes = spAllCollection->tags(varTagName, &spDispCollection);

CComQIPtr<IHTMLElementCollection> spInputCollection = spDispCollection;
_ASSERTE(spInputCollection != NULL);

// Now you can browse spInputCollection using
// IHTMLElementCollection::item and IHTMLElementCollection::get_length methods.

The second method is:
// CComQIPt spDocument is a document object.
// Query for IHTMLDocument3 interface
CComQIPtr<IHTMLDocument3> spDoc3 = spDocument;
_ASSERTE(spDoc3 != NULL);

// Get the collection of elements that have the "input" tag name.
CComQIPtr<IHTMLElementCollection> spInputCollection;
HRESULT hRes = spDoc3->getElementsByTagName(CComBSTR("input"), &spInputCollection);
_ASSERTE(SUCCEDEED(hRes));

// Now you can browse spInputCollection using
// IHTMLElementCollection::item and IHTMLElementCollection::get_length methods.


Monday, May 19, 2008

ATL thunks and Windows DEP story

When using older ATL versions the program may generate an access violation due to Data Execution Prevention. This is basically a memory protection feature that prevents executing code from memory pages marked as non-executable.

The ATL implementation of CWindow class uses a technique called thunk. A thunk is a small piece of code that ATL generates in a region of memory allocated on the heap. Older versions of ATL did not set the execution flag on the allocated memory pages and that generates a crash on Windows machines where DEP is enabled.

Usually ATL is used in Internet Explorer extensions. If DEP is enabled then the result is a browser crash. This is a good reason to upgrade to Visual Studio 2005 or later. Find more on how to activate DEP in IE7 here.

Monday, April 21, 2008

ATL + STL = CAdapt

I didn't know about CAdapt class until I tried to convert a VS2003 project to VS2005. Everything worked OK but I couldn't compile it! The compiler complained about:
std::list<CComQIPtr<IHTMLElement> >.
I found that what was accepted by VC++ 2003 compiler is not accepted by VC++ 2005 compiler. The reason of for is the address operator overloaded by CComQIPtr class. std::list class needs the address of a CComQIPtr object but & operator returns a IHTMLElement* address.

Here's where CAdapt class comes to save us. The list of smart pointers becomes: std::list<CAdapt<CComQIPtr<IHTMLElement> > >
Also you need to use m_T member where needed.

Tuesday, April 15, 2008

Did you know that ? fatal error C1091: compiler limit

It doesn't happen every day to find a new compiler error message. Here's one I didn't expect:

fatal error C1091: compiler limit: string exceeds 65535 bytes in length

I was about to complete my programming task when this error struck and I really needed a very long string. On VC++ 2003 compiler the limit is even smaller, 16 Kb.

The conclusion of this story: don't put the whole story of your life inside a C++ string constant!

Monday, March 31, 2008

.Net and COM interop story

.Net allows programmers to reuse COM components in their managed code. To make this possible a managed wrapper object around the native object is needed. Besides that, one can use the COM object like any other managed object. Even if it sounds simple, you have to be aware of the differences between the CLR's object lifetime management and the COM version of object lifetime management.

COM programmers have to call Release on every interface that has been AddRef'ed. For C# programmers using COM objects that means AddRef is called when:
- a COM object is created.
- a COM object is returned by calling a method or a property.
- a COM object is cast'ed to another COM interface type.

To release a COM object in C# there are two options:
- leave the GC to collect managed wrappers and to call their finalizers that will call Release on native COM object.
- manually call Marshal.ReleaseComObject on every interface used in the code.

Let's see a short example using COM objects exposed by IE. The code bellow changes the color of every link in a HTML document.

// IHTMLDocument2 doc;
foreach (IHTMLElement elem in doc.all)
{
IHTMLAnchorElement anchor = elem as IHTMLAnchorElement;
if (anchor != null)
{
elem.style.color = "red";
}
}
This first approach leaves the task of releasing COM objects to garbage collector. Let's manually release COM objects now:

// IHTMLDocument2 doc;
IHTMLElementCollection allCollection = doc.all;
foreach (IHTMLElement crntElem in allCollection)
{
IHTMLAnchorElement anchor = crntElem as IHTMLAnchorElement;
if (anchor != null)
{
IHTMLStyle style = crntElem.style;
style.color = "red";

Marshal.ReleaseComObject(style);
Marshal.ReleaseComObject(anchor);
}

Marshal.ReleaseComObject(crntElem);
}

Marshal.ReleaseComObject(allCollection);

As you can see the number of code lines doubles! I personally prefer to leave the task of releasing COM objects to GC even if they will be eventually released after some time when GC comes into action.

Some might be tempted to call GC.Collect after a large chunk of code that work with COM objects but this could be even worse because other managed objects could be promoted to next GC generation and their lifespan is therefore longer than necessary.

In theory it is possible to create a lot of large COM objects that will exceed the native heap while the managed heap has a lot of available memory because managed wrappers are smaller in size. GC won't be called in this scenario so the native heap won't be freed.

If your application suffers from this kind of memory allocation problem, maybe using COM objects from managed code is not the best approach for you.

Monday, February 25, 2008

The game of programming. Programming the game.

The first computer I've ever seen was a Z80 Spectrum. That was back in 1988. Like any kid of my age I was amazed by computer games. I started to learn programming with the hope that one day I will create my own game.

That happened after many years, in 1999 and here's the result. I wrote these two little games just to learn some Java language. Now looking back at this old code of mine, it's nice to see that I wasn't that bad after all :-)

Source code:
Click on images below to play!


Saturday, February 02, 2008

When IHTMLWindow2.document throws UnauthorizedAccessException

This is basically a C# translation of one of my older articles "When IHTMLWindow2::get_document returns E_ACCESSDENIED". Some .Net people encountered difficulties to use it, so I decided to make their life easier.

The main problem is the confusion created by System.IServiceProvider .Net interface because it has the same name as the COM interface. Once this issue is passed the code translation is straightforward. Here's the interop code to declare the COM interface IServiceProvider.
// This is the COM IServiceProvider interface, not System.IServiceProvider .Net interface!
[ComImport(), ComVisible(true), Guid("6D5140C1-7436-11CE-8034-00AA006009FA"),
InterfaceTypeAttribute(ComInterfaceType.InterfaceIsIUnknown)]
public interface IServiceProvider
{
[return: MarshalAs(UnmanagedType.I4)][PreserveSig]
int QueryService(ref Guid guidService, ref Guid riid, [MarshalAs(UnmanagedType.Interface)] out object ppvObject);
}
You find here full source code of the sample assembly.

This technique was successfully implemented and tested in Twebst web automation library.

Wednesday, January 09, 2008

Nunit and STAThread story

I use NUnit unit-testing framework to test my pet project Twebst. Being a collection of COM objects, Twebst can be used within any environment that supports COM. That means it can be used from .Net languages like C#.

First I started by creating an assembly to be used from NUnit GUI. Some tests failed without an obvious reason. After some research I understood that the COM apartment must be STAThread. The threading model must be set before the thread is started but I don't have access to NUnit GUI main thread from my assembly.

One possible solution to this problem is to transform the assembly into an EXE application that uses the NUnit framework like this:

[STAThread]
public static void Main(string[] args)
{
NUnit.ConsoleRunner.Runner.Main(
new string[] { System.AppDomain.CurrentDomain.BaseDirectory + "MyExe.exe", "/nothread" });
}
When /nothread command line flag is used the tests are executed by the main thread which already has the right COM apartment properly set.

Thursday, December 20, 2007

Repairing Internet Explorer

Creating Internet Explorer extensions is not always an easy task. Erroneous extensions can make your IE to crash, become unstable or to behave unpredictably.

Even worse, uninstalling extensions can leave IE in a broken state, some things stop working properly. This usually happens when IE components are unregistered and IE registry keys are altered and not properly restored on uninstall. Here's a short list of damages I personally encountered:

1). 'Open in New Window' Command Does Not Work in Internet Explorer
I'll come later with examples of how an extension can break IE and how to avoid it. For now let's concentrate on how we can fix it. Download RepairIE.zip file, extract files inside and run "fixie.cmd"

Points of interest:
  • Internet Explorer components are registered using regsvr32.exe tool.
  • For IE7 mshtml.dll can not be registered as explained above; instead mshtml.tlb is registered using regtlib.exe tool.
  • Registering shdocvw.dll using regsvr32.exe is not a good idea on IE7 because it damages [HKEY_CLASSES_ROOT\Typelib\{EAB22AC0-30C1-11CF-A7EB-0000C05BAE0B}\1.1\0\win32 registry key.
  • reg_ieframe.reg file fixes the broken key above.

Sunday, December 09, 2007

Allow local scripted HTML files to run in IE7

During automatically testing Twebst library I was very annoyed about IE7 refusing to properly open local HTML files that contains scripts. The following message is displayed:

"To help protect your security, Internet Explorer has restricted this web page from running scripts or ActiveX controls that could access your computer. Click here for options..."

It took me several minutes to find the hidden option that turns off this warning. First I looked for it in Security tab options but it was actually in Advanced tab. Here's how you find it:

1). Go to:
Tools > Internet Options > Advanced > Security
2). Check:
Allow active content to run in files on My computer
3). Restart IE





Wednesday, November 14, 2007

How to get a handle to current TabWindowClass tab in IE7

IWebBrowser2::get_HWND method gets the handle of the Internet Explorer 7 main window. Sometimes the tab window handle is needed. Here's a sample code that I recently found in MSDN. It shows how to get the handle of the tab window starting from a IWebBrowser2 object.
#include <shlguid.h>

HWND GetTabWnd(CComQIPtr<IWebBrowser2> spBrowser)
{
HWND hwndTab = NULL;
CComQIPtr<IServiceProvider> spServiceProvider = spBrowser;

if (spServiceProvider != NULL)
{
CComQIPtr<IOleWindow> spWindow;
if (SUCCEEDED(spServiceProvider->QueryService(
SID_SShellBrowser,
IID_IOleWindow,
(void**)&spWindow)))
{

spWindow->GetWindow(&hwndTab));
}
}

return hwndTab;
}
I think the code is supposed to work on top level IWebBrowser2 objects. You can read more about top browser objects in my previous article.

This technique was successfully implemented and tested in Twebst web automation library.

Monday, November 12, 2007

When IWebBrowser2::get_HWND returns E_FAIL

IWebBrowser2::get_HWND "gets the handle of the Microsoft Internet Explorer main window". As any COM method get_HWND returns a HRESULT value. According to MSDN, the method "returns S_OK if successful, or an error value otherwise".

It was hard for me to imagine how this method could fail but I still got an E_FAIL return value. This happened because the IWebBrowser2 object was not the top level browser. A web page containing frames/iframes is represented by a hierarchy of IHTMLWindow objects. Each window has an associated IHTMLDocument2 object exposed by IHTMLWindow2::get_document. An IHTMLWindow can be also converted to a IWebBrowser2 object. Here's my solution to get the main window handle starting from a non-top level browser object (this is a common scenario when adding your custom menu item in the IE context menu).

// IHTMLWindow2 to IWebBrowser2
CComQIPtr<IWebBrowser2> IHTMLWindow2ToIWebBrowser2(CComQIPtr<IHTMLWindow2> spHTMLWindow)
{
ATLASSERT(spHTMLWindow != NULL);

// Query for a service provider.
CComQIPtr<IWebBrowser2> spBrowser;
CComQIPtr<IServiceProvider> spServiceProvider = spHTMLWindow;

if (spServiceProvider != NULL)
{
// Ask the service provider for a IWebBrowser2 object.
spServiceProvider->QueryService(IID_IWebBrowserApp, IID_IWebBrowser2, (void**)&spBrowser);
}

return spBrowser;
}

// IWebBrowser2 to IHTMLWindow2
CComQIPtr<IHTMLWindow2> IWebBrowserToIHTMLWindow(CComQIPtr<IWebBrowser2> spBrowser)
{
ATLASSERT(spBrowser != NULL);
CComQIPtr<IHTMLWindow2> spWindow;

// Get the document of the browser.
CComQIPtr<IDispatch> spDisp;
spBrowser->get_Document(&spDisp);

// Get the window of the document.
CComQIPtr<IHTMLDocument2> spDoc = spDisp;
if (spDoc != NULL)
{
spDoc->get_parentWindow(&spWindow);
}

return spWindow;
}


CComQIPtr<IWebBrowser2> TopBrowser(CComQIPtr<IWebBrowser2> spBrowser)
{
ATLASSERT(spBrowser != NULL);

// Retrieve IHTMLWindow2 from browser.
CComQIPtr<IHTMLWindow2> spHTMLWnd = IWebBrowserToIHTMLWindow(spBrowser);
if (spHTMLWnd != NULL)
{
// Find top window.
CComQIPtr<IHTMLWindow2> spTopWindow;
HRESULT hResult = spHTMLWnd->get_top(&spTopWindow);

if (SUCCEEDED(hResult) && (spTopWindow != NULL))
{
// Convert the browser object to window.

return IHTMLWindow2ToIWebBrowser2(spTopWindow);
}
}

return CComQIPtr<IWebBrowser2>();
}

This technique was successfully implemented and tested in My web automation library.

Monday, November 05, 2007

How to programmatically find if XP theme is active?

Registry key:
HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\ThemeManager

Value:
ThemeActive is "1" if Windows XP, "0" if Windows Classic.

Sunday, October 28, 2007

Open TextRange selection as URL

On many forums, people post URLs as plain text and this forces me to:
  • select the text URL
  • ctrl+C to copy it in the clipboard
  • ctrl+T to open a new tab
  • paste the URL in the address bar of the newly created tab
  • press "Enter" key to start navigation

I found this whole process very annoying and it seems I have to go through this many times a day. Here's my attempt to automate it by creating an extension for "Internet Explorer" similar to Linkification extension (for "Fire Fox" browser). I called my extension: "Make Link".

How it works:

  • right click on text URL
  • choose "MakeLink: use clicked text as link" menu item (the URL is open in a new tab automatically)

or you can

  • fully select the URL text or only a part of it
  • right click on selected text
  • choose "MakeLink: use clicked text as link" menu item (text link is open in a new tab automatically)

DOWNLOADS:

Points of interests:

  • HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\MenuExt registry key to add a item in the IE context menu
  • gain access to window object passed in external.menuArguments
  • document.selection property to get the selected TextRange
  • TextRange methods: moveStart, moveEnd, moveToPoint, select
  • The extension is entirely written in JScript (no BHO, no COM).

Known issues:

  • on some web pages with horizontal scroll simply right click a text URL won't work. You need to select a part of the URL and then right click it.



Wednesday, October 17, 2007

How to properly catch RBN_CHEVRONPUSHED notification?

This is actually a thread I started on MSDN forum but unfortunately it remained unanswered:

Virtually any IE toolbar needs a chevron to happily live along with other toolbars in the same re-bar. So does my toolbar. To implement chevron functionality in IE toolbars I need to handle RBN_CHEVRONPUSHED. According to MSDN, when the chevron button is pushed, the notification is sent by the rebar in the form of WM_NOTIFY message to its parent. Here is the windows hierarchy in IE7:

WorkerW <- ReBarWindow32 <- ToolbarWindow32

where the last toolbar window is my toolbar. So I need to catch notifications from ReBarWindow32 that are sent to WorkerW window. To do that the first idea that came to my mind was to subclass the WorkerW window. I don’t like this idea because:
  • I subclass a window that does not belong to me, it was created by IE.
  • I don’t know what is the best time to subclass it: on IObjectWithSite.SetSite or on IDockingWindow.ShowDW ? (Those functions are implemented by my toolbar component)
  • I don’t know what is the best time to un-subclass it.
  • I don’t know when other toolbar might subclass/un-subclass the same window (I actually got a conflict with other toolbar resulting in IE stack overflow crash because of the order of subclassing/unsubclassing).
My second approach uses RB_SETPARENT to modify the parent of ReBarWindow32 window to be one of my windows. I process the RBN_CHEVRONPUSHED notification for my chevron button and send the other notifications to the original parent window (that is WorkerW). I change the parent on toolbar initialization/un-init (IObjectWithSite.SetSite). It seems a safer approach but I’m still worried about other toolbars using the same technique and the possibility of conflicts.

Take care of standard IE "Links" toolbar that also sends WM_COMMAND, WM_DRAWITEM and WM_MEASUREITEM messages to WorkerW window (and now you'll get those messages too). On IE6, "Go" button also do the same.

So the question remains: what is the best way to catch RBN_CHEVRONPUSHED notification when creating an IE toolbar extension?