Sunday, November 08, 2009

Automate HTML File Upload Control - Tutorial (1)

If you ever tried to programatically change the value of a HTML File Upload Control you probably noticed that <input type="file" /> element is read-only! This happens for good security reasons: web pages scripts should not be able to upload random files without user consent. Unfortunatelly the control is read-only for browser extensions and add-ons too (BHO, toolbars, side bars, etc).

For IE6 and IE7 the upload control may be automated by setting the focus to "Internet Explorer_Server" window and generating Win32 keyboard events. For IE8, the upload control is read-only, the user can only set a value by pressing "Browse" button and choosing a file.

With Twebst Automation Studio, automating HTML File Upload Control can't be easier. Here's a short VBScript sample of doing it:

Option Explicit
Dim core
Dim browser
Set core = CreateObject("Twebst.Core")
Set browser = core.StartBrowser("http://www.2shared.com/")

Call browser.FindElement("input file", "id=upField").InputText("C:\somepath\photo1.jpg")



Tuesday, November 03, 2009

New Web Macro Recorder released!

After several months of complete radio silence, things have been moving forward for Twebst Library which turned into a more powerful and mature product.
Please welcome Twebst Web Automation Studio!

Twebst Automation Studio is an advanced web automation framework for Internet Explorer that can be used within any environment that supports COM, from scripting languages (JScript, VBScript) to high level programming languages (C#, VB.Net, C++).

The framework includes two components: Twebst Web Recorder that automatically generates most of the automation code and Twebst Library which is a collection of programable objects that creates a web automation API.

But one image is better than thousand words...

Web Macro Recorder in Action

Twebst Web Recorder features:

  • Easily create web macros through an intuitive graphical interface.
  • Generate web automation code using the language of your choice: JScript, VBScript, C#, VB.Net, C++
  • Record web actions on all HTML controls (button, combo-box, list-box, edit-box)
And one more thing: FREE version available!
.

Tuesday, May 26, 2009

Web Data Extraction with C++ Web Macro

Web data extraction or web scraping can be implemented in various ways. Today I will use Twebst Web Automation Library to extract search results from Google using DOM parsing method and Internet Explorer automation (you need to install Twebst Library first).

Here are the steps that C++ web macro will perform in order to extract results from Google search:
  • Open an Internet Explorer browser and navigate to Google site.
  • Find the search edit box and fill out the word to search.
  • Find the submit button and click it.
  • Wait until the page is loaded and find a DIV with id=res
  • Find the collection of all H3 elements inside the DIV element.
  • Extract the text and URL and display it.

Enough talk! Let the code speak for itself.

// Start a new Internet Explorer instance and navigate to a given URL.
IBrowserPtr pBrowser = pCore->StartBrowser("http://www.google.com/");

// Find search edit box in page and type some text into it.
IElementPtr pSearchEdit = pBrowser->FindElement("input text", SearchCondition("name=q"));
pSearchEdit->InputText("codecentrix");

// Find search button and click it.
IElementPtr pSearchBtn = pBrowser->FindElement("input submit", SearchCondition("text=Google Search"));
pSearchBtn->Click();

// Find the DIV element where the result are displayed.
IElementPtr pResultDiv = pBrowser->FindElement("div", SearchCondition("id=res"));

// Get all found results and print them in console.
IElementListPtr pResultList = pResultDiv->FindAllElements("h3", SearchCondition());

// Display only the header result (text and url).
for (int i = 0; i < pResultList->length; ++i)
{
    // Get current H3 in the list.
    IElementPtr pCrntResult = pResultList->Getitem(i);

    // Find first and only anchor inside H3
    IElementPtr pCrntAnchor = pCrntResult->FindElement("a", SearchCondition());
    CComQIPtr<IHTMLAnchorElement> spCrntAnchor = pCrntAnchor->nativeElement;

    // Get URL from IHTMLAnchorElement.
    CComBSTR bstrURL = "";
    spCrntAnchor->get_href(&bstrURL);

    // Display results.
    wcout << pCrntResult->text << L"\n" << bstrURL.m_str << L"\n\n";
}

Download:

Tuesday, May 19, 2009

IE Web Login Automation

One highly repetitive web task is the logon to a web site. This is a common scenario where Twebst Web Automation Library really shines. Here is a short web macro written in JScript language that automatically logs you on Yahoo Mail site. All you have to do is to replace "UUUUUUUUUU" and "PPPPPPPPPP" with your user name and password in the code below.

// Open a browser and navigate to yahoo mail login page.
var core = new ActiveXObject("Twebst.Core");
var browser = core.StartBrowser("https://login.yahoo.com/config/mail?.intl=us");

// Find login fields.
var u = browser.FindElement("input text");
var p = browser.FindElement("input password");
var s = browser.FindElement("input submit");

// Log on to site by filling the user-name and password fileds and then click submit boutton.
u.InputText("UUUUUUUUUU");
p.InputText("PPPPPPPPPP");
s.Click();

FindElement searches thru all frames/iframes hierarchy for the first input element of type text/password/submit. Additional conditions can be specified for search (like searching an element by id/name or any other HTML attribute). Search conditions can make use of regular expressions if needed.

One more important thing is that FindElement method waits for the web page to be completely loaded before searching the element (the timeout can be specified by using core.loadTimeout property). Read more about Twebst Library...

Download:

Monday, May 18, 2009

Twebst Web Automation Library v1.40 released

Twebst version 1.40 is launched!
Main changes include IE8 compatibility, better support for working with embeded IE browser control, support for modal and modeless HTML dialogs and functions for clipboard access.

Here is the list of new features and enhancements:
- NEW: IE8 is now supported
- ENH: core.AttachToNative* methods work now with hosted IE browser control
- BUG: various fixes
- NEW: core.foregroundBrowser property
- NEW: core.productName property
- NEW: core.productVersion property
- NEW: core.GetClipboardText method
- NEW: core.SetClipboardText method
- NEW: core.AttachToWnd method
- NEW: core.NativeWindowToNativeBrowser method
- NEW: core.NativeWindowToNativeDocument method
- NEW: core.NativeWindowToNativeDocument
- NEW: browser.FindModalHtmlDialog method
- NEW: browser.FindModelessHtmlDialog method
- NEW: element.GetAttribute method
- NEW: element.SetAttribute method
- NEW: element.RemoveAttribute method
- NEW: element.tagName property
- NEW: element.FindParentElement method
- NEW: core.RightClick method-
- Find more ...

Free Download Twebst Library 1.40

Wednesday, April 29, 2009

Homemade Handcrafted Help System

A good documentation is very important for any serious project. It comes a moment in life, when programmers find themselves working on the help system. That is what happened to me during the later stages of Twebst Web Automation Library project. Even though this project is not open source, I try to make public as many parts of it as possible. Today I will present Twebst Help System and how this tedious and annoying task of creating it, was automated.

Twebst is a library of COM objects used to automate Internet Explorer browser. The objects and the supported properties and methods have to be documented . The page structure is the same for every object/method/property and it also contains code samples that need syntax highlighting. This is good news because it leaves a lot of place for templates and automation.

Here is the solution:
  • The template is an XML document. When documenting an object/method or property the focus is on the content rather than on formatting the text. There is one XML file for each object/method/property.
  • A WSH script written in jscript parses the XML document and adds syntax highlighting to sample code in the documentation page. Regular expression are used for parsing.
  • cross references are added automatically by the same script.
  • then a XSL transformation is applied to convert XML source to a HTML document that will be eventually written to disk.
  • The whole process is optimized by removing unnecessary operations like generating the HTML when it already exists and is newer than its XML source.
  • Finally the HTML documents refers a CSS style sheet to easily change the look.

It goes like this:
XML + JScript-> XML with color syntax and cross references + XSL -> HTML + CSS -> CHM

For local help, the CHM compiler is invoked as a final step and a CHM Help File is generated. All you have to do is launching Build.js script you may find in the archive below.


Downloads: TwebstHelp.zip

Prerequisites: In order to build the CHM file you'll need HTML Help Workshop from Microsoft.

Wednesday, March 18, 2009

SQL CE 3.5 SP1 and MSVCR80

I am involved in Web Replay (the best password manager on the market) development from its inception. Recently, we have upgraded SQL CE engine to the latest 3.5 SP1 version. Web Replay is built against Visual C++ runtime version 9 which is deployed by including Microsoft Visual C++ 2008 Redistributable Package (x86) in the setup. But that was not enough because Microsoft SQL CE 3.5 SP1 depends on Visual C++ runtime version 8.0 which is NOT included in the redistributable kit!


If you encounter this problem you have no option but to include vc2005 redist in your setup. You could think that giving up the upgrade to 3.5 version and continuing using SQL CE 3.1 is an option. Well, it isn't! SQL CE 3.1 is crashing randomly on Windows XP SP3 (even Microsoft SQL Server Management Studio is reporting some memory corruption error before completely freezing).

Friday, February 20, 2009

RegisterHotKey in C#

I like .Net framework for its huge class library covering almost anything you'll ever need. One thing I could not find was registering a windows hot-key in C#. It seems the framework does not provide a solution for system-wide hot keys. So I had to use the good old RegisterHotKey Win32 function.

To call it from C#, you need to use P/Invoke service to invoke the unmanaged function residing in user32.dll

using System.Runtime.InteropServices;

[DllImport("user32.dll")]
private static extern bool RegisterHotKey(IntPtr hWnd, int id, int fsModifiers, int vk);

The hWnd parameter is the handle of a form that will receive WM_HOTKEY message. To get the handle of a Form simply use Handle property. In order to process WM_HOTKEY message, your form needs to override WndProc method.

Just one more thing you should know: RegisterHotKey and ShowInTaskBar property don't mix well. It seems that a new window is created each time ShowInTaskBar property is called so your WndProc mehod will stop recieving hot-keys. You need to RegisterHotKey again because Handle property returns a new window handle after ShowInTaskBar was called.

Monday, January 05, 2009

WSH and clipboard access

I did some Windows Script Host programming recently and I was pleasantly surprised by its power, features and flexibility. One thing that I couldn't accomplish was accessing the clipboard from WSH. Digging the internet I found some solutions like this one based on Internet Explorer Automation. There are several problems with this approach as you can read in my article about Internet Explorer Automation: What's wrong with Internet Explorer Automation?

My solution for scripting the clipboard content in WSH is a regular COM object created with VC++ and ATL.

Download full source code and compiled DLL: WSH_clipboard.zip
To install the COM object run register.bat

I found scripting the clipboard useful enough to add this feature to the next release of Twebst Web Automation Library.

Monday, December 15, 2008

Free Web Macros for Internet Explorer

As I presented in my previous post, automating Internet Explorer can be a difficult task.
Twebst Web Automation Library can make things easier.

It gives full programmatic control over the Internet Explorer browser. Twebst is a library of COM object that can be used within any environment that supports COM, from scripting languages (JScript, VB Script) to high level programming languages (C#, C++). For more information, see Twebst Libray Online Documentation. And yes, it's free!

Get it FREE!

What Twebst can do?

  • increase productivity by automating repetitive web tasks
  • automate regression testing of web applications
  • automate web actions and data-entry
  • automatically log in to different web sites
  • fill out web-forms automatically
  • extract data from web pages (web scraping).
  • monitor web pages

Twebst features

  • Start new browsers and navigate to a specified URL.
  • Connect to existing browsers.
  • Search and access HTML elements and frames inside browsers.
  • Intuitive names for HTML elements using the text that appears on the screen.
  • Advanced search of browsers and HTML elements using regular expressions.
  • Perform actions on all HTML controls (button, combo-box, list-box, edit-box etc).
  • Simulates user behavior generating hardware or browser events.
  • Get access to native interfaces exposed by Internet Explorer so you don't need to learn new things if you already know IE web programming.
  • Synchronize web actions and navigation by waiting the page to complete in a specified timeout.
  • Available from any programming or script language that supports COM
  • Optimized search methods and collections.

Wednesday, December 10, 2008

What's wrong with Internet Explorer Automation?

The Microsoft Office products (Word, Excel, Power Point, Access, Outlook) allow their users to manipulate Office documents from Visual Basic or Visual Basic for Applications (VBA) code. It is possible to write a VBA macro in Excel that initializes a series of cells, and uses the cells to display a chart for instance.

Automation is the process of controlling one product from another product with the result that the client product can use the objects, methods, and properties of the server product. The client has access to the object model of the server.

Though Internet Explorer browser is not part of the Office suite, it supports automation. Here is a short sample:

// Create an IE automation object.
var ie = new ActiveXObject("InternetExplorer.Application");

// Make it visible and navigate to a given URL.
ie.Visible = true;
ie.Navigate("http://www.google.com/");

// Give it some time to load the page and then get the document.
WScript.Sleep(3000);
var doc = ie.Document;

// Fill out search field.
var edit = doc.getElementsByName("q").item(0);
edit.value = "codecentrix";

// ... and press the submit button.
var submit = doc.getElementsByName("btnG").item(0);
submit.click();


Here is ie_auto.js file for download.
However there are problems with Internet Explorer automation:
  • it may not work at all on Windows Vista unless the script is running at the same integrity level as iexplore.exe process. Simply clicking the js file won't do it. The script will run at medium integrity level and Internet Explorer has low integrity level and as result the script fails. If you run the script at high integrity level the newly started IE instance will have the same high integrity level and the script works (but this is not the best option from a security point of view). Changing the integrity level of the running script (or application) is not always the most desirable or easiest thing to do.
  • no support to "connect" to already existing IE documents.
  • difficult search of elements across all sub-documents inside frames/iframes (and sometimes impossible, see the point above).
  • difficult and time consuming search of HTML elements on attributes other than id or name (getElementById and getElementsByName are the only methods I know that search elements directly wihtout browsing element collections which might be very slow when performed out of process).
  • no direct support for synchronizing input actions (clicks, keys) with the HTML document loading (it could be implemented by registering to IE events like document complete or looping while the browser becomes ready to accept inputs).
  • no advanced search criteria like regular expression or searching on multiple attributes.
If you are interested in solving the issues above, let me introduce a project I've been working on for some time now. Here's Twebst, web automation library for Internet Explorer!

Get it FREE!


(to be continued)

Tuesday, November 25, 2008

Creating shortcuts to Quick Launch Toolbar with WSH

I had this problem of creating shortcuts to Quick Launch Tollbar while working on Script Of The Day application. This small product is almost entirely created using JScript and Windows Scripting Host (WSH).

SpecialFolders method of WScript.Shell object provides the full path for some special folders like Desktop and Favorites but the directory for Quick Launch Toolbar is not supported. To get it I used %userprofile% env var like this:
var shell          = WScript.CreateObject("WScript.Shell");
var quickLaunchDir = shell.ExpandEnvironmentStrings("%userprofile%") +
"\\Application Data\\Microsoft\\Internet Explorer\\Quick Launch";
var oShellLink = shell.CreateShortcut(quickLaunchDir + "\\Codecentrix.lnk");

oShellLink.TargetPath = "http://www.codecentrix.com/";
oShellLink.IconLocation = "http://www.codecentrix.com/favicon.ico";
oShellLink.WindowStyle = 1;
oShellLink.Description = "Web Site";
oShellLink.Save();
Downloads:

Saturday, August 09, 2008

focus vs fireEvent("onfocus")

While working on Twebst web automation library I encountered this problem: how to simulate setting the focus on HTML edit controls in Internet Explorer? There are two ways to do this.

  1. Call IHTMLElement2::focus() method on target element that "causes the element to receive the focus and executes the code specified by the onfocus event".
  2. Rise onfocus event on target element by calling IHTMLElement3::fireEvent() method.

The two approaches are quite similar but there are some interesting differences.

  1. fireEvent("onfocus") does not actually set the focus on the element, it just executes the code of the onfocus handler event.
  2. Calling focus method sets the focus on target element and call the onfocus event handler but not immediately. The onfocus event seems to be inserted in a queue and its handler is executed asynchronously after the current handler is finished.
  3. If focus method is called from inside the onfocus handler nothing happens if the control already has the focus (that prevents an infinite recursion).

Example:


<html>
<script type="text/javascript" language="javascript">
function BtnFocusClick()
{
     document.getElementById('editTest').focus();
     window.status += "b";
}

function BtnOnFocusClick()
{
     document.getElementById('editTest').fireEvent('onfocus');
     window.status += "c";
}

function EditOnFocus()
{
     window.status += "a";
}
</script>

<body>
     <input type="text" onfocus="EditOnFocus()"; id="editTest"/><br/>
     <input type="button" value="focus" id="btnFocus" onclick="BtnFocusClick();"/>
     <input type="button" value="fire onfocus" id="btnOnFocus" onclick="BtnOnFocusClick();"/>
</body>
</html>

If pressing the button "fire onfocus" button the message in the Internet Explorer status bar is the expected one "ac". If pressing the "focus" button, the message is in reverse order than expected: "ba". That suggests that EditOnFocus handler is called after BtnFocusClick exit.


Thursday, June 19, 2008

IHTMLDocument3::getElementsByTagName and IHTMLElementCollection

A common task when writing Internet Explorer extensions is to browse a collection of objects based on a specified element tag-name. To work with collections IE provides IHTMLElementCollection interface that represents a collection of elements in an HTML document.

Usually a collection is retrieved by calling methods of IHTMLDocument2 interface. For some tag-names there specialized methods to retrieve collection (IHTMLDocument2::get_anchors , IHTMLDocument2::get_applets, IHTMLDocument2::get_forms, IHTMLDocument2::get_images, IHTMLDocument2::get_links, IHTMLDocument2::get_scripts).

To get all elements collection there is IHTMLDocument2::get_all.
One way to get a collection of elements having a specified tag-name is:
// CComQIPt<IHTMLDocument2> spDocument is a document object.

CComQIPtr<IHTMLElementCollection> spAllCollection;
HRESULT hRes = spDocument->get_all(&spAllCollection);
_ASSERTE(SUCCEDED(hRes) && (spAllCollection != NULL));

// Get the sub-collection of elements that have the "input" tag name.
CComVariant varTagName(CComBSTR("input"));
CComQIPtr<IDispatch> spDispCollection;
hRes = spAllCollection->tags(varTagName, &spDispCollection);

CComQIPtr<IHTMLElementCollection> spInputCollection = spDispCollection;
_ASSERTE(spInputCollection != NULL);

// Now you can browse spInputCollection using
// IHTMLElementCollection::item and IHTMLElementCollection::get_length methods.

The second method is:
// CComQIPt spDocument is a document object.
// Query for IHTMLDocument3 interface
CComQIPtr<IHTMLDocument3> spDoc3 = spDocument;
_ASSERTE(spDoc3 != NULL);

// Get the collection of elements that have the "input" tag name.
CComQIPtr<IHTMLElementCollection> spInputCollection;
HRESULT hRes = spDoc3->getElementsByTagName(CComBSTR("input"), &spInputCollection);
_ASSERTE(SUCCEDEED(hRes));

// Now you can browse spInputCollection using
// IHTMLElementCollection::item and IHTMLElementCollection::get_length methods.


Monday, May 19, 2008

ATL thunks and Windows DEP story

When using older ATL versions the program may generate an access violation due to Data Execution Prevention. This is basically a memory protection feature that prevents executing code from memory pages marked as non-executable.

The ATL implementation of CWindow class uses a technique called thunk. A thunk is a small piece of code that ATL generates in a region of memory allocated on the heap. Older versions of ATL did not set the execution flag on the allocated memory pages and that generates a crash on Windows machines where DEP is enabled.

Usually ATL is used in Internet Explorer extensions. If DEP is enabled then the result is a browser crash. This is a good reason to upgrade to Visual Studio 2005 or later. Find more on how to activate DEP in IE7 here.

Monday, April 21, 2008

ATL + STL = CAdapt

I didn't know about CAdapt class until I tried to convert a VS2003 project to VS2005. Everything worked OK but I couldn't compile it! The compiler complained about:
std::list<CComQIPtr<IHTMLElement> >.
I found that what was accepted by VC++ 2003 compiler is not accepted by VC++ 2005 compiler. The reason of for is the address operator overloaded by CComQIPtr class. std::list class needs the address of a CComQIPtr object but & operator returns a IHTMLElement* address.

Here's where CAdapt class comes to save us. The list of smart pointers becomes: std::list<CAdapt<CComQIPtr<IHTMLElement> > >
Also you need to use m_T member where needed.

Tuesday, April 15, 2008

Did you know that ? fatal error C1091: compiler limit

It doesn't happen every day to find a new compiler error message. Here's one I didn't expect:

fatal error C1091: compiler limit: string exceeds 65535 bytes in length

I was about to complete my programming task when this error struck and I really needed a very long string. On VC++ 2003 compiler the limit is even smaller, 16 Kb.

The conclusion of this story: don't put the whole story of your life inside a C++ string constant!

Monday, March 31, 2008

.Net and COM interop story

.Net allows programmers to reuse COM components in their managed code. To make this possible a managed wrapper object around the native object is needed. Besides that, one can use the COM object like any other managed object. Even if it sounds simple, you have to be aware of the differences between the CLR's object lifetime management and the COM version of object lifetime management.

COM programmers have to call Release on every interface that has been AddRef'ed. For C# programmers using COM objects that means AddRef is called when:
- a COM object is created.
- a COM object is returned by calling a method or a property.
- a COM object is cast'ed to another COM interface type.

To release a COM object in C# there are two options:
- leave the GC to collect managed wrappers and to call their finalizers that will call Release on native COM object.
- manually call Marshal.ReleaseComObject on every interface used in the code.

Let's see a short example using COM objects exposed by IE. The code bellow changes the color of every link in a HTML document.

// IHTMLDocument2 doc;
foreach (IHTMLElement elem in doc.all)
{
IHTMLAnchorElement anchor = elem as IHTMLAnchorElement;
if (anchor != null)
{
elem.style.color = "red";
}
}
This first approach leaves the task of releasing COM objects to garbage collector. Let's manually release COM objects now:

// IHTMLDocument2 doc;
IHTMLElementCollection allCollection = doc.all;
foreach (IHTMLElement crntElem in allCollection)
{
IHTMLAnchorElement anchor = crntElem as IHTMLAnchorElement;
if (anchor != null)
{
IHTMLStyle style = crntElem.style;
style.color = "red";

Marshal.ReleaseComObject(style);
Marshal.ReleaseComObject(anchor);
}

Marshal.ReleaseComObject(crntElem);
}

Marshal.ReleaseComObject(allCollection);

As you can see the number of code lines doubles! I personally prefer to leave the task of releasing COM objects to GC even if they will be eventually released after some time when GC comes into action.

Some might be tempted to call GC.Collect after a large chunk of code that work with COM objects but this could be even worse because other managed objects could be promoted to next GC generation and their lifespan is therefore longer than necessary.

In theory it is possible to create a lot of large COM objects that will exceed the native heap while the managed heap has a lot of available memory because managed wrappers are smaller in size. GC won't be called in this scenario so the native heap won't be freed.

If your application suffers from this kind of memory allocation problem, maybe using COM objects from managed code is not the best approach for you.

Monday, February 25, 2008

The game of programming. Programming the game.

The first computer I've ever seen was a Z80 Spectrum. That was back in 1988. Like any kid of my age I was amazed by computer games. I started to learn programming with the hope that one day I will create my own game.

That happened after many years, in 1999 and here's the result. I wrote these two little games just to learn some Java language. Now looking back at this old code of mine, it's nice to see that I wasn't that bad after all :-)

Source code:
Click on images below to play!


Saturday, February 02, 2008

When IHTMLWindow2.document throws UnauthorizedAccessException

This is basically a C# translation of one of my older articles "When IHTMLWindow2::get_document returns E_ACCESSDENIED". Some .Net people encountered difficulties to use it, so I decided to make their life easier.

The main problem is the confusion created by System.IServiceProvider .Net interface because it has the same name as the COM interface. Once this issue is passed the code translation is straightforward. Here's the interop code to declare the COM interface IServiceProvider.
// This is the COM IServiceProvider interface, not System.IServiceProvider .Net interface!
[ComImport(), ComVisible(true), Guid("6D5140C1-7436-11CE-8034-00AA006009FA"),
InterfaceTypeAttribute(ComInterfaceType.InterfaceIsIUnknown)]
public interface IServiceProvider
{
[return: MarshalAs(UnmanagedType.I4)][PreserveSig]
int QueryService(ref Guid guidService, ref Guid riid, [MarshalAs(UnmanagedType.Interface)] out object ppvObject);
}
You find here full source code of the sample assembly.

This technique was successfully implemented and tested in Twebst web automation library.