Getting Started

 


Register GetWord Library

GetWord library must be registered before usage. On Windows Vista, register GetWord library according to the directions showed in Pic.1 and Pic.2.
On Windows 2000/XP/Server 2003, run register.bat to register the library directly.

   

Pic.1

Pic.2

Basic Usage

1. Calling Interfaces

GetWord supports two calling interfaces: ActiveX Calling Interface and Raw Dll Calling Interface.

ActiveX Calling Interface is the SUGGESTED calling interface, which is more concise and is very easy to integrate into your application. The samples  using ActiveX Calling Interface can be found in Samples\ActiveX_Demo.

Raw Dll Calling Interface is DEPRESSED, which is kept only for backward compatibility purpose. If your development environment does not support ActiveX (such development environment is very rare nowadays), you can use Raw Dll Calling Interface. The samples  using Raw Dll Calling Interface can be found in Samples\Dll_Demo.

 

2. Text Capturing Modes

 GetWord supports three text capturing modes: Point Text Capturing, Rectangle Text Capturing and Selected (Highlighted) Text Capturing.

2.1  Point Text Capturing

 In this capturing mode, you need to pass in a point where you want to capture text from, and GetWord will return two items for each capturing:
    1. the total string on the line where the given point located at
    2. the point position (zero-based) in the string

Say, if you want to capture the text at the mouse cursor. Let the line text is " Many people use Google to search things. Google is a great searching engine." If you put the mouse cursor on the 'o' of the second Google, GetWord will return:
    1. the total string: Many people use Google to search things. Google is a great searching engine.
    2. the cursor position: 42
In our demo program (You can find it at http://www.textcapture.com), you will find the two items as 'All Text' and 'Cursor Pos' respectively. 'Cursor Text' is calculated according to the total string and the cursor position.

Generally, if you want to capture the complete word or phrase at a given point, you need a dictionary which contains the words or phrases you want to recognize. When you get the total string and the point position in the string, you can look up the dictionary to determine what should be returned. For some simple cases, you can use regular expression instead of a dictionary to judge what should be returned directly. Say, if you want to use GetWord to recognize the phone number on web pages. Let the phone number string is "Our phone number is +86-10-80906058", you put mouse cursor on any letter of the string '+86-10-80906058' and want to get the phone number '861080906058'. This can be done with a simple regular expression searching program. We offered a such demo - Samples\PlugIn_Demo\Token, which could can the English or Chinese word at the given point.
  
2.2  Rectangle Text Capturing
 
In this capturing mode, you need to pass in a window handle and a rectangle region where you want to capture text from, and GetWord will return all the strings in the rectangle.
 
There are two API functions you can use: GetRectString and GetRectStringPairs. GetRectString returns all the strings in the given rectangle using an internal formatter of GetWord. If you want to control the text output or monitor a sub region of the rectangle, you can use GetRectStringPairs. GetRectStringPairs returns all the sub strings and their corresponding sub rectangles in the given rectangle. You can format the text or monitor a specific sub region based on the sub strings and their corresponding sub rectangles returned by GetRectStringPairs.
 
2.3  Selected (Highlighted) Text Capturing
 
In this capturing mode, you need to pass in a window handle, and GetWord will return the selected (highlighted) text in the window.
 
     

3  Enable Text Capturing in Adobe Acrobat/Acrobat Reader

Copy "GetWord.api" into the plug_ins folder of Acrobat or Acrobat Reader to enable the text capturing feature in them.
  The default path for Acrobat: C:\Program Files\Adobe\Acrobat 7.0\Acrobat\plug_ins
  The default path for Acrobat Reader: C:\Program Files\Adobe\Acrobat Reader 7.0\Reader\plug_ins

In fact, you can find the installation path of Adobe Acrobat or Adobe Acrobat Reader in the registry, so that you can copy the plugin file - GetWord.api automatically in your program. The installation path in registry is as follows:
  Adobe Acrobat:  HKEY_LOCAL_MACHINE(HKEY_CURRENT_USER)\SOFTWARE\Adobe\Adobe Acrobat\[VERSION_NUMBER]\InstallPath
            [VERSION_NUMBER] maybe 7.0 or 8.0, etc, depending on your Acrobat version

  Adobe Acrobat Reader:  HKEY_LOCAL_MACHINE(HKEY_CURRENT_USER)\SOFTWARE\Adobe\Acrobat Reader\[VERSION_NUMBER]\InstallPath
            [VERSION_NUMBER] maybe 7.0 or 8.0, etc, depending on your Acrobat Reader version
 

For your convenience, we have created a program named "install_plugin.exe" to do this. You can find it in "GetWord Library" folder. When you run "install_plugin.exe", it will copy GetWord.api into the appropriate plug_ins folder of Acrobat or Acrobat Reader automatically.

NOTE: PDF capturing works when the captured text could be selected normally (you need not really select the text when capturing). If the text could not be selected normally, such as some encrypted pages and interactive pages, the capturing could not work properly.

 

4  Run GetWord On Windows Vista

On Windows Vista, you need to run your application which incorporates GetWord as an administrator. (Pic.1). You can do this according to the following directions:
1. Right click on the main executable file of your application.
2. Select the "Properties" item on the popup menu.
3. Select "Compatibility" tab on the popup dialog.
4. Check the item "Run this program as an administrator", and click "OK" button.

In fact, your application based on GetWord still works correctly for most of the software if your application does not run as an administrator. But for the software which run as an administrator, your application based on GetWord could not capture any text from them because of the security reason of Vista. In this case, you need to run your application as an administrator to capture text from these software.

Pic.3

 

Copyright © 2005-2018 Ruling Technology Co., Ltd. All rights reserved. Terms of Use | Privacy Statement