Download Image

The Download Image command extracts an image from a web page. The command will download an image, and then save it to the file system, send it to a database, or embed it into Excel - depending on your chosen export target. The web selection path for this command normally points to the image itself, but it could also point to a web element that contains a URL that links to the image.

The figure below shows the Command Properties panel after choosing Download Image from the New Command drop-down:

Command Configuration

The Common tab in the Configure Agent Command panel has three tabs:

File URL - contains the URL for the image.
File Name - contains the name of the downloaded image.
OCR tab - check the box if you want to convert an image into text.

We explain the details of each below.

File URL

The entry in this tab determines the specific URL for the image, and the agent uses this URL to download the image at run time. You can choose the HTML attribute that the command should extract to get the image URL. The default value is the Image URL, which extracts the src HTML attribute.

Click the Transformation Script button click to enter a Regular Expression or write a .NET script that will transform the image URL to meet your requirements. This is often useful when you want to extract a large image, but it's easier to select a corresponding thumbnail image. You may then be able to transform the thumbnail image URL into the large image URL, and the agent will then use the transformed image URL to download the large image.

See the Content Transformation Script topic for more information.

You can choose the HTML attribute that the command should extract to get the image URL. The default value is the Image URL, which extracts the src HTML attribute.

File Name

The entry in this tab contains the file name. From the drop-down menu, you can choose the HTML attribute that you want to use as the name.

Click the Transformation Script button to enter Regular Expressions or write a .NET script that will transform the image name to meet your requirements. See the Content Transformation Script topic for more information.

Use the Data Value option to specify that an agent data value will be used as the file name. The agent data can come from a data provider, an input parameter, or captured data.

Use the Detect File Extension option to specify if the agent should try and detect the file type of the downloaded document, or if a transformation script or a data value will provide a file name that includes a file extension.

OCR

Using the OCR tab, you have the option to convert the image into text. For example, you might need to convert CAPTCHA images into text so that the agent can bypass CAPTCHA blocking websites. See the topic CAPTCHA Blocking for more information.

This tab gives you the option to export both the image file and the converted text, or just the converted text. To convert the image, you'll need to check the box and then enter a script to call an external OCR service. Sequentum Enterprise does not include an OCR feature but allows you to integrate with 3rd party services by using this script feature. For more information, see the topic Image OCR Scripts.

Data Fields

If the agent is saving the image to a database, then by default this command will generate two data fields: one for the image data and another for the name of the image. If the agent is saving the image to the file system, the command will generate only one data field containing the full file path to the image. The command property Export URL can be used to also generate a data field that contains the image URL.

Command Properties

Capture

Act as System Value: Acts as a system value that is guaranteed to be present, and does not participate in an empty data row check. The default value is set to False.

Allow Empty Value: The default value is set to True which allows empty or missing values. Allow Empty set to False indicates that it will not allow null values.

Always Update Design Value: This property value set to True indicates that the design value is updated whenever possible, and not just when editing the command. The Default value is set to False.

Change Tracking: The default value is set to ‘Include’ specifies the captured content will cause change tracking to record a change. Change Tracking value set to Exclude indicates that capture commands can be excluded from change tracking, so if the captured data changes, it will not cause the last change status for the data entry to change.

Create Index: Creates an index in the internal database for the column holding this content. This can improve performance when a duplicate check is performed on this content.

Data Consumer: Specifies the input data to use when processing this command.

Captured Data Command: Specifies the previously captured data column name which you want to use as input data.
Data Source: The source of the data consumed.
Data Transformation Script: Data transformation script. The default value is set to Disabled which is reflected by the "Enabled" property value False. If you want to enable the data transformation script then you need to set to "Enabled" property value as True.
Input Parameter Name: Specifies the input parameter name to use.
Provider Column Name: Specifies a column from the data source that should provide the data to this command. specifies a command that provides data to the agent. A command can provide data to itself.
Provider Container: Specifies a command that provides data to this command.

Data Format: This property specifies the data format of the captured content.

Data Format Style: This property specifies the style of data format for captured content. The default value is set to ‘None’.

Data Type: The data type of captured content.

Short Text: All content will be captured as Short Text by default. Short Text content can be up to 4000 characters long.
Long Text: Long Text content can be any length, but cannot always be used in comparisons, so you may not be able to include Long Text content in duplicate checks.
Integer: A whole number.
Float: A floating-point number.
Date/Time: A date and/or time value.
Boolean: A value that can be true or false. Boolean values are stored as 1 or 0 integer values.
Binary: A variable-length stream of binary data ranging between 1 and 8,000 bytes.
Big Integer: A 64-bit signed integer.
Decimal: Represents a decimal floating-point number. A fixed precision and scale numeric value between -10 38 -1 and 10 38 -1.
GUID: A globally unique identifier (or GUID).A GUID is a 128-bit integer (16 bytes) that can be used across all computers and networks wherever a unique identifier is required. Such an identifier has a very low probability of being duplicated.
Document: The captured data is a document in binary form. This can be used in capture commands that stores a downloaded document from the web.
Image: The captured data is an image in binary form. Can be used in capture commands that store a downloaded image from the web.
Temporary: The captured data is not stored in the internal database, and also not exported. Can be used as temporary storage during agent run time.

Date/Time Conversion: This property specifies the possible options one can choose to allow for date/time conversion. for e.g. If we set AssumeLocalTime, then it explicitly assumes as local time i.e. whatever time is defined in the field using the script (UTC NOW/Universal or Only NOW/Local). On the other hand, If we set Universal LocalTime, then it explicitly assumes universal time i.e. whatever time is defined in the field using the script (UTC NOW/Universal or Only NOW/Local).

Decimal Precision: Specify decimal Precision. The default value is 19.

Decimal Scale: Specify decimal Scale. The default value is 4.

Design Value: The value to use for this capture command in the agent editor. This value can be important when testing scripts in the editor if the scripts depend on captured data.

Key Column: The captured content is used to identify a data entry if this option is set to true.

Make Data Available to Parent Commands: Copies the extracted data to all parent data rows, making the data available to parent commands executed after this command.

Max. Data Length: The maximum length of the captured content when using the Short Text data type. The maximum possible length depends on the chosen database. The default value is set to 4000.

Raise Validation Error: Default value is set to TRUE adds a page load error if value validation fails.

Transformation Script: A script used to transform the captured content.

Use Data Value: Captures a data value instead of a property of the selected web element. Web selection is ignored when this option is true.

User Defined Design Value: This property value set to True indicates the design value for this capture command is user-defined instead of set automatically when the command is saved.

Validation Time: The default value is set to Runtime specifies when data validation will take place. If you want to data validation at an export time instead of Run time then you can set this property value as “Export”.

Command

Command Description: A custom description for the command. The default value is Empty.

Command Transformation Script: A script used to change command properties at run time. The default value is disabled.

Disabled: This property set to True allows the user to disable the command. A disabled command will be ignored. The default value is set to False.

ID: This property indicates the internal unique ID of the command and is always auto- generated e.g. 58c8e4ac-e4c0-48f7-a63d-77064945380b.

Increase Data Count: This property indicates the data count every time this command is processed. The default value is set to False. Set it to TRUE if you want to get the count of the number of times a specific command is executed to get the data. The data count value is increased during data extraction, so it is used to measure agent progress and basis this increased data count, the agent decides the success criteria.

Name: This property specifies the name of the command.

Notify On Critical Error: A notification email is sent at the end of an agent run if the command encounters a critical error, and the agent has been configured to notify on critical errors. Critical errors include page load errors and missing required web selections. The default value is set to False.

Convert Image

Convert to Text: Uses a script to convert the downloaded image to text and exports the text. This can be used for CAPTCHA processing.

Export Converted Image: Exports the converted image in addition to the text.

Image Conversion Script: A script used to transform an image into text.

C# Script: Specifies C# script.
Enabled: To use the Script we need to set this property as True. The default value is set to False, which indicates that the script is disabled.
Library Assembly File: The name of a custom assembly file when "Use Default Library" is set to false.
Library Method Name: The method to execute when using the default script library.
Library Method Parameter: A custom parameter passed to the script library method.
Python Script: Specifies Python script.
Regex Script: Specifies Regex script.
Script Language: Specifies the scripting language which you want to use e.g C#, VB.NET, Python, Script Library, Regular Expressions.
Template Name: The template name of the referenced template.
Template Reference: Loads this script from a template when the agent is loaded.
Use Default Library: Uses the default script library when Script Language is set to Script Library.
Use Selection: The script is provided with the selected web element. The script will not be provided with the selected web element if this value is False.
Use Shared Library: Uses a script library that is shared among all agents.
VB.NET Script: Specifies VB.NET script.

Service Provider: Specifies the service provider used to convert the image to text.

Service Provider Key: Specifies the key if the conversion provider requires a key.

Service Provider Password: Specifies the password if the conversion provider requires a password.

Service Provider Username: Specifies the username if the conversion provider requires a username.

Thumbnail: Default value is ‘No Thumbnail’ which is indicated by the “Convert To Thumbnail” value False. To enable the Thumbnail, we set the “Convert To Thumbnail” value True which converts the downloaded image into the thumbnail.

Convert To Thumbnail: Specifies whether downloaded images should be converted to thumbnail. The default value is set to False.
Maximum Height: Specifies the Maximum Height of a thumbnail. The default value is 100.
Maximum Width: Specifies the Maximum Width of a thumbnail. The default value is 100.

Debug

Debug BreakPoint: Debugging will break at this command if the breakpoint is set. The default value is set to False.

Debug Disabled: A disabled command will be ignored during debugging. The default value is set to False.

Debug Error Option: This property specifies what action to take when an error occurs in the debugger. The default value is set to Notify which indicates that when an error occurs at debugging time, then it will be notified. If we want to ignore the error at debug time, then we need to set this property value as Ignore.

Export

Excel Column Width: Specifies the width of the data column holding the captured data when exporting to Excel or PDF. The default value is 150.

Excel/PDF/CSV Column Format: Specifies the format of the data column holding the captured data when exporting to Excel, PDF or CSV. For Excel and PDF this format string is the same used in Excel under Custom format when formatting a cell. For CSV this is a standard .NET format string. This is useful in cases where one needs to apply particular format like NUMBER, DATE, CURRENCY etc.
In addition, it is to be noted that when the Export target is set to anything other than Excel, CSV or PDF, any value under this property will not come into play.

Export Enabled: A command with Export Enabled set to false will not save any data to data output. Default value is set to True indicates that data will be output.

Merge Rows Method: When the parent list Container command option "Export Method" is set to "Add Columns And Merge Rows", this option specifies how to combine row values.

Merge Rows Value Separator: When "Merge Rows Method" is set to "Concatenate", this separator is used to separate the extracted values.

Sort Order: Specifies the order in which the column is listed when exporting to a file format.

File Capture

Auto Detect File Extension: Default value is set to True automatically detects the file extension of the downloaded file. Clear this option if you want a filename transformation script to set the file extension.

Download Timeout: The maximum amount of time waiting for a file to download (milliseconds). Default value is 50000 milliseconds.

Export URL: This property set to True Exports the URL along with the file. Default value is set to False which do not export the URL along with the file.

File Name Attribute: The web element attribute to use as file name for the downloaded file.

File Name Column Name: Specifies the name of the export column containing the file name. A default column name will be used if this property is empty.

File Name Data Consumer: Specifies the input data to use when "Use Data Value as File Name" is set to true.

Captured Data Command: Specifies the previously captured data column name which you want to use as input data.
Data Source: The source of the data consumed.
Data Transformation Script: Data transformation script. Default value is set to Disabled which is reflected by "Enabled" property value False. If you want to enable the data transformation script then you need to set to "Enabled" property value as True.
Input Parameter Name : Specifies the input parameter name to use .
Provider Column Name: Specifies a column from the data source that should provide the data to this command. specifies a command that provides data to the agent. A command can provide data to itself.

File Name Design Value: The value to use for the file name capture in the agent editor. This value can be important when testing scripts in the editor if the scripts depend on captured data.

File Name Transformation Script: A script used to transform the file name attribute used to name the download file.

Fixed File Extension: Adds a fixed file extension e.g. jpeg, jpg, or gif to the downloaded file.

Try Internet Cache: Retrieves the file from the Internet cache if it exists, instead of downloading it. This can be useful for some CAPTCHA images where the image is first downloaded by a web browser and the website does not allow a second download.

URL Column Name: If the URL is exported, this property specifies the name of the export column containing the URL. A default column name will be used if this property is empty.

Use Data Value as File Name: Uses a data value as file name instead of an attribute of the selected web element.

Use Original File Name: Uses the file name of the document when possible.

HTML Capture

Concatenate Content Separator: The separator such as comma, pipe etc. to use between content from multiple web elements. This property is only applicable when "Concatenate Multiple Web Elements" is set to True.

Concatenate Multiple Web Elements: Concatenates content if multiple web elements are selected. Only the first web element will be used if this value is set to False.

HTML Attribute: The web element attribute to capture.

Web Selection

Selection: The selection XPaths of the web elements associated with this command.

Paths: List of selection XPaths.
- Path: The selection XPath.
Select Hidden Elements: Selects visible and disabled elements when true. Otherwise selects only visible and enabled web elements.
Selection Missing Option: Specifies what happens if this selection does not exist on the current page.
- Default: Specifies if this selection does not exist in the current page then logs an error.
- Ignore Command but Execute Sub-Commands: Specifies if this selection does not exist in the current page then it ignores the current command , but executes sub-commands of the command.
- Ignore Command: Specifies if this selection does not exist in the current page then it ignores the current command as well as sub-commands.
- Log Error and Ignore Command: Specifies if this selection does not exist in the current page then it ignores the current command as well as sub-commands and logs an error message.
- Log Warning and Ignore Command: Specifies if this selection does not exist in the current page then it ignores the current command as well as sub-commands and logs a warning message. Note: Warning message will be logged if , Log level is set to either ‘Low’ or ‘High’.
- Log PageLoad Error and Ignore Command: Specifies if this selection does not exist in the current page then it ignores the current command as well as sub-commands and logs a Page Load error.