Skip to main content

Web Content

Typically, the Web Content command is the most common. It is used to capture content from a web element, such as text, URL or HTML.

Example

Below screenshot shows the simplest example where the Web Content command can be used.

Capture.jpg

The properties shown in the following table are available to all Web Content commands:

Text Option

Description

Text

This is the text that displays in the web browser and is the most common choice (it's also the default).

Formatted Text

This option will extract the entire HTML of the chosen web element and insert line breaks where appropriate.

HTML

This option will extract the entire HTML of the chosen web element, including the HTML of any child elements.

Clean HTML

This option will extract the HTML, but remove all attributes that are used to style the HTML.

Styled HTML

This option will extract the HTML, but remove all attributes that are used to style the HTML, except attributes used for inline styling.

Inner HTML

The entire HTML of all child elements of the selected web element, but not the tag HTML of the chosen element itself.

Tag Text

The text of the selected web element, excluding the text of any child elements.

Unique ID

A unique identifier. The web selection is ignored if this option is selected.

Default File Name

Returns any file name specified by a response from a web server. If no filename is specified, extracts a file name from a href attribute, or a unique identifier if the href does not exist.

This attribute is mostly relevant to file capture commands.

Node Position

The position of the web elements among all siblings.

Position

The position of the web element among siblings of the same type.

 

In addition to the default options given above, some web elements may have other attributes available, such as Class, Name, ID, Value, URL, etc. If a web element has an attribute that is not shown in the default drop-down box, then you can simply enter the name of the attribute you want to extract.

The figure below shows the Command Properties panel after choosing Web Content from the New Command drop-down:

WebElement_10_D.png

 

Extracting URLs deserves a special mention. That's because it's more common to extract a link URL instead of navigating to the link, or to extract an image URL instead of downloading the image itself. If you want to extract a URL from a web element, simply select the element and extract the URL or Image URL attribute. You can also enter the actual HTML tag attributes: src for images and href for links. However, the URL and Image URL attributes automatically convert any relative URLs to absolute URLs-which is best in most cases.

Content Transformation Script

The Web Content command allows you to use regular expressions or a .NET script, to transform the extracted content. In most cases, we recommend that you write expressions or use a script to clean the data that you extract. You can also separate data-such as the elements of a postal address into a separate field.

Example: Consider a case in which you want to extract product data that includes a price of $400. You could use a transformation script to strip off the "$" character and leave only the numeric value.

Please see the Content Transformation Script topic for more information.

Command Properties

Capture

Act as System Value: Acts as a system value that is guaranteed to be present, and does not participate in an empty data row check. Default value is set to False.

Allow Empty Value: Default value is set to True which  allow empty or missing values. Allow Empty set to False indicates that it will not  allow null values .

Always Update Design Value: This property value  set to True indicates that the  design value is updated whenever possible, and not just when editing the command. The Default value is set to False.

Change Tracking: Default value is set to ‘Include’ specifies the captured content will cause change tracking to record a change. Change Tracking value set to Exclude  indicates that capture commands can be excluded from change tracking, so if the captured data changes, it will not cause the last change status for the data entry to change.

Create Index: Creates an index in the internal database for the column holding this content. This can improve performance when a duplicate check is performed on this content.

Data Consumer: Specifies the input data to use when processing this command.

  • Captured Data Command: Specifies the previously captured data column name which you want to use as  input data .

  • Data Source:  The source of the data consumed.

  • Data Transformation Script: Data transformation script. Default value is set to Disabled which is reflected by "Enabled" property value False. If you want to enable the data transformation script then you need to set to "Enabled" property value as True.

  • Input Parameter Name : Specifies the input parameter name to use .

  • Provider Column Name: Specifies a column from the data source that should provide the data to this command. specifies a command that provides data to the agent. A command can provide data to itself.

  • Provider Container: Specifies a command that provides data to this command.

Data Format: This property specifies the  data format of captured content.

Data Format Style: This property specifies the style of data format for captured content. Default value is set to ‘None’.

Data Type: The data type of captured content.

  • Short Text: All content will be captured as Short Text by default. Short Text content can be up to 4000 characters long.

  • Long Text: Long Text content can be any length, but cannot always be used in comparisons, so you may not be able to include Long Text content in duplicate checks.

  • Integer: A whole number.

  • Float: A floating point number.

  • Date/Time: A date and/or time value.

  • Boolean: A value that can be true or false. Boolean values are stored as 1 or 0 integer values.

  • Binary: A variable-length stream of binary data ranging between 1 and 8,000 bytes.

  • Big Integer: A 64-bit signed integer.

  • Decimal: Represents a decimal floating-point number. A fixed precision and scale numeric value between -10 38 -1 and 10 38 -1.

  • GUID: A globally unique identifier (or GUID).A GUID is a 128-bit integer (16 bytes) that can be used across all computers and networks wherever a unique identifier is required. Such an identifier has a very low probability of being duplicated.

  • Document: The captured data is a document in binary form. This can be used in capture commands that stores a downloaded document from the web.

  • Image:  The captured data is an image in binary form. Can be used in capture commands that stores a downloaded image from the web.

  • Temporary: The captured data is not stored in the internal database, and also not exported. Can be used as temporary storage during agent run time.

Date/Time Conversion: This property specifies the possible options one can choose to allow for date/time conversion. for e.g. If we set AssumeLocalTime, then it explicitly assumes as local time i.e. whatever time is defined in the field using script (UTC NOW/Universal or Only NOW/Local). On the other hand, If we set Universal LocalTime, then it explicitly assumes as universal time i.e. whatever time is defined in the field using script (UTC NOW/Universal or Only NOW/Local).

Decimal Precision: Specify decimal Precision. Default value is 19.

Decimal Scale: Specify decimal Scale. Default value is 4.

Design Value: The value to use for this capture command in the agent editor. This value can be important when testing scripts in the editor if the scripts depend on captured data.

Key Column: The captured content is used to identify a data entry if this option is set to true.

Make Data Available to Parent Commands: Copies the extracted data to all parent data rows, making the data available to parent commands executed after this command.

Max. Data Length: The maximum length of captured content when using the Short Text data type. The maximum possible length depends on the chosen database. Default value is set to 4000.

Raise Validation Error: Default value is set to TRUE adds a page load error if value validation fails.

Transformation Script: A script used to transform the captured content.

Use Data Value: Captures a data value instead of a property of the selected web element. The web selection is ignored when this option is true.

User Defined Design Value: This property value set to True indicates the design value for this capture command is user defined instead of set automatically when the command is saved.

Validation Time: Default value is set to Runtime specifies when data validation will take place. If you want to data validation at export time instead of Run time then you can set this property value as “Export” .

Command

Command Description: A custom description for the command. Default value is Empty.

Command Transformation Script:  A script used to change command properties at run time. The default value is disabled.

Disabled: This property set to True allows user to disable the command. A disabled command will be ignored. The default value is set to False.

ID:  This property indicates the internal unique ID of the command and is always auto- generated e.g. 58c8e4ac-e4c0-48f7-a63d-77064945380b.

Increase Data Count: This property indicates the data count every time this command is processed. The default value is set to False. Set it to TRUE if you want to get the count of the number of times a specific command is executed to get the data. The data count value is increased during data extraction, so it is used to measure agent progress and basis this increased data count, the agent  decides the success criteria.

Name:  This property specifies the name of the command.

Notify On Critical Error:  A notification email is sent at the end of an agent run if the command encounters a critical error, and the agent has been configured to notify on critical errors. Critical errors include page load errors and missing required web selections. Default value is set to False.

Debug

Debug Break Point: Debugging will break at this command if the break point is set. Default value is set to False.

Debug Disabled: A disabled command will be ignored during debugging. Default value is set to False.

Debug Error Option: This property specifies what action to take when an error occurs in the debugger. Default value is set to Notify which indicates that when an error occurs at debugging time , then it will be notified. If we want to ignore the error at debug time , then we need to set this property value as Ignore.

Export

Excel/PDF/CSV Column Format: Specifies the format of the data column holding the captured data when exporting to Excel, PDF or CSV. For Excel and PDF this format string is the same used in Excel under Custom format when formatting a cell. For CSV this is a standard .NET format string. This is useful in cases where one needs to apply particular format like NUMBER, DATE, CURRENCY etc.
In addition, it is to be noted that when the Export target is set to anything other than Excel, CSV or PDF, any value under this property will not come into play.

Excel Column Width: Specifies the width of the data column holding the captured data when exporting to Excel or PDF.

Export Enabled: A command with Export Enabled set to false will not save any data to data output. Default value is set to True indicates that data will be output.

Merge Rows Method: When the parent list Container command option "Export Method" is set to "Add Columns And Merge Rows", this option specifies how to combine row values.

Merge Rows Value Separator: When "Merge Rows Method" is set to "Concatenate", this separator is used to separate the extracted values.

Sort Order: Specifies the order in which the column is listed when exporting to a file format.

HTML Capture

Concatenate Content Separator: The separator such as comma, pipe etc.  to use between content from multiple web elements. This property is only applicable when "Concatenate Multiple Web Elements" is set to True.

Concatenate Multiple Web Elements: Concatenates content if multiple web elements are selected. Only the first web element will be used if this value is set to False.

HTML Attribute: The web element attribute to capture.

Web Selection

Selection:  The selection XPaths of the web elements associated with this command.

  • Paths:  List of selection XPaths.

    • Path:  The selection XPath.

  • Select Hidden Elements:  Selects visible and disabled elements when true. Otherwise selects only visible and enabled web elements.

  • Selection Missing Option:  Specifies what happens if this selection does not exist in the current page.

    • Default: Specifies if  this selection does not exist in the current page then logs error. 

    • Ignore Command but Execute Sub-Commands: Specifies if  this selection does not exist in the current page then it ignores the current command , but executes sub-commands of the command.

    • Ignore Command: Specifies if  this selection does not exist in the current page then it ignores the current command as  well as sub-commands.

    • Log Error and Ignore Command: Specifies if this selection does not exist in the current page then it ignores the current command as  well as sub-commands and logs an error message.

    • Log Warning and Ignore Command: Specifies if this selection does not exist in the current page then it ignores the current command as  well as sub-commands and logs a warning message. Note: Warning message will be logged if , Log level is set to either ‘Low’ or ‘High’.

    • Log PageLoad Error and Ignore Command: Specifies if  this selection does not exist in the current page then it ignores the current command as  well as sub-commands and logs a Page Load error.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.