Script Utilities
All scripts have access to a ScriptUtils class through the script arguments. This class contains the following utility functions:
Property or Function | Description |
---|---|
CustomDataStorage LoadCsvFile(string path, bool hasHeaderRow = true) | Loads data from a CSV file into a custom data storage. path - The path to the CSV file. hasHeaderRow - True if the CSV file has a header row. Otherwise false. |
CustomDataStorage LoadCsvFile(string path, char separator, bool hasHeaderRow = true) | Loads data from a CSV file into a custom data storage. path - The path to the CSV file. separator- The separator used by the CSV file. The default separator is comma. hasHeaderRow - True if the CSV file has a header row. Otherwise false. |
DataTable LoadCsvFileIntoDataTable(string path, bool hasHeaderRow = true) | Loads a CSV file from a specified location and returns the data in a DataTable. path - The path to the CSV file. hasHeaderRow - True if the CSV file has a header row. Otherwise false. |
DataTable LoadCsvFileIntoDataTable(string path, char separator, bool hasHeaderRow = true) | Loads a CSV file from a specified location and returns the data in a DataTable. path - The path to the CSV file. separator- The separator used by the CSV file. The default separator is comma. hasHeaderRow - True if the CSV file has a header row. Otherwise false. |
CustomDataStorage LoadCsvFileFromDefaultInputFolder(string filename, bool hasHeaderRow = true) | Loads data from a CSV file into a custom data storage. The CSV file must be located in the agent's default input folder. filename - The filename of the CSV file without the full path. hasHeaderRow - True if the CSV file has a header row. Otherwise false. |
CustomDataStorage LoadCsvFileFromDefaultInputFolder(string filename, char separator, bool hasHeaderRow = true) | Loads data from a CSV file into a custom data storage. The CSV file must be located in the agent's default input folder. filename - The filename of the CSV file without the full path. separator- The separator used by the CSV file. The default separator is comma. hasHeaderRow - True if the CSV file has a header row. Otherwise false. |
DataTable LoadCsvFileFromDefaultInputFolderIntoDataTable(string filename, bool hasHeaderRow = true) | Loads data from a CSV file into a data table. The CSV file must be located in the agent's default input folder. filename - The filename of the CSV file without the full path. hasHeaderRow - True if the CSV file has a header row. Otherwise false. |
DataTable LoadCsvFileFromDefaultInputFolderIntoDataTable(string filename, char separator, bool hasHeaderRow = true) | Loads data from a CSV file into a data table. The CSV file must be located in the agent's default input folder. filename - The filename of the CSV file without the full path. separator- The separator used by the CSV file. The default separator is comma. hasHeaderRow - True if the CSV file has a header row. Otherwise false. |
ISimpleDatabaseStorage GetSimpleDatabaseStorage(IConnection connection, string storageName, CustomDataColumn[] columns, bool allowDropExisting = false) | Returns a simple database storage that automatically creates a database table with the name and columns specified. |
void ExecuteCommandLine(string programFilePath, string inputFilePath, string outputFilePath, string options) | Executes a program on the command line. The parameters inputFilePath, outputFilePath, and options are added as command line parameters. |
void ExecuteCommandLine(string programFilePath, string arguments) | Executes a program on the command line with the specified command line parameters. |
int ClearBrowserCookies(string url) | Clears all the web browser cookies on the computer associated with the specified URL. |
void ResetBrowserSession() | Clears the web browser session cookies. This is equivalent to restarting a web browser. |
string PostData(string url, string postData, int timeout = 0) | Posts data to a web server. url - Data is posted to this URL. postData - The data posted. This data should be in the format key1=value1&key2=value2&key3=value3... timeout - Connection timeout in milliseconds. Use 0 for default timeout. |
string PostData(string url, byte[] postData, int timeout = 0) | Posts data to a web server. url - Data is posted to this URL. postData - The data posted. timeout - Connection timeout in milliseconds. Use 0 for default timeout. |
string PostData(string url, string postData, string headers, int timeout = 0) | Posts data to a web server. url - Data is posted to this URL. postData - The data posted. This data should be in the format key1=value1&key2=value2&key3=value3... headers - Custom headers sent with the request. timeout - Connection timeout in milliseconds. Use 0 for default timeout. |
string PostData(string url, byte[] postData, string headers, int timeout = 0) | Posts data to a web server. url - Data is posted to this URL. postData - The data posted. headers - Custom headers sent with the request. timeout - Connection timeout in milliseconds. Use 0 for default timeout. |
WebResult PostDataWithStatus(string url, string postData, int timeout = 0) | Posts data to a web server and return a status. url - Data is posted to this URL. postData - The data posted. This data should be in the format key1=value1&key2=value2&key3=value3... timeout - Connection timeout in milliseconds. Use 0 for default timeout. |
WebResult PostDataWithStatus(string url, byte[] postData, int timeout = 0) | Posts data to a web server and return a status. url - Data is posted to this URL. postData - The data posted. timeout - Connection timeout in milliseconds. Use 0 for default timeout. |
WebResult PostDataWithStatus(string url, string postData, string headers, int timeout = 0) | Posts data to a web server and return a status. url - Data is posted to this URL. postData - The data posted. This data should be in the format key1=value1&key2=value2&key3=value3... headers - Custom headers sent with the request. timeout - Connection timeout in milliseconds. Use 0 for default timeout. |
WebResult PostDataWithStatus(string url, byte[] postData, string headers, int timeout = 0, string editValue = "") | Posts data to a web server and return a status. url - Data is posted to this URL. postData - The data posted. headers - Custom headers sent with the request. timeout - Connection timeout in milliseconds. Use 0 for default timeout. editValue - The value to return when designing an agent. |
string GetData(string url, int timeout = 0) | Gets data from a web server. url - The URL to get data from. timeout - Connection timeout in milliseconds. Use 0 for default timeout. |
string GetData(string url, string headers, int timeout = 0) | Gets data from a web server. url - The URL to get data from. headers - Custom headers sent with the request. timeout - Connection timeout in milliseconds. Use 0 for default timeout. |
WebResult GetDataWithStatus(string url, int timeout = 0) | Gets data from a web server and returns a status. url - The URL to get data from. timeout - Connection timeout in milliseconds. Use 0 for default timeout. |
WebResult GetDataWithStatus(string url, string headers, int timeout = 0) | Gets data from a web server and returns a status. url - The URL to get data from. headers - Custom headers sent with the request. timeout - Connection timeout in milliseconds. Use 0 for default timeout. |
string GetCookies(string url) | Gets all web browser cookies associated with the specified URL |
void SetCookies(string url, string cookies) | Sets web browser cookies for a specified URL. url - The URL the cookies should be associated with. cookies - The cookies to associate with the specified URL. |
void SendMessage(int customMessageType, string message) | Sends a message to the host application that started the agent. The application must have had started a named pipe before it started the agent. customMessageType - A custom message type. This can be any integer value which meaning is interpreted by the host application. messsage - The message to send to the host application. |
void SendMessage(CgMessageType messageType) | Sends a message to the host application that started the agent. The application must have had started a named pipe before it started the agent. messageType - A message type. |
void SendMessage(CgMessageType messageType, string message) | Sends a message to the host application that started the agent. The application must have had started a named pipe before it started the agent. messageType - A message type. message - The message to send to the host application. |
void SendMessage(CgMessageType messageType, int customMessageType, string message) | Sends a message to the host application that started the agent. The application must have had started a named pipe before it started the agent. messageType - A message type. customMessageType - A custom message type. This can be any integer value which meaning is interpreted by the host application. message - The message to send to the host application. |
IProxyServer GetCurrentActiveProxy() | Returns the current active proxy. |
bool HasJsonPath(string json, string jsonPath) | Checks if the specified JSON string has the specified JSON path. json - JSON string to parse. jsonPath - JSON path to check |
string GetJsonStringValue(string json, string jsonPath) | Parses the specified JSON string and retrieves the string value specified by the JSON path. json - JSON string to parse. jsonPath - JSON path to retrieve the string value from |
string[] GetJsonStringValues(string json, string jsonPath) | Parses the specified JSON string and retrieves the string values specified by the JSON path. json - JSON string to parse. jsonPath - JSON path to retrieve the string values from |
double? GetJsonDoubleValue(string json, string jsonPath) | Parses the specified JSON string and retrieves the double value specified by the JSON path. json - JSON string to parse. jsonPath - JSON path to retrieve the double value from |
double[] GetJsonDoubleValues(string json, string jsonPath) | Parses the specified JSON string and retrieves the double values specified by the JSON path. json - JSON string to parse. jsonPath - JSON path to retrieve the double values from |
DataTable LoadRegexMatches(string input, string pattern) | Loads Regex matches into DataTable. input - input string to parse. pattern - the regular expression pattern to match. |
DataTable LoadRegexMatches(string input, string pattern, RegexOptions options) | Loads Regex matches into DataTable. input - input string to parse. pattern - the regular expression pattern to match. options - bitwise combination of the enumeration values that modify the regular expression. |
DataTable LoadRegexMatches(string input, Regex regex) | Loads Regex matches into DataTable. input - input string to parse. regex - the regular expression to match. |
string TransformContent(string input, string regexScript) | Transforms data with a Sequentum Enterprise regex script. The regex script must follow the exact format of a Sequentum Enterprise transformation script. Use \r\n for line breaks. |
Example
The following example generates a list of URLs that could be used in an Agent command to provide start URLs:
C#
Python
using System;
using System.Collections.Generic;
using System.Data;
using Sequentum.ContentGrabber.Api;
public class Script
{
public static CustomDataStorage ProvideData(DataProviderArguments args)
{
var data = args.ScriptUtilities.LoadCsvFileFromDefaultInputFolder("inputData.csv");
return data;
}
}
import clr
from Sequentum.ContentGrabber.Api import *
def ProvideData(args):
data = args.ScriptUtilities.LoadCsvFileFromDefaultInputFolder("inputData.csv")
return data
Database Utilities
All scripts have access to the following function through the script arguments:
public IConnection GetDatabaseConnection(string connectionName)
The function returns a predefined database connection. See the topic Database Connections for more information about predefined database connections.
The IConnection, ICommand, and IReader interfaces are Sequentum Enterprise interfaces that are meant to make it easier to write and read data to a database, but you can always use the standard .NET libraries if you prefer.
The following example writes some extracted data to a database:
NOTE: To use the following code in the agent, use this additional library. By default the library is not added in the Execute Script.
C# : using Sequentum.ContentGrabber.Commands;
using System;
using Sequentum.ContentGrabber.Api;
using Sequentum.ContentGrabber.Commands;
public class Script
{
public static CustomScriptReturn CustomScript(CustomScriptArguments args)
{
IConnection connection = args.GetDatabaseConnection("exportDatabase");
connection.OpenDatabase();
try
{
ICommand command = connection.GetNewCommand();
command.SetSql("insert into export_table values
(@title, @description)");
command.AddParameterWithValue("title", args.DataRow["title"], CaptureDataType.ShortText);
command.AddParameterWithValue("description",
args.DataRow["description"], CaptureDataType.ShortText);
command.ExecuteNonQuery();
}
finally
{
connection.CloseDatabase();
}
return CustomScriptReturn.Empty();
}
}
The IConnection interface has the following functions and properties:
Property or Function | Description |
---|---|
IReader GetNewReader(string sql) | Returns an IReader interface to a new Reader object that uses the current connection. |
ICommand GetNewCommand() | Returns an ICommand interface to a new Command object that uses the current connection. |
void OpenDatabase() | Opens the database connection. |
void CloseDatabase() | Closes the database connection. |
object GetConnection() | Return the underlying database connection that is specific to the database type used. For example, if the database is a MySQL database, the object returned is a MySqlConnection. |
void ExecuteNonQuery(string sql) | Executes a SQL statement that doesn't return a result set. |
object ExecuteScalar(string sql) | Executes a SQL statement that returns a single scalar value. |
long ExecuteCount(string sql) | Executes a SQL statement that returns a single long value. This function should be used when using the SQL function "Count". |
bool HasTable(string tableName) | Return true if the specified table exists in the database. |
bool HasColumn(string tableName, string columnName) | Return true if the specified columns exists in the specified table. |
DataRow[] GetTableColumns(string tableName) | Return all columns in the specified database table. |
void DropTable(string tableName) | Drops the specified database table. |
void TruncateTable(string tableName) | Truncates the specified database table. |
void Lock() | Locks the database connection, so it cannot be used by any other running threads. |
void Release() | Releases a lock on the database connection. |
IReader ExecuteReader(string sql, params object[] pars) | Executes a SQL with a list of SQL parameters and returns a IReader. |
DataTable ExecuteDataTale(string sql, params object[] pars) | Executes a SQL with a list of SQL parameters and returns a DataTable. |
void ExecuteNonQuery(string sql, params object[] pars) | Executes a SQL with a list of SQL parameters that doesn't return a result. |
object ExecuteScalar(string sql, params object[] pars) | Executes a SQL with a list of SQL parameters that returns a single scalar value. |
long ExecuteCount(string sql, params object[] pars) | Executes a SQL with a list of SQL parameters that returns a single long value. This function should be used when using the SQL function "Count". |
The ICommand interface has the following functions and properties:
Property or Function | Description |
---|---|
void AddParameterWithValue(string pName, object pValue, CaptureDataType type) | Adds a SQL parameter to the Command object. A matching parameter must exist in the associated SQL statement. The value of a parameter is updated if the parameter already exists. |
void AddParameter(string pName, CaptureDataType type) | Adds a SQL parameter to the Command object. A matching parameter must exist in the associated SQL statement. The value of a parameter is updated if the parameter already exists. |
void SetParameterValue(string pName, object pValue) | Sets the value of an existing SQL parameter. A matching parameter must exist in the associated SQL statement. |
void ExecuteNonQuery(string sql) | Executes a SQL statement that doesn't return a result set. |
void ExecuteNonQuery() | Executes an existing SQL statement that doesn't return a result set. |
object ExecuteScalar(string sql) | Executes a SQL statement that returns a single scalar value. |
object ExecuteScalar() | Executes an existing SQL statement that returns a single scalar value. |
long ExecuteCount(string sql) | Executes a SQL statement that returns a single long value. This function should be used when using the SQL function "Count". |
long ExecuteCount() | Executes an existing SQL statement that returns a single long value. This function should be used when using the SQL function "Count". |
void SetSql(string sql) | Set the SQL statement associated with the SQL command. |
The IReader interface has the following functions and properties:
Property or Function | Description |
---|---|
void SetSql(string sql) | Sets the SQL statement associated with this data reader. |
void AddParameterWithValue(string pName, object pValue) | Adds a SQL parameter to the Command object. A matching parameter must exist in the associated SQL statement. The value of a parameter is updated if the parameter already exists. |
bool Read() | Reads a row of data. Executes an existing SQL statement if it has not already been executed. |
void Close() | Closes the data reader. |
DateTime GetDateTimeValue(int columnIndex) | Gets a DateTime value from the current row. |
string GetStringValue(int columnIndex) | Gets a String value from the current row. |
int GetIntValue(int columnIndex) | Gets a Integer value from the current row. |
Guid GetGuidValue(int columnIndex) | Gets a Guid value from the current row. |
Type GetFieldType(int columnIndex) | Gets the data type of a specified column. |
IDataReader GetDataReader() | Returns the underlying .NET data reader. |
object GetFieldValue(int columnIndex) | Gets a data value from the current data row. |
Extension Methods
Sequentum Enterprise provides a few extension methods that can be used in all scripts:
Property or Function | Description |
---|---|
static DataTable ToDataTable(this string stringValue, string columnName) | Converts a single string value into a DataTable with one data column and one data row. |
static DataTable ToDataTable(this string[] dataRows, string columnName) | Converts an array of strings into a DataTable with one data column and one data row for each string value. |
static DataTable ToDataTable(this string[] dataRows, string columnName, string stringFormat) | Converts an array of strings into a DataTable with one data column and one data row for each string value. A standard .NET format string can be used to format the string value before it is inserted into the DataTable. |
static DataTable ToDataTable(this List<string> dataRows, string columnName) | Converts a list of strings into a DataTable with one data column and one data row for each string value. |
static DataTable ToDataTable(this List<string> dataRows, string columnName, string stringFormat) | Converts a list of strings into a DataTable with one data column and one data row for each string value. A standard .NET format string can be used to format the string value before it is inserted into the DataTable. |
Example
The following example generates a list of URLs that could be used in an Agent command to provide start URLs:
C#
Python
using System;
using System.Collections.Generic;
using System.Data;
using Sequentum.ContentGrabber.Api;
public class Script
{
public static CustomDataStorage ProvideData(DataProviderArguments args)
{
List<string> urls = new List<string>();
for(int i=1; i<100; i++)
{
urls.Add("http://www.domain.com/page.php?ID=” + i.ToString());
}
return new CustomDataStorage(urls.ToArray());
}
}
import clr
from Sequentum.ContentGrabber.Api import *
def ProvideData(args):
cds = CustomDataStorage()
cds.AddStringColumn('URLs')
for i in range(1,100,1):
cds.AddRowValues("http://www.domain.com/page.php?ID=”+str(i))
return cds
Runtime Data
Extracted data can be accessed at run-time using the RuntimeData class available through the script arguments:
Property or Function | Description |
---|---|
IInternalReader GetInternalReaderForRow(string tableName, Guid rowId) | Returns a data reader for a specified row in a specified internal data table. |
GetInternalReader(string tableName, Guid? parentRowId) | Returns a data reader for a specified internal data table with the specified parent row ID. |
IInternalReader GetInternalReader(string tableName, Guid? parentRowId, SortedList<string, object> searchColumnValues) | Returns a data reader for a specified internal data table with the specified parent row ID and column values. |
IInternalReader GetInternalReader(string tableName, SortedList<string, object> searchColumnValues) | Returns a data reader for a specified internal data table matching the specified column values. |
IInternalReader GetInternalReader(string tableName, SortedList<string, object> searchColumnValues, int startIndex, int count, string orderColumnName = null) | Returns a data reader for a specified internal data table matching the specified column values. Selects a specified number of data rows starting at a specified index. The result will be ordered by Row ID if the order column name is not specified. |
IInternalReader GetInternalReader(string tableName) | Returns a data reader for a specified internal data table. |
IInternalReader GetInternalReader(string tableName, int startIndex, int count, string orderColumnName = null) | Returns a data reader for a specified internal data table matching the specified column values. Selects a specified number of data rows starting at a specified index. The result will be ordered by Row ID if the order column name is not specified. |
IInternalReader GetInternalReaderForCurrentRow() | Returns a data reader for the current row in the current internal data table. |
IInternalReader GetInternalReaderForRow(Guid rowId) | Returns a data reader for the specified row in the current internal data table. |
IInternalReader GetInternalReader(Guid? parentRowId) | Returns a data reader for the current internal data table with the specified parent row ID. |
IInternalReader GetInternalReader(Guid? parentRowId, SortedList<string, object> searchColumnValues) | Returns a data reader for the current internal data table with the specified parent row ID and column values. |
IInternalReader GetInternalReader(SortedList<string, object> searchColumnValues) | Returns a data reader for the current internal data table with the specified column values. |
The methods in the RuntimeData class all returns an internal data reader with the following methods and properties:
Property or Function | Description |
---|---|
bool Read() | Reads the next data row. Returns false if there are no more data rows available. |
void Close() | Closes the data reader. |
string GetStringValue(string columnName) | Gets a String value from the current row. |
int? GetIntegerValue(string columnName) | Gets a Integer value from the current row. |
Guid? GetGuidValue(string columnName) | Gets a Guid value from the current row. |
DateTime? GetDateTimeValue(string columnName) | Gets a Date/Time value from the current row. |
double? GetFloatValue(string columnName) | Gets a Float value from the current row. |
double? GetFloatValue(string columnName) | Gets a Boolean value from the current row. |
string GetStringValue(int index) | Gets a String value from the current row. |
int? GetIntegerValue(int index) | Gets a Integer value from the current row. |
Guid? GetGuidValue(int index) | Gets a Guid value from the current row. |
DateTime? GetDateTimeValue(int index) | Gets a Date/Time value from the current row. |
double? GetFloatValue(int index) | Gets a Float value from the current row. |
bool? GetBooleanValue(int index) | Gets a Boolean value from the current row. |
Type GetFieldType(int index) | Gets the data type of a specified column. |
object GetFieldValue(string columnName) | Returns a data value from the current data row. |
object GetFieldValue(int index) | Returns a data value from the current data row. |
Guid GetKey() | Returns the primary key of the current row. |
Guid? GetParentKey() | Returns parent key of the current row or null if the row has no parent data. |
int GetSegment() | Returns the data segment of the current row. |
int GetStatus() | Returns the status of the current row. |
string GetSessionId() | Returns the session ID or null if no session is used. |
object this[string columnName] | Returns a data value from the current data row. |
object this[int index] | Returns a data value from the current data row. |
IInternalReader GetChildTable(string tableName) | Returns a data reader that reads data from a specified child table of the table associated with this data reader. The returned data reader will only read data rows that are child data to the current data row. |
IInternalReader GetChildTable(string tableName, Guid parentRowId, SortedList<string, object> searchColumnValues) | Returns a data reader that reads data from a specified child table of the table associated with this data reader. |
IInternalReader GetChildTable(string tableName, SortedList<string, object> searchColumnValues) | Returns a data reader that reads data from a specified child table of the table associated with this data reader. |
IInternalReader GetChildTable(string tableName, Guid parentRowId) | Returns a data reader that reads data from a specified child table of the table associated with this data reader. |
IInternalReader GetChildTableRow(string tableName, Guid rowId) | Returns a data reader that reads data from a specified child table of the table associated with this data reader. |
IInternalReader GetParentTableRow() | Returns a data reader that reads the parent data row of the table associated with this data reader. |
IInternalReader GetParentTableRow(Guid rowId) | Returns a data reader that reads a specified data row from the parent table of the table associated with this data reader. |
IInternalReader GetParentTable(Guid parentRowId, SortedList<string, object> searchColumnValues) | Returns a data reader that reads data from the parent table of the table associated with this data reader. |
IInternalReader GetParentTable(SortedList<string, object> searchColumnValues) | Returns a data reader that reads data from the parent table of the table associated with this data reader. |
IInternalReader GetParentTable(Guid parentRowId) | Returns a data reader that reads data from the parent table of the table associated with this data reader. |
string[] ChildTableNames | An array containing the names of all child tables of the table associated with this data reader. |
IInternalDataColumn[] Columns | Return all columns in the current table. |
IDataReader GetDataReader() | Returns the underlying .NET data reader. |
string ReadAllToJson(bool isIncludeChildData = false, bool isIncludeFiles = false) | Reads all rows and returns them as a JSON string. |
string ReadAllToJsonList(bool isIncludeChildData = false, bool isIncludeFiles = false) | Reads all rows and returns them as a JSON list. |
string ToJson(bool isIncludeChildData = false, bool isIncludeFiles = false) | Returns the current row as a JSON string. |
Examples
The following example reads all data from the current row, including child data, and returns the data as a JSON string.
IRuntimeData data = args.RuntimeData;
IInternalReader reader = data.GetInternalReaderForCurrentRow();
string json;
if(reader.read())
{
reader.ToJson(true, false);
}
else
{
json = "No data available";
}
return json;
The following example reads all data associated with the current row from the first child table and returns the data as a JSON string.
IRuntimeData data = args.RuntimeData;
IInternalReader reader = data.GetInternalReaderForCurrentRow();
IInternalReader childReader = reader.GetChildTable(reader.ChildTableNames[0], args.DataRow.RowId);
string json = childReader.ReadAllToJsonList(true, false);
return json;