Change Tracking
An agent can keep track of the latest changes that have been made to extracted data.
The agent will mark extracted data as deleted, modified or added. If data is deleted but later returned, the data will be marked as "returned", or "returned modified" if the data returned in a modified state.
An agent can be configured to only export data that has changed since last successful run, or only export data that has changed since a specified number of days.
Data will be marked as deleted if it was extracted last time the agent ran, but not during the current run. Data will only be marked as deleted if an agent completes successfully. This prevents data from being incorrectly marked as deleted if an agent fails halfway through a run. The Success Criteria options are used to define when an agent run should be considered successful.
Default change tracking can be enabled on the Internal Database window which is available from the Data menu in Sequentum Enterprise.
Change tracking can be fine-tuned by setting advanced change tracking options. These options can be set by clicking the Configure Change Tracking button.
The following advanced change tracking options are available.
Option | Description |
---|---|
Enabled | Enables or disables change tracking. |
Export Method | Specifies how to export data when an agent is not exporting historical data to a database. The option Historical Data Export Method is used instead when exporting historical data to a database. The following options are available: Export All. Exports all data no matter if the data has changed or not. Since Last Successful Run. Exports all data that has changed since last successful run. Since Number of Days. Exports all data that has changed since a specified number of days. |
Export Method Days | When Export Method is set to Since Number of Days, only data that has changed since the specified time period is exported. |
Historical Data Export Method | This option is used instead of Export Method when exporting historical data to a database. The following options are available: All Data. Exports all data no matter if the data has changed or not. Changed Data Only. Exports all data that has changed since last agent run. |
Track Deletes | Specifies if an agent should track deleted data. If an agent does not track deleted data, the last change status will not change for data that was not found in the last successful agent run. If an agent is tracking deleted data, the last change status will be set to Deleted for data that was not found in the last successful agent run. |
Delete on Days Not Seen | A data row will be marked as deleted if it hasn't been seen in the past number of days specified. The default value of 0 will mark the data row as deleted if the data row is not found in the next run. |
Last Change Enabled | Exports the type of change that was last made to a data row. |
Last Change Column Name | The name of the data column where the type of change is stored. The following are values that are possible in this column. Added. A data row has been added. Modified. A data row has been modified from its previous run. Deleted. A data row has been deleted. Returned. A data row that was once marked as deleted has been seen. ReturnedModified. A data row that was once marked as deleted has been seen with some columns updated. |
Change Date Enabled | Exports the date a data row was last changed. |
Change Date Column Name | The name of the data column where the change date is stored. |
Insert Date Enabled | Exports the date a data row was first inserted. This is the date the data was first extracted. |
Insert Date Column Name | The name of the data column where the insert date is stored. |
Update Date Enabled | Exports the date a data row was last processed. This is the date the data was last extracted and compared to existing data. Notice that data may not have changed at this date. |
Update Date Column Name | The name of the data column where the update date is stored. |
Identifier Enabled | Exports the object identifier used in the internal database. This value uniquely identifies the data row and will not change unless the internal database is recreated. |
Identifier Column Name | The name of the data column where the object identifier is stored. |
Columns Affected Enabled | Exports a column that contains the names of columns affected by a change. |
Columns Affected Column Name | The name of the data column where the columns affected value is stored. |
Changed Last Run Enabled | Exports a value indicating if a data row changed last time the agent was run. |
Changed Last Run Column Name | The name of the data column to store the value indicating if a data row changed the last time an agent was run. |
NOTE:
It is important to note that when the agent is run with the Change Tracking Export Method set to Since Last Successful Run, it exports the Result file with all added Records. However for the same run if you regenerate the data using the Data -> Regenerate Export Data menu, it will regenerate the blank file.
This is intentional, otherwise, you could end up with duplicate data at the end target data store.
Key Columns
When an agent exports data, it compares extracted data with existing data to see if the data has been added, changed or deleted. In order to see if a data entry has changed, the agent needs to be able to uniquely identify a data entry. By default, an agent uses all captured data to identify a data entry, but this means that every time any data changes, a data entry is always identified as a new data entry because it never matches any existing data entry.
To get change tracking working properly, so the agent correctly identifies modified data entries, it's important to mark capture commands that extract data that uniquely identifies a data entry. For example, when extracting product data, a website may display a product ID that a capture command can extract and the agent use as a unique identifier. Set the command option Key Column to mark a capture command as a command that extracts data that can be used to uniquely identify a data entry.
Multiple capture commands can be marked as key columns to combine extracted data from multiple commands into a value that uniquely identifies a data entry.
Each container command that is configured to generate a separate data table should have one or more capture commands that are marked as key columns.
Exclude From Change Tracking
Capture commands can be excluded from change tracking, so if the captured data changes, it will not cause the last change status for the data entry to change.
Include: If there is a change in the captured data, the data row will update with the new change and the last change status will be updated to reflect this change.
Exclude but Update: If the captured data changes, the data row will update with the new changes but the last change status will NOT be updated.
This can be useful if an agent is configured with a run_date capture command that returns the date in which the agent ran. The date column will be updated during every run, but the data row will not have its last change status updated if the date column was the only change.
Exclude: If the captured data changes, the last change status for the data entry will not change.