Performance Sessions

Sessions allow you to run multiple instances of the same agent at the same time. This can be used to split up large web scraping tasks and have multiple instances of an agent working on the task.

Sequentum Enterprise splits up a large task by dividing list entries into subsets, and each instance of the agent will then work on one of those subsets. For example, if you are processing a long list of start URLs, Sequentum Enterprise could divide the list into two and have one instance of the agent go through the first half of the list and a second instance of the agent go through the second half of the list.

Agents already use multithreading internally to split up work in a similar fashion as performance sessions, but sometimes it's faster to have multiple processes working on a task rather than multiple threads, especially if you run the processes on multiple computers. A single instance of an agent processing a website using multithreading needs to wait for threads to catch up at certain points. For example, when processing pagination with multiple threads going through a list of links on each page, some threads may finish before others, but they'll need to wait for all threads to finish before the agent can move to the next page in pagination. Multiple instances of the same agent run completely independent, so one instance will never have to wait for other instances to catch up.

Running an Agent in a Session

To run an agent in a session, you must specify a session ID when you run the agent and the agent must be configured to support sessions. To configure an agent to support sessions, set the agent option Support Sessions to Performance Sessions under the Agent Settings tab.

performanceSessions.png

When using Performance Sessions, the session ID must be in a special format that dictates how work is divided between sessions. The input list associated with the Agent command (start command) will be divided by default, but you can specify any list command in an agent by setting the option Process in Sessions on the list command. You can only set this option on one command in an agent.

The special format of the session ID specifies how many sessions will work on the input list, and the subset of list entries the current instance of the agent should work on. The session ID must be in the following format.

[Subset to Process]/[Total Number of Sessions]

For example, if you have an agent that processes a list of 10 start URL, and want 5 instances of an agent to each process 2 URLs, then the session ID "3/5" would start an instance of the agent that processes URL number 5 and 6.

You can start multiple sessions at once by specifying a comma-separated list of numbers, or a range of numbers. Here are a few examples:

"1-10/10" starts all 10 sessions.

"1,3-6/10" starts session 1 and 3 to 6.

1,4,5/10" starts session 1, 4 and 5.

A session ID can be specified when running an agent from the command-line by using the command-line option session_id. The following command-line runs an agent named sequentum with a session ID "2/10".

RunAgent.exe "sequentum" session_id "2/10"

You can also specify a session ID when you run an agent from the Sequentum Enterprise editor. Open the Run window and enter a session ID or select an existing session ID from the drop-down box, and then press the Get button to make that session active.

  NOTICE

The session panel is only visible if the agent supports sessions.

performanceSessions1.png

Run an agent with a session from the Sequentum Enterprise editor.

You can select All Sessions from the session dropdown list and press the Get button to open a window that displays status information for all agent sessions.

performanceSessions2.png

You can delete all sessions by selecting All Sessions from the session dropdown list and then press the Delete button. You can also delete a range of sessions by specifying a session range.

Session Data Cleanup

Normally, data generated by sessions is cleaned up when the session expires. This is because sessions are normally always new sessions with a new session ID, so to avoid having old data hanging around forever, Sequentum Enterprise will remove session data periodically, unless you specifically tell it not to. However, performance sessions are different, since they are not always new sessions with new session IDs, so by default, Sequentum Enterprise will not remove data generated by performance sessions.

You can manually delete one or more sessions from the Run window in the Sequentum Enterprise editor.

When you delete a session, Sequentum Enterprise will only clean up externally exported session data if the agent is configured to export data to a database. If you are using an Export Script or if you are exporting to a file format, then you are responsible for any cleanup of exported session data.

You may not always want to remove externally exported session data when you delete a session. To prevent session data from being removed, set the agent option Cleanup External Session Data to False. This option can be found in the Sessions section on the Properties tab of the main agent command (first command).

Increasing Session Limits

You may start seeing various agent errors if you run too many agent sessions on a Windows server. The errors may include scripting errors and system errors, such as Error creating Windows handle. This is because each agent session will open web browser windows, and Windows saves information about each web browser window in allocated heap memory which has limited size.

When you run agents from the Agent Control Center or the built-in scheduler, the agents are run as non-interactive programs, and the heap memory size allocated to such programs is small by default. You can increase the size by changing a registry key in Windows.

Use Regedit.exe to open the Windows Registry Editor and find the following Registry key:

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\SubSystems\Windows

On a Windows server the default value will look something like this:

%SystemRoot%\system32\csrss.exe ObjectDirectory=\Windows SharedSection=1024,20480,768 Windows=On SubSystemType=Windows ServerDll=basesrv,1 ServerDll=winsrv:UserServerDllInitialization,3 ServerDll=winsrv:ConServerDllInitialization,2 ServerDll=sxssrv,4 ProfileControl=Off MaxRequestThreads=16

The critical bit is:

SharedSection=1024,20480,768

The second number (20480) is the size for interactive programs. The third number (768) is the size for non-interactive programs. We recommend you start by increasing the value for non-interactive programs to 2048 or 4096.

SharedSection=1024,20480,2048

Do not increase the value more than required and no larger than 8192, as each service in your system will consume more memory.

 

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.