Sticky Sessions

Normally, for every request that comes through Till, it randomizes Proxy IP, User-Agent string, and sets a dedicated Cookie Jar to this particular request.
Sometimes to avoid anti-scraper detection and to mimic real user behavior, you need to reuse these same values so that they are shared across several requests.

The Sticky Sessions feature allows you to "stick" the Proxy IP, User-Agent, and Cookie Jar to a certain session.
You can then specify a set of requests to use this session.

When used properly, it allows you to do advanced scraping scenarios to scale your scrapers and avoid anti-scraper detections.

Manage Cookies

The Sticky Sessions feature allows you to manage cookies throughout your requests.
It lets you "stick" a cookie jar to a session. Then, when multiple requests use this session, the requests will use the cookies that are stored on the cookie jar. Every new cookie that is modified by the target server, will also be saved into the session's cookie jar.

This feature is useful when you want to do advanced scraping scenarios, and avoid anti-scrapers, as it allows you to mimic a real user that is interacting with the target website.

How to use

To use sticky sessions, you need to specify the X-DH-Session-ID header on the HTTP request.

HTTP Header Value
X-DH-Session-ID Any string value. For example: foo

Example Using Curl

This is an example of using sticky sessions with curl:

$ curl 'https://fetchtest.datahen.com/echo/request' -H 'X-DH-Session-ID: foo' -H 'X-DH-Cache-Freshness: now' -k --proxy http://localhost:2933

Note: The above assumes that you're using the foo session ID for this request.

Behavior

The following is the behavior of what your requests will look like if you use sticky sessions.

In this example, we are mimicking two different users accessing the target website at the same time.

Req # Session ID IP Used User-Agent Cookie Jar
1 user1 198.51.100.1 chrome (empty)
2 user2 198.51.100.2 firefox (empty)
3 user1 198.51.100.1 chrome some-val: val1
4 user2 198.51.100.2 firefox some-val: val1
5 user1 198.51.100.1 chrome some-val: val1, another-val: val2
6 user2 198.51.100.2 firefox some-val: val1, another-val: val2

Configuration

Note: Sticky Sessions is a Premium feature. If you've already upgraded your plan, you can restart Till and it will be turned on.

The following are the configuration options that you can set:

Configuration Value
sessions.disabled Allowed values: true or false. (default false)
sessions.ttl Time-to-live for the session records. Allowed values: minute, hour, day, week, fortnight, month, year, or forever (default week).

Note: Till stores the sessions data inside your Data Directory on your local disk. You can change the TTL settings to save disk space. The lesser the TTL, the smaller the space in your disk that will be used.

Configuration Steps

Step 1: Configure Till

When you have already created a config file, you can add a configuration like so:

# Sticky Session settings
sessions:
  # Disable the sticky sessions feature.
  # Defaults to false.
  disabled: false
  
  # TTL (Time To Live). How long a session record will be allowed to live before it gets deleted.
  # Defaults to "week".
  ttl: "week"

Step 2: Verify Till

Now, you just need to verify that your Till configuration is working and that your requests are served using their sticky sessions.

To verify that, we need to do the following steps:

  1. Send a request to an echo endpoint
  2. Send a request to a URL that sets a cookie
  3. Send another request to an echo endpoint to verify stickiness

Note: In the following examples, we will use the session ID user1.

Let's get started:

1. Request to an echo endpoint

Let's send a request to the echo endpoint so that we can compare the result with later requests.

You can use the following curl command:

$ curl 'https://fetchtest.datahen.com/echo/request' -H 'X-DH-Session-ID: user1' -H 'X-DH-Cache-Freshness: now' -kv --proxy http://localhost:2933
...
> GET /echo/request HTTP/1.1
> Host: fetchtest.datahen.com
> User-Agent: curl/7.58.0
> Accept: */*
> X-DH-Session-ID: user1
> X-DH-Cache-Freshness: now
>
< HTTP/1.1 200 OK
< Transfer-Encoding: chunked
< Alt-Svc: h3-27=":443"; ma=86400, h3-28=":443"; ma=86400, h3-29=":443"; ma=86400, h3=":443"; ma=86400
< Cf-Cache-Status: DYNAMIC
< Cf-Ray: 674e9ae4abedebc5-LAX
< Connection: keep-alive
< Content-Type: text/plain; charset=utf-8
< Date: Mon, 26 Jul 2021 15:19:13 GMT
< Expect-Ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
< Nel: {"report_to":"cf-nel","max_age":604800}
< Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=%2FfZqElU9nSOIMEuhIsctzRRc8xaaNpaG8JtS2%2B9KlZvCCChutSdwabhLWfE2RZfU3fDZHylptfiSRw1ogSceUkj5BbBQk9BFRN4%2Bmol9Ap7UNUjvt7Zfb9K95EjMO49PI%2F6YlXEFcwk%3D"}],"group":"cf-nel","max_age":604800}
< Server: cloudflare
< X-Dh-Gid: fetchtest.datahen.com-144a91f641d36c08dade39f739b05d31
<
GET /echo/request HTTP/2.0
Host: fetchtest.datahen.com
Accept: */*
Accept-Encoding: gzip
Cdn-Loop: cloudflare
Cf-Connecting-Ip: 198.51.100.1
Cf-Ipcountry: US
Cf-Ray: 674e9ae4abedebc5-LAX
Cf-Visitor: {"scheme":"https"}
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183 Safari/537.36
X-Forwarded-For: 198.51.100.1
X-Forwarded-Proto: https

Make note of the IP, and user agent from the response body, so that you can compare this with a later request.

Cf-Connecting-Ip: 198.51.100.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183 Safari/537.36

2. Send request to a url that sets a cookie

Next, we're going to see if our the sticky session's cookie jar do saves cookies.
Let's send a request to a URL endpoint that sets a cookie:

$ curl 'https://fetchtest.datahen.com/cookie' -H 'X-DH-Session-ID: user1' -H 'X-DH-Cache-Freshness: now' -kv --proxy http://localhost:2933
...
> GET /cookie HTTP/1.1
> Host: fetchtest.datahen.com
> User-Agent: curl/7.58.0
> Accept: */*
> X-DH-Session-ID: user1
> X-DH-Cache-Freshness: now
>
< HTTP/1.1 200 OK
< Content-Length: 26
< Alt-Svc: h3-27=":443"; ma=86400, h3-28=":443"; ma=86400, h3-29=":443"; ma=86400, h3=":443"; ma=86400
< Cf-Cache-Status: DYNAMIC
< Cf-Ray: 674ea161aa2ce50e-LAX
< Connection: keep-alive
< Content-Type: text/plain; charset=utf-8
< Date: Mon, 26 Jul 2021 15:23:39 GMT
< Expect-Ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
< Nel: {"report_to":"cf-nel","max_age":604800}
< Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=MyzDNi0GSp7Y4lNYKDHOPbxUpQrI4nrDgdZmQDOz4cQXWKYczOqIpSW4NsG7OYSu%2BIGR5ptTKNkLqN8O1lKl2VPIlVxbePbsfgMtNgicIlVYScmbUYYZ%2BUvMSVWLrUDvlRCqlEYBbSc%3D"}],"group":"cf-nel","max_age":604800}
< Server: cloudflare
< Set-Cookie: cookieA=cookieAhere
< Set-Cookie: cookieB=cookieBhere
< X-Dh-Gid: fetchtest.datahen.com-7147f8e17fdc930f6798549682ecb175
<
* Connection #0 to host localhost left intact
This should set the cookie

The above request sets the following cookie, which Till will save in the cookie jar of the session with the ID of user1:

< Set-Cookie: cookieA=cookieAhere
< Set-Cookie: cookieB=cookieBhere

3. Send another request to an echo endpoint to verify stickiness

Now that the cookies set on the previous request have been saved into the cookie jar, let's verify this by doing another request to the echo endpoint.

curl 'https://fetchtest.datahen.com/echo/request' -H 'X-DH-Session-ID: user1' -H 'X-DH-Cache-Freshness: now' -kv --proxy http://localhost:2933                                                     
...
> GET /echo/request HTTP/1.1
> Host: fetchtest.datahen.com
> User-Agent: curl/7.58.0
> Accept: */*
> X-DH-Session-ID: user1
> X-DH-Cache-Freshness: now
>
< HTTP/1.1 200 OK
< Transfer-Encoding: chunked
< Alt-Svc: h3-27=":443"; ma=86400, h3-28=":443"; ma=86400, h3-29=":443"; ma=86400, h3=":443"; ma=86400
< Cf-Cache-Status: DYNAMIC
< Cf-Ray: 674eaa759b5831af-LAX
< Connection: keep-alive
< Content-Type: text/plain; charset=utf-8
< Date: Mon, 26 Jul 2021 15:29:51 GMT
< Expect-Ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
< Nel: {"report_to":"cf-nel","max_age":604800}
< Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=BLYoAu46fx4eTjPgGS8hjwpz9IpCwloExRM6aYshaWLJQibUxWdcyJSj6wBykofo%2FbQi%2FFv3PSYX2PsbXYTaBn1anbiGJ3MaiGHn%2FLuB0BMv3I8F0QCl94q3DeEoTzRSrvVacyO0HlI%3D"}],"group":"cf-nel","max_age":604800}
< Server: cloudflare
< X-Dh-Gid: fetchtest.datahen.com-144a91f641d36c08dade39f739b05d31
<
GET /echo/request HTTP/1.1
Host: fetchtest.datahen.com
Accept: */*
Accept-Encoding: gzip
Cdn-Loop: cloudflare
Cf-Connecting-Ip: 198.51.100.1
Cf-Ipcountry: US
Cf-Ray: 674eaa759b5831af-LAX
Cf-Visitor: {"scheme":"https"}
Connection: Keep-Alive
Cookie: cookieA=cookieAhere; cookieB=cookieBhere
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183 Safari/537.36
X-Forwarded-For: 198.51.100.1
X-Forwarded-Proto: https

We now need to verify two things:

  • If the proxy IP and user-agent are the same throughout the requests.
  • If the cookie jar saves cookies throughout the requests.

Let's verify if the proxy IP and user-agent are the same. Check if the following information is the same as request 1.

Cf-Connecting-Ip: 198.51.100.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183 Safari/537.36

Next, let's verify if the cookie that was set in request 2, was used in this request. If you see the following values in the response, then it means that it works correctly.

Cookie: cookieA=cookieAhere; cookieB=cookieBhere

Congratulations! You've correctly set up and were able to use sticky sessions.