HTTP Request Interceptions
HTTP Request Interceptions
This feature allows you to intercept any HTTP requests that pass through till, and respond with whatever response that you want.
The following are some examples of useful scenarios:
By using interceptions, you can easily scale and maintain your scrapers without the need to build your interception logic inside your scraper codes.
Note: HTTP Request Interception is a Premium feature. If you've already upgraded your plan, you can restart Till and it will be turned on.
When you have already created a config file, you can add a configuration like so:
Note: The following interception is just an example. Your specific interception configuration is maybe different.
# Request Interception settings.
interceptions:
# this is an example of one interception
# you can add as many interceptions as you wish,
# as long as the `name` is unique.
- name: foo_bar # name can be anything, must be unique.
# This interception should be disabled or not
disabled: true
# This is the matcher that will be used to determine
# if a request should be intercepted or not.
matches:
# regex pattern of the URLs you're trying to match
pattern: '.+\.(jpe?g|png|tiff|bmp|gif|webp)'
# Methods of the URLs you're trying to match
method: GET,POST
# This is what will be served on any intercepted requests.
responds:
# HTTP status code
code: 200
# The HTTP header
header:
"Content-Type": "image/png"
"Foo": "bar"
# You can either respond with a `body` or a `file`
body: "foo body"
# Or, you can have it serve a local file
file: "/path/to/your/image.png"
# Add more of your interceptions below this line
The following are the options that you can set for each interception:
Configuration | Value |
---|---|
name | String. Anything as long as it's unique. For example: foo_bar |
disabled | Allowed values: true or false . (default false ) |
matches: | This is used to match a HTTP request, in order to intercept it. |
matches.pattern | Regular Expression pattern |
matches.method | can be any HTTP method. If you need multiple methods, separate them with a comma. (example: GET,POST ) |
responds: | Once intercepted, the interception will respond with the following: |
responds.code | HTTP status code. (example: 200 ) |
responds.header | HTTP response header. (example: "Content-Type": "image/png" ) |
responds.body | Any thing here that will be served on the response body. (Example: foo bar ) |
responds.file | To serve a file on the response body. (Example: /path/to/your/file.txt ) |
The following are starter recipes that you can copy and modify based on your preference.
In order to intercept the Google Analytics Javascript URL https://www.google-analytics.com/analytics.js
.
You can add a configuration under the interceptions
configuration like so:
interceptions:
# Add the lines below directly under the `interceptions` configuration:
- name: google_analytics
disabled: false
matches:
pattern: 'google-analytics\.com\/\S*\.js'
method: GET
responds:
code: 200
header:
"Content-Type": "application/json"
body: '{"msg":"Yaay it got intercepted!"}'
Now, to confirm that this URL pattern has been intercepted by running the following curl
command:
$ curl 'https://www.google-analytics.com/analytics.js' -kv --proxy http://localhost:2933
...
< HTTP/1.1 200 OK
< Connection: close
< Content-Type: application/json
< X-Dh-Gid: www.google-analytics.com-f400c0924cacf5d1918f5a188dfdb2fd
<
* TLSv1.2 (IN), TLS alert, Client hello (1):
* Closing connection 0
* TLSv1.2 (OUT), TLS alert, Client hello (1):
{"msg":"Yaay it got intercepted!"}
If you've seen the above message, it means that your interception worked properly.
In this recipe, we're going to try to intercept and replace the image below with a local text file.
First, let's create a text file. Copy the following content into a file on your computer called replacement.txt
Hello from replacement.txt
Next, add a configuration under the interceptions
configuration like so:
interceptions:
# Add the lines below directly under the `interceptions` configuration:
- name: cat_replacement
disabled: false
matches:
pattern: 'fetchtest\.datahen\.com\/S*\.jpeg'
method: GET
responds:
code: 200
header:
"Content-Type": "text/plain"
file: '/path/to/your/replacement.txt'
Now, to confirm that this URL pattern has been intercepted by running the following curl
command:
curl 'https://fetchtest.datahen.com/assets/img/cat.jpeg' -kv --proxy http://localhost:2933
...
< HTTP/1.1 200 OK
< Connection: close
< Content-Type: text/plain
< X-Dh-Gid: fetchtest.datahen.com-7c46f04a27de059e8c4eb7bd199dea2c
<
* TLSv1.2 (IN), TLS alert, Client hello (1):
* Closing connection 0
* TLSv1.2 (OUT), TLS alert, Client hello (1):
Hello from replacement.txt
If you've seen the above message, it means that your interception worked properly.
Getting Started
How To Use
Integrations
Python
Node.js
Go
Ruby