Configuration
Configuration
There are two ways to configure Till, to use the flags on the CLI or through a config file.
Some of Till's features such as Sticky Sessions, Cache, and Request Log stores data in a certain directory on your local drive.
The name of the data directory follows the <instance name>.data
naming convention. For example, if your instance name is foo
, then the data directory's name will be foo.data
.
Configuration | Value |
---|---|
datadir | Specify the path to the data directorys. Defaults to: ~/.config/datahen/till/default.data . |
When you run till serve
you can configure the way Till should run by specifying any of the command-line flags.
The following describes the flags that you can specify:
$ till serve --help
Using config file: ~/.config/datahen/till/config.yaml
Starts the DataHen Till server in order to listen to and receive HTTP requests and proxy them.
Usage:
till serve [flags]
Flags:
--apiport string Specify the port to run the API server (default "2980")
--ca-cert string Specify the CA certificate file (default is ~/.config/datahen/till/till-ca-cert.pem)
--ca-key string Specify the CA certificate file (default is ~/.config/datahen/till/till-ca-key.pem)
--datadir string Specify the path to the data directory that this instance uses (default is ~/.config/datahen/till/default.data)
--force-user-agent When set to true, will override any user-agent header with a random value based on ua-type (default true)
-h, --help help for serve
-i, --instance string Specify the name of the Till instance. (default "default")
-p, --port string Specify the port to run (default "2933")
--proxy-file string Specify the path to a txt file that contains a list of proxies
-t, --token string Specify the Till auth token. To get your token, sign up for at https://www.datahen.com/till
--ua-type string Specify what kind of browser user-agent to generate. Values can either be "desktop" or "mobile" (default "desktop")
Global Flags:
--config string config file (default is ~/.config/datahen/till/config.yaml)
You can specify a more granular configuration for Till by using the config.yaml
file.
By default, if you create a yaml file on the path ~/.config/datahen/till/config.yaml
, Till will use this file as its configuration when it runs.
The following is a sample config.yaml
file that you can use as a starting point:
Note: This config.yaml file contains all default values for Till. You can change the values as necessary.
# Get your auth token at https://till.datahen.com
token: replace-with-your-token
# The Till instance name.
# This can be changed to match other instance names that you've
# created on https://till.datahen.com/instances
instance: default
# The data directory that is used by various features of Till to store data.
# It usually follows the naming convention of `<instance name>.data`
datadir: ~/.config/datahen/till/default.data
# The proxy port where your scraper codes will connect to.
port: 2930
# The port to the Till UI.
uiport: 2980
# Certificate Authority (CA) settings that Till will use to act as Man-In-The-Middle (MIITM) proxy.
# The path to the CA certificate file.
ca-cert: ~/.config/datahen/till/till-ca-cert.pem
# The path to the CA key file.
ca-key: ~/.config/datahen/till/till-ca-key.pem
# User agent settings.
# When set to true, it will override all user-agent with a randomly generated one
force-user-agent: false
# specify user agent type to generate randomly.
ua-type: desktop
# Proxy IP settings.
# Path to the text file that contains a list of proxy IPs.
# If you don't specify this, Till will use your real local IP address.
proxy-file: ~/.config/datahen/till/proxylist.txt
# Sticky Session settings
sessions:
# Disable the sticky sessions feature.
# Defaults to false.
disabled: false
# TTL (Time To Live). How long a session record will be allowed to live before it gets deleted.
# Defaults to "week".
ttl: "week"
# Cache settings
cache:
# Disable the cache feature.
# Defaults to false.
disabled: false
# TTL (Time To Live). How long a cache record will be allowed to live before it gets deleted.
# Defaults to "week".
ttl: "week"
# Specifies by default on how fresh the Cache Hit will be.
# Defaults to "any"
freshness: "any"
# Specifies if Till should serve cached responses of failed HTTP requests (non 2XX statuses)
# Defaults to false.
serve-failures: false
# Logger settings
logger:
# Disable the logger feature
# Defaults to false.
disabled: false
# TTL (Time To Live). How long a request log record will be allowed to live before it gets deleted.
# Defaults to "week".
ttl: "week"
# Request Interception settings
interceptions:
# this example intercept various image URLs and responds with a local image
- name: replace_images
disabled: true
matches:
pattern: '.+\.(jpe?g|png|tiff|bmp|gif|webp)'
method: GET
responds:
code: 200
header:
"Content-Type": "image/png"
file: "/path/to/your/image.png"
# this example intercept a certain URL and responds with a body
- name: replace_body
disabled: true
matches:
pattern: 'fetchtest\.datahen\.com\/echo\/request'
method: GET,POST
responds:
code: 200
header:
"Content-Type": "application/json"
body: '{"Hello":"this has been intercepted"}'
Getting Started
How To Use
Integrations
Python
Node.js
Go
Ruby