# HTTP resources The [`HttpResource`](HttpResource) class is a concrete implementation of the [`FileResource`](FileResource) class described in the [file resources](file-resources.md) section of the documentation. This class uses [requests](https://requests.readthedocs.io/en/latest/) to fetch URLs and [tenacity](https://tenacity.readthedocs.io/en/latest/) to implement retrying. ## Retry strategy By default, the [`HttpResource`](HttpResource) is configured with sensible defaults regarding HTTP retries. For example, HTTP requests whose response have a status code of 500 (Internal Server Error) will be automatically retried, but 404 (Not Found) won't. The `HttpResource._default_retrying` property calls the [`create_requests_retrying`](dbnomics_toolbox.retry_utils.requests.retrying.create_requests_retrying) function. This strategy retries the HTTP requests when the response status code is: ```python { 408, # Request Timeout 425, # Too Early 429, # Too Many Requests 500, # Internal Server Error 502, # Bad Gateway 503, # Service Unavailable 504, # Gateway Timeout 524, # A Timeout Occurred (Cloudflare-specific) 522, # Connection Timed Out (Cloudflare-specific)}. } ``` This strategy applies a delay between different retry attempts based on the `Retry-After` HTTP header in the response (cf [RFC 6585](https://datatracker.ietf.org/doc/html/rfc6585#section-4)), or if not found fallbacks by applying an exponential delay following the formula `(2 ** (attempt_index - 1)) * 1.5`, starting from 1.5 seconds for the second attempt (the first attempt has no reason to wait), then 3, 6, etc., limited to a maximum of 5 minutes. The retry strategy can be customized by passing the `retrying` kwarg to the constructor of the [`HttpResource`](HttpResource) class, or by creating a child class that overrides the `HttpResource._default_retrying` property. ## Validate the response By default the `HttpResource._validate_response` method calls the [`Response.raise_for_status`](https://requests.readthedocs.io/en/latest/api/#requests.Response.raise_for_status) method, which raises an exception if the response status code is unsuccessful. The response validation can be customized by passing the `validate_response` kwarg to the constructor of the [`HttpResource`](HttpResource) class, or by creating a child class that overrides the `HttpResource._validate_response` method. ## Proxies If you need to use a proxy, you pass the `proxies` argument to the [`HttpResource`](HttpResource) class: ```python from dbnomics_toolbox.fetcher_utils.resources.http_resource import HttpResource proxies = { "http": "http://10.10.1.10:3128", "https": "http://10.10.1.10:1080", } HttpResource( proxies=proxies, request="http://example.org, target_file="test.txt", ) ``` Alternatively you can configure it once for an entire `Session`: ```python from dbnomics_toolbox.fetcher_utils.resources.http_resource import HttpResource from requests import Session proxies = { "http": "http://10.10.1.10:3128", "https": "http://10.10.1.10:1080", } with Session() as session: session.proxies.update(proxies) resource = HttpResource( request="http://example.org, session=session, target_file=target_file, ) ``` Proxies can also be configured by using the standard environment variables `http_proxy`, `https_proxy`, `no_proxy`, and `all_proxy`, as documented by the Requests library. See also: