# File resources [`FileResource`](FileResource) is an abstract class that provides base features for downloading files, but does not define `_download` method itself. The [`HttpResource`](HttpResource) class, described in the [HTTP resources](http-resources.md) section of the documentation, is a concrete implementation of the [`FileResource`](FileResource) class. ## Resource flow The `FileResource._start` method first calls the `FileResource._download` method, doing multiple attempts depending on the retry strategy. The `FileResource._download` method is responsible for writing the target file. After the file is downloaded, the `FileResource._start` method calls the `FileResource._post_process_target_file` which in turn calls the `FileResource._validate_mimetype` and the `FileResource._reformat_file` methods. See the following sections for more information about them. ## MIME type validation A common pitfall when downloading files is that the server responds something else than the expected response, the most well-known example being the "404 not found" web page. By default, the [`FileResource`](FileResource) class validates that the actual MIME type of the downloaded file matches the expected one, based on the file name, after it has been downloaded. For example, a file nameed `catalog.json` will be expected to have a MIME type of `application/json`, and a file named `data.csv` will be expected to have a MIME type of `text/csv`. The `FileResource._validate_mimetype` method calls the [`validate_mimetype`](dbnomics_toolbox.fetcher_utils.mimetype_utils.validate_mimetype), which makes use of the [`mimetypes.guess_type`](https://docs.python.org/3/library/mimetypes.html#mimetypes.guess_type) function of the Python standard library. If the MIME type could not be guessed based on the file name, the [`MimeTypeNotGuessed`](MimeTypeNotGuessed) exception is raised. In that case it is still possible to pass the `accept_mimetype` kwarg to the constructor of [`FileResource`](FileResource), which skips guessing the MIME type from the file name. The actual MIME type of the file is then detected from the file contents by using the [`python-magic`](https://pypi.org/project/python-magic) package. If the detected MIME type does not match the expected one, the [`InvalidMimeType`](InvalidMimeType) exception is raised. The [`BaseDownloader`](BaseDownloader) considers the resource as failed and logs the error. MIME type validation can be disabled by passing the `validate_mimetype=False` kwarg to the constructor of [`FileResource`](FileResource). ## Reformat files When downloading a text-based file like JSON or XML, the server can send its contents formatted in different ways. For exemple, a JSON file can be responded completely unindented (as a single line), indented with 2 or 4 spaces, etc. The same goes for XML files. To minimize variations between different versions of the same file, especially when using a version control system like [Git](https://git-scm.com/), it is advised to reformat the file using settings that don't vary in time. By default, the [`FileResource`](FileResource) class reformats the file after it has been downloaded, using a different method based on the file extension. As of now, the JSON (`.json`) and XML (`.xml`) formats are supported. The `FileResource._reformat_file` method calls the [`reformat_file`](dbnomics_toolbox.fetcher_utils.file_utils.common.reformat_file) function which can rely on external tools to actually reformat the files. If a tool is missing, or the reformatting fails, an exception is raised and the resource is considered as failed by the [`BaseDownloader`](BaseDownloader) Reformatting can be disabled by passing the `reformat_file=False` kwarg to the constructor of [`FileResource`](FileResource).