Data storage¶
This section explains how class instances of the data model living in Python memory can be stored.
The dbnomics_toolbox.storage module defines a BaseStorage class with abstract methods allowing to load and save instances of data model classes.
It also defines a FileSystemStorage class – a concrete implementation of BaseStorage, or adapter – that reads and writes data model classes from/to the file-system.
This allows potentially other adapters to store converted data to other storage systems (e.g. in-memory only, SQLite, DuckDB, PostgreSQL, etc.).
BaseStorage interface¶
The BaseStorage class defines abstract methods to load and save instance of data model classes, like load_provider_metadata or save_provider_metadata.
The convert part of fetchers is designed to make use of BaseStorage as an interface without telling which adapter to use, so that the fetcher is not coupled to any specific adapter.
To achieve this, instead of instanciating any concrete adapter directly, the converter process will read the storage URI from the STORAGE_URI environment variable at runtime, and the corresponding adapter will be instanciated and used.
Since any adapter uses the same interface as BaseStorage, the source code of the fetcher will remain valid.
Example¶
# To instanciate a storage adapter from a storage URI:
storage_uri = StorageUri.parse("filesystem:converted-data")
storage = BaseStorage.from_uri(storage_uri)
assert type(storage) == FileSystemStorage
# To load data:
provider_metadata = storage.load_provider_metadata("INSEE")
# To save data:
storage.save_provider_metadata(provider_metadata)
Note: here the storage URI is parsed manually for demonstration purpose, but in the converter part of the fetcher, the storage instanciation is done automatically.
Storage adapters¶
The adapters are concrete implementations of the BaseStorage abstract class.
File-system adapter¶
As of now, only the FileSystemStorage adapter is available.
It is documented in the File-system storage adapter page.
Revisions¶
The load_* methods BaseStorage can load entities of the data model at a specific revision.
By default the latest revision is loaded.