4.7. Storage¶

4.7.1. Description¶

The Storage is an object that is responsible for storing extracted features as computed from Marker step of the pipeline. If the pipeline is provided with a storage-like object, the extracted features are stored via that object else they are kept in memory.

Storage is meant to be used inside the DataGrabber context but you can operate on them outside the context as long as the processed data is in the memory and the Python runtime has not garbage-collected it.

The Markers are responsible for defining what storage kind (matrix, vector, timeseries, scalar_table) they support for which data type by overriding its get_output_type method. The storage object in turn declares and provides implementation for specific storage kind. For example, SQLiteFeatureStorage supports saving matrix, vector and timeseries via store_matrix, store_vector and store_timeseries methods respectively.

For storage interfaces not supported by junifer yet, you can either make your own Storage by providing a concrete implementation of BaseFeatureStorage or open an issue on junifer Github and we can help you out.

4.7.2. Storage Types¶

Storage Type	Description	Options	Reference
`matrix`	A 2D square matrix with row and column names	`col_names`, `row_names`, `matrix_kind`, `diagonal` `row_header_col_name` (only for `HDF5FeatureStorage.store_matrix()`)	`BaseFeatureStorage.store_matrix()`
`vector`	A 1D row vector of values with column names	`col_names`	`BaseFeatureStorage.store_vector()`
`timeseries`	A 2D square or non-square matrix of scalar values with column names	`col_names`	`BaseFeatureStorage.store_timeseries()`
`scalar_table`	A 2D square or non-square matrix of scalar values with row name, column name and row header column name	`col_names`, `row_names`, `row_header_col_name`	`BaseFeatureStorage.store_scalar_table()`

4.7.3. Storage Interfaces¶

Storage class	File extension	File type	Storage kinds
`SQLiteFeatureStorage`	`.sqlite`	SQLite	`matrix`, `vector`, `timeseries`
`HDF5FeatureStorage`	`.hdf5`	HDF5	`matrix`, `vector`, `timeseries`, `scalar_table`