surveydata.storagesystem module
Core interface (informal) for survey data storage systems.
- class surveydata.storagesystem.StorageSystem
Bases:
object
Largely-abstract base class for survey data storage systems.
- __init__()
Initialize storage system.
- attachments_supported() bool
Query whether storage system supports attachments.
- Returns
True if attachments supported, otherwise False
- Return type
bool
- get_attachment(attachment_location: str = '', submission_id: str = '', attachment_name: str = '') BinaryIO
Get submission attachment from storage.
- Parameters
attachment_location (str) – Attachment location string (as returned when attachment stored)
submission_id (str) – Unique submission ID (in lieu of attachment_location)
attachment_name (str) – Attachment filename (in lieu of attachment_location)
- Returns
Attachment as file-like object (though, note: it doesn’t support seeking)
- Return type
BinaryIO
Must pass either attachment_location or both submission_id and attachment_name.
- get_data_timezone() timezone
Get the timezone for timestamps in the data.
- Returns
Timezone for timestamps in the data (defaults to datetime.timezone.utc if unknown)
- Return type
datetime.timezone
- get_dataframe(metadata_id: str) DataFrame
Get Pandas DataFrame from a binary file in storage.
- Parameters
metadata_id (str) – Unique metadata ID (should begin and end with __ and not conflict with any submission ID)
- Returns
Metadata string from storage, or empty string if no such metadata exists
- Return type
pd.DataFrame
- get_dataframe_csv(metadata_id: str) DataFrame
Get Pandas DataFrame from a .csv file in storage.
- Parameters
metadata_id (str) – Unique metadata ID (should begin and end with __ and not conflict with any submission ID)
- Returns
Metadata string from storage, or empty string if no such metadata exists
- Return type
pd.DataFrame
- get_metadata(metadata_id: str) str
Get metadata string from storage.
- Parameters
metadata_id (str) – Unique metadata ID (should not conflict with any submission ID)
- Returns
Metadata string from storage, or empty string if no such metadata exists
- Return type
str
- get_metadata_binary(metadata_id: str) bytes
Get metadata bytes from storage.
- Parameters
metadata_id (str) – Unique metadata ID (should not conflict with any submission ID)
- Returns
Metadata bytes from storage, or empty bytes array if no such metadata exists
- Return type
bytes
- get_submission(submission_id: str) dict
Get submission data from storage.
- Parameters
submission_id (str) – Unique submission ID
- Returns
Submission data (or empty dictionary if submission not found)
- Return type
dict
- get_submissions() list
Get all submission data from storage.
- Returns
List of dictionaries, one for each submission
- Return type
list
- get_submissions_df() DataFrame
Get all submission data from storage, organized into a Pandas DataFrame.
- Returns
Pandas DataFrame containing all submissions currently in storage
- Return type
pandas.DataFrame
- list_attachments(submission_id: str = '') list
List all attachments currently in storage.
- Parameters
submission_id (str) – Optional submission ID, to list only attachments for specific submission
- Returns
List of attachments, each as dict with name, submission_id, and location_string
- Return type
list
- list_submissions() list
List all submissions currently in storage.
- Returns
List of submission IDs
- Return type
list
- query_attachment(attachment_location: str = '', submission_id: str = '', attachment_name: str = '') bool
Query whether specific submission attachment exists in storage.
- Parameters
attachment_location (str) – Attachment location string (as returned when attachment stored)
submission_id (str) – Unique submission ID (in lieu of attachment_location)
attachment_name (str) – Attachment filename (in lieu of attachment_location)
- Returns
True if submission exists in storage; otherwise False
- Return type
bool
- query_submission(submission_id: str) bool
Query whether specific submission exists in storage.
- Parameters
submission_id (str) – Unique submission ID
- Returns
True if submission exists in storage; otherwise False
- Return type
bool
- set_data_timezone(tz: timezone)
Set the timezone for timestamps in the data.
- Parameters
tz (datetime.timezone) – Timezone for timestamps in the data
- store_attachment(submission_id: str, attachment_name: str, attachment_data: BinaryIO) str
Store submission attachment in storage.
- Parameters
submission_id (str) – Unique submission ID
attachment_name (str) – Attachment filename
attachment_data (BinaryIO) – File-type object containing the attachment data
- Returns
Location string for stored attachment
- Return type
str
- store_dataframe(metadata_id: str, df: DataFrame)
Store Pandas DataFrame as binary file in storage.
- Parameters
metadata_id (str) – Unique metadata ID to save as (should begin and end with __ and not conflict with any submission ID)
df (pd.DataFrame) – Pandas DataFrame to store as binary file
- store_dataframe_csv(metadata_id: str, df: DataFrame)
Store Pandas DataFrame as .csv file in storage.
- Parameters
metadata_id (str) – Unique metadata ID to save as (should begin and end with __ and not conflict with any submission ID)
df (pd.DataFrame) – Pandas DataFrame to store as .csv file
- store_metadata(metadata_id: str, metadata: str)
Store metadata string in storage.
- Parameters
metadata_id (str) – Unique metadata ID (should begin and end with __ and not conflict with any submission ID)
metadata (str) – Metadata string to store
- store_metadata_binary(metadata_id: str, metadata: bytes)
Store metadata bytes in storage.
- Parameters
metadata_id (str) – Unique metadata ID (should begin and end with __ and not conflict with any submission ID)
metadata (bytes) – Metadata bytes to store
- store_submission(submission_id: str, submission_data: dict)
Store submission data in storage.
- Parameters
submission_id (str) – Unique submission ID
submission_data (dict) – Submission data to store