pyvrs.reader module

class pyvrs.reader.AsyncVRSReader(path: Path | str | List[Path] | List[str] | Dict[str, int | str | List[str]] | FileSpec | List[FileSpec], auto_read_configuration_records: bool = True, encoding: str = 'utf-8', multi_path: bool = False)[source]

Bases: VRSReader, AsyncIterable[VRSRecord]

filtered_by_fields(stream_ids: Set[str] | str | None = None, record_types: Set[str] | str | None = None, min_timestamp: float | None = None, max_timestamp: float | None = None) AsyncFilteredVRSReader[source]

Filter this reader to only read records with given condition.

Parameters:
  • stream_ids

    Stream Ids that you are interested in. .. note:

    Filtering can be performed by glob-like wildcard matching,
    e.g. if the available Stream IDs are {'1010-1', '1011-1', '1011-2', '1011-10'}
    >>> reader.filtered_by_fields(stream_ids='1011*')
    would keep '1011-1', '1011-2', '1011-10', but would discard '1010-1'.
    >>> reader.filtered_by_fields(stream_ids='1011-?')
    would only keep '1011-1', '1011-2'.
    >>> reader.filtered_by_fields(stream_ids='1011-1')
    would only keep '1011-1'.
    

  • record_types – Record types that you are interested in. The options are {“data”, “configuration”, “state”}.

  • min_timestamp – Minimum timestamp you want to read from.

  • max_timestamp – Maxmum timestamp you want to read to.

Returns:

AsyncFilteredVRSReader that represents filtered VRS file.

class pyvrs.reader.SyncVRSReader(path: Path | str | List[Path] | List[str] | Dict[str, int | str | List[str]] | FileSpec | List[FileSpec], auto_read_configuration_records: bool = True, encoding: str = 'utf-8', multi_path: bool = False)[source]

Bases: VRSReader

filtered_by_fields(*, stream_ids: Set[str] | str | None = None, record_types: Set[str] | str | None = None, min_timestamp: float | None = None, max_timestamp: float | None = None) SyncFilteredVRSReader[source]

Filter this reader to only read records with given condition.

Parameters:
  • stream_ids

    Stream Ids that you are interested in. .. note:

    Filtering can be performed by glob-like wildcard matching,
    e.g. if the available Stream IDs are {'1010-1', '1011-1', '1011-2', '1011-10'}
    >>> reader.filtered_by_fields(stream_ids='1011*')
    would keep '1011-1', '1011-2', '1011-10', but would discard '1010-1'.
    >>> reader.filtered_by_fields(stream_ids='1011-?')
    would only keep '1011-1', '1011-2'.
    >>> reader.filtered_by_fields(stream_ids='1011-1')
    would only keep '1011-1'.
    

  • record_types – Record types that you are interested in. The options are {“data”, “configuration”, “state”}.

  • min_timestamp – Minimum timestamp you want to read from.

  • max_timestamp – Maxmum timestamp you want to read to.

Returns:

FilteredVRSReader that represents filtered VRS file.

class pyvrs.reader.VRSReader(path: Path | str | List[Path] | List[str] | Dict[str, int | str | List[str]] | FileSpec | List[FileSpec], auto_read_configuration_records: bool = True, encoding: str = 'utf-8', multi_path: bool = False)[source]

Bases: BaseVRSReader, ABC

A Pythonic reader for VRS files. Behaves as a filterable list - has a length (number of records), can be indexed to retrieve VRSRecords, and can be iterated over and sliced just like a regular Python list. Significant file reads are only done when record state is queried, so it remains performant even for larger VRS files.

The methods in child classes of VRSReader operates against all records. When user call filtered_by_fields method, that call creates FilteredVRSReader that represents a slice of the file.

Example

Basic usage:

>>> reader = SyncVRSReader("path/to/a.vrs")
>>> print(reader)  # print to get an overview of what's in the file
>>> for record in reader:  # loop over all records
>>>     print(record)  # print all the state of the record

Filters can be added to only consider certain types of records from certain streams:

>>> filtered_reader = reader.filtered_by_fields(
>>>                     stream_ids = {'1010-*', '1001-*'}, # limit to some streams
>>>                     record_types = {'configuration'}, # only read config records
>>>                   )
>>> for filtered_record in filtered_reader:
>>>     print(filtered_record)  # only configs from streams matching 1010-*/1001-*

List slicing and indexing is supported:

>>> for one_interesting_record in reader[42]:
>>>     print(one_interesting_record)
>>> for every_100th_record in reader[::100]:
>>>     print(every_100th_record)
>>> for a_last_five_record in reader[-5:]:
>>>     print(a_last_five_record)

IMPORTANT NOTE ON CONFIGURATION RECORDS

VRS is a streaming file format, which means order of record reads matters for one important case - namely reading configuration records, as they instruct VRS how to parse future data records. It’s possible that a single stream (i.e. the list of records associated with a single Recordable ID) contains multiple records of type ‘configuration’, each of which encode how future records in this stream should be interpreted. An example of this would be if the resolution of a image changes mid-stream - a new configuration would encode the new image size, and reading linearly through the records would mean that VRS understands how to parse the images correctly.

pyvrs surfaces a flexible random access API, with list-like behavior. This is very convenient and powerful, but requires careful management of read order of configuration records - if the matching preceding configuration record is not read, a record may not be interpretable at all.

pyvrs can assist users by always ensuring preceding configuration records are automatically read before any record access. This, for the vast majority of use cases, ‘just works’, and allows people to read all records in a file without consideration to configuration records. However, for some niche use cases (i.e. badly formatted files with incorrect configuration record placement) this automated behavior could be undesirable.

To avoid any confusion on the matter, pyvrs requires that users decide whether or not they want to opt-in to automatic configuration record reading at instantiation of the VRSReader. To opt-in, pass:

auto_read_configuration_records=True

to the constructor. In most cases, this is the behavior non-VRS experts want, and we set auto_read_configuration_records=True by default.

If you would prefer to disable all automatic reading of configuration records, pass:

auto_read_configuration_records=False

Users who pass this must take care to manually read configuration records in the correct order before reading data records.

If you wish to read multiple VRS files simultaneously, use the following flag (False by default):

multi_path=True

close()[source]

explicitly close the VRS reader without waiting for Python grabage collection.

property file_tags: Mapping[str, str]

Return a dict of all file tags present in this VRS file.

Returns:

{<tag>: <value>}

Return type:

Dictionary of all file tags

abstract filtered_by_fields(*, stream_ids: Set[str] | str | None = None, record_types: Set[str] | str | None = None, min_timestamp: float | None = None, max_timestamp: float | None = None) FilteredVRSReader[source]

Filter this reader to only read records with given condition.

Parameters:
  • stream_ids

    Stream Ids that you are interested in. .. note:

    Filtering can be performed by glob-like wildcard matching,
    e.g. if the available Stream IDs are {'1010-1', '1011-1', '1011-2', '1011-10'}
    >>> reader.filtered_by_fields(stream_ids='1011*')
    would keep '1011-1', '1011-2', '1011-10', but would discard '1010-1'.
    >>> reader.filtered_by_fields(stream_ids='1011-?')
    would only keep '1011-1', '1011-2'.
    >>> reader.filtered_by_fields(stream_ids='1011-1')
    would only keep '1011-1'.
    

  • record_types – Record types that you are interested in. The options are {“data”, “configuration”, “state”}.

  • min_timestamp – Minimum timestamp you want to read from.

  • max_timestamp – Maxmum timestamp you want to read to.

Returns:

FilteredVRSReader that represents filtered VRS file.

find_stream(recordable_type_id: int, tag_name: str, tag_value: str) str[source]

Find stream matching recordable type and tag, and return its stream id.

Parameters:
  • recordable_type_id – stream_id is <recordable_type_id>-<instance_id>

  • tag_name – tag name that you are interested in

  • tag_value – tag value that you are interested in

Returns:

Stream ID that starts with recordable_type_id and has a given tag pair.

find_streams(recordable_type_id: int, flavor: str = '') List[str][source]

Find streams matching recordable type and flavor, and return sets of stream ids.

Parameters:
  • recordable_type_id – stream_id is <recordable_type_id>-<instance_id>

  • tag_name – tag name that you are interested in

  • tag_value – tag value that you are interested in

Returns:

A set of stream IDs that start with recordable_type_id and has a given flavor.

get_estimated_frame_rate(stream_id: str) float[source]

Get the estimated frame rate for the given stream_id.

Parameters:

stream_id – stream_id that you are interested in.

Returns:

The estimated frame rate.

get_record_index_by_time(stream_id: str, timestamp: float, epsilon: float | None = None, record_type: RecordType | None = None) int[source]

Get index in filtered records by timestamp.

Parameters:
  • stream_id – stream_id that you are interested in.

  • timestamp – timestamp that you are interested in.

  • epsilon – Optional argument. If specified we search for record in range of (timestamp-epsilon)-(timestamp+epsilon) and returns the nearest record.

  • record_type – Optional argument. If specified we search for record with the record_type.

Returns:

The absolute index of the record corresponds to the stream_id & timestamp.

Raises:
  • TimestampNotFoundError – If epsilon is not None and the record doesn’t exist within the time range.

  • ValueError – If epsilon is None and the record isn’t found using lower_bound.

get_records_count(stream_id: str, record_type: RecordType) int[source]

Get the number of records for the stream_id & record_type.

Parameters:
  • stream_id – stream_id you are interested in.

  • record_type – record type you are interested in.

Returns:

The number of records for stream_id & record type

get_stream_for_flavor(recordable_type_id: int, flavor: str, index_number: int = 0) str[source]

Get a recordable id for a specific recordable type id (device type), flavor and index number

Parameters:
  • recordable_type_id – stream_id is <recordable_type_id>-<instance_id>

  • flavor – A flavor of device to look for.

  • index_number – The number of the index of the stream. Defaults to 0.

get_stream_info(stream_id: str) Dict[str, str][source]

Get details about a stream.

Parameters:

stream_id – stream_id you are interested in.

Returns:

An information about the stream in a dictionary.

get_timestamp_for_index(index: int) float[source]

Get the timestamp corresponding to the given index.

Parameters:

index – the index for the record

Returns:

A timestamp corresponds to the index

get_timestamp_list(indices: List[int] | None = None) List[float][source]

Get the list of timestamps corresponding to the given indices.

Parameters:

indices – the list of indices we want to get the timestamp.

Returns:

A list of timestamps correspond to the indices, if indices are None, we get the full timestamp list.

property max_timestamp: float

Return a maximum timestamp of this VRS file.

might_contain_audio(stream_id: str) bool[source]

Check if the given stream_id contains an audio data.

Parameters:

stream_id – stream_id that you are interested in.

Returns:

Based on the config record, return if the stream contains an audio data.

might_contain_images(stream_id: str) bool[source]

Check if the given stream_id contains an image data.

Parameters:

stream_id – stream_id that you are interested in.

Returns:

Based on the config record, return if the stream contains an image data.

property min_timestamp: float

Return a minimum timestamp of this VRS file.

property n_records: int

Return a number of records in this VRS file.

read_next_record(stream_id: str, record_type: str, index: int) VRSRecord | None[source]

Read the first record that matches stream_id and record_type and its index is greater or equal than given index.

Parameters:
  • stream_id – stream_id that you are interested in.

  • record_type – record_type that you are interested in.

  • index – the absolute index in the file. Based on this index, try to find the previous record that matches stream_id & record_type

Returns:

VRSRecord if there is a record, otherwise None

read_prev_record(stream_id: str, record_type: str, index: int) VRSRecord | None[source]

Read the last record that matches stream_id and record_type and its index is smaller or equal than given index.

Parameters:
  • stream_id – stream_id that you are interested in.

  • record_type – record_type that you are interested in.

  • index – the absolute index in the file. Based on this index, try to find the previous record that matches stream_id & record_type

Returns:

VRSRecord if there is a record, otherwise None

read_record_by_time(stream_id: str, timestamp: float, epsilon: float | None = None, record_type: RecordType | None = None) VRSRecord[source]

Read record by timestamp.

Parameters:
  • stream_id – stream_id that you are interested in.

  • timestamp – timestamp that you are interested in.

  • epsilon – Optional argument. If specified we search for record in range of (timestamp-epsilon)-(timestamp+epsilon) and returns the nearest record.

  • record_type – Optional argument. If specified we search for record with the record_type.

Returns:

VRSRecord corresponds to the stream_id & timestamp.

Raises:
  • TimestampNotFoundError – If epsilon is not None and the record doesn’t exist within the time range.

  • ValueError – If epsilon is None and the record isn’t found using lower_bound.

property record_types: Set[str]

Return a set of record types in this VRS file.

set_image_conversion(conversion: ImageConversion) None[source]

Set default image conversion policy, and clears any stream specific setting.

Parameters:

conversion – The image conversion you want to apply for all streams.

set_stream_image_conversion(stream_id: str, conversion: ImageConversion) None[source]

Set image conversion policy for a specific stream.

Parameters:
  • stream_id – The stream_id you want to apply image conversion to.

  • conversion – The image conversion you want to apply for a specific stream.

set_stream_type_image_conversion(recordable_type_id: str, conversion: ImageConversion) int[source]

Set image conversion policy for streams of a specific device type.

Parameters:
  • recordable_type_id – The recordable_type_id you want to apply image conversion to. If you specify 1000, streams with id 1000-* are the targets.

  • conversion – The image conversion you want to apply for a specific stream.

Returns:

The number of streams affected.

property stream_ids: Set[str]

Return a set of stream ids in this VRS file.

property stream_tags: Mapping[str, Mapping[str, Any]]

Return a dict of all per-stream tags present in this VRS file.

Returns:

{<stream_id>: {<tag>: <value>}}

Return type:

Dictionary of all per-stream tags

property time_range: float

Return a timestamp range of this VRS file.