unienv_data.replay_buffer.collection_wrapper¶
ReplayBufferCollectionWrapper – transparently records transitions into replay buffers.
This wrapper sits around any Env and writes every step transition into one
or more ReplayBuffer instances according to the chosen replay_mode.
Environment-space invariant¶
This wrapper relies on the :class:~unienv_interface.env_base.env.Env contract
that when an env is batched, its observation_space, action_space, and
context_space already include the leading batch dimension. The internal
source/target space builders (_build_canonical_source_space and the default
DictSpace builder) therefore use the env-side spaces as-is and never wrap
them in an additional batch_space call. Re-batching those spaces would
double-batch and produce incorrect shapes – this was the root cause of a
previous bug in this module.
ReplayBufferCollectionWrapper
¶
ReplayBufferCollectionWrapper(env: Env[BArrayType, ContextType, ObsType, ActType, RenderFrame, BDeviceType, BDtypeType, BRNGType], replay_buffers: Union[ReplayBuffer, Sequence[ReplayBuffer]], replay_mode: Literal['auto', 'shared', 'per_env'] = 'auto', transition_builder: Optional[Union[Callable[..., Any], DataTransformation]] = None, validate_transition: bool = False, dump_on_episode_end: bool = False, dump_on_reset_boundary: bool = False, dump_path: Optional[PathLike] = None)
Bases: Wrapper[BArrayType, ContextType, ObsType, ActType, RenderFrame, BDeviceType, BDtypeType, BRNGType, BArrayType, ContextType, ObsType, ActType, RenderFrame, BDeviceType, BDtypeType, BRNGType]
Automatically record every transition into one or more replay buffers.
Parameters¶
env : Env
The environment to wrap.
replay_buffers : ReplayBuffer | Sequence[ReplayBuffer]
One replay buffer (shared mode) or one per environment slot (per‑env
mode).
replay_mode : "auto", "shared" or "per_env"
* "shared" - a single flat replay buffer.
* "per_env" - one buffer per batched slot, with episode segments.
* "auto" - resolve based on batch_size and buffer count.
transition_builder : callable or DataTransformation, optional
Either a callable (cached_obs, cached_context, action, next_obs,
reward, terminated, truncated, env_index, rb_single_space) ->
transition, or a DataTransformation that maps a canonical
raw transition dict to the replay buffer's format. When None
the default DictSpace‑based builder is used.
validate_transition : bool
When True every built transition is validated against the target
replay buffer's space (single‑instance for append / batched for
shared extend). Off by default for speed.
dump_on_episode_end : bool
When True, automatically call dumps(path) on the replay
buffer(s) after an episode ends (i.e., when terminated or truncated
is true). Requires dump_path to be set.
dump_on_reset_boundary : bool
When True, automatically call dumps(path) on reset boundaries
(full reset or masked reset). Requires dump_path to be set.
dump_path : str | os.PathLike | None
Filesystem path where the replay buffer(s) should be dumped when
auto-dump is triggered. May be reassigned after construction to
redirect subsequent dumps. Required if dump_on_episode_end or
dump_on_reset_boundary is True. Repeated dumps overwrite
the same path.
shared
classmethod
¶
shared(env: Env, replay_buffer: ReplayBuffer, **kwargs: Any) -> 'ReplayBufferCollectionWrapper'
Create a wrapper in shared mode with a single replay buffer.
per_env
classmethod
¶
per_env(env: Env, replay_buffers: Sequence[ReplayBuffer], **kwargs: Any) -> 'ReplayBufferCollectionWrapper'
Create a wrapper in per‑env mode with one buffer per slot.
reset
¶
reset(*args: Any, mask: Optional[BArrayType] = None, seed: Optional[int] = None, **kwargs: Any) -> Tuple[ContextType, ObsType, Dict[str, Any]]
step
¶
step(action: ActType) -> Tuple[ObsType, Union[SupportsFloat, BArrayType], Union[bool, BArrayType], Union[bool, BArrayType], Dict[str, Any]]
reset_async
¶
reset_async(*args: Any, mask: Optional[BArrayType] = None, seed: Optional[int] = None, **kwargs: Any) -> None
step_wait
¶
step_wait() -> Tuple[ObsType, Union[SupportsFloat, BArrayType], Union[bool, BArrayType], Union[bool, BArrayType], Dict[str, Any]]
set_replay_buffers
¶
set_replay_buffers(replay_buffers: Union[ReplayBuffer, Sequence[ReplayBuffer]], *, replay_mode: Optional[Literal['auto', 'shared', 'per_env']] = None, require_same_space: bool = True, allow_mid_episode: bool = False, finalize_old_segments: bool = False, dump_old: bool = False) -> List[ReplayBuffer]
Swap the replay buffer(s) without rebuilding the wrapper.
Parameters¶
replay_buffers : ReplayBuffer | Sequence[ReplayBuffer]
The new replay buffer(s) to use.
replay_mode : "auto", "shared", "per_env", or None
If provided, change the replay mode. If None, keep the
current mode.
require_same_space : bool
If True, the new buffers must have the same single_space
as the old buffers. Default is True.
allow_mid_episode : bool
If True, allow swapping even if this splits an episode
across old/new buffers. Default is False.
finalize_old_segments : bool
If True, close any open segments in the old buffers before
swapping. Default is False.
dump_old : bool
If True, dump the old buffers before swapping. Requires
dump_path to be set. Only does this safely after
segment finalization where needed. Default is False.
Returns¶
List[ReplayBuffer] The old replay buffer(s) that were replaced.
Raises¶
ValueError
If the new buffers don't match the expected layout or space.
RuntimeError
If there are open segments and allow_mid_episode and
finalize_old_segments are both False.