Environments¶
UniEnv exposes two environment styles:
Env: a stateful object that owns its runtime state internallyFuncEnv: a functional interface that passes state explicitly
Both carry the same high-level contract: typed action, observation, and optional context spaces plus the standard reset, step, and render lifecycle.
Env¶
Use Env when you want the familiar environment shape:
reset(...) -> context, observation, infostep(action) -> observation, reward, terminated, truncated, inforender()
This is the most direct fit for interactive control loops, evaluation harnesses, and compatibility layers that expect a stateful object.
Important properties exposed by Env implementations:
action_spaceobservation_spacecontext_spacebackenddevicebatch_size
Env also includes convenience helpers such as sample_action() and sample_observation(), which draw from the declared spaces while updating the environment RNG.
FuncEnv¶
Use FuncEnv when explicit state passing is more important than object-owned mutable state.
The core methods are:
initial(...) -> state, context, observation, inforeset(state, ...) -> state, context, observation, infostep(state, action) -> state, observation, reward, terminated, truncated, info
This style is useful when you want:
- JAX-friendly execution patterns
- easier functional composition
- tighter control over state snapshots, rollouts, or transformations
Bridging The Two¶
FuncEnvBasedEnv adapts a FuncEnv into a stateful Env.
That means you can:
- implement the hard logic once in functional form
- expose a familiar object-oriented environment API to downstream code
- keep wrappers and tooling that expect
Envunchanged
Batched Environments¶
Both environment styles can represent batched execution through batch_size.
When batch_size is set:
- actions, observations, rewards, and done flags are expected to carry a batch dimension
reset(mask=...)can reset only selected batch elements- helper methods such as
update_observation_post_resetmerge masked resets back into a full batch
Space invariant for batched environments¶
A key contract that all Env and FuncEnv implementations share:
If an environment is batched, then its
observation_space,action_space, andcontext_space(when present) already describe batched values.
Concretely:
batch_size |
Meaning | Spaces |
|---|---|---|
None |
Unbatched – single instance, no leading batch axis. | Describe single-instance values. |
N (including 1) |
Batched – leading batch dimension of size N. |
Already include the leading batch axis of size N. |
batch_size == 1 is still a batched environment. Its spaces carry a leading dimension of size 1; they are not the same as the unbatched spaces.
Why this matters¶
Code that builds source/target spaces from the env – replay-buffer builders, data transformations, wrappers that unbatch per-slot data – must use the env-side spaces as-is. Applying an additional batch_space(...) on top of an already-batched env space will double-batch and produce incorrect shapes.
Common patterns:
- Correct:
source_space = env.observation_space(works for both batched and unbatched envs). - Incorrect:
source_space = batch_space(env.observation_space, env.batch_size)when the env is already batched – this adds a second batch dimension. - Unbatching per-slot data: use
get_at(env.observation_space, data, i)to slice slotiout of the batched data; the resulting per-slot space is the single-instance space obtained by unbatchingenv.observation_space, not by re-batching it.
Wrappers¶
UniEnv wrappers work at the Env layer. They can change:
- the action interface
- the observation and context interface
- backend placement
- episode length
- rendering and video export behavior
See Wrappers and Transformations for the main wrapper stack.
When To Use Which¶
Choose Env if:
- you want the most familiar interface
- your simulator already manages mutable runtime state
- you are wrapping an existing imperative system
Choose FuncEnv if:
- you want explicit state passing
- you care about purely functional rollout logic
- you want the same core logic to be easier to test, checkpoint, or stage
Use FuncEnvBasedEnv if you want both.