Environments¶

UniEnv exposes two environment styles:

Env: a stateful object that owns its runtime state internally
FuncEnv: a functional interface that passes state explicitly

Both carry the same high-level contract: typed action, observation, and optional context spaces plus the standard reset, step, and render lifecycle.

`Env`¶

Use Env when you want the familiar environment shape:

reset(...) -> context, observation, info
step(action) -> observation, reward, terminated, truncated, info
render()

This is the most direct fit for interactive control loops, evaluation harnesses, and compatibility layers that expect a stateful object.

Important properties exposed by Env implementations:

action_space
observation_space
context_space
backend
device
batch_size

Env also includes convenience helpers such as sample_action() and sample_observation(), which draw from the declared spaces while updating the environment RNG.

`FuncEnv`¶

Use FuncEnv when explicit state passing is more important than object-owned mutable state.

The core methods are:

initial(...) -> state, context, observation, info
reset(state, ...) -> state, context, observation, info
step(state, action) -> state, observation, reward, terminated, truncated, info

This style is useful when you want:

JAX-friendly execution patterns
easier functional composition
tighter control over state snapshots, rollouts, or transformations

Bridging The Two¶

FuncEnvBasedEnv adapts a FuncEnv into a stateful Env.

That means you can:

implement the hard logic once in functional form
expose a familiar object-oriented environment API to downstream code
keep wrappers and tooling that expect Env unchanged

Batched Environments¶

Both environment styles can represent batched execution through batch_size.

When batch_size is set:

actions, observations, rewards, and done flags are expected to carry a batch dimension
reset(mask=...) can reset only selected batch elements
helper methods such as update_observation_post_reset merge masked resets back into a full batch

Space invariant for batched environments¶

A key contract that all Env and FuncEnv implementations share:

If an environment is batched, then its observation_space, action_space, and context_space (when present) already describe batched values.

Concretely:

`batch_size`	Meaning	Spaces
`None`	Unbatched – single instance, no leading batch axis.	Describe single-instance values.
`N` (including `1`)	Batched – leading batch dimension of size `N`.	Already include the leading batch axis of size `N`.

batch_size == 1 is still a batched environment. Its spaces carry a leading dimension of size 1; they are not the same as the unbatched spaces.

Why this matters¶

Code that builds source/target spaces from the env – replay-buffer builders, data transformations, wrappers that unbatch per-slot data – must use the env-side spaces as-is. Applying an additional batch_space(...) on top of an already-batched env space will double-batch and produce incorrect shapes.

Common patterns:

Correct: source_space = env.observation_space (works for both batched and unbatched envs).
Incorrect: source_space = batch_space(env.observation_space, env.batch_size) when the env is already batched – this adds a second batch dimension.
Unbatching per-slot data: use get_at(env.observation_space, data, i) to slice slot i out of the batched data; the resulting per-slot space is the single-instance space obtained by unbatching env.observation_space, not by re-batching it.

Wrappers¶

UniEnv wrappers work at the Env layer. They can change:

the action interface
the observation and context interface
backend placement
episode length
rendering and video export behavior

See Wrappers and Transformations for the main wrapper stack.

When To Use Which¶

Choose Env if:

you want the most familiar interface
your simulator already manages mutable runtime state
you are wrapping an existing imperative system

Choose FuncEnv if:

you want explicit state passing
you care about purely functional rollout logic
you want the same core logic to be easier to test, checkpoint, or stage

Use FuncEnvBasedEnv if you want both.