Skip to content

IMdpReward

interface · namespace Hazel

Engine-side reward signal. Computes a scalar reward from MuJoCo/agent state each step. Examples: velocity tracking error, joint acceleration penalty, feet air time.

public interface IMdpReward

Properties

Descriptor

ComponentDescriptor Descriptor { get; }

Capability descriptor for the manifest.

Name

string Name { get; }

Unique identifier used in contracts and manifests.

Methods

Compute(MdpContext, Dictionary)

float Compute(MdpContext ctx, Dictionary<string, string> parameters)

Compute the scalar reward value for the current step. Hot path — must not allocate.


Source: Hazel/Learn/MdpComponent.cs