IMdpReward¶
interface · namespace Hazel
Engine-side reward signal. Computes a scalar reward from MuJoCo/agent state each step. Examples: velocity tracking error, joint acceleration penalty, feet air time.
Properties¶
Descriptor¶
Capability descriptor for the manifest.
Name¶
Unique identifier used in contracts and manifests.
Methods¶
Compute(MdpContext, Dictionary)¶
Compute the scalar reward value for the current step. Hot path — must not allocate.
Source: Hazel/Learn/MdpComponent.cs