Contents Menu Expand Light mode Dark mode Auto light/dark mode
imitation
imitation

Getting Started

  • What is imitation?
  • Installation
  • First Steps
  • Command Line Interface

Main Concepts

  • Experts
  • Trajectories
  • Reward Networks
  • Limitations on Horizon Length
  • Benchmarking imitation
  • Benchmark Summary

Algorithms

  • Behavioral Cloning (BC)
  • Generative Adversarial Imitation Learning (GAIL)
  • Adversarial Inverse Reinforcement Learning (AIRL)
  • DAgger
  • Density-Based Reward Modeling
  • Maximum Causal Entropy Inverse Reinforcement Learning (MCE IRL)
  • Preference Comparisons
  • Soft Q Imitation Learning (SQIL)

Tutorials

  • Train an Agent using Behavior Cloning
  • Train an Agent using the DAgger Algorithm
  • Train an Agent using Generative Adversarial Imitation Learning
  • Train an Agent using Adversarial Inverse Reinforcement Learning
  • Learning a Reward Function using Preference Comparisons
  • Learning a Reward Function using Preference Comparisons on Atari
  • Learn a Reward Function using Maximum Conditional Entropy Inverse Reinforcement Learning
  • Learning a Reward Function using Kernel Density
  • Train an Agent using Soft Q Imitation Learning
  • Train an Agent using Soft Q Imitation Learning with SAC
  • Reliably compare algorithm performance
  • Train Behavior Cloning in a Custom Environment

API Reference

  • imitation
    • imitation.algorithms
      • imitation.algorithms.adversarial
        • imitation.algorithms.adversarial.airl
        • imitation.algorithms.adversarial.common
        • imitation.algorithms.adversarial.gail
      • imitation.algorithms.base
      • imitation.algorithms.bc
      • imitation.algorithms.dagger
      • imitation.algorithms.density
      • imitation.algorithms.mce_irl
      • imitation.algorithms.preference_comparisons
      • imitation.algorithms.sqil
    • imitation.data
      • imitation.data.buffer
      • imitation.data.huggingface_utils
      • imitation.data.rollout
      • imitation.data.serialize
      • imitation.data.types
      • imitation.data.wrappers
    • imitation.policies
      • imitation.policies.base
      • imitation.policies.exploration_wrapper
      • imitation.policies.interactive
      • imitation.policies.replay_buffer_wrapper
      • imitation.policies.serialize
    • imitation.regularization
      • imitation.regularization.regularizers
      • imitation.regularization.updaters
    • imitation.rewards
      • imitation.rewards.reward_function
      • imitation.rewards.reward_nets
      • imitation.rewards.reward_wrapper
      • imitation.rewards.serialize
    • imitation.scripts
      • imitation.scripts.analyze
      • imitation.scripts.config
        • imitation.scripts.config.analyze
        • imitation.scripts.config.eval_policy
        • imitation.scripts.config.parallel
        • imitation.scripts.config.train_adversarial
        • imitation.scripts.config.train_imitation
        • imitation.scripts.config.train_preference_comparisons
        • imitation.scripts.config.train_rl
        • imitation.scripts.config.tuning
      • imitation.scripts.convert_trajs
      • imitation.scripts.eval_policy
      • imitation.scripts.ingredients
        • imitation.scripts.ingredients.bc
        • imitation.scripts.ingredients.demonstrations
        • imitation.scripts.ingredients.environment
        • imitation.scripts.ingredients.expert
        • imitation.scripts.ingredients.logging
        • imitation.scripts.ingredients.policy
        • imitation.scripts.ingredients.policy_evaluation
        • imitation.scripts.ingredients.reward
        • imitation.scripts.ingredients.rl
        • imitation.scripts.ingredients.sqil
        • imitation.scripts.ingredients.wb
      • imitation.scripts.parallel
      • imitation.scripts.train_adversarial
      • imitation.scripts.train_imitation
      • imitation.scripts.train_preference_comparisons
      • imitation.scripts.train_rl
      • imitation.scripts.tuning
    • imitation.testing
      • imitation.testing.expert_trajectories
      • imitation.testing.hypothesis_strategies
      • imitation.testing.reward_improvement
      • imitation.testing.reward_nets
    • imitation.util
      • imitation.util.logger
      • imitation.util.networks
      • imitation.util.registry
      • imitation.util.sacred
      • imitation.util.sacred_file_parsing
      • imitation.util.util
      • imitation.util.video_wrapper

Development

  • Developer Guide
  • Contributing
    • Code of Conduct
    • Ways to contribute
  • Release Notes
  • License
Back to top
Edit this page

imitation.regularization#

Implements a variety of regularization techniques for NN weights.

Modules

imitation.regularization.regularizers

Implements the regularizer base class and some standard regularizers.

imitation.regularization.updaters

Implements parameter scaling algorithms to update the parameters of a regularizer.

Next
imitation.regularization.regularizers
Previous
imitation.policies.serialize
Copyright © 2019-2022, Center for Human-Compatible AI
Made with Sphinx and @pradyunsg's Furo