This work proposes a novel problem formulation for preference learning in human-robot interaction (HRI) where human preferences are encoded as soft planning constraints. The authors explore a data-driven method to enable a robot to infer preferences by querying users, which they instantiate in rearrangement tasks in the Habitat 2.0 simulator.
The key highlights and insights are:
The authors distinguish between hard constraints (essential for task success) and soft constraints (desired but not required robot behavior) in planning, and focus on learning the soft constraints.
The authors represent preferences as a collection of sub-preferences, where each sub-preference corresponds to a specific aspect of the robot's behavior (e.g., order of subtasks, state of receptacles).
The authors propose a neural network model to predict the user's preferences given a sequence of queries where the user chooses between potential robot behaviors. The model is trained to predict the probability distribution over the sub-preferences, capturing the uncertainty in the user's choices.
The authors evaluate their approach under varied levels of noise in the simulated user choices, and find that models trained on some noise perform better than a perfectly rational baseline, especially when generalizing to different noise levels.
The authors also compare models supervised on the ground-truth preferences versus models supervised on the inferred probability distributions, finding that the latter can outperform the former when the training data has high levels of noise.
Overall, this work presents a promising approach for learning human preferences as soft constraints in robot planning, paving the way for more adaptable and personalized robot behavior in the future.
إلى لغة أخرى
من محتوى المصدر
arxiv.org
استفسارات أعمق