Data Retrieval with Importance Weights for Few-Shot Imitation Learning

1Stanford
Conference on Robot Learning (CoRL) 2025 (Oral)
Method Figure

Our method consists of three main steps:

(A) Learning a latent space to encode state-action pairs.

(B) Estimating a probability distribution over the target and prior data, and using importance weights for data retrieval.

(C) Co-training on the target data and retrieved prior data.

By augmenting the high-quality but much smaller target dataset with diverse, relevant prior samples, we learn more robust and performant policies.

Abstract

While large-scale robot datasets have propelled recent progress in imitation learning, learning from smaller task specific datasets remains critical for deployment in new environments. Retrieval-based imitation learning addresses this by extracting relevant samples from large, widely available prior datasets to augment a limited demonstration dataset. To determine the relevant retrieval-based approaches commonly calculate a prior data point's minimum distance to a point in the target dataset in latent space. While retrieval-based methods have shown success using this metric for data selection, it has two shortcomings. First, it relies on high-variance nearest neighbor estimates that are susceptible to noise. Second, it does not account for the distribution of prior data when retrieving data.

To address these issues, we introduce Importance Weighted Retrieval (IWR), which estimates importance weights, or the ratio between the target and prior data distributions for retrieval, using Gaussian KDEs. By considering the probability ratio, IWR overcomes the bias of previous selection rules, and by using reasonable modeling parameters, IWR effectively smooths estimates using all data points. Across both simulation and real-world evaluations on the Bridge dataset, IWR consistently improves performance of existing retrieval-based methods, despite only requiring minor modifications.

Video Rollouts

Carrot (move carrot to dish rack)

IWR (Ours)
Behavior Retrieval
BC (no retrieval)

Corn (move corn to plate)

IWR (Ours)
Behavior Retrieval
BC (no retrieval)

Simulation: IWR vs. BR

Soup-Sauce (put both the alphabet soup and the tomato sauce in the basket)

IWR (Ours)
Soup Sauce 1
Soup Sauce 2
Soup Sauce 3
Soup Sauce 4
Soup Sauce 5
BR
Soup Sauce 1
Soup Sauce 2
Soup Sauce 3
Soup Sauce 4
Soup Sauce 5


Mug-Mug (put the white mug on the left plate and put the yellow and white mug on the right plate)

IWR (Ours)
Soup Sauce 1
Soup Sauce 2
Soup Sauce 4
Soup Sauce 3
Soup Sauce 5
BR
Soup Sauce 1
Soup Sauce 2
Soup Sauce 4
Soup Sauce 3
Soup Sauce 5

Data Retrieval Visualization

Mug-Mug task (put the white mug on the left plate and put the yellow and white mug on the right plate)

IRW gets (1) a higher portion of directly relevant tasks and (2) retrieves a more balanced distribution across timesteps.

Target Demo
IWR retrieved samples from prior
BR retrieved samples from prior
Method Figure
Figure: Difference in retrieval distributions between BR and IWR for the Mug-Pudding task in terms of both tasks (left) and timesteps (right). BR faces two challenges: (1) object similarity across tasks results in retrieval of irrelevant demonstrations, and (2) tasks often share similar starting configurations, which bias retrieval towards initial samples instead of more informative later-stage actions. IWR addresses these limitations by upweighting samples containing underrepresented objects or occurring later in the demonstrations.

Results

Method Figure

Acknowledgments

This work was supported by NSF grants 2132847 and 2218760, DARPA YFA, ONR YIP, and the Cooperative AI Foundation.

BibTeX

@inproceedings{xie2025data,
  title={Data Retrieval with Importance Weights for Few-Shot Imitation Learning},
  author={Xie, Amber and Chand, Rahul and Sadigh, Dorsa and Hejna, Joey},
  booktitle={9th Annual Conference on Robot Learning},
  year={2025},
  url={https://arxiv.org/abs/2509.01657}
}