Data Retrieval with Importance Weights for Few-Shot Imitation Learning

Method Figure — Our method consists of three main steps:

(A) Learning a latent space to encode state-action pairs.

(B) Estimating a probability distribution over the target and prior data, and using **importance weights** for data retrieval.

(C) Co-training on the target data and retrieved prior data.

By augmenting the high-quality but much smaller target dataset with diverse, relevant prior samples, we learn more robust and performant policies.

Abstract

While large-scale robot datasets have propelled recent progress in imitation learning, learning from smaller task specific datasets remains critical for deployment in new environments. Retrieval-based imitation learning addresses this by extracting relevant samples from large, widely available prior datasets to augment a limited demonstration dataset. To determine the relevant retrieval-based approaches commonly calculate a prior data point's minimum distance to a point in the target dataset in latent space. While retrieval-based methods have shown success using this metric for data selection, it has two shortcomings. First, it relies on high-variance nearest neighbor estimates that are susceptible to noise. Second, it does not account for the distribution of prior data when retrieving data.

To address these issues, we introduce Importance Weighted Retrieval (IWR), which estimates importance weights, or the ratio between the target and prior data distributions for retrieval, using Gaussian KDEs. By considering the probability ratio, IWR overcomes the bias of previous selection rules, and by using reasonable modeling parameters, IWR effectively smooths estimates using all data points. Across both simulation and real-world evaluations on the Bridge dataset, IWR consistently improves performance of existing retrieval-based methods, despite only requiring minor modifications.

Video Rollouts

Carrot (move carrot to dish rack)

IWR (Ours)

Behavior Retrieval

BC (no retrieval)

Corn (move corn to plate)

IWR (Ours)

Behavior Retrieval

BC (no retrieval)

Simulation: IWR vs. BR

Soup-Sauce (put both the alphabet soup and the tomato sauce in the basket)

IWR (Ours)

BR

Mug-Mug (put the white mug on the left plate and put the yellow and white mug on the right plate)

IWR (Ours)

BR

Data Retrieval Visualization

Mug-Mug task (put the white mug on the left plate and put the yellow and white mug on the right plate)

IRW gets (1) a higher portion of directly relevant tasks and (2) retrieves a more balanced distribution across timesteps.

Target Demo

IWR retrieved samples from prior

BR retrieved samples from prior

Results

Acknowledgments

This work was supported by NSF grants 2132847 and 2218760, DARPA YFA, ONR YIP, and the Cooperative AI Foundation.

BibTeX

@inproceedings{xie2025data,
  title={Data Retrieval with Importance Weights for Few-Shot Imitation Learning},
  author={Xie, Amber and Chand, Rahul and Sadigh, Dorsa and Hejna, Joey},
  booktitle={9th Annual Conference on Robot Learning},
  year={2025},
  url={https://arxiv.org/abs/2509.01657}
}

Data Retrieval with Importance Weights for Few-Shot Imitation Learning

Abstract

Video Rollouts

Carrot (move carrot to dish rack)

Corn (move corn to plate)

Simulation: IWR vs. BR

Soup-Sauce (put both the alphabet soup and the tomato sauce in the basket)

Mug-Mug (put the white mug on the left plate and put the yellow and white mug on the right plate)

Data Retrieval Visualization

Mug-Mug task (put the white mug on the left plate and put the yellow and white mug on the right plate) IRW gets (1) a higher portion of directly relevant tasks and (2) retrieves a more balanced distribution across timesteps.

Results

Acknowledgments

BibTeX

Mug-Mug task (put the white mug on the left plate and put the yellow and white mug on the right plate)

IRW gets (1) a higher portion of directly relevant tasks and (2) retrieves a more balanced distribution across timesteps.