A novel technique is aiding robots in efficiently packing objects into tight spaces. MIT researchers harnessed a family of generative AI models to collaboratively tackle complex robot manipulation challenges that require multiple steps.
Packing a substantial amount of luggage into a car’s trunk, much like trying to solve dense packing problems, poses a significant challenge. Robots, too, grapple with such tasks.
For robots, solving the packing problem entails meeting various constraints, such as stacking luggage to prevent it from spilling, ensuring heavier items aren’t atop lighter ones, and preventing collisions between the robot arm and the car’s bumper.
Conventional methods address this issue step by step, guessing partial solutions that meet one constraint at a time and subsequently checking for violations of other constraints. With an extensive sequence of actions and a pile of luggage to handle, this process becomes impractically time-consuming.
To address this, MIT researchers employed a form of generative AI known as a diffusion model. Their approach employs a set of machine-learning models, each specializing in representing a specific constraint. These models combine forces to generate global solutions to the packing problem, factoring in all constraints simultaneously.
Their approach proved capable of generating effective solutions faster than other techniques and producing a greater number of successful solutions within the same timeframe. Importantly, it could handle problems with novel combinations of constraints and larger sets of objects, which the models hadn’t encountered during training.
Thanks to its adaptability, this technique can teach robots to grasp and adhere to overall constraints in packing problems, including the importance of collision avoidance or the desire for specific objects to be adjacent to each other. Robots trained in this manner can be applied to a wide range of complex tasks in various settings, from warehouse order fulfillment to home bookshelf organization.
Zhutian Yang, a graduate student in electrical engineering and computer science at MIT and the lead author of a paper on this new machine-learning technique envisions pushing robots to tackle more intricate tasks characterized by numerous geometric constraints and continuous decision-making, typical of service robots operating in diverse human environments. She believes that compositional diffusion models can empower robots to address these complex problems and achieve remarkable generalization results.
The research was a collaborative effort involving MIT graduate students Jiayuan Mao and Yilun Du, Jiajun Wu, an assistant professor of computer science at Stanford University, Joshua B. Tenenbaum, a professor in MIT’s Department of Brain and Cognitive Sciences and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL), Tomás Lozano-Pérez, an MIT professor of computer science and engineering and a member of CSAIL, and senior author Leslie Kaelbling, the Panasonic Professor of Computer Science and Engineering at MIT and a member of CSAIL. The findings will be presented at the Conference on Robot Learning.
While still in the early stages, this approach holds promise for enabling more efficient, safe, and reliable autonomous systems in various applications.