Jump to content
  • New Lego-building AI creates models that actually stand up in real life


    Karlston

    • 77 views
    • 5 minutes
     Share


    • 77 views
    • 5 minutes

    Carnegie Mellon "LegoGPT" system uses physics checks to ensure models don't collapse.

    lego_header_3-1152x648.jpg

    Several examples of shapes created by LegoGPT.  

    Credit: Pun et al.

     

    On Thursday, researchers at Carnegie Mellon University unveiled LegoGPT, an AI model that creates physically stable Lego structures from text prompts. The new system not only designs Lego models that match text descriptions (prompts) but also ensures they can be built brick by brick in the real world, either by hand or with robotic assistance.

     

    "To achieve this, we construct a large-scale, physically stable dataset of LEGO designs, along with their associated captions," the researchers wrote in their paper, which was posted on arXiv, "and train an autoregressive large language model to predict the next brick to add via next-token prediction."

     

    This trained model generates Lego designs that match text prompts like "a streamlined, elongated vessel" or "a classic-style car with a prominent front grille." The resulting designs are simple, using just a few brick types to create primitive shapes—but they stand up. As one Ars Technica staffer joked this morning upon seeing the research, "It builds Lego like it's 1974."

     

    A LegoGPT demo video assembled by the research team.

     

    In the paper titled "Generating Physically Stable and Buildable Lego Designs from Text," the research team led by Ava Pun explained that many existing 3D generation models focus on making diverse objects with detailed geometry, but these digital designs often can't be physically made. "Without proper support, parts of the design can collapse, float, or remain disconnected," they wrote.

     

    Unlike previous attempts at autonomous Lego modeling, LegoGPT reportedly produces step-by-step instructions for building Lego creations that don't fall apart. You can see demos of the system in action on the project's website.

    How LegoGPT works

    To build LegoGPT, the Carnegie Mellon team repurposed the technology behind large language models (LLMs), similar to the kind that run ChatGPT, for "next-brick prediction" instead of next-word prediction. To do so, the team fine-tuned LLaMA-3.2-1B-Instruct, an instruction-following language model from Meta.

     

    The team then augmented the brick-predicting model with a separate software tool that can verify physical stability using mathematical models that simulate gravity and structural forces.

     

    To train the model, the team assembled a new dataset called "StableText2Lego," which contained over 47,000 stable Lego structures paired with descriptive captions generated by a separate AI model, OpenAI's GPT-4o. Each structure underwent physics analysis to ensure it could be built in the real world.

     

    To build the Lego dataset, the team fed images rendered from 24 different viewpoints into GPT-4o and let that model write captions for each LEGO structure, asking it to focus on geometric features while omitting color information.
    To build the Lego dataset, the team fed images rendered from 24 different viewpoints into GPT-4o and let that 
    model write captions for each Lego structure, asking it to focus on geometric features while omitting color information.  
    Credit: Pun et al.

    LegoGPT works by first generating a sequence of precisely placed Lego bricks. For each new brick in the sequence, the system makes sure it doesn't collide with existing bricks and that it fits within the building space. After completing a design, it uses the aforementioned mathematical models to verify that the model can stand upright without falling apart.

     

    If parts would collapse in real life, the system identifies the first unstable brick and backtracks, removing it and all subsequent bricks before trying a different approach. This "physics-aware rollback" method proved essential to the team's approach. Without it, only 24 percent of designs remained standing, compared to 98.8 percent with the full system.

     

    The LegoGPT system works in three parts, shown in this diagram.
    The LegoGPT system works in three parts, shown in this diagram.  
    Credit: Pun et al.

    The researchers also expanded the system's abilities by adding texture and color options. For example, using an appearance prompt like "Electric guitar in metallic purple," LegoGPT can generate a guitar model, with bricks assigned a purple color.

    Testing with robots and humans

    To prove their designs worked in real life, the researchers had robots assemble the AI-created Lego models. They used a dual-robot arm system with force sensors to pick up and place bricks according to the AI-generated instructions.

     

    Human testers also built some of the designs by hand, showing that the AI creates genuinely buildable models. "Our experiments show that LegoGPT produces stable, diverse, and aesthetically pleasing Lego designs that align closely with the input text prompts," the team noted in its paper.

     

    When tested against other AI systems for 3D creation, LegoGPT stands out through its focus on structural integrity. The team tested against several alternatives, including LLaMA-Mesh and other 3D generation models, and found its approach produced the highest percentage of stable structures.

     

    A video of two robot arms building a LegoGPT creation, provided by the researchers.

     

    Still, there are some limitations. The current version of LegoGPT only works within a 20×20×20 building space and uses a mere eight standard brick types. "Our method currently supports a fixed set of commonly used Lego bricks," the team acknowledged. "In future work, we plan to expand the brick library to include a broader range of dimensions and brick types, such as slopes and tiles."

     

    The researchers also hope to scale up their training dataset to include more objects than the 21 categories currently available. Meanwhile, others can literally build on their work—the researchers released their dataset, code, and models on their project website and GitHub.

     

    Source


    Hope you enjoyed this news post.

    Thank you for appreciating my time and effort posting news every day for many years.

    News posts... 2023: 5,800+ | 2024: 5,700+ | 2025 (till end of April): 1,811

    RIP Matrix | Farewell my friend  :sadbye:


    User Feedback

    Recommended Comments

    There are no comments to display.



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Paste as plain text instead

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...