A new collaborative study from Microsoft Research reveals why even the smartest AI chatbots collapse in multi-turn conversations.
Top AI research labs have released sophisticated AI models and subsequent chatbots to cement their brand names in the ever-evolving landscape, which is honestly becoming difficult to keep up with. However, users often lodge complaints about these offerings, citing hallucinations or outrightly wrong responses to queries.
A research paper by Microsoft Research and Salesforce analyzed 200,000+ AI conversations from the most advanced Large Language Models (LLMs), including GPT-4.1, Gemini 2.5 Pro, Claude 3.7 Sonnet, o3, DeepSeek R1, and Llama 4, and revealed that these tools often get "lost in conversation" when tasks are broken into a natural, multi-turn conversation (via NeuroNad).
Generative AI has seemingly turned into a buzzword in the tech industry; it's what everyone in the business is talking about right now. The technology is gaining widespread adoption worldwide despite claims that it's a bubble on the verge of bursting.
In 2024, Microsoft claimed that ChatGPT wasn't better than Copilot AI. The company indicated that users weren't using the product as intended, while pointing the finger at poor prompt engineering skills.
This recent study builds on this premise, as LLMs tend to deliver better results in single-turn conversations than in multi-turn ones. It further disclosed that the clear disparity in performance doesn't mean that the model has miraculously become dumber.
The researchers detail that the models' aptitude only decreased by 15%, but their unreliability skyrocketed by 112%. So what happened here exactly? The researchers indicated that AI models tend to suffer from premature generation, where they'd attempt to provide a solution for your query even before you're done with the explanation.
Recommended Comments
There are no comments to display.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.