Release

June 12, 20251 Minute Read

AI prompt editor and evaluations tooling now supports multi-turn conversations

You can now save and evaluate multi-turn conversations in the GitHub Models prompt editor and evaluations tooling!

Until now, the evaluations tooling only supported a single user prompt. With this update, you can include up to four rounds of user and assistant messages directly in your .prompt.yml file and test how models respond at the end of a longer interaction. In the API, you can include unlimited pairings.

This is especially useful for:

  • Testing memory and context retention. For example, in the case of a travel bot, does it still recommend snowy places by turn four after the user says “I want a cold destination” in turn two?
  • Ensuring consistent behavior as instructions evolve. For example, a shopping assistant where the user first says “make it under $100,” then later changes it to “under $200,” and the assistant correctly adjusts its recommendations.
  • Evaluating real-world chat flows. For example, a customer support agent that needs to escalate properly after several back-and-forth troubleshooting steps.

Start building AI apps with GitHub Models today

GitHub Models and all our AI development tooling are available now to all GitHub users in public preview. This includes prompt editing and lightweight evaluations. Try our tools out by enabling them in your repository or organization, or learn more in our documentation.

Help us shape what’s next

We’re just getting started, and your feedback helps guide our roadmap. Join the community discussion to share your thoughts and connect with other developers building the future of AI on GitHub.

Subscribe to our developer newsletter

Discover tips, technical guides, and best practices in our biweekly newsletter just for devs.

By submitting, I agree to let GitHub and its affiliates use my information for personalized communications, targeted advertising, and campaign effectiveness. See the GitHub Privacy Statement for more details.

AI prompt editor and evaluations tooling now supports multi-turn conversations - GitHub Changelog