Many users have noticed that ChatGPT’s behavior has changed since its launch last November. This paper delves into how the platform’s responses to various prompts changed, some for good, some for worse. Its authors stress the need for LLM services to continually monitor the quality of their models.
Overall, our findings show that the behavior of the same LLM service can change substantially in a relatively short amount of time, highlighting the need for continuous monitoring of LLM quality.