The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community

Shachar Don-Yehiya, Leshem Choshen, Omri Abend

2024-08-16

The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community

Summary

This paper introduces the ShareLM collection and its accompanying plugin, which allows users to contribute their conversations with language models to help improve AI development and research.

What's the problem?

While companies often gather user data to enhance their AI models, the open-source and research communities lack similar resources. This means that researchers do not have access to enough real-world conversations to help train and improve language models effectively.

What's the solution?

The authors created the ShareLM collection, a unified set of human conversations with language models, along with a plugin that enables users to easily share their interactions. The plugin allows users to rate their conversations and delete any they want to keep private before sharing, ensuring user control over their data.

Why it matters?

This research is important because it encourages community involvement in AI development by providing a platform for sharing valuable conversation data. By contributing to the ShareLM collection, users can help improve language models, making them more effective and responsive to real-world needs.

Abstract

Human-model conversations provide a window into users' real-world scenarios, behavior, and needs, and thus are a valuable resource for model development and research. While for-profit companies collect user data through the APIs of their models, using it internally to improve their own models, the open source and research community lags behind. We introduce the ShareLM collection, a unified set of human conversations with large language models, and its accompanying plugin, a Web extension for voluntarily contributing user-model conversations. Where few platforms share their chats, the ShareLM plugin adds this functionality, thus, allowing users to share conversations from most platforms. The plugin allows the user to rate their conversations, both at the conversation and the response levels, and delete conversations they prefer to keep private before they ever leave the user's local storage. We release the plugin conversations as part of the ShareLM collection, and call for more community effort in the field of open human-model data. The code, plugin, and data are available.

View Paper