OpenAI introduces a new collaboration program to aggregate datasets from external parties for use in training artificial intelligence models.
The “OpenAI Data Partnerships” initiative aims to acquire a wide range of private and public information believed to be not easily accessible to the public online.
The company states that the collected data is not necessarily in textual formats or quantities, as the program also accepts images, audio, and video.
The company seeks information on any topic in any language as long as it reflects human intentions, such as long articles or written conversations.
The data focused on humans collected by OpenAI is expected to enhance the company’s tools, as well as a technological tool linked to self-contained word copies.
This initiative is also in line with the recent expansion of the ChatGPT chatbot in supporting voice queries to interact dialogically with users.
The AI models operated by the company enhance this feature and improve other functional tools by providing more information that learns how to conduct human-like conversations.
In the framework of its partnership with the Data Partnership program, OpenAI conducted a test on the model to enhance the capabilities of the consumer-facing GPT-4 Turbo model. The company worked on developing this model to provide complex meaningful responses to users.
OpenAI indicates that it has started collaborating with interested institutions, including government bodies such as the Icelandic government. It works on enhancing GPT-4’s ability to understand inquiries in Icelandic by improving curated datasets.
Public and private institutions can participate in the program by filling out a form on the company’s website and sharing information related to the type and volume of data they wish to share.
The company clarifies that datasets may be made available to the public, making it an ideal example of datasets associated with language training models.
In addition, the institution can provide information by using the curated dataset it has collected, including initial models, approved models, and custom models.
This pathway is recommended for companies or institutions aiming to maintain the privacy of their data, and OpenAI points out that they are not seeking datasets containing sensitive or personal information.
ChatGPT has achieved record numbers in increasing its user base, with approximately 100 million active users relying on the tool weekly worldwide, emphasizing privacy as a key point for the tool.