OpenAI's ChatGPT, a popular AI chatbot, is expanding its capabilities to accept voice and image prompts in addition to text-based queries. While previously reliant on text inputs, this enhancement allows users to engage with the chatbot in more intuitive ways.
-
Voice Commands
Users of ChatGPT's paid version can now speak their queries aloud. Upon tapping the microphone icon, the system converts the spoken words into text, generates a response, and delivers it audibly. This functionality offers a conversational experience akin to interactions with virtual assistants like Siri or Google Assistant. Users can choose from five different voices for the bot's responses, with OpenAI emphasizing improved language models for more accurate answers.
-
Image-Based Queries
ChatGPT now supports image-based queries, functioning similarly to Google Lens. Users can upload images, and the chatbot will attempt to understand the user's intent and provide relevant responses. Additional clarifications or questions can be added through text or voice inputs, making this feature more versatile than conventional image search tools.
OpenAI has implemented certain limitations in processing images through ChatGPT to prioritize user privacy. For example, the bot cannot assess or recognize individuals' identities to safeguard against potential misuse.
These enhancements are initially available to paid ChatGPT users and will roll out to free users in the near future.