Google has introduced a new feature to its Gemini app, which now allows users to upload audio files on Android, iOS, and web platforms. This much-anticipated update enables the transcription of audio, extraction of key points, creation of concise summaries, and identification of speakers. Moreover, it provides the ability to extract actionable items and important quotes from the audio content.
Enhanced Functionality Across Platforms
The app supports multiple audio formats, including MP3, M4A, and WAV, offering flexibility for users who operate across different devices. This feature is accessible via the plus menu for mobile app users and the "Upload files" option on the web version of Gemini.
One of the convenient aspects of the new audio feature is that users can simultaneously upload up to 10 audio files per prompt, provided their total length does not exceed 10 minutes. This forms part of the overall limitation on the number of files per prompt, which includes code folders, GitHub repositories, and ZIPs among others.
Limitations and User Options
Google has implemented separate allowances for free and paid users, whereby free users are limited to 10 minutes of audio time. Conversely, paid users benefit from a more substantial audio upload allotment. This differentiation in service levels allows both types of users to manage their tasks more effectively within Gemini's ecosystem, according to their needs.
This addition is a response to what Google describes as one of the most requested features from the user community, signifying the company's commitment to listening and responding to user feedback. By incorporating audio processing capabilities, Google enhances Gemini's utility for both personal and professional use, enabling users to unlock more from their recorded discussions, interviews, and meetings.
Overall, the new audio feature is a significant step toward making Gemini an all-encompassing tool for content and data management alongside its existing functionalities. With these capabilities, Gemini continues to evolve as a versatile application catering to diverse user needs.