Gemini Now Lets You Generate Podcasts from PDFs & Documents

Google has been on a spree lately with its Gemini expansion. Following the introduction of new models, the company is now rolling out two new features: Audio Overview, which turns documents and PDFs into podcast-style discussions, and Canvas, a collaborative workspace for seamless AI-powered interactions.

Audio Overview is powered by Google’s NotebookLM model, a specialized AI assistant with more advanced document and web analysis capabilities than Gemini. This model also powers Spotify’s 2024 recap.

Audio Overview Expands to Deep Research

Google first introduced Audio Overview last year with Daily Recap, allowing users to generate AI-hosted podcast-style summaries from web sources and articles. Now, the feature is expanding to support documents, including PDFs and research papers, through Deep Research, which was recently integrated into Gemini.

Users can access Audio Overview from Deep Research directly within the Gemini app on mobile. After generating a study or research document, tapping on the file and selecting “Generate Audio Overview” in the menu initiates the feature. The same functionality is also available via Deep Research on the web.

At present, generating Audio Overviews from documents is limited to Deep Research on mobile and web. We tested the feature within the Gemini app, but it did not fully function, though a message indicated that the Audio Overview was being created. It is likely that Google will expand support in the near future. The feature is already rolling out to Gemini and Gemini Advanced users but is currently available in English.

Gemini Becomes Collaborative

Canvas is a major addition to Gemini, a collaborative workspace designed for real-time document editing, interactive coding, and AI-powered previews.

For text-based projects, users can draft and edit documents while leveraging Gemini’s fine-tuning tools, which allow for tone adjustments, sentence shortening, and style modifications. Additionally, Gemini provides suggested edits to enhance writing quality. Canvas outputs can also be shared for collaboration via Google Docs.

CSS code explanation for a Tic-Tac-Toe game interface with options for player vs computer.
Gemini gains Canvas which enables collaborative and interactive coding. / © Google

Google is also enhancing Gemini’s coding capabilities with Canvas. The workspace enables users to generate, preview, and test code directly, eliminating the need for separate simulators or coding apps. Supported formats include web apps like HTML and React, Python scripts, games, and other simulations.

Additionally, Canvas can be a valuable tool for learning to code, as it provides real-time insights and explanations of specific code strings and snippets.

Canvas is now available on Gemini Web for both basic and premium users. It supports all languages in which Gemini apps are currently offered.

While the new Gemini upgrades make it a more flexible AI, they also add complexity to the overall experience, making it feel less streamlined compared to a single chatbot like ChatGPT. Nevertheless, these features bring valuable functionality. But what do you think? We’d love to hear your thoughts on these new additions!

Source

Guidantech
Logo
Shopping cart