AI for Doctors: MedGemma Reads Scans, MedASR Writes It Down

Google’s latest open-source medical AI models bring multimodal intelligence to the clinic and a $100K hackathon to the community. AI for doctors is here!

Google is pushing medical AI into its next phase. The company has just released MedGemma 1.5, a major upgrade to its open multimodal model for medical image interpretation.

Image Source: GPT 5.2/AIHealthTech Insider

Alongside it, Google is introducing MedASR, a new speech-to-text model fine-tuned specifically for medical language.

Summary of the MedGemma collection of models and their capabilities. Next generation medical image interpretation with MedGemma 1.5 and medical speech to text with MedASR

Together, these models aim to support doctors and developers with more adaptable tools for interpreting everything from CT scans to dictated clinical notes and they’re both available now on Hugging Face and Vertex AI.

Next generation medical image interpretation with MedGemma 1.5 and medical speech to text with MedASR

📊 What’s New in MedGemma 1.5

MedGemma was built for the complexity of healthcare — a field where data comes in many forms: images, text, reports, and conversations. The 1.5 release brings significant upgrades across the board:

Example showing how MedGemma 1.5 4B can be used ..Next generation medical image interpretation with MedGemma 1.5 and medical speech to text with MedASR

High-dimensional imaging: Supports full-volume CTs, MRIs, and histopathology slides

Improved accuracy:

+3% on disease-related CT findings
+14% on MRI findings
Major gains in anatomical localization and lab report parsing

Lighter weight: The 4B parameter model is optimized for edge use, but scalable in the cloud

Flow chart describing the intended use of MedGemma as a developer tool.Next generation medical image interpretation with MedGemma 1.5 and medical speech to text with MedASR

“We built MedGemma to reflect the real-world, multimodal nature of medicine — and this update brings us closer to that goal.”
— Daniel Golden, Engineering Manager, Google Research

🗣️ Meet MedASR: AI That Understands Medical Speech

While MedGemma tackles vision and text, MedASR focuses on voice. It’s a new automated speech recognition (ASR) model designed specifically for the medical domain — and it’s showing impressive results:

MedASR can be used either for transcribing medical dictation (top) or to dictate prompts for MedGemma (bottom). Next generation medical image interpretation with MedGemma 1.5 and medical speech to text with MedASR

58% fewer errors than Whisper large-v3 on chest X-ray dictation, based on Google’s internal benchmarks
82% fewer errors fewer errors on diverse internal medical dictation tasks across specialties, also based on internal testing

Doctors can use MedASR to dictate clinical notes, while developers can integrate it to power spoken prompts for MedGemma — enabling fully voice-driven AI reasoning in clinical workflows.

💡 Real-World Impact

Since its original release, MedGemma has already made waves:

Qmed Asia built a conversational AI interface for Malaysia’s clinical practice guidelines

Taiwan’s National Health Insurance used MedGemma to extract key data from 30,000+ pathology reports, helping guide lung cancer treatment decisions

The new models remain free for both research and commercial use, with tutorials and support available via GitHub and Hugging Face.

Use is subject to each model’s license, and developers must ensure compliance with applicable health data regulations such as HIPAA and GDPR.

🏆 The MedGemma Impact Challenge: $100K Up for Grabs

To inspire more real-world applications, Google is launching the MedGemma Impact Challenge — a global hackathon hosted on Kaggle with $100,000 in prizes.

Developers, startups, and researchers are invited to build innovative healthcare tools using MedGemma and MedASR. The goal? Showcase how open AI models can meaningfully improve patient care.

🛠️ Get Started

MedGemma 1.5, MedASR, and the full Health AI Developer Foundations (HAI-DEF) suite are available now:

For documentation, benchmarks, and fine-tuning tutorials, visit the HAI-DEF GitHub.

Also to Read

Discover more at the intersection of AI and healthcare with AIHealthTech Insider delivered every Monday, free to read. Click here to subscribe.

Latest Posts

artificial-intelligence

AI for Doctors: MedGemma Reads Scans, MedASR Writes It Down

Google’s latest open-source medical AI models bring multimodal intelligence to the clinic and a $100K hackathon to the community. AI for doctors is here!

📊 What’s New in MedGemma 1.5

🗣️ Meet MedASR: AI That Understands Medical Speech

💡 Real-World Impact

🏆 The MedGemma Impact Challenge: $100K Up for Grabs

🛠️ Get Started

Also to Read

Med-Gemini: Revolutionizing Healthcare with Advanced AI Solutions

Latest Posts

Build Your Own AI Anime Short Prompt Generator App in 30 Minutes — No Code Needed

OpenAI Just Added Pets to Codex. Here’s How to Hatch Your Own.

Build a Working App With Claude AI in 8 Steps (Even If You’ve Never Coded)

Useful Links

Company

Contact Information