TL;DR: 2024 AI Innovations & 2025 Outlook
- Microsoft & OpenAI: Edge becomes “AI Browser,” GPT Store democratizes AI apps, GPT-4 Copilot ups productivity.
- Apple & Google: Apple invests in AI-powered Siri/Xcode; Google debuts Bard→Gemini, advanced image/video models.
- Anthropic & Meta: Claude 3 excels in coding/reasoning; Meta grows AI chatbots and open-source Llama 3.
- Nvidia & Robotics: Annual GPU/CPU releases, new AI models, and humanoid robotics (Figure, Boston Dynamics) gain traction.
- Healthcare & Biotech: China’s AI hospital, AlphaFold Nobel, AI-driven gene editing boost medical breakthroughs.
- Creative Tools: Cheaper coding (Qwen2.5), design/video AI (Canva, Runway, D-ID) reshape creative workflows.
- Global Governance: EU enacts first AI laws; US/UK partner with labs to audit advanced models.
- 2025: Anticipate bigger supercomputers, more OS-level AI, mass deployment of humanoid robots, and deeper medical AI.
Or you can watch our podcast for a detailed discussion similar to this!
January
Microsoft
Microsoft Edge Leads the New ‘AI Browser’ Era
Microsoft Edge’s mobile rebrand to “AI Browser” integrates Bing AI, GPT-4 Copilot, and DALL-E 3 image generation, redefining the web-browsing experience.
Copilot Pro & the AI Evolution
Launched at $20/month, Copilot Pro unifies GPT-4 Turbo across Microsoft 365 apps — speeding up creativity, boosting productivity, and introducing advanced image creation via the Designer tool.
OpenAI
OpenAI Opens One-Stop AI App Shop
With its new GPT Store, OpenAI empowers anyone to build custom AI models without coding and monetize them worldwide — ushering in a fresh era of AI entrepreneurship.
Introducing the GPT Store
ChatGPT Plus, Team, and Enterprise users can now explore a hub of GPT-based apps, ranging from writing aids to programming tools, complete with a community leaderboard.
OpenAI Tackles AI “Laziness”
New updates to GPT-4 Turbo (gpt-4–0125-preview) improve code generation and accuracy, while a more cost-effective GPT-3.5 Turbo model addresses efficiency and API affordability.
Figure
AI Robotics Unveiled: Figure’s Coffee Wizardry
Figure 01 brews coffee after just 10 hours of video training, demonstrating impressive agility and adaptive learning capabilities in a robotic humanoid.
AI Etiquette: Google’s AMIE Outclasses Human Doctors
AMIE, an AI chatbot, demonstrates higher diagnostic accuracy and surprising empathy — paving the way for AI-driven improvements in patient care.
Google’s Lumiere: A Novel AI Model for Video Synthesis
Lumiere generates five-second videos in a single pass from text or images, surpassing existing text-to-video models in realism, motion coherence, and text alignment.
February
Neuralink (Elon Musk)
Musk’s Consensual TELEPATHY
Neuralink successfully implants its first brain-computer interface into a human patient, showing promising neuron spike detection. This $5 billion endeavor could transform treatments for paralysis and other disabilities through AI-integrated neuroscience.
Bard’s Out, Gemini’s In: Blast Off with Ultra 1.0 and the Super App
Rebranded as Gemini, Google’s upgraded AI tool Bard now features a mobile app for Android and iOS plus Gemini Advanced with Ultra 1.0 for more complex tasks and context-aware conversations.
Gemini 1.5: The Next Frontier in AI
Google DeepMind unveils Gemini 1.5, leveraging a mixture-of-experts (MoE) architecture for better scalability and a boosted context window, enhancing AI performance across diverse applications.
EU AI Regulation
EU Unveils Groundbreaking AI Regulations
After extensive negotiations, EU member states approve the world’s first comprehensive AI laws, aiming to protect users and nurture AI innovation. With the European Parliament’s final vote approaching, this milestone sets a global benchmark in AI governance.
Apple
Apple Unveils AI-Powered MGIE Model
Co-developed with the University of California, Santa Barbara, MGIE (MLLM-Guided Image Editing) simplifies photo editing through text prompts, advancing generative AI for intuitive design and creativity.
Apple’s Secret AI Game-Changer: Ask Unveiled
Ahead of iOS 18, Apple quietly pilots ‘Ask,’ a ChatGPT-like feature for AppleCare advisors. It instantly retrieves solutions from internal databases, speeding up customer support and hinting at deeper AI integration.
OpenAI
AI Memory Lane: ChatGPT’s Next Evolution
New personalization features let ChatGPT remember key details, streamlining conversations and creating a more tailored user experience — with robust privacy controls to safeguard data.
Sora: OpenAI’s Video Magic
OpenAI’s text-to-video model, Sora, turns prompts into vivid, dynamic clips. Currently in limited testing, it demonstrates the next leap in AI-generated media for creators and filmmakers.
March
Nvidia & AI in Industry
NVIDIA GTC 2024 Highlights
- Blackwell Platform: Ushers in a new era for generative AI.
- 6G Research Platform: Accelerates next-gen wireless tech.
- Project GR00T: NVIDIA’s foundational model for humanoid robots, reshaping robotics innovation.
Nvidia’s $9/hr AI Nurse Solution
Partnering with Hippocratic AI, Nvidia debuts empathetic AI “nurses” that outperform human counterparts in patient interactions at a fraction of the cost — potentially redefining healthcare accessibility and staffing.
Figure AI
Figure AI’s Leap Toward Humanoid Helpers
Receiving $675M from investors like Jeff Bezos, Microsoft, and Nvidia, Figure AI now values at $2.6B. Its humanoid robot, Figure 01, aims to tackle labor shortages in manufacturing and logistics — steering clear of military applications.
Figure 01’s Leap into Conversational AI
Through a partnership with OpenAI, Figure 01 gains advanced vision-language capabilities and bimanual manipulation skills. It interprets visual data, plans tasks, and converses intelligently — all powered by neural network visuomotor transformers.
Apple
Apple’s AI Evolution with Siri’s New Leap
Apple plans a generative AI–enhanced Siri and additional AI tools in Xcode, with an eye on automating Apple Music playlists and refining enterprise solutions. Strategic publisher deals give Apple robust datasets for AI training.
Apple Gears Up for AI Revolution with DarwinAI Acquisition
Apple acquires Canadian startup DarwinAI for undisclosed sums, boosting its 2024 AI ambitions. DarwinAI’s expertise in explainable AI and algorithm optimization positions Apple for major strides in generative AI.
Claude 3 (Anthropic)
AI’s Next Frontier: Introducing Claude 3
Anthropic launches Claude 3 across 159 countries in three distinct models — Haiku, Sonnet, and Opus — offering superior reasoning, coding, and multilingual communication. Opus outperforms competitors in complex tasks.
OpenAI
OpenAI’s Sora Enchants Hollywood
Sora, a text-to-video model, wows major studios like Universal and Paramount with its ability to craft cinematic clips from brief prompts. It promises both budget-friendly production and potential industry-wide disruption.
April
Microsoft
Microsoft Debuts Phi-3 Mini
A lean, 3.8B-parameter AI model available on Azure, Hugging Face, and Ollama. Phi-3 Mini excels at coding and reasoning — offering an efficient alternative to larger, more power-hungry models.
Gemini 1.5 Pro: Hear the Future of AI
Now global in 180+ countries, Gemini 1.5 Pro supports native audio analysis and JSON Mode. Developers get advanced tools and an enhanced text embedding model for building next-gen AI apps.
AI Storm Watch with Google SEEDS
SEEDS uses generative AI to produce faster, cheaper weather forecasts, including detecting rare extreme events — potentially revolutionizing climate preparedness.
Meta
Meta Tests AI Chatbot Across WhatsApp, Instagram, and Messenger
Meta pilots an LLM-powered chatbot in India and Africa. The goal: refine conversational commerce and user support features for these massive social platforms.
Meet Llama 3, the Open-Source AI Shaking Up the Game
Meta’s new open-source model comes in 8B/70B parameter sizes for scalability. Safety tools accompany Llama 3, reflecting Meta’s pledge to responsible AI.
Boston Dynamics & Mentee Robotics
The Future Is Now (and It Has Robots)
Boston Dynamics transitions Atlas to a fully electric model, with Hyundai leading initial factory tests — showcasing stronger, more agile humanoids for commercial use.
Your Robot Roommate Is Here: Mentee Learns As It Works
Mentee Robotics debuts MenteeBot, trained via sim-to-real learning, capable of household tasks and industrial chores. Large language models drive its decision-making and command execution.
Gene Editing & AI
AI Revolutionizes Gene Editing
Profluent launches OpenCRISPR-1, using AI insights akin to ChatGPT to design precise gene-editing tools. This promises faster, more effective breakthroughs in medicine and biotech.
May
Microsoft
Code, Creativity, and Connections with Microsoft Build 2024
Seattle hosts Build 2024, focusing on AI innovations for Windows 11, new Surface hardware, and deep-dive developer sessions, with Satya Nadella headlining the keynote.
Microsoft Announced New AI Chip–Based PCs
Microsoft partners with Qualcomm to build Arm laptops running offline AI tasks, rivaling Apple’s M-series MacBooks. Launching at $999, these devices promise AI-driven features without web connectivity.
Nvidia
Nvidia Unlocks: What’s in Your Files? ChatRTX Knows
This demo app creates a custom, local chatbot using retrieval-augmented generation — perfect for searching your personal files securely on an RTX PC.
Nvidia Announces New AI Chip Every Year
Fueled by $14B in quarterly profit, Nvidia moves to an annual release cycle for CPUs/GPUs. Meta and Tesla plan to incorporate these new AI chips at scale.
Healthcare & Accessibility
Med-Gemini and Digital Twin: Transforming Medical Care
Google Research and DeepMind unveil specialized AI models tailored to medical settings, promising breakthroughs in diagnosis and patient-specific digital replicas.
SignLLM: AI for Deaf People
A multilingual sign language model supporting eight sign languages and reinforcement learning, SignLLM bridges communication gaps in education, entertainment, and more.
Canva
Canva Create 2024
A redesigned editor, enterprise upgrades, and more AI via Magic Studio headline Canva’s new product announcements — positioning the platform as a holistic content creation environment.
Google’s AI for Low Vision
On Global Accessibility Awareness Day, Google updates Android’s Lookout to detect seven categories of objects, plus text-free Look to Speak and Project Gameface for gesture-based controls.
June
Nvidia & Computex 2024
AI Insights @COMPUTEX 2024: Accelerate Everything
Nvidia’s Jensen Huang unveils cost-effective, eco-friendly semiconductor solutions, plus platforms like Rubin and Spectrum-X. Key highlights include the forthcoming Blackwell and Rubin architectures, high-performance GPU/CPU combos, and a $100 trillion IT market projection.
Apple & WWDC 2024
Siri Gets a Brain Boost & Your iPhone Can Text From Mars: WWDC 2024 Highlights
iOS 18 introduces a redesigned home screen, advanced AI privacy, and deeper ChatGPT integration. MacOS Sequoia upgrades performance, while Apple Pay and Apple TV+ gain new features like tap-to-pay and InSight.
Apple’s AI Integration Strategy: A Game-Changer
Apple unveils “Apple Intelligence” — an AI layer that permeates Siri, Photos, Mail, Music, News, and more while safeguarding user privacy. Hardware-software synergy sets Apple apart from other tech giants.
Amazon
Amazon’s AI ‘Private Investigator’ Enhances Customer Experience and Sustainability
Project P.I. uses generative AI and computer vision to detect product defects in fulfillment centers, reducing returns, waste, and carbon footprint while boosting customer satisfaction.
Meta
Meta Upgrades WhatsApp for Business with AI Tools
At a Sao Paulo event, Meta showcases AI automation for customer chats, a “Meta Verified” badge, and direct calling for large businesses — streamlining commerce on WhatsApp.
Meta FAIR Unveils Four New AI Models and Research Artifacts
From Chameleon (mixed-modal LLMs) and JASCO (text-to-music) to AudioSeal (speech watermarking), Meta’s open-sourced models aim to spur innovation while promoting responsible AI practices.
China & Healthcare AI
China Opens First AI Hospital with Robot Doctors
“Agent Hospital” treats 3,000 patients daily using advanced AI for quicker diagnoses and better training. With a 93%+ accuracy rate on med exams, it signals a major leap in cost-effective healthcare access.
July
ElevenLabs
Iconic Hollywood Voices Brought to Life in ElevenLabs’ Reader App
Through partnerships with estates of film legends like Judy Garland and James Dean, ElevenLabs’ Reader app offers AI-narrated voices for PDFs, ePubs, and articles — merging nostalgia with cutting-edge audio tech.
OpenAI & Los Alamos
OpenAI & Los Alamos Team Up for Safe AI in Science
By using ChatGPT-4o for physical lab tasks, the partnership aims to boost breakthroughs in national security, renewable energy, and medicine — while shaping best practices for safe AI research protocols.
Google DeepMind
AI-Powered Weather Forecasting Wins Prestigious Award
GraphCast wins the 2024 MacRobert Award, delivering forecasts in 45 seconds. It marks a leap forward in early warnings for severe weather events, potentially saving lives.
AI-Powered Robots Navigate Complex Environments with Ease
Gemini 1.5 Pro’s Mobility VLA processes video tours to build topological maps. With 1M-token context and multimodal instructions, it boasts a 90% success rate in complex real-world navigation tasks.
Record-Breaking AI
AI Crushes Rubik’s Cube Record in 0.3 Seconds
Mitsubishi Electric’s TOKUFASTbot uses advanced FA equipment to solve Rubik’s Cube puzzles in record time, highlighting the capabilities of high-speed precision robotics.
OpenAI
GPT-4o Mini: Revolutionizing Cost-Efficient AI
Priced at just $0.15/million input tokens, GPT-4o mini runs text and vision inputs with a 128K context window, promising advanced reasoning and coding at a fraction of the usual cost.
Kling AI
Kling AI: Global Video Generation Breakthrough
Kling lifts phone number restrictions and expands to worldwide users with AI video up to five seconds long. With a 2,000 character prompt limit, it challenges OpenAI’s Sora in the race for text-to-video innovation.
August
Meta
Meta’s AI Studio Empowers Creators with Personalized Chatbots
Now widely available in the U.S., AI Studio lets creators build custom chatbots for tasks like caption creation, fan Q&A, and meme generation. It supports cross-platform use on Instagram, Messenger, WhatsApp, and the web.
NVIDIA
NVIDIA’s Edify AI Transforms 3D World-Building with Real-Time AI Magic
At SIGGRAPH’s Real-Time Live, NVIDIA showed how Edify AI can generate complex 3D objects and environments in minutes. Created assets sync seamlessly with Omniverse USD Composer for further refinement.
NVIDIA Revolutionizes Digital Humans and Avatar Technology at Gamescom 2024
Announced new tools for lifelike avatars, improved animations, and enhanced realism in gaming and VR. AI-driven technology promises smoother movement and unprecedented customization.
Mistral-NeMo-Minitron 8B: Small AI, Big Impact
NVIDIA’s compact 8B-parameter model balances high accuracy with low computational cost. It runs efficiently on workstations for real-time chatbots and other AI apps — keeping data local and secure.
Black Forest Labs
FLUX.1 by Black Forest Labs: Revolutionizing Text-to-Image Synthesis
FLUX.1, available in Pro and Schnell versions, delivers high-quality generative art rivaling DALL-E. It pushes the boundaries of multimodal AI, with faster rendering and precise prompt adherence.
Gemini AI on Earbuds
Google’s Pixel Buds Pro integrate Gemini for advanced speech-to-speech, music playback, and personalized recommendations. Gemini’s superior contextual understanding aims to outshine Alexa and Siri in wearables.
AI-Driven Traffic Optimization: Google’s Green Light Initiative
Deployed in over 70 intersections across 13 cities, Google’s system adjusts signal timing to reduce congestion by up to 30%. Boston’s implementation saw a 50% drop in stop-and-go traffic.
Imagen 3: The Next Level in AI Image Generation
Google opens access to Imagen 3, a text-to-image model emphasizing safety and representation. Available via ImageFX, it delivers improved text rendering and fewer visual artifacts than previous versions.
Figure
Figure 02: The Pinnacle of AI Hardware Innovation
A second-gen humanoid robot with a 50% larger battery, six onboard RGB cameras, and 4th-gen human-scale hands. Integrates custom AI models for speech-to-speech and advanced visuomotor reasoning.
ByteDance
Jimeng AI: A New Player in Generative AI
ByteDance’s latest app generates images and videos from text prompts, offering up to 80 images and 26 videos free. Currently China-only, Jimeng AI hints at ByteDance’s global ambitions in creative AI.
Sakana.ai
AI Scientist: The Future of Automated Research
In partnership with Oxford’s Foerster Lab, Sakana.ai unveils “The AI Scientist,” autonomously conducting research from idea generation to peer-reviewed paper creation — each manuscript costs about $15.
Luma Labs
Dream Machine 1.5: AI Video Revolution
Offering higher-quality 5-second video clips than previous versions, Dream Machine 1.5 refines natural language prompts and transitions. It edges closer to photorealistic AI-generated video content.
1X Robotics
NEO Beta: The AI-Powered Humanoid Revolutionizing Home Robotics
1X’s bipedal robot sports muscle-inspired actuators for safer human-robot interaction. With advanced AI, it navigates spaces and carries loads — poised to become a versatile home assistant.
September
Apple
Apple’s AI-Powered Public Beta Opens New Possibilities for Entrepreneurs
iOS 18.1, iPadOS 18.1, and macOS Sequoia 15.1 betas include AI text rewriting, automatic object removal from photos, and an enhanced Siri interface. Available on new iPhone and M1-powered devices for real-time, privacy-focused AI.
Weave Robotics
Isaac: Weave Robotics Unveils AI-Powered Personal Robot for Household Tasks
Priced at $59,000 (or $1,385/month), Isaac autonomously handles chores like folding laundry and organizing. Slated for fall 2025 shipment, it signals stiff competition in the growing personal robotics market.
Meta
Meta Connect 2024: Innovations in AR, VR, and AI
Reveals Orion AR Glasses with neural-based controls, a budget Quest 3S VR headset, updated Ray-Ban Meta Smart Glasses with live video, and Llama 3.2 with improved vision-text support. A major leap in immersive, AI-enhanced experiences.
October
OpenAI
OpenAI’s DevDay 2024: A Showcase of Innovation for Developers
Key announcements include real-time speech-to-speech APIs, GPT-4o vision fine-tuning, and model distillation tools — aiming to reduce cost and expand AI’s accessibility across industries.
OpenAI’s New sCM Model Generates Images 50x Faster
A simplified “consistency model” approach powers ultra-quick image generation in just two inference steps, producing high-quality images in 0.11 seconds on an A100 GPU — ushering in a more efficient era of AI art.
Meta
Meta Unveils Movie Gen
A media foundation model featuring 30B parameter video generation and 13B parameter audio creation. Text-to-video and text-to-audio alignment enable advanced editing and personalized multimedia content across entertainment and marketing.
Microsoft
Microsoft Unveils Copilot Labs and Copilot Vision
Copilot Labs grants early access to advanced problem-solving features like “Think Deeper.” Copilot Vision, integrated into Edge, offers real-time assistance for research, customer service, and website navigation — boosting productivity with a privacy-first design.
Nobels in AI
AlphaFold Wins Nobel Prize for Protein Structure Prediction
Demis Hassabis and John Jumper share half the Nobel in Chemistry for AlphaFold, which solves protein folding at near-atomic accuracy. The breakthrough accelerates drug development and biotech innovation.
AI Pioneers John Hopfield and Geoffrey Hinton Win 2024 Nobel Prize in Physics
Celebrated for associative memory systems and the Boltzmann machine, respectively. Their physics-inspired contributions form the backbone of deep learning, enabling progress in speech, vision, and material science.
Nvidia
ComfyGen: Nvidia’s AI Automates Text-to-Image Creation
A synergy of large language models and curated workflows auto-selects diffusion models, upscalers, and prompts — yielding high-quality results with minimal user input. The system outperforms monolithic models in user satisfaction tests.
Nvidia’s Nemotron 70B: Revolutionizing Large Language Models
Leveraging RLHF and advanced tuning, Nemotron 70B tops alignment benchmarks at 94.1% user preference. It ships enterprise-ready, fueling the competition among OpenAI, Google, and Nvidia in delivering more human-like AI.
Claude 3.5 (Anthropic)
Claude 3.5 Takes AI Coding & Computer Use to New Heights
Anthropic’s Sonnet variant excels at agentic coding tasks, while the new Haiku model delivers top-tier performance at faster speeds. An experimental “computer use” feature allows AI to navigate GUIs and automate workflows.
November
OpenAI
ChatGPT: Combining Voice and Vision for Smarter Interactions
GPT-4o’s newest update refines speech capabilities for more natural, emotive conversation, with a teased “live camera” feature for real-time object recognition. Users can now collaborate on a virtual canvas and upload files for deeper AI-driven insights.
D-ID
D-ID Unveils Real-Time, High-Quality Avatars for Dynamic Conversations
New Express and Premium+ avatars replicate natural head and torso movements, boosting video marketing and customer support. With AI influencers, multilingual translation, and Canva integration, D-ID’s platform promises up to 35% higher conversion rates.
Perplexity
Perplexity Launches AI-Powered Election Information Hub
An AI-driven voter guide providing real-time candidate updates, polling locations, and ballot tracking. While aiming for accuracy via non-partisan data, early errors reveal the challenges of reliable generative AI in civic tools.
Nvidia: Audio & Avatars
Nvidia Unveils Fugatto: AI Model Transforms Audio Creation
Capable of generating novel sounds, remixing voices, and altering musical tracks via text or audio prompts, Fugatto offers unprecedented audio manipulation. However, Nvidia weighs public release carefully, eyeing responsible deployment.
December
Luma AI
Luma AI Launches Photon Models: High-Quality Images for Less
Photon and Photon Flash deliver 1080p images at unbeatable prices — $0.015 and $0.002, respectively. Their speed, creativity, and cost-effectiveness have gained immediate acclaim among fashion designers, filmmakers, and content creators.
DeepMind
Genie 2: The Next Frontier in AI-Generated 3D Worlds
This foundation world model transforms single image prompts into extensive interactive environments, complete with complex object interactions, NPCs, and emergent physics. Genie 2 opens new doors for gaming, AI training, and creative prototyping.
Amazon & Anthropic
Amazon and Anthropic Partner to Build AI Supercomputer
Project Rainier harnesses AWS Trainium 2 chips for a massive, more cost-effective AI infrastructure. With a fivefold increase in cluster size, Anthropic can accelerate Claude’s training at lower costs — challenging GPU-based systems.
Google Veo 2 & Imagen 3: The Dynamic Duo Redefining AI Media Creation
Veo 2 brings cinematic-quality video generation with text-based lens and style controls, while Imagen 3 refines image composition with advanced detail. The Whisk platform experiments with playful image remixing and AI captioning.
Apptronik & Google DeepMind
AI and Robotics Unite: Apptronik Joins Forces with Google DeepMind
The collaboration merges Apptronik’s humanoid robotics with DeepMind’s advanced AI, boosting versatile, real-world robot capabilities. This partnership could reshape logistics, manufacturing, and even home robotics.
In Summary
2024 laid the groundwork with massive leaps in LLMs, robotics, healthcare AI, and regulation. In 2025, we’ll see:
- AI woven into core operating systems and consumer devices,
- Mass-scale humanoid robotics tackling real-world tasks,
- Accelerated medical breakthroughs via gene editing and AI-diagnosis,
- Regulations shaping how and where advanced AI can be deployed, and
- Ever-more-powerful generative models fueling new apps and creative possibilities.
From personalized AI in your pocket to robotic coworkers on the factory floor, 2025 looks poised to be the year AI becomes truly ambient, everywhere, and transformative — with both immense opportunities and critical societal questions to address.
For more such AI updates that is actually useful, subscribe to our newsletter (72K+ subscribers have already grabbed the offer)- Click here