- Digital Dips
- Posts
- OpenAI just opened the door to AGI
OpenAI just opened the door to AGI
Dipping into the digital future: Google has made a huge series of announcements, and OpenAI is launching multiple new models.

Hi Futurist,
I hope you had a lovely Easter with your family. With the days off around Easter, maybe you found a moment to catch your breath. In the previous Dip, I already wrote that I found it nearly impossible to keep up with all the news and interpret what it means for yourself, your work, your organization, and society. But the past two weeks… oh boy. I truly believe we’ve now arrived at the start of a steep upward curve. ‘The last supper’ is behind us. Now it really begins. Therefore, I am sending you insights, inspiration, and innovation straight to your inbox. Let’s dive into the depths of the digital future together and discover the waves of change shaping our industry.
💡 In this post, we're dipping in:
📣 Byte-Sized Breakthroughs: OpenAI didn’t just release new models, they may have cracked the door open to AGI. Meanwhile, Google is building the backbone of agent collaboration with A2A, and Claude just got smarter at reading emails and doing deep web research. Canva dropped a new suite merging AI and creativity, and Copilot can now click and scroll like a human.
🤖 Digital Toolbox: From prompt-to-product in minutes to workflows you can build by just typing, this week’s tools are all about less friction, more flow. WillBot keeps your ads sharp and your team sharper. Firebase Studio rethinks app building with sketches and sentences. And Lleverage? It turns plain text into working automations. No dev team required.
🧐 In Case You Missed It: From talking browsers and video-editing sidekicks to agents that practically run your day for you. OpenAI dropped breadcrumbs all over the place, Google rolled out a full-on agent army, and Amazon's new voice is suspiciously smooth. If you blinked, you missed fifty tools. But lucky for you, we didn’t.
Do you have tips, feedback or ideas? Or just want to give your opinion? Feel free to share it at the bottom of this dip. That's what I'm looking for.

No time to read? Listen to this episode of Digital Dips on Spotify and stay updated while you’re on the move. The link to the podcast is only available to subscribers. If you haven’t subscribe already, I recommend to do so.

Quick highlights of the latest technological developments.
Headstory: OpenAI just opened the door to AGI
An internal memo from the CEO of Shopify reveals how large organizations are viewing AI today. Before hiring someone new, you should first consider whether an AI agent can do the job. That’s not an exaggeration, it’s literally what the leaked memo says. AI is no longer just a tool. It’s becoming a full-fledged teammate. Many organizations are in absolute awe of how AI augments human ability, fills talent gaps, and radically boosts productivity. And with OpenAI’s newest releases, o3 and o4-mini, I’m convinced we’re at the beginning of a sharp upward acceleration. In this article, I’ll explain why that matters and why it should matter to you.
Last week, OpenAI released o3 and o4-mini, two cutting-edge models in their o-series designed to think before they speak. These represent the smartest models released to date and a step change in ChatGPT's capabilities for everyone from curious users to advanced researchers. For the first time, these models can independently access and combine every tool ChatGPT offers: searching the web, analyzing uploaded files with Python, reasoning about visual inputs, and generating images.
These models are trained to reason about when and how to use tools, typically delivering detailed and thoughtful answers in minutes. They tackle multi-faceted questions more effectively, representing a step toward a more agentic ChatGPT that can independently execute tasks. The combined power of state-of-the-art reasoning with full tool access translates into significantly stronger performance across academic benchmarks and real-world tasks.
The release of o4-mini suggests something bigger is brewing internally. Full o4 likely already exists behind the scenes and is being used to train even more advanced systems. We expect GPT-5 at the end of the summer.
But this is more than a version update. I believe this is the beginning of Level 4 on OpenAI’s five-tier AI maturity scale. A stage they call Innovator AI. These systems function similarly to inventors who blend knowledge from multiple fields to arrive at novel solutions. They don't just follow instructions. They innovate. They create. They discover.
Scientists using these models report potential new discoveries. They combine knowledge from multiple disciplines. They make creative, cross-domain decisions. They function more like inventors than assistants. These models possess knowledge and intelligence presumably surpassing 99% of humans and experts.
This is why I believe with o3 and o4 we are entering the AGI-window (Artificial General Intelligence). Why? Because they reason with tools the way we do. We employ tools to achieve outputs, ChatGPT now does the same through independent thinking. We grab calculators, search engines, spreadsheets. Now, ChatGPT does the same. It evaluates, decides, selects, and applies tools autonomously to transform input into output. As these models grow smarter, and they will, their ability to determine which tools they need improves correspondingly.
Tool use autonomy is the bridge between artificial narrow intelligence and AGI. When systems determine not just answers but approaches, selecting tools based on contextual understanding, they cross into territory previously exclusive to human intelligence. The ability to reason about process, not just content, marks a turning point in AI evolution. Combined with expanding context windows and improved reinforcement learning, we're witnessing the emergence of artificial general intelligence.
If that sounds far off, consider this: many AI labs are already preparing for a post-AGI world. Google DeepMind, for example, is hiring researchers to explore what comes next. They seek researchers for post-AGI exploration. At the same time, they’ve built reinforcement learning (RL) systems that can discover their own RL algorithms.
And it goes deeper. DeepMind’s “Streams” system doesn’t even learn from human datasets. It learns from the world itself. No labels. No predefined categories. Just raw observation, experimentation, and abstraction. This isn’t only about mimicking human knowledge. It’s mostly about generating new knowledge. Beyond human reach. And in that world, we’ll need to rethink governance, ethics, and even the role of humanity itself.
What's clear is that we navigate uncharted territory. These models represent something qualitatively different, whether we call it AGI or not. AI expert Ethan Mollick calls this ‘Jagged AGI’: superhuman in enough areas to transform how we work and live, but unreliable enough that human expertise remains essential. One day, an AI solves advanced mathematical proofs. The next, it stumbles over a basic riddle.
But within that jaggedness lies opportunity. Those who understand where AI shines, and where it fails, will gain a massive edge. The models themselves improve rapidly, but organizational adaptation moves slower. The gap between the two is your window of leverage. This is why it should matter to you. Companies that integrate these technologies strategically will outpace competitors still debating whether to experiment.
If you lead an organization, here’s what you should do now:
Create cross-functional AI steering committees with representation beyond IT.
Develop tiered implementation roadmaps prioritizing high-value, low-risk applications first.
Establish AI ethics frameworks and governance structures before deployment scales.
Invest in continuous learning programs keeping leadership current on AI capabilities.
Build strategic partnerships with AI providers offering industry-specific solutions.
For me, it’s clear what’s next. In the coming months, AI models will gain broader knowledge, near-infinite context windows, access to more tools, and agent-level capabilities. Combine that with what I’ve written in earlier editions: agents are already doubling in power every 3–4 months. And soon, they’ll operate in teams, collaborating across tasks. All of this is driven by reinforcement learning methods that even we don’t fully understand. AGI is not a distant future anymore. For me, for the first time, there’s no doubt: it’s here. Like I wrote in my 2025 predictions, ‘one AI lab will announce they reached AGI’. I don’t think that is a prediction anymore, I think it’s reality.
The winners of this new era won’t be those who chase every shiny update. It will be those who understand how and where to deploy AI thoughtfully. That means building governance before chaos, upskilling before replacement, vision before fear. The future arrives unevenly. But it arrives for everyone. And you have the possibility to shape it yourself. It’s time to build.
Google brings agent cooperation to the enterprise
TL;DR
Google introduces Agent2Agent (A2A), an open protocol designed to let AI agents communicate, share tasks and coordinate actions across systems, even when built by different vendors. With backing from over 50 partners like Salesforce, ServiceNow and SAP, this move could form the backbone of AI collaboration in the enterprise world.
Read it yourself?
Sentiment
Online reactions show that many see A2A as a necessary step forward. The idea of a shared protocol for agents is gaining traction, especially with heavyweight support from across the tech industry. Some remain cautious about Google leading the charge, but with this broad alliance of companies onboard, it’s clear the sector is serious about building a standard for AI collaboration.
My thoughts
Agent-to-Agent communication is the future. I’ve said it before: parallel agents are the new workforce. But how do you get them to actually work together? Imagine agents not just automating, but negotiating. Specializing. Coordinating. Acting like teams. The A2A protocol is the common language they’ll speak. It complements Anthropic’s MCP. Where MCP gives context, A2A handles collaboration. Together they form the base layer of the modular, decentralized AI stack. No single tool has all the data or business knowledge. Most tasks need input from multiple systems. That’s why this protocol matters. A2A gives agents a shared way to talk, understand each other, and just get the job done. Salesforce knows CRM. Workday knows HR. Box knows docs. And soon, they’ll all talk fluently to each other.
OpenAI’s GPT-4.1 arrives quietly, but with big ambitions
TL;DR
OpenAI launches three new API-only models: GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano. These models improve across the board—better at coding, sharper at following instructions, and able to handle up to 1 million tokens of context. They’re also cheaper, faster, and more efficient than GPT-4o, but come with a catch: they’re not available in ChatGPT.
Sentiment
The excitement online is real. Developers are praising the speed, the improved coding skills, the massive context window, and the lower cost. But the buzz isn’t without debate. For the first time, OpenAI releases a model that lags behind Google’s Gemini 2.5 Pro in terms of raw knowledge. Even more curious: GPT-4.1 is also pricier than Gemini Flash 2.0, while only offering slightly better performance than 4o. So, while impressive, it’s not the clear king of the hill.
My thoughts
This release isn’t for you. Well, not directly. GPT-4.1 runs in the background. It’s not for chatting, but for powering tools. Think about your calendar app that now knows how to draft invites. Your photo editor that generates fitting captions. Or your travel app that parses your emails and builds your itinerary. These models are the invisible engines behind better software. Without you even realizing it. And now, they’re within reach for almost every company. That’s what makes GPT‑4.1 such a big deal. You no longer need deep AI expertise or huge budgets to build smart features. The mini and nano models are fast, cheap, and smart enough to power 90% of common tasks, from summarizing customer feedback to tagging products or analyzing documents. AI doesn’t have to be flashy. It just needs to work. Quietly. Reliably. And that’s exactly what 4.1 offers. With GPT-4.1 being smarter, faster, and cheaper than 4o, developers can now roll out better features, faster.
Of course, OpenAI isn’t the only one in the game. Google launched a similar model (read more about model this below). But while that one might compete on price, you’ll need to turn off reasoning to match OpenAI’s efficiency. OpenAI keeps moving through the side door. Most people won’t even notice GPT‑4.1 entering their lives. But it will. Through your apps. Through your tools. Enhancing how you search, write, plan, and communicate. And you benefit from that, even if you never open ChatGPT. That’s the paradox. AI might becomes invisible for you, and in doing so, becomes everywhere.
More byte-sized breakthroughs:
Canva launches Visual Suite 2.0, where creativity meets productivity
Canva has unveiled Visual Suite 2.0, a comprehensive update that integrates design and productivity tools into a single platform. Key features include Canva Sheets, a visual spreadsheet tool with AI-powered insights, and Canva Code, which allows users to create interactive content without coding. The suite also introduces enhanced photo editing capabilities and a centralized AI chatbot to streamline creative workflows.Copilot Studio can click, type and scroll like humans
Microsoft adds Computer Use to Copilot Studio, letting AI agents interact with websites and desktop apps just like humans do. They can now click buttons, select menus, and type into fields, even when no API is available.
From data entry to invoice processing, it automates the repetitive stuff with smart reasoning. Just describe what you want, and the agent takes it from there.Claude now reads your email and researches like a pro
Anthropic has enhanced Claude with two major updates: a new Research feature and integration with Google Workspace. The Research tool enables Claude to perform iterative, multi-step web queries, providing comprehensive answers with citations. Meanwhile, the Google Workspace integration allows Claude to access Gmail, Calendar, and Docs, streamlining tasks like summarizing meeting notes or emails.

A must-see webinar, podcast, or article that’s too good to miss.
WillBot - Your AI performance marketing agent
WillBot is your always-on AI performance marketing agent that integrates seamlessly with your ad platforms. It analyzes creative elements and delivers actionable insights through natural language. With 24/7 monitoring and instant reporting, WillBot empowers your marketing team to make smarter decisions faster.
Firebase Studio - Build full-stack AI apps with just a prompt
Firebase Studio is your browser-based AI development environment. Prototype apps using natural language, images, or sketches, and let Gemini assist you from idea to deployment. With built-in previews, emulators, and one-click hosting, you can go from concept to production in minutes.
Lleverage - Automate in seconds. Just type it.
Lleverage lets you build powerful AI workflows just by typing what you want. With its visual builder, users can design, test, and deploy custom automations without coding. Just describe your task and watch it turn into a working automation. From document handling to full business ops.

A roundup of updates that are too cheesy to ignore.
Amazon's launches their AI voice model Nova Sonic, understanding not just words but tone and inflection for lifelike conversations.
Amazon Nova Reel 1.1 lets you generate multi-shot videos up to 2 minutes in length, ensuring style consistency across each shot.
Cloudflare unveils its new Agents platform, guiding you to build AI Agents and MCP servers using their SDK and Workers.
Microsoft unveils Copilot Vision in Microsoft Edge, it can see what you see on a screen to help you browse and organize effortlessly.
Visual Code introduces Agent Mode, for multi-step coding tasks, from code analysis to terminal command execution.
Krea Stage lets you create entire 3D environments with AI from images or text, delivering consistent snapshots every time.
Cloudflare's AutoRAG enters open beta, offering a fully managed RAG pipeline without needing to write a single line of Python code.
WordPress introduces a AI website builder that creates your site based on simple prompts.
Make launches AI Agents, transforming your business workflows with real-time decision-making capabilities.
Gamma's new update revolutionizes creation, letting your ideas shine with instant docs, sites, socials, and presentations.
Leonardo AI introduces Motion 2.0, upgrading video creation with AI for seamless text-to-video and enhanced playback.
Viggle's new Mic 2.0 lets you transform images into videos with seamless voice and motion control via audio or text prompts.
Runway Gen-4 Turbo now generating 10-second clips in just 30 seconds. Perfect for fast-paced creativity, now available across all plans
Cassette AI's Video to SFX model & API now offers hyper-realistic sound effects for any video, no prompts needed.
Together AI announces DeepCoder-14B, a fully open-sourced coding reasoning model reaching o1 and o3-mini levels, developed with Agentica.
Deep Cogito is introducing their path to superintelligence with their Cogito v1 Preview, outperforming the strongest open source competitors.
OpenAI's Evals allows you to define tests programmatically, automate evaluations, and iterate on prompts seamlessly within your workflow.
OpenAI has upgraded ChatGPT with enhanced memory capabilities, allowing it to reference all your previous conversations.
OpenAI introduces a new library to centralize all your ChatGPT image creations in one convenient location.
OpenAI is reportedly building its own X-like social network, featuring a prototype with ChatGPT’s image generation and social feed capabilities.
OpenAI sets its sights on acquiring AI coding company Windsurf after failed acquisition talks with Cursor.
Gemini Live is rolling out, it lets you ask anything you see and share your screen or camera for brainstorming and troubleshooting.
Google Workspace introduces AI functions to Google Sheets, streamlining text generation, summarization, and sentiment analysis for your data.
Google rolls out Veo 2 to Gemini Advanced & Google Workspace users, enabling high-quality 8-second videos from text in any style.
Google introduces new AI tools for video, image, speech, and music on Vertex AI, including the private preview of Lyria, their text-to-music model.
Google introduces new Security agents: Alert triage agent for dynamic investigations and Malware analysis agent for code safety checks.
Google unveils its next-gen Customer Engagement Suite with human-like voices and emotional comprehension for improved conversations.
Google introduces specialized agents for your data team, enhancing BigQuery and Looker with data engineering and science capabilities.
Google introduces an expanded Agentspace with unified search capabilities, a new Agent Gallery, and a no-code Agent Designer.
Google introduces the Agent Development Kit (ADK), an open-source framework empowering agents with controlled behavior via MCP.
Google introduces Workspace Flows, enabling automation of repetitive tasks, process improvement, better decision-making, and app integration.
Google’s newest model delivers serious reasoning power at a fraction of the cost. It let you control how much the model ‘thinks’.
Google AI Studio introduces an infinite canvas for rapid prototyping and building apps with Gemini API.
Grok launches Studio, featuring a canvas like Gemini & ChatGPT, code execution and Google Drive integration for collaborative content creation.
Grok remembers your conversations, offering personalized recommendations and advice tailored just for you.
Grok Workspaces helps you organize files and conversations in a single place with its own custom instructions.
Grok is now available in the API, offering 5x cost savings while excelling in real-world tasks across law, finance, and healthcare.
Notion introduces Notion Mail, an intelligent inbox that organizes itself, drafts emails, and arranges meetings according to your preferences.
Notion’s open-source MCP is now on GitHub, ready for seamless AI integration with your favorite client.
Pika Twists lets you manipulate any character or object in your footage seamlessly, keeping everything else perfectly intact.
Perplexity now integrates Box and Dropbox, joining Google Drive, OneDrive, and SharePoint for seamless access and secure Deep Research.
Bolt introduces native Stripe integration, allowing you to build and monetize your entire SaaS business without coding.
HiDream-I1-Dev is now open-source, climbing to #1 on the Artificial Analysis leaderboard, temporarily outpacing GPT-4o.
YouTube offers creators a free AI-powered tool to compose custom instrumental tracks for their videos.
LTX Image upscale is now available in the Storyboard workspace, enabling 4K resolution upgrades with just one click.
LTX Studio unveils Timeline, enabling seamless project management from start to finish with intuitive organization and editing tools.
Mistral Le Chat rolls out beta libraries, similar to ChatGPT and Perplexity Spaces, supporting file uploads in different formats.
HeyGen's new Video Score immediately highlights areas for improvement in your footage, enabling the creation of superior avatars with ease.
HeyGen’s new MCP server is now live, enhancing AI with smarter contextual video generation through external tool and data integration.
HeyGen has introduced a new feature that corrects eye contact, allowing you to maintain seamless connection with your audience.
MiniMax MCP is now live. Streamline your workflow with multimodal tools, featuring Voice Cloning, Text-to-Speech, and Image & Video Generation.
Freepik introduces Composition Reference, a new tool that lets you build any visual from a reference image or a sketch with notes.
Kling AI's massive 2.0 update unleashes tools like Kling 2.0 for video and Kolors 2.0 for images, boosting your creative storytelling power.
Kling AI introduces Multi-Elements, allowing you to swap, add, and delete elements in videos for a whole new narrative experience.
Cohere launches Embed 4, enabling businesses to achieve advanced multimodal search capabilities for secure AI applications.
Firecrawl unveils FIRE-1 Agent, the AI-powered web action agent that breaks through interaction barriers to reveal hidden data.
Flora's Image Blending is now live, enabling seamless fusion of style, structure, and subject with intuitive prompts.
Luma Labs introduces Camera Angle Concepts, enabling new cinematic perspectives like overhead, selfie, and aerial view.
Midjourney updates its image editor with a refreshed UI, layers, a smart selection tool, and enhanced moderation.
ElevenLabs introduces the official MCP server, granting Claude and Cursor full access to its AI audio platform with simple text prompts.

How was your digital dip in this edition?You're still here? Let me know your opinion about this dip! |

This was it. Our thirty-fourth digital dip together. It might seem like a lot, but remember; this wasn't even everything that happened in the past few weeks. This was just a fraction.
I help business leaders navigate this rapidly evolving landscape. Through capability assessment, use-case development, and pilot program design, I accelerate your organization's AI adoption without disrupting core operations. My workshops help executive teams understand both theoretical implications and practical applications of reasoning AI, building a shared vision of your AI-enabled future. Together, we translate technological possibility into strategic advantage, identifying where AI creates the greatest value for your specific business context. I guide implementation roadmaps that balance quick wins with long-term transformation, ensuring measurable returns while building organizational capability. Send me a message if you need help shaping your future.
Looking forward to what tomorrow brings! ▽
-Wesley