• Tech Insights 2025 Week 51

    More JUICE! GPT-5.2 was launched last week, and this is the first model built with the new datacenters using the latest NVIDIA Blackwell chipsets. GPT-5.2 beats over 70% of human experts in office tasks in the GDPval benchmark, it hallucinates 30% less than GPT-5, and it can use it’s entire 400kb context window for extreme precision in programming tasks and needle-in-a-haystack problems. This is a significant upgrade over GPT-5.1.

    So, how did OpenAI achieve this? There are many factors, but the most important one is that it uses more compute power per request than the previous models. A LOT more! To control how much time and depth an AI model thinks before responding OpenAI uses a parameter called JUICE. Higher JUICE means deeper analysis and better results for complex decisions, but with slower responses and higher cost. You cannot set this parameter yourself, this parameter is pre-set by OpenAI when you switch between different variants (low, medium, high).

    The previous top models GPT-5.1 High and GPT-5.1-Codex-High had a JUICE parameter value of 256. The latest model GPT-5.2 has a JUICE value of 768, three times higher! This is a crazy jump in compute power used for inference. This is what the new hardware infrastructures are capable of, and this is just the start. Contrary to what you might have heard the past weeks, the age of scaling is definitely not over yet.

    In the past few months you have heard me talk a lot about the possibilities of agentic software programming, primarily with recent SOTA models like Claude Opus 4.5 and GPT-5.1-Codex. Last week OpenAI announced that the #1 most downloaded Android App, OpenAI Sora for Android, was created 100% prompted with OpenAI Codex in less than four weeks. It was developed by a small team of 4 people, is 99.9% crash free, and has an average rating of 4.8 based on 63k reviews. Consider this, and that GPT-5.2 is now better than most white-collar experts at most office tasks, and you begin to grasp the full context of why Jensen Huang last Thursday in an interview with TIME said “There’s a belief that the world’s GDP is somehow limited at $100 trillion. AI is going to cause that $100 trillion to become $500 trillion”.

    So how does all this translate to Swedish companies? Not so much. In a study published last week by EY Global People Consulting they interviewed 15,000 employees across 29 countries, and Sweden is on the rock bottom when it comes to AI use in the workplace. Only 15% of the Swedish participants in the study used AI on a daily basis, compared to 37% which was the global average. This matches my own experience pretty well. Most Swedish companies have just finished their rollout of M365 Copilot to a few select employees (not all to save cost), but in practice very few people actually use it since many prefer working with other tools like ChatGPT or Claude. And when it comes to agentic solutions, many companies would like to buy it as a packaged product at a fixed price, the problem with that approach is that agentic AI solutions are not created that way, they are developed iteratively with many interactions across the entire organization.

    The good news is that this makes it very easy for you to move far ahead of everyone else! Bring in external experts in agentic AI, put them in a squat team with your own architects and product owners, and just start building things. You will be shocked by how much of your repetitive tasks can be automated by modern AI systems. We are doing this now with many Swedish companies, and the results are so far off-the-charts. Maybe not $500 trillion off-the-charts, but we are getting there…

    Thank you for being a Tech Insights subscriber!

    Listen to Tech Insights on Spotify: Tech Insights 2025 Week 51 on Spotify

    THIS WEEK’S NEWS:

    1. OpenAI Releases Enterprise AI Adoption Report: 6x Usage Gap Between Leaders and Median Users
    2. OpenAI Releases GPT-5.2
    3. Agentic AI Foundation launches under Linux Foundation with Block, Anthropic, and OpenAI
    4. Stripe Launches Agentic Commerce Suite
    5. Amazon and Microsoft Commit $52B to India AI Infrastructure
    6. Claude Code Slack Integration Launches in Beta
    7. Mistral Launches Devstral 2 and Mistral Vibe CLI
    8. Cursor Releases Visual Editor for Browser-Based Development
    9. Disney and OpenAI Strike $1 Billion Licensing Deal for Sora Video Generation
    10. Adobe Photoshop, Express and Acrobat Launch in ChatGPT
    11. Runway introduces GWM-1, a real-time general world model family
    12. Trump Signs Executive Order for Federal AI Regulatory Framework

    OpenAI Releases Enterprise AI Adoption Report: 6x Usage Gap Between Leaders and Median Users

    https://openai.com/index/the-state-of-enterprise-ai-2025-report

    The News:

    • OpenAI published its first State of Enterprise AI report on December 8, analyzing data from 1 million business customers and surveying 9,000 workers across 100 enterprises.​
    • Frontier workers at the 95th percentile send 6x more messages than median employees and save over 10 hours per week, while frontier firms generate 2x more messages per seat and 7x more messages to custom GPTs.​
    • ChatGPT enterprise message volume grew 8x since November 2024, and API customers now consume 320x more reasoning tokens than one year ago.​
    • Custom GPT usage increased 19x in 2025, now accounting for 20% of enterprise messages, with BBVA reportedly using over 4,000 custom GPTs.​
    • Three-quarters of surveyed workers report AI improved work speed and quality, with specific gains including 87% of IT workers solving issues faster, 85% of marketers executing campaigns faster, and 73% of engineers delivering code faster.​

    My take: A week earlier Anthropic published similar findings, reporting 80% reduction in task completion time based on 100,000 user conversations. Frontier desktop workers now save over 10 hours per week with ChatGPT, and API customers consume 320 times more tokens than just a year ago. Companies are starting to see real financial effects from AI.

    The main takeaway for me with this study is the huge gap in usage between median users and “frontier” users. This clearly shows that it’s not enough to just deploy a chat service like ChatGPT in your company, you also need to make sure that people know how to use it – and that they want to use it. This is why it’s so important to give users the right tools, the benefits are enormous if people want to use them and learn to use them the right way.

    Read more:

    OpenAI Releases GPT-5.2

    https://openai.com/index/introducing-gpt-5-2

    The News:

    • OpenAI released GPT-5.2, positioning it as its most advanced model series for professional work and long-running agent tasks.​
    • The model ships in three variants: Instant for routine queries, Thinking for complex structured work like coding and document analysis, and Pro for maximum accuracy on difficult problems.​
    • GPT-5.2 scored 52.9% (Thinking) and 54.2% (Pro) on ARC-AGI-2, a benchmark testing abstract reasoning while resisting memorization. It achieved 100% on AIME 2025 math problems and 93.2% on GPQA Diamond science questions.​
    • API pricing is $1.75 per million input tokens and $14 per million output tokens, with 90% discount on cached input at $0.175 per million tokens.​
    • The model supports up to 400,000 token context window and is available immediately to ChatGPT paid users, API developers, Microsoft 365 Copilot, and GitHub Copilot subscribers.

    My take: GPT-5.2 is the first model trained on the new infrastructure with the latest NVIDIA Blackwell GB200-NVL72 GPUs. And it shows. Two weeks ago former OpenAI employee Ilya Sutskever proclaimed that the “age of scaling” is over. OpenAI thinks otherwise, and GPT-5.2 is clear evidence of this.

    GPT-5.2 has three major innovations that will make a significant impact on businesses in 2026:

    1. In the test GDPval, where AI models are benchmarked against human experts in 44 occupations, GPT-5.2 is the first model that wins more than it loses, and it wins by over 70%! It’s also 11 times faster and 99% cheaper than human workers.
    2. Context window, the internal “memory” of GPT-5.2 can now be fully used. GPT-5.2 can use the context completely, which means you can upload texts up to 400kb and GPT-5.2 will instantly find every single detail like needle-in-a-haystack in it.
    3. Hallucinations. 30% less hallucinations than GPT-5.1, which already scored well. GPT-5.2 has an average response-level error rate less than 6.2%.

    We have finally reached the point where AI models are good enough for most tasks in most companies. They can write all your code, write all your documents, and process all your requirements. Any further model improvements will be like comparing subpixel optimizations on large-screen TVs. If your company has been waiting for models to become good enough to do any task in an agentic setup, this is it. This is what you have been waiting for.

    Read more:

    Agentic AI Foundation launches under Linux Foundation with Block, Anthropic, and OpenAI

    https://www.linuxfoundation.org/press

    The News:

    • Block, Anthropic, and OpenAI created the Agentic AI Foundation under the Linux Foundation, to coordinate open standards and shared infrastructure for agentic AI systems.
    • The organizations contributed existing projects such as Model Context Protocol (MCP), AGENTS.md, and goose, consolidating them into a vendor neutral foundation.
    • MCP defines a protocol for tools, data sources, and models to interoperate, with existing implementations in products like Claude Desktop and early third party SDKs.
    • AGENTS.md provides a structured specification format for describing agent behaviors and capabilities in repositories, similar in style to CONTRIBUTING.md or README files.
    • Goose offers reference implementations and libraries for agent workflows, giving developers concrete examples of multi step tool use and orchestration patterns.
    • Governance sits under the Linux Foundation with a neutral technical steering committee, published charters, and membership open to multiple vendors and independent contributors.

    My take: Thank you, and finally! This is a huge step towards MCP being accepted as a true open standard, and AGENTS.md as THE agent configuration script that can be used by all models. No-one will be happier than me not having to keep AGENTS.md, CLAUDE.md, GEMINI.md, .cursorrules and copilot-instructions.md in sync with each other in every single software project i am working on. Hopefully in the near future all agentic tools will just use AGENTS.md for control.

    Read more:

    Stripe Launches Agentic Commerce Suite

    https://stripe.com/blog/agentic-commerce-suite

    The News:

    • Stripe launched the Agentic Commerce Suite on December 11, 2025, to connect businesses with AI shopping agents through a single integration.​
    • The suite addresses a six-month integration timeline that businesses face when connecting to each individual AI agent platform.​
    • Businesses upload their product catalog to Stripe once and select which AI agents to sell through in the Stripe Dashboard.​
    • Stripe handles product discovery, checkout, taxes, shipping calculations, and fraud detection while businesses continue using their existing commerce stack and order fulfillment systems.​
    • The suite processes Shared Payment Tokens (SPTs), payment credentials scoped to specific sellers with time and amount limits, and uses Stripe Radar to detect fraud patterns specific to AI agent transactions.​
    • Early adopters include URBN (Anthropologie, Free People, Urban Outfitters), Etsy, Ashley Furniture, Coach, Kate Spade, and ecommerce platforms Wix, WooCommerce, BigCommerce, Squarespace, and commercetools.​
    • “At Etsy, our responsibility is to ensure that our sellers’ work can be discovered wherever buyers choose to shop. Stripe’s Agentic Commerce Suite offers an integration solution that makes this easier than ever, enabling us to surface sellers’ unique items to buyers across platforms”, said Rafe Colburn, chief product and technology officer at Etsy.

    My take: It’s interesting how Stripe is moving away from being just a payment provider to now also owning the product catalogs and product discovery. The Agentic Commerce Suite builds on the Agentic Commerce Protocol (ACP), an open standard that Stripe co-developed with OpenAI and released in September 2025. OpenAI implemented ACP first with Instant Checkout in ChatGPT, and Google launched agentic checkout features in November 2025 within Google Search and AI Mode. If you work with e-commerce and have not yet talked about the possibilities with ACP within your company, then you now have a new topic to discuss on your next executive management meeting.

    Read more:

    Amazon and Microsoft Commit $52B to India AI Infrastructure

    https://www.bbc.com/news/articles/c3w79pgn8peo

    The News:

    • Amazon announced $35 billion in investments across all its businesses in India through 2030, focusing on AI-driven digitization, export growth, and job creation.​
    • Microsoft pledged $17.5 billion over four years (2026-2029) to expand cloud and AI infrastructure in India, marking its largest Asian investment.​
    • Amazon’s investment aims to digitize 15 million small businesses with AI capabilities and enable $80 billion in cumulative exports by 2030.​
    • Microsoft plans to build the India South Central cloud region in Hyderabad by mid-2026, which will become its largest hyperscale region in the country with three availability zones.​
    • Amazon has already invested $40 billion in India since 2010, making this new commitment bring total planned spending to $75 billion.​
    • Microsoft’s investment builds on a previous $3 billion commitment announced earlier in 2025, set for completion by end of 2026.​

    My take: What wasn’t mentioned in the BBC article is what energy sources will be used to power these $52 billion data centers. India gets over 70% of its electricity from fossil fuels, and while data centers only represent 0.5% of the total electricity today in India, that figure is set to increase rapidly in the next few years. Combine that with heavy water usage for cooling in some of the world’s driest areas, and you can see where this potentially could be headed. Still this is one of the largest investments by US companies in India so far, and it just shows how much is being invested into AI in the next few years.

    “New data centers are directly causing utilities to build new gas plants and delay coal plants’ retirements, and that locks us into using dirty energy for decades at a minimum”. Eliza Pan, spokeswoman for Amazon Employees for Climate Justice.

    Read more:

    Claude Code Slack Integration Launches in Beta

    https://claude.com/blog/claude-code-and-slack

    The News:

    • Anthropic launched Claude Code integration for Slack last week, allowing developers to delegate coding tasks directly from Slack threads without switching tools.​
    • Developers tag @Claude in Slack channels or threads with coding requests. Claude analyzes recent messages to determine the appropriate repository, automatically initiates a Claude Code session, and posts status updates back to the Slack thread.​
    • The system gathers context from channel and thread messages to inform debugging and implementation decisions. Claude auto-selects repositories based on those previously authenticated in Claude Code on the web.​
    • Upon completion, Claude provides a link to review the full coding session and a direct link to open a pull request.​
    • The feature requires the existing Claude app for Slack plus access to Claude Code on the web. No additional downloads are needed for current Slack app users.​
    • Anthropic reports that internal employees use Claude for 60% of tasks, achieving 50% productivity increases. 27% of Claude-assisted work consists of tasks that might not have been completed otherwise.

    My take: What happens when you ask Claude Code 200 times to improve your code base without any human intervention? Well you can read the results from that test here, and it’s not pretty. All Claude models like to write lots of code, they typically prioritize writing code before trying to solve the actual underlying problem or creating a robust architecture. This is why you should never ever allow Claude to change your code without reviewing both the plan AND the generated code in detail before committing. OpenAI Codex does not have this problem, it is much more restrictive adding new code and takes a much more nuanced approach to programming. I feel confident with Codex, working with Claude is challenging.

    This is why I would not recommend anyone to use this Slack integration for anything other than very simple text or CSS changes. Do not connect this to your code base hoping it will be like asking a senior developer to do things. This is not how Claude works. Ask it to change text strings, translate into different languages, change colors or style, it excels at that. Not autonomous code improvements.

    “Ultrathink. You’re a principal engineer. Do not ask me any questions. We need to improve the quality of this codebase. Implement improvements to codebase quality.”

    Read more:

    Mistral Launches Devstral 2 and Mistral Vibe CLI

    https://mistral.ai/news/devstral-2-vibe-cli

    The News:

    • Mistral released Devstral 2, a 123-billion-parameter coding model family with two sizes that targets agentic software development. The flagship Devstral 2 uses a modified MIT license while Devstral Small 2 (24B parameters) uses Apache 2.0.​
    • Devstral 2 achieves 72.2% on SWE-Bench Verified, placing it among top open-weight coding models. The model requires minimum 4 H100-class GPUs for deployment while Devstral Small 2 runs on single GPU or CPU configurations.​
    • Both models support 256K context windows and handle multi-file edits, framework dependency tracking, and autonomous error correction. Devstral Small 2 scores 68.0% on SWE-Bench Verified despite being 5x smaller than comparable models.​
    • Mistral Vibe CLI provides a terminal-based coding assistant released under Apache 2.0 license. The tool scans file structures and Git status, executes shell commands, and integrates with IDEs through Agent Communication Protocol.​
    • Devstral 2 API access is currently free, with post-launch pricing set at $0.40/$2.00 per million tokens (input/output). Mistral reports 7x cost efficiency versus Claude Sonnet for real-world tasks.

    My take: Who will use Mistral Devstral 2 and for what purpose? The answer is easy: if you or your organization has decided that you will not use any cloud service or model from a US-owned or Chinese company (like Danish city governments), then here is Mistral to the rescue. And it’s free if your company earns less than $20 million per month. To run Devstral 2 you need four H100-class GPUs for the full 123B version. If you have that then congratulations, you can now vibe code for free with the new Mistral Vibe CLI tool. Just don’t expect GPT-5.2 performance from it.

    Cursor Releases Visual Editor for Browser-Based Development

    https://cursor.com/blog/browser-visual-editor

    The News:

    • Cursor launched a visual editor that combines web apps, codebases, and editing tools in one window. Developers can rearrange interface elements through drag-and-drop while AI agents update the code automatically.
    • The editor works with the DOM tree structure. Users can reorder buttons, rotate sections, and test grid configurations by dragging rendered elements. After arranging the visual design, developers instruct the agent to locate and update the relevant components.
    • React component props appear in the sidebar. Developers can adjust component states and variants without switching between files or windows.
    • The sidebar includes sliders, color pickers, and design system tokens. Users can modify styles for grids, flexbox layouts, and typography with live preview.
    • Point-and-click prompting accepts natural language instructions. Users can click elements and type commands like “make this bigger” or “turn this red”. Multiple agents process requests in parallel.

    My take: For web developers you can now open your React web page in the Cursor browser, then inspect and adjust UI properties to see things update in realtime. The AI agent will then update your React source code to match. I haven’t seen anything like it before, most other frameworks like Builder.io and Plasmic require you to register components, integrate SDKs, and structure your code beforehand to work with their systems. This solution by Cursor instead works with your regular React source. I wouldn’t go so far as to call it a game-changer for web development, but it is close. It also shows the value of Cursor as an agentic development tool compared to something like Codex or Claude Code. If you thought Cursor was part of an AI bubble then you should probably reconsider.

    Cursor also launched a new “Debug mode”, which automatically adds console logs to track a specific issue, asks you to trigger the issue, and then automatically collects the logs and tries to fix it. This is something that can easily be done manually, but instead of having to run the app, copy and paste the logs to the AI, all is done automatically by the editor.

    Read more:

    Disney and OpenAI Strike $1 Billion Licensing Deal for Sora Video Generation

    https://thewaltdisneycompany.com/disney-openai-sora-agreement

    The News:

    • Disney has signed a three-year licensing agreement with OpenAI, becoming the first major content partner for Sora, OpenAI’s video generation platform. The deal includes a $1 billion equity investment in OpenAI and allows users to create short videos featuring more than 200 Disney, Marvel, Pixar, and Star Wars characters through text prompts.​
    • Characters available include Mickey Mouse, Minnie Mouse, Ariel, Belle, Cinderella, Baymax, Simba, Mufasa, plus characters from Encanto, Frozen, Inside Out, Moana, Monsters Inc., Toy Story, Up, and Zootopia. Marvel and Star Wars characters such as Black Panther, Captain America, Deadpool, Iron Man, Darth Vader, Han Solo, and Luke Skywalker are also included in animated or illustrated versions.​
    • The agreement excludes any talent likenesses or voices. Sora generates videos up to one minute long based on user prompts, while ChatGPT Images creates still images using the same character library.​
    • Disney+ will stream curated selections of user-generated Sora videos. Disney CEO Bob Iger stated the company plans to eventually let users create AI videos directly within Disney+ itself, targeting younger users to increase engagement.​
    • Disney will use OpenAI APIs to build products and tools for Disney+ and deploy ChatGPT for employee use. The transaction requires definitive agreements, board approvals, and standard closing conditions.

    My take: On the same day as this partnership was announced, Disney sent a cease-and-desist letter to Google, accusing them of generating unauthorized Disney content at “massive scale”. If you have tried Veo 3.1 you know it’s been trained on Disney content, and it will be interesting to see where it ends up. It was not many months ago when Bob Iger promised to “focus on quality” and release fewer movies and reduce their output. Now the Internet will soon be filled with Sora-generated-Disney-spam and Disney+ will start streaming user-generated AI videos.

    Read more:

    Adobe Photoshop, Express and Acrobat Launch in ChatGPT

    https://news.adobe.com/news/2025/12/adobe-photoshop-express-acrobat-chatgpt

    The News:

    • Adobe launched Photoshop, Express and Acrobat as free apps inside ChatGPT, bringing Adobe editing tools to 800 million weekly ChatGPT users.​
    • Users access features through text prompts like “Adobe Photoshop, help me blur the background of this image,” with ChatGPT automatically surfacing the app and providing contextual guidance.​
    • Photoshop in ChatGPT adjusts specific image parts, modifies brightness and contrast, and applies effects like Glitch and Glow while preserving image quality. Users control effect intensity through sliders.​
    • Express provides access to professional design templates, text editing, image replacement, and animation directly in chat without switching apps.​
    • Acrobat enables PDF editing, text and table extraction, file merging and compression, format conversion, and sensitive detail redaction within ChatGPT.​
    • The integrations are available on ChatGPT desktop, web and iOS as of December 10, 2025. Adobe Express works on Android, with Photoshop and Acrobat support coming soon.​
    • Users can transfer work seamlessly from ChatGPT to Adobe’s native apps to access full feature sets not available in the ChatGPT integration.

    My take: User feedback so far has not been very positive, and it’s easy to understand why. It’s one thing to highlight an area of an image and ask a model like Nano Banana Pro in Photoshop to change something, it’s another thing to try to describe what to do in what specific area with just text. If you have 1 minute, just check this short video of Photoshop in ChatGPT and you will understand what I mean. 🫠

    Read more:

    Runway introduces GWM-1, a real-time general world model family

    http://runwayml.com/research/introducing-runway-gwm-1

    The News:

    • Runway released GWM-1, a family of autoregressive “general world models” built on Gen-4.5, for frame-by-frame, real-time generation and interactive control of virtual worlds, avatars, and robots.
    • GWM-1 generates video frame by frame with conditioning on actions such as camera pose, robot control signals, and audio, so users can update scenes interactively instead of precomputing full clips.
    • The GWM Worlds variant targets explorable 3D-style environments where a user or agent can move a camera through a scene while the model updates frames in real time.
    • GWM Avatars focuses on conversational agents that sync speech, facial motion, and body movement with audio input, with responses generated in a continuous video stream rather than discrete clips.
    • GWM Robotics applies the same world model to robotic manipulation, taking robot commands or state as input and predicting scene evolution, which can sit in simulation loops or feed planning systems.

    My take: GWM-1 is still a research preview, and looking at the examples it’s clear this is not something you want to use for production, but I am still intrigued by the promise of real-time world generation. There are so many uses for it, primarily for robotics where you can create millions of scenarios that robots are trained on in record time, but of course also for human training. This could solve one of the main problems with using AR/VR for training: content generation and world interaction, and assuming we get this working somewhat reliable in the next few years it might actually trigger a renewed wave of AR/VR interest in companies.

    Read more:

    Trump Signs Executive Order for Federal AI Regulatory Framework

    https://www.whitehouse.gov/presidential-actions/2025/12/eliminating-state-law-obstruction-of-national-artificial-intelligence-policy

    The News:

    • President Trump signed an executive order establishing a single federal AI regulatory framework that overrides state laws. The order aims to standardize AI regulation across the United States.
    • An AI Litigation Task Force will form within 30 days to challenge state AI regulations that conflict with federal policy. Colorado’s algorithmic discrimination ban faces particular scrutiny, with the administration arguing it may force AI models to produce false outputs.
    • The Commerce Department has 90 days to identify states with “onerous” AI laws. States on this list could lose access to portions of the $42.5 billion Broadband Equity Access and Deployment program funds, specifically non-deployment related funding.
    • The Federal Trade Commission receives direction to issue guidance on whether state laws requiring modifications to AI outputs violate federal deceptive practice prohibitions.
    • The order preserves state authority in three areas: child safety regulations, data center infrastructure decisions, and government procurement policies. The administration seeks congressional legislation to formalize this federal framework.

    My take: This regulation prevents states from enforcing their own AI regulations by establishing federal preemption. States like Colorado and California that have passed laws requiring companies to assess AI systems for discriminatory impacts or mandating transparency measures will lose the ability to enforce these requirements. This is a big win for tech companies like Google and OpenAI who both lobbied for this outcome, as they feel state regulations are becoming excessively restrictive and problematic.

    Read more: