- Tech Insights 2025 Week 46by Johan Sanneblad
Last week was one of the lowest-activity AI weeks in a while, but I predict the activity will ramp up significantly in the next couple of weeks. Google is planning to release Gemini 3 Pro very soon which is looking to be a real powerhouse with a 1 million token context window and unmatched agentic capabilities. Unmatched, until OpenAI launches GPT 5.1 which is also due out any week now.
Last week McKinsey released “The state of AI in 2025: Agents, innovation, and transformation”. This is mandatory reading if you are working as CEO, CIO, CDO or CTO and want to understand how leading companies actually use AI in production. There are lots of good material in the report, but one main takeaway stands out:
The report differentiates between AI high performers and everyone else, where AI high performers are respondents who reported that more than 5% of their organization’s EBIT and “significant value” are attributable to the organization’s use of AI. So what is the main difference between these high performers and everyone else? It’s that they approach AI as a transformation journey, not by rolling out chatbots and Copilot workflows.
Only 6% of all respondents were AI high performers, but 62% of all respondents were experimenting with agentic AI systems. To me, this means just one thing: there are too many people playing around with this stuff that have no clue of what they are doing and how to build AI agents that actually provide measurable value and ROI. If you have started an AI initiative at your company, ask yourself a simple question: have the people driving this initiative delivered tangible value from generative AI before, or is this their first experiment? Do they have the mandate to transform the organization, or is it driven as an isolated lab experiment?
Looking a these figures where 49% or all large companies have either started scaling or have already fully scaled their agentic AI solutions, it’s time for everyone to stop experimenting and start transforming.
Thank you for being a Tech Insights subscriber!

Listen to Tech Insights on Spotify: Tech Insights 2025 Week 46 on Spotify
THIS WEEK’S NEWS:
- McKinsey State of AI 2025: Most Organizations Stuck in Pilot Phase
- Moonshot AI Releases Kimi K2 Thinking Model
- Lovable Announces Enterprise and Education Partnerships with Atlassian, imagi, and OpenAI
- TypeScript Becomes World’s Most Popular Language on GitHub, Driven by AI Development
- Amazon Releases Chronos-2 Time Series Forecasting Model
- Cursor Adds Semantic Search, Shows 12.5% Accuracy Improvement
- AWS and OpenAI Announce Multi-Year Strategic Partnership
- German Commons: Largest Openly Licensed German Text Dataset Released
McKinsey State of AI 2025: Most Organizations Stuck in Pilot Phase
https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai

The News:
- McKinsey surveyed 1,993 participants across 105 countries between June and July 2025, finding 88 percent of organizations now use AI in at least one business function, up from 78 percent in 2024.
- Two-thirds of respondents report their organizations remain in experimentation or pilot stages, with only one-third scaling AI across the enterprise.
- 62 percent of organizations are experimenting with AI agents, but only 23 percent are scaling agentic systems, primarily in IT and knowledge management functions.
- Just 39 percent of respondents attribute any EBIT impact to AI use, with most reporting less than 5 percent of enterprise EBIT stemming from AI.
- High performers (6 percent of respondents who report 5 percent or more EBIT impact) are three times more likely to redesign workflows fundamentally and set growth or innovation objectives beyond cost reduction.
- 32 percent of respondents expect workforce reductions of 3 percent or more in the next year due to AI, while 13 percent predict increases of that magnitude.
- Organizations now mitigate an average of four AI-related risks, up from two in 2022, with 51 percent experiencing at least one negative consequence, most commonly from AI inaccuracy.
My take: In my own company TokenTek we focus on just one thing: we help companies build and scale their own agentic AI solutions. In many cases companies have previously struggled for months with their own AI solutions, experimenting with various agent frameworks and vector databases, trying out MCP servers and vibecoding chatbots. But they are not getting anywhere. They build things very few people are using and they don’t save any money by doing it.
For these companies, the easiest choice is then to pause these initiatives and just rollout M365 Copilot and GitHub Copilot and call it a day. The AI strategy is done, and AI tools are provided for both engineers and administrators. But they still don’t achieve any measurable ROI from these investments. AI tools are seen as a cost with few measurable effects on actual bottom line results.
The McKinsey report is very clear that there is a measurable effect on ROI from scaling agentic AI systems, but like most other initiatives you need experts that have done this a few times before so you don’t get stuck at the details. The report is also very clear that leadership matters: when the CEO and senior team own AI, adoption scales and budgets follow (many leaders spend >20% of digital on AI).
If you want to know how to get started with a profitable AI journey, you are very welcome to get in touch with me and I can show you real examples of how Swedish companies save money and improve revenue right now with agentic AI in production systems.
Moonshot AI Releases Kimi K2 Thinking Model
https://moonshotai.github.io/Kimi-K2/thinking.html

The News:
- Moonshot AI released Kimi K2 Thinking on November 6, 2025, a 1 trillion-parameter Mixture-of-Experts model that activates 32 billion parameters per inference. The model targets reasoning, coding, and agentic tool use tasks.
- The model scored 44.9% on Humanity’s Last Exam, 71.3% on SWE-Bench Verified, 83.1% on LiveCodeBench v6, and 60.2% on BrowseComp. These scores meet or exceed GPT-5 and Claude Sonnet 4.5 across multiple benchmarks.
- K2 Thinking executes up to 300 consecutive tool calls autonomously without human intervention. It supports a 256k token context window and uses INT4 quantization, which provides 2x inference speed compared to standard precision.
- The model operates under a Modified MIT License, which permits commercial use with one condition: products exceeding 10 million monthly active users or generating $20 million per month must display “Kimi K2” attribution in their interface.
- Users can access K2 Thinking through kimi.com, the Moonshot API at platform.moonshot.ai, and Hugging Face. The model weights and code are publicly available.
My take: Last week I reported about the open source model MiniMax-M2, the world’s then best performing open source model, and this week we have an even stronger model with Kimi K2 Thinking. The model actually matches GPT-5 on mathematical reasoning tasks like AIME 2025 and MATH2025. User feedback so far has been fenomenal, this model seems to perform exceptionally well both for agentic tasks as well as software programming! The main “problem” with this model is how to run it.
Technically, you can run an INT4 quantized version of it if you happen to have a pair of dual Mac M3 Ultras lying around. But that will only get you around 15 tokens per second. To run the full precision model at FP16 you need a stack of 32x H100 or A100 80GB GPUs. Considering that most thinking requests easily consume tens of thousands of tokens in reasoning alone, you quickly start to see where this is heading. Open source AI models no longer means “run a great language model on your own hardware” instead it’s more like “run open source language models in an expensive cloud just like Claude, Gemini and GPT-5”.
Read more:
- My Hands-On Review of Kimi K2 Thinking: The Open-Source AI That’s Changing the Game : r/LocalLLaMA
- Why Kimi K2 VRAM Requirements Are a Challenge for Everyone?
Lovable Announces Enterprise and Education Partnerships with Atlassian, imagi, and OpenAI

The News:
- Lovable partnered with Atlassian to integrate AI code generation into Jira tickets and Confluence documents, targeting product managers and development teams. Product managers can transform specifications into working prototypes and embed them back into Confluence pages.
- The integration connects Jira tickets and Confluence documentation directly to Lovable’s API. Teams convert planning documents into functional code prototypes without manual development work.
- Lovable, imagi, and OpenAI launched an education initiative providing $1 million in OpenAI credits. Schools receive free Lovable access through imagi during Computer Science Education Week and Hour of AI, running through December 31, 2025.
- The education program includes classroom-ready lesson plans, COPPA-compliant temporary accounts, and implementation resources. The initiative targets grades 9-12 students with no coding experience required.
- Atlassian added Lovable to its roster of AI development partnerships alongside Cursor, OpenAI, GitHub, and Cognition, according to the company’s Q1 FY26 shareholder letter.
My take: There are doubters of Lovable, but Lovable just keep on delivering. Most Lovable skeptics will say that some day the foundation models like Claude and ChatGPT will be good enough for all code generation, and then there is no longer a need for Lovable. I am not so sure of this. I believe that when large language models become more capable, the skills required by users to guide them in the right direction will also increase. The better the model is at creating complex solutions the better you need to be at guiding it. Lovable is like an auto pilot that sends the LLM in the right direction, and I believe there is a place for Lovable even when LLMs become infinitely better. If you haven’t yet tried Lovable and you work as product manager you definitely should.
TypeScript Becomes World’s Most Popular Language on GitHub, Driven by AI Development

The News:
- TypeScript overtook Python and JavaScript in August 2025 to become the most-used language on GitHub by contributor count, marking the first language shift of this magnitude in over a decade.
- TypeScript gained over 1 million new contributors in 2025, a 66% year-over-year increase, bringing its total to 2.63 million active developers and surpassing Python by approximately 42,000 contributors.
- GitHub attributes the rise to AI-assisted development tools, which benefit from TypeScript’s static typing that catches errors during compilation rather than at runtime.
- A 2025 study found that 94% of compilation errors in LLM-generated code are type-related, which TypeScript’s type system automatically flags before code execution.
- Major frameworks including Next.js 15, SvelteKit 2, Qwik, and Astro 3 now scaffold projects in TypeScript by default, accelerating adoption among new developers.
- TypeScript saw 77.9% year-over-year growth in AI-tagged projects on GitHub, establishing it as the preferred language for building AI application interfaces, APIs, and integration layers.
My take: Typescript with a well-configured and very strict linter, Prettier and Knip that runs every time the AI agent finishes a task creates an amazing environment for AI-based development. If you haven’t yet tried agentic 100% prompt based development in Typescript you are in for a treat, this really shows you how development of tomorrow will work.
Amazon Releases Chronos-2 Time Series Forecasting Model
https://www.amazon.science/blog/introducing-chronos-2-from-univariate-to-universal-forecasting

The News:
- Chronos-2 is a 120-million parameter time series foundation model that handles univariate, multivariate, and covariate-informed forecasting without additional training.
- The model uses in-context learning with a group attention mechanism that enables information exchange within arbitrary-sized groups of time series.
- Chronos-2 supports past-only covariates like historical traffic volume, known future covariates such as scheduled promotions or weather forecasts, and categorical covariates including specific holidays or promotion types.
- Amazon trained the model on synthetic data generated by imposing multivariate structure on time series sampled from base univariate generators, addressing the scarcity of real-world datasets with complex relationships.
- The model achieves state-of-the-art zero-shot accuracy on fev-bench, GIFT-Eval, and Chronos Benchmark II, outperforming existing time series foundation models across all categories on fev-bench.
- The model processes over 300 time series per second on a single NVIDIA A10G GPU and supports both GPU and CPU inference.
My take: Chronos-2 is the follow-up for the earlier Chronos and Chronos-Bolt models by Amazon, which collectively has gotten over 600 million downloads from Hugging Face. Both Chronos and Chronos-Bolt only supported univariate forecasting, which means they could only predict the future value of a single variable, using its own historical data. Chronos-2 extends this by introducing multivariate and covariate-informed forecasting, and won over 90 percent of head-to-head comparisons against its predecessors.
So what can you do with this Chronos-2? Quite a lot. Retailers can forecast product demand while accounting for planned promotional campaigns, seasonal events like Midsummar or Christmas, and weather patterns. Manufacturers can predict spare parts demand across multiple facilities and product lines. For cold-start scenarios, such as launching a new distribution center or service location, Chronos-2 can leverage patterns from existing facilities to generate accurate forecasts even with minimal operational history. The usage areas are enormous, and if you are working with these kinds of tasks already this will be a major upgrade to your workflows.
Cursor Adds Semantic Search, Shows 12.5% Accuracy Improvement
https://cursor.com/blog/semsearch

The News:
- Cursor integrated semantic search into its coding agent toolset, adding natural language code retrieval alongside traditional grep-based search.
- The feature uses a custom-trained embedding model and indexing pipeline that processes agent session traces as training data.
- Offline evaluations on Cursor Context Bench showed 12.5% average accuracy improvement across all frontier coding models, with gains ranging from 6.5% to 23.5% depending on the model.
- Online A/B testing revealed 0.3% higher code retention overall, increasing to 2.6% on codebases with 1,000+ files.
- The training approach analyzes agent work traces to identify which code should have been retrieved earlier, then trains embeddings to align with LLM-generated rankings of helpful content.
My take: This is interesting. Both Anthropic and OpenAI do not use semantic search in their tools Claude Code and Codex. Anthropic specifically states that “Semantic search is usually faster than agentic search, but less accurate, more difficult to maintain, and less transparent”. I personally think semantic search would be a great complement to agentic search, if done right with proper chunking. I guess the main problem is how to do this actual chunking – complex flows span multiple segments in multiple files, so each logical chunk could be extremely complex. It seems however that Cursor somehow solved this, and if this indeed works as well as they say I can see this becoming a standard feature in all software development agents in the future.
AWS and OpenAI Announce Multi-Year Strategic Partnership
https://openai.com/index/aws-and-openai-partnership

The News:
- OpenAI signed a $38 billion, seven-year contract with Amazon Web Services to access hundreds of thousands of NVIDIA GPUs through AWS infrastructure, marking its first partnership with the cloud leader and reducing dependence on Microsoft Azure.
- The partnership deploys Amazon EC2 UltraServers featuring NVIDIA GB200 and GB300 chips clustered via low-latency networks, with capacity targeted for deployment by end of 2026 and expansion through 2027.
- OpenAI gains immediate access to existing AWS compute and the ability to scale to tens of millions of CPUs for agentic workloads, with AWS building separate capacity that includes clusters exceeding 500,000 chips.
- OpenAI’s models became available on Amazon Bedrock earlier in 2025, with thousands of customers using them for agentic workflows, coding, and scientific analysis.
- Sam Altman stated that “scaling frontier AI requires massive, reliable compute”, while AWS CEO Matt Garman noted the “breadth and immediate availability of optimized compute” as AWS’s differentiator.
My take: The hunt for GPUs goes on. This contract ensures OpenAI will have access to hundreds of thousands additional GPUs in 2027, which will be required for continuous model scaling. OpenAI’s biggest competitor in the coming two years will be Google, and OpenAI is going to need every bit of compute they can get ahold of in order to keep up.
German Commons: Largest Openly Licensed German Text Dataset Released
https://www.arxiv.org/abs/2510.13996

The News:
- Researchers from the University of Kassel, University of Leipzig, and hessian.AI released the German Commons, a dataset containing 154.56 billion tokens across 35.78 million documents from 40 institutional sources with verified open licenses.
- The corpus spans seven domains: web, political, legal, news, economic, cultural, and scientific text.
- All texts carry licenses of at least CC-BY-SA 4.0 or equivalent, with document-level metadata tracking each source’s license status.
- The processing pipeline includes OCR-specific filtering for historical documents, deduplication, quality filtering, and removal of personal or toxic information.
- The dataset includes historical newspaper archives from the Deutsches Zeitungsportal (4 million newspaper editions from 1671 to 1994) and Austrian ANNO corpus, parliamentary protocols from the German Bundestag and historic Reichstag, and Wikipedia content.
- The corpus and processing code library “llmdata” are available on Hugging Face and GitHub.
My take: The previous largest German dataset was the German subset of the Common Corpus, which contained 112 billion German tokens. I think it will be critical for all countries with unique languages to provide resources like this moving forward, to make sure the world’s largest language models understand all the intricacies and nuances of each language.

