Mistral’s Magistral Makes a Good Addition to Reasoning AI
French AI company Mistral continues to make waves with the launch of Magistral, its advanced reasoning model available in two variants: Magistral Small (open-source, 24B parameters) and Magistral Medium (business-oriented).
Magistral explicitly showcases its thought process, enabling users to follow logical steps in solving problems, a valuable asset in sectors like finance, healthcare, and law. It supports multilingual capabilities including French, German, Chinese, Arabic, and more, and notably offers rapid response times, significantly reducing latency compared to competitors.
Benchmark tests highlight Magistral Medium’s impressive performance (90%) on math reasoning tasks (AIME2024), placing it ahead of several competitors. Available via Hugging Face, Le Chat, and major cloud providers, Magistral positions itself as a key player in the evolving landscape of reasoning AI.
DeepSeek R1 Update: Enhanced Capabilities and Open-Source Appeal
DeepSeek's latest iteration, R1-0528, substantially improves functionality, introducing critical features such as function calling, JSON output support, lower hallucination rates, and increased consistency in real-time interactions.
Significantly, DeepSeek has enhanced the model’s reasoning depth, nearly doubling its reasoning capacity—improving performance notably in complex tasks like math and coding benchmarks. For instance, it now achieves 87.5% accuracy on the AIME2025 benchmark, nearing the top-tier models like OpenAI’s o3.
Remaining committed to open-source principles, DeepSeek’s update signals a shift in the competitive landscape, positioning it as a reliable, accessible, and cost-effective alternative to proprietary models..
AI Insurance Emerges Amid Growing Concerns
With AI increasingly embedded in business processes, mishaps and legal issues like the Air Canada chatbot incident have highlighted significant financial risks. Recognizing these vulnerabilities, insurance providers like Lloyd’s of London have begun offering specialized "AI insurance" policies.
These policies specifically cover scenarios where AI tools fail, resulting in financial damages. While AI remains probabilistic and prone to occasional errors ("hallucinations"), AI insurance provides a much-needed safety net, potentially becoming a standard requirement similar to cybersecurity coverage.
This development underscores the growing maturity of AI, reflecting both its transformative potential and inherent risks..
AI and Job Loss: Separating Hype from Reality
We touched base on this topic in the last newsletter, but AI’s impact on employment remains widely debated. While companies like Duolingo embrace AI-first strategies, phasing out contractors for tasks manageable by AI, broader economic studies show minimal actual impacts on wages and employment numbers.
Also, some companies like Klarna, after laying off large number of their employees, rolled back that decision at least partially.
Research by economists reveals that despite widespread adoption, generative AI has only marginally improved productivity, saving workers an average of just 2.8% of their work hours without significantly affecting economic outcomes.
This nuanced picture highlights a disconnect between AI-driven operational changes at the corporate level and modest macroeconomic impacts, emphasizing the importance of adaptability and AI literacy to remain competitive.
Limits of AI Reasoning
Recent research from Apple, The Illusion of Thinking, provides new insights into the limitations of Large Reasoning Models (LRMs). Although LRMs excel at generating detailed reasoning steps, experiments reveal significant performance breakdowns when facing complex, multi-step tasks.
Surprisingly, standard language models sometimes outperform LRMs in simpler tasks. Researchers also discovered a counterintuitive scaling limit: LRMs initially increase reasoning effort with complexity but eventually collapse, indicating fundamental constraints in current approaches.
This research underscores the need to clearly understand the true reasoning capabilities of AI, recognizing their strengths and limitations for responsible and effective deployment..