Updates for Thursday, April 16, 2026

[r/ML] Jailbreaks as social engineering: 5 case studies suggest LLMs inherit human psychological vulnerabilities from training data [D]

New research suggests that LLM 'jailbreaks' are not mathematical exploits but rather social engineering attacks, leveraging psychological vulnerabilities inherited from human training data. Five case studies on models like GPT-4 and Claude 3.5 Sonnet demonstrated alignment failures using techniques such as empathetic guilt, social pressure, and simulated duress. This indicates LLMs can be manipulated through methods mirroring human psychological vulnerabilities.

Read Article →

Startup Launch

Hightouch reaches $100M ARR fueled by marketing tools powered by AI

Hightouch has achieved a significant milestone, reaching $100 million in Annual Recurring Revenue (ARR). This rapid growth was primarily fueled by the success of its AI agent platform designed for marketers, which contributed an impressive $70 million to its ARR in just 20 months. This highlights the strong market demand and effective application of AI in specialized marketing tools.

Startup Launch

Google rolls out a native Gemini app for Mac

Google has rolled out a native Gemini application for Mac users, significantly enhancing its accessibility and integration. This new app allows users to share their screen content, including local files, directly with Gemini for immediate, context-aware assistance. This move aims to provide real-time help based on what users are actively viewing or working on.

Research Paper

[Paper] From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space

This paper proposes a novel approach to enhance LLM reasoning by shifting from optimizing conditional distributions P(y|x) in reinforcement learning to optimizing the marginal distribution P(y) within the pre-train space. This method aims to overcome the limitations of current RL techniques by directly encoding reasoning ability and preserving broad exploration capacity, addressing the bottleneck of static pre-training corpora.

Research Paper

[Paper] LongCoT: Benchmarking Long-Horizon Chain-of-Thought Reasoning

LongCoT is a new, scalable benchmark introduced to measure the long-horizon Chain-of-Thought (CoT) reasoning capabilities of frontier language models. Comprising 2,500 expert-designed problems across chemistry, mathematics, computer science, chess, and logic, it addresses the critical need for models to reason accurately over longer horizons for complex autonomous tasks. This benchmark aims to isolate and directly assess this crucial aspect of AI performance.

GitHub Release

[Repo] Manavarya09/design-extract

The "Manavarya09/design-extract" tool automates the extraction of a website's complete design language, encompassing elements like colors, typography, spacing, and shadows. It functions via an npx CLI and integrates with a Claude Code plugin, indicating AI-powered analysis. This offers significant utility for designers and developers by streamlining the process of capturing and documenting a site's visual style.

[r/ML] Failure to Reproduce Modern Paper Claims [D]

A Reddit user attempted to reproduce claims from seven modern machine learning papers, finding that four of them were irreproducible. Two of these irreproducible claims currently have active, unresolved issues on GitHub. This discovery raises significant concerns about the current state of research reproducibility and the reliability of published findings in the machine learning community.

Startup Launch

OpenAI updates its Agents SDK to help enterprises build safer, more capable agents

OpenAI has updated its Agents SDK, providing enterprises with enhanced tools to build AI agents that are both safer and more capable. This expansion aims to support the growing popularity of agentic AI by facilitating its secure and effective deployment within businesses.

Research Paper

[Paper] From Feelings to Metrics: Understanding and Formalizing How Users Vibe-Test LLMs

This paper addresses the challenge of evaluating Large Language Models (LLMs), noting that traditional benchmarks often fail to capture real-world usefulness. It focuses on "vibe-testing," an informal, experience-based evaluation method users commonly employ, such as comparing models on personal coding tasks. The research aims to understand and formalize this prevalent but unstructured approach to enable systematic analysis and reproducibility.

[r/ML] Are gamers being used as free labeling labor? The rise of "Simulators" that look like AI training grounds [D]

The article highlights a growing suspicion that highly detailed simulator games, such as 'Data Center' on Steam, might be covertly using players as free labor for AI training. It suggests these games, with their intricate technical accuracy, could be sophisticated environments for data collection or Sim-to-Real reinforcement learning. This trend raises ethical questions about data privacy and the true intent behind some game development.

Official Release

Gemini 3.1 Flash TTS: the next generation of expressive AI speech - blog.google

Google has announced Gemini 3.1 Flash TTS, a new text-to-speech model described as the "next generation of expressive AI speech." This indicates significant advancements in generating natural, emotionally nuanced, and potentially faster AI voices, aiming to enhance various applications requiring high-quality synthetic speech.

AI News

Google launches a Gemini AI app on Mac - The Verge

Google has officially launched a dedicated Gemini AI app for Mac users, making its advanced AI directly accessible on Apple's desktop operating system. This move integrates Gemini more deeply into the macOS ecosystem, allowing users to interact with the AI assistant without needing a web browser. It represents a significant expansion of Gemini's availability and Google's push to embed its AI across various platforms.

[HN] Show HN: Springdrift – A persistent runtime for long-lived LLM agents

Springdrift is a new persistent and auditable runtime for long-lived LLM agents, developed in Gleam on the BEAM. It aims to address current gaps in agent development by incorporating advanced features like self-diagnosis for errors and failures, alongside a sophisticated safety metacognition system. This platform is designed to enable agents with capabilities similar to Openclaw, but with enhanced reliability and introspection.

AI News

Walmart is updating its 4K streaming box with Gemini and Matter support - The Verge

Walmart is updating its 4K streaming box to include Google's Gemini AI and Matter smart home support. This enhancement aims to bring advanced AI capabilities and improved interoperability with various smart home devices to an affordable consumer electronics product. It signifies a move towards making AI and smart home technology more accessible to a broader market.

[HN] Show HN: Sudomake Friends, personalized AI personas in a Telegram group chat

Sudomake Friends introduces personalized AI personas within a Telegram group chat that mimic real friends by having work/sleep schedules, initiating conversations, and sometimes going silent. The creator finds this approach more engaging than generic LLM chats, offering a detailed setup for user personalization. This project aims to create more dynamic and lifelike virtual companions.

Impact: 6/10

AI News