AI News Flash · Daily Brief
Anthropic's Claude Managed Agents gains 'Dreaming' memory feature
Platforms
Anthropic's Claude Managed Agents gains 'Dreaming' memory feature
Unveiled at Code with Claude in London (May 19–21), Dreaming lets Claude agents write structured notes to themselves during tasks so that subsequent agents working on the same codebase can read, consolidate, and learn from prior work. The feature targets long-running multi-agent coding pipelines where knowledge transfer between agents has been a persistent gap. It ships as part of Claude Managed Agents, Anthropic's cloud-based multi-agent orchestration platform.
technologyreview.comxAI ships Grok Imagine API Quality Mode for enterprise image generation
xAI launched Quality Mode for the Grok Imagine API, bringing higher realism, stronger text rendering, and finer creative controls to enterprise developers and teams. The update targets brand and marketing workflows — product renders, ad variations, and scaled asset generation — where fidelity and consistency matter. xAI says the Grok Imagine API ranks among the top models on the LMArena text-to-image leaderboard as of early May.
releasebot.ioCapabilities
DeepSWE Benchmark Exposes 16-Point GPT-5.5 Lead Over Claude Opus on Hard Coding Tasks
Datacurve's DeepSWE—a 113-task suite spanning 91 open-source repositories and five programming languages, with tasks written from scratch to avoid contamination—separates frontier models far more sharply than existing leaderboards. GPT-5.5 tops the leaderboard at 70% pass@1, sixteen points ahead of its nearest competitor, while Claude Opus 4.7 lands at 54% and Claude Haiku 4.5 collapses to zero despite scoring 39% on SWE-Bench Pro. An audit by the DeepSWE team also found SWE-Bench Pro verifiers carry a 24% false-negative rate, calling prior clustered rankings into question.
venturebeat.comClaude Mythos Preview Finds 27-Year-Old OpenBSD Bug in Controlled Glasswing Rollout
Anthropic's Project Glasswing gave select partners—including AWS, Apple, Cisco, Google, JPMorgan Chase, and Microsoft—access to Claude Mythos Preview to find critical software vulnerabilities before public release. In early internal testing, Mythos Preview identified thousands of zero-day vulnerabilities across every major OS and browser, including a 27-year-old bug in OpenBSD. Anthropic says it has no plans for a general Mythos Preview release and will first ship new cybersecurity safeguards with an upcoming Claude Opus model.
anthropic.comTechnology & Research
MobileMoE runs frontier-class LLMs on smartphones at 3.8x faster prefill
MobileMoE applies a four-stage training recipe—pre-training, mid-training, instruction fine-tuning, and quantization-aware training—to derive MoE architectures purpose-built for on-device inference. Across 14 benchmarks it matches or exceeds leading on-device dense LLMs with 2–4x fewer inference FLOPs, and beats OLMoE-1B-7B with up to 60% fewer parameters. On commodity smartphones, MobileMoE-S delivers 1.8–3.8x faster prefill and 2.2–3.4x faster decode than the dense MobileLLM-Pro baseline at the same INT4 weight memory—the first end-to-end efficient MoE inference profiled on real phone hardware.
scirate.comNVIDIA open-sources Cosmos world-foundation models and 1,700-hour physical AI datasets
NVIDIA released NVIDIA Cosmos open world foundation models aimed at giving robots and autonomous vehicles human-like reasoning for world generation and simulation. Alongside the models, NVIDIA published over 1,700 hours of driving data spanning rare real-world edge cases, 500,000 robotics trajectories, and 455,000 protein structures—plus the dataset and training code for the Llama Embed Nemotron 8B model. The release also includes new Nemotron Speech ASR models and an updated LLM Router, all under open licenses, targeting physical AI teams that lack the data infrastructure to train from scratch.
blogs.nvidia.comRegulation & Policy
New York algorithmic-discrimination bill clears Senate committee with private right of action
New York's S 1169 was voted out of the Senate Internet and Technology Committee this week, advancing a bill that would prohibit algorithmic discrimination, require independent audits of high-risk AI systems, and give both the attorney general and private individuals the right to sue for violations. The bill moves through Albany as the legislature races toward its June closure date and follows the state's March 2026 amendments to the RAISE Act, which shifted New York's frontier-AI framework toward a transparency-and-reporting model aligned with California. If enacted, New York would join a small group of states combining audit mandates with a private right of action—a pairing the industry has most strenuously opposed.
troutmanprivacy.comAI Stocks
(NVDA) Nvidia posts $82B Q1 revenue, guides Q2 above Street despite $8B China hole
Nvidia reported Q1 FY2027 revenue of $81.6 billion, up 85% year-over-year and ahead of the $78.8 billion consensus, with adjusted EPS of $1.87 beating the $1.77 estimate. Data center revenue hit $75 billion, up 92% year-over-year, driven by Blackwell GPU demand; CEO Jensen Huang said Blackwell sales are 'off the charts' with cloud inventory sold out. Q2 guidance absorbed an estimated $8 billion in lost China H20 revenue from export controls yet still came in above Street expectations — the third consecutive quarter of year-over-year revenue acceleration — though shares slipped roughly 1% as investors faded a well-telegraphed beat.
nvidianews.nvidia.com(ORCL) Oracle lands $300B compute deal with OpenAI in AI infrastructure milestone
Oracle secured a $300 billion agreement to supply computing power to OpenAI, one of the largest AI infrastructure contracts ever disclosed, signaling Oracle's cloud division is becoming a primary beneficiary of hyperscale AI capex alongside AWS and Azure. Oracle has guided capital expenditures to reach $50 billion for the fiscal year ending May 2026, almost entirely AI-data-center-driven. The deal cements Oracle's positioning as a third major cloud player in the AI infrastructure race and gives ORCL shares a durable long-cycle revenue catalyst beyond its traditional software base.
fool.com