Ward: Newsss

Judging from headlines and social media posts in recent years, one might reasonably assume that AI is going to fix the power grid, cure the world’s diseases, and finish my holiday shopping for me. But maybe there’s just a whole lot of hype floating around out there. This week, we published a new pac…

319

How social media encourages the worst of AI boosterism

Demis Hassabis, CEO of Google DeepMind, summed it up in three words: “This is embarrassing.” Hassabis was replying on X to an overexcited post by Sébastien Bubeck, a research scientist at the rival firm OpenAI, announcing that two mathematicians had used OpenAI’s latest large language model, GPT-5…

428

Robots that spare warehouse workers the heavy lifting

Founded by MIT alumni, the Pickle Robot Company has developed machines that can autonomously load and unload trucks inside warehouses and logistic centers.

322

Guided learning lets “untrainable” neural networks realize their potential

CSAIL researchers find even “untrainable” neural nets can learn effectively when guided by another network’s built-in biases using their guidance method.

111

A smarter way for large language models to think about hard problems

This new technique enables LLMs to dynamically adjust the amount of computation they use for reasoning, based on the difficulty of the question.

101

How social media encourages the worst of AI boosterism

Demis Hassabis, CEO of Google DeepMind, summed it up in three words: “This is embarrassing.” Hassabis was replying on X to an overexcited post by Sébastien Bubeck, a research scientist at the rival firm OpenAI, announcing that two mathematicians had used OpenAI’s latest large language model, GPT-5…

213

How I learned to stop worrying and love AI slop

Lately, everywhere I scroll, I keep seeing the same fish-eyed CCTV view: a grainy wide shot from the corner of a living room, a driveway at night, an empty grocery store. Then something impossible happens. JD Vance shows up at the doorstep in a crazy outfit. A car folds into itself like paper and dr…

213

MatE: Material Extraction from Single-Image via Geometric Prior

arXiv:2512.18312v1 Announce Type: new
Abstract: The creation of high-fidelity, physically-based rendering (PBR) materials remains a bottleneck in many graphics pipelines, typically requiring specialized equipment and expert-driven post-processing. To democratize this process, we present MatE, a nov…

222

SurgiPose: Estimating Surgical Tool Kinematics from Monocular Video for Surgical Robot Learning

arXiv:2512.18068v1 Announce Type: new
Abstract: Imitation learning (IL) has shown immense promise in enabling autonomous dexterous manipulation, including learning surgical tasks. To fully unlock the potential of IL for surgery, access to clinical datasets is needed, which unfortunately lack the ki…

222

Confidence Calibration in Vision-Language-Action Models

arXiv:2507.17383v2 Announce Type: replace
Abstract: Trustworthy robot behavior requires not only high levels of task success but also that the robot can reliably quantify how likely it is to succeed. To this end, we present a first-of-its-kind study of confidence calibration in vision-language-acti…

227

Neologism Learning as a Parameter-Efficient Alternative to Fine-Tuning for Model Steering

arXiv:2512.18551v1 Announce Type: new
Abstract: In language modeling, neologisms are new tokens trained to represent a concept not already included in a given model's vocabulary. Neologisms can be used to encourage specific behavior in models, for example by appending prompts with "Give me a neolog…

329

DeltaMIL: Gated Memory Integration for Efficient and Discriminative Whole Slide Image Analysis

arXiv:2512.19331v1 Announce Type: new
Abstract: Whole Slide Images (WSIs) are typically analyzed using multiple instance learning (MIL) methods. However, the scale and heterogeneity of WSIs generate highly redundant and dispersed information, making it difficult to identify and integrate discrimina…

221

LIR$^3$AG: A Lightweight Rerank Reasoning Strategy Framework for Retrieval-Augmented Generation

arXiv:2512.18329v1 Announce Type: new
Abstract: Retrieval-Augmented Generation (RAG) effectively enhances Large Language Models (LLMs) by incorporating retrieved external knowledge into the generation process. Reasoning models improve LLM performance in multi-hop QA tasks, which require integrating…

111

SoK: Are Watermarks in LLMs Ready for Deployment?

arXiv:2506.05594v2 Announce Type: replace
Abstract: Large Language Models (LLMs) have transformed natural language processing, demonstrating impressive capabilities across diverse tasks. However, deploying these models introduces critical risks related to intellectual property violations and potent…