Entity Distribution
Top entities by frequency
A fast, reusable HTML report for lightweight Radar projects: entity distribution, article velocity, and source mix, alongside a clean reading list.
Some sources or steps reported errors. The report still renders with partial data.
Entity Distribution
Top entities by frequency
Article Timeline
Daily volume inferred from article dates
Source Distribution
Share of articles by source
Data Freshness
Collection lag distribution
Entity Extraction Rate
Percentage with matched entities
Source Health
Article count by source (sorted)
Reading List
Click through to the original source
Pramana: Fine-Tuning Large Language Models for Epistemic Reasoning through Navya-Nyaya
arXiv:2604.04937v1 Announce Type: new Abstract: Large language models produce fluent text but struggle with systematic reasoning, often hallucinating confident but unfounded claims. When Apple researchers added irreleva...
Operational Noncommutativity in Sequential Metacognitive Judgments
arXiv:2604.04938v1 Announce Type: new Abstract: Metacognition, understood as the monitoring and regulation of one's own cognitive processes, is inherently sequential: an agent evaluates an internal state, updates it, an...
arXiv:2604.04939v1 Announce Type: new Abstract: The paper considers a new quantitative-qualitative proximity measure for the features of information objects, where data enters a common information resource from several ...
ReVEL: Multi-Turn Reflective LLM-Guided Heuristic Evolution via Structured Performance Feedback
arXiv:2604.04940v1 Announce Type: new Abstract: Designing effective heuristics for NP-hard combinatorial optimization problems remains a challenging and expertise-intensive task. Existing applications of large language ...
arXiv:2604.04941v1 Announce Type: new Abstract: Many combinatorial optimisation problems hide algebraic structures that, once exposed, shrink the search space and improve the chance of finding the global optimal solutio...
PaperOrchestra: A Multi-Agent Framework for Automated AI Research Paper Writing
arXiv:2604.05018v1 Announce Type: new Abstract: Synthesizing unstructured research materials into manuscripts is an essential yet under-explored challenge in AI-driven scientific discovery. Existing autonomous writers a...
Part-Level 3D Gaussian Vehicle Generation with Joint and Hinge Axis Estimation
arXiv:2604.05070v1 Announce Type: new Abstract: Simulation is essential for autonomous driving, yet current frameworks often model vehicles as rigid assets and fail to capture part-level articulation. With perception al...
MMORF: A Multi-agent Framework for Designing Multi-objective Retrosynthesis Planning Systems
arXiv:2604.05075v1 Announce Type: new Abstract: Multi-objective retrosynthesis planning is a critical chemistry task requiring dynamic balancing of quality, safety, and cost objectives. Language model-based multi-agent ...
arXiv:2604.05081v1 Announce Type: new Abstract: We introduce MedGemma 1.5 4B, the latest model in the MedGemma collection. MedGemma 1.5 expands on MedGemma 1 by integrating additional capabilities: high-dimensional medi...
Uncertainty-Guided Latent Diagnostic Trajectory Learning for Sequential Clinical Diagnosis
arXiv:2604.05116v1 Announce Type: new Abstract: Clinical diagnosis requires sequential evidence acquisition under uncertainty. However, most Large Language Model (LLM) based diagnostic systems assume fully observed pati...
Non-monotonic causal discovery with Kolmogorov-Arnold Fuzzy Cognitive Maps
arXiv:2604.05136v1 Announce Type: new Abstract: Fuzzy Cognitive Maps constitute a neuro-symbolic paradigm for modeling complex dynamic systems, widely adopted for their inherent interpretability and recurrent inference ...
A mathematical theory of evolution for self-designing AIs
arXiv:2604.05142v1 Announce Type: new Abstract: As artificial intelligence systems (AIs) become increasingly produced by recursive self-improvement, a form of evolution may emerge, in which the traits of AI systems are ...
IntentScore: Intent-Conditioned Action Evaluation for Computer-Use Agents
arXiv:2604.05157v1 Announce Type: new Abstract: Computer-Use Agents (CUAs) leverage large language models to execute GUI operations on desktop environments, yet they generate actions without evaluating action quality, l...
Bypassing the CSI Bottleneck: MARL-Driven Spatial Control for Reflector Arrays
arXiv:2604.05162v1 Announce Type: new Abstract: Reconfigurable Intelligent Surfaces (RIS) are pivotal for next-generation smart radio environments, yet their practical deployment is severely bottlenecked by the intracta...
Learning to Focus: CSI-Free Hierarchical MARL for Reconfigurable Reflectors
arXiv:2604.05165v1 Announce Type: new Abstract: Reconfigurable Intelligent Surfaces (RIS) has a potential to engineer smart radio environments for next-generation millimeter-wave (mmWave) networks. However, the prohibit...
Instruction-Tuned LLMs for Parsing and Mining Unstructured Logs on Leadership HPC Systems
arXiv:2604.05168v1 Announce Type: new Abstract: Leadership-class HPC systems generate massive volumes of heterogeneous, largely unstructured system logs. Because these logs originate from diverse software, hardware, and...
ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces
arXiv:2604.05172v1 Announce Type: new Abstract: Large language model (LLM) agents are increasingly deployed to automate productivity tasks (e.g., email, scheduling, document management), but evaluating them on live serv...
Attribution Bias in Large Language Models
arXiv:2604.05224v1 Announce Type: new Abstract: As Large Language Models (LLMs) are increasingly used to support search and information retrieval, it is critical that they accurately attribute content to its original au...
arXiv:2604.05229v1 Announce Type: new Abstract: Agentic AI systems plan, use tools, maintain state, and produce multi-step trajectories with external effects. Those properties create a governance problem that differs ma...
EAGLE: Edge-Aware Graph Learning for Proactive Delivery Delay Prediction in Smart Logistics Networks
arXiv:2604.05254v1 Announce Type: new Abstract: Modern logistics networks generate rich operational data streams at every warehouse node and transportation lane -- from order timestamps and routing records to shipping m...
Simulating the Evolution of Alignment and Values in Machine Intelligence
arXiv:2604.05274v1 Announce Type: new Abstract: Model alignment is currently applied in a vacuum, evaluated primarily through standardised benchmark performance. The purpose of this study is to examine the effects of al...
Pressure, What Pressure? Sycophancy Disentanglement in Language Models via Reward Decomposition
arXiv:2604.05279v1 Announce Type: new Abstract: Large language models exhibit sycophancy, the tendency to shift their stated positions toward perceived user preferences or authority cues regardless of evidence. Standard...
arXiv:2604.05297v1 Announce Type: new Abstract: Value factorization, a popular paradigm in MARL, faces significant theoretical and algorithmic bottlenecks: its tendency to converge to suboptimal solutions remains poorly...
Graph of Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills
arXiv:2604.05333v1 Announce Type: new Abstract: Skill usage has become a core component of modern agent systems and can substantially improve agents' ability to complete complex tasks. In real-world settings, where agen...
TRACE: Capability-Targeted Agentic Training
arXiv:2604.05336v1 Announce Type: new Abstract: Large Language Models (LLMs) deployed in agentic environments must exercise multiple capabilities across different task instances, where a capability is performing one or ...
Dynamic Agentic AI Expert Profiler System Architecture for Multidomain Intelligence Modeling
arXiv:2604.05345v1 Announce Type: new Abstract: In today's artificial intelligence driven world, modern systems communicate with people from diverse backgrounds and skill levels. For human-machine interaction to be mean...
arXiv:2604.05348v1 Announce Type: new Abstract: Hallucinations in medical large language models (LLMs) remain a safety-critical issue, particularly when available evidence is insufficient or conflicting. We study this p...
ETR: Entropy Trend Reward for Efficient Chain-of-Thought Reasoning
arXiv:2604.05355v1 Announce Type: new Abstract: Chain-of-thought (CoT) reasoning improves large language model performance on complex tasks, but often produces excessively long and inefficient reasoning traces. Existing...
arXiv:2604.05358v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) mitigates hallucination but does not eliminate it: a deployed system must still decide, at inference time, whether its answer is actua...
TFRBench: A Reasoning Benchmark for Evaluating Forecasting Systems
arXiv:2604.05364v1 Announce Type: new Abstract: We introduce TFRBench, the first benchmark designed to evaluate the reasoning capabilities of forecasting systems. Traditionally, time-series forecasting has been evaluate...
TDA-RC: Task-Driven Alignment for Knowledge-Based Reasoning Chains in Large Language Models
arXiv:2604.04942v1 Announce Type: new Abstract: Enhancing the reasoning capability of large language models (LLMs) remains a core challenge in natural language processing. The Chain-of-Thought (CoT) paradigm dominates p...
The Illusion of Latent Generalization: Bi-directionality and the Reversal Curse
arXiv:2604.04943v1 Announce Type: new Abstract: The reversal curse describes a failure of autoregressive language models to retrieve a fact in reverse order (e.g., training on ``$A > B$'' but failing on ``$B < A$''). Re...
Inclusion-of-Thoughts: Mitigating Preference Instability via Purifying the Decision Space
arXiv:2604.04944v1 Announce Type: new Abstract: Multiple-choice questions (MCQs) are widely used to evaluate large language models (LLMs). However, LLMs remain vulnerable to the presence of plausible distractors. This o...
Phase-Associative Memory: Sequence Modeling in Complex Hilbert Space
arXiv:2604.05030v1 Announce Type: new Abstract: We present Phase-Associative Memory (PAM), a recurrent sequence model in which all representations are complex-valued, associations accumulate in a matrix state $S_{t}$ $\...
This Treatment Works, Right? Evaluating LLM Sensitivity to Patient Question Framing in Medical QA
arXiv:2604.05051v1 Announce Type: new Abstract: Patients are increasingly turning to large language models (LLMs) with medical questions that are complex and difficult to articulate clearly. However, LLMs are sensitive ...
Memory Dial: A Training Framework for Controllable Memorization in Language Models
arXiv:2604.05074v1 Announce Type: new Abstract: Memorization in language models is widely studied but remains difficult to isolate and control. Understanding when and what models memorize is essential for explaining the...
Beyond LLM-as-a-Judge: Deterministic Metrics for Multilingual Generative Text Evaluation
arXiv:2604.05083v1 Announce Type: new Abstract: While Large Language Models (LLMs) are increasingly adopted as automated judges for evaluating generated text, their outputs are often costly, and highly sensitive to prom...
Document Optimization for Black-Box Retrieval via Reinforcement Learning
arXiv:2604.05087v1 Announce Type: new Abstract: Document expansion is a classical technique for improving retrieval quality, and is attractive since it shifts computation offline, avoiding additional query-time processi...
Multilingual Language Models Encode Script Over Linguistic Structure
arXiv:2604.05090v1 Announce Type: new Abstract: Multilingual language models (LMs) organize representations for typologically and orthographically diverse languages into a shared parameter space, yet the nature of this ...
MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU
arXiv:2604.05091v1 Announce Type: new Abstract: We present MegaTrain, a memory-centric system that efficiently trains 100B+ parameter large language models at full precision on a single GPU. Unlike traditional GPU-centr...
arXiv:2604.05096v1 Announce Type: new Abstract: Large language models (LLMs) acquire most of their knowledge during pretraining, which ties them to a fixed snapshot of the world and makes adaptation to continuously evol...
arXiv:2604.05114v1 Announce Type: new Abstract: We study a pipeline that curates reasoning data from initial structured data for improving long-context reasoning in large language models (LLMs). Our approach, $\pi^2$, c...
SenseAI: A Human-in-the-Loop Dataset for RLHF-Aligned Financial Sentiment Reasoning
arXiv:2604.05135v1 Announce Type: new Abstract: We introduce SenseAI, a human-in-the-loop (HITL) validated financial sentiment dataset designed to capture not only model outputs but the full reasoning process behind the...
EvolveRouter: Co-Evolving Routing and Prompt for Multi-Agent Question Answering
arXiv:2604.05149v1 Announce Type: new Abstract: Large language model agents often exhibit complementary strengths, making routing a promising approach for multi-agent question answering. However, existing routing method...
Just Pass Twice: Efficient Token Classification with LLMs for Zero-Shot NER
arXiv:2604.05158v1 Announce Type: new Abstract: Large language models encode extensive world knowledge valuable for zero-shot named entity recognition. However, their causal attention mechanism, where tokens attend only...
What Makes a Good Response? An Empirical Analysis of Quality in Qualitative Interviews
arXiv:2604.05163v1 Announce Type: new Abstract: Qualitative interviews provide essential insights into human experiences when they elicit high-quality responses. While qualitative and NLP researchers have proposed vario...
Gradient-Controlled Decoding: A Safety Guardrail for LLMs with Dual-Anchor Steering
arXiv:2604.05179v1 Announce Type: new Abstract: Large language models (LLMs) remain susceptible to jailbreak and direct prompt-injection attacks, yet the strongest defensive filters frequently over-refuse benign queries...
Improving Clinical Trial Recruitment using Clinical Narratives and Large Language Models
arXiv:2604.05190v1 Announce Type: new Abstract: Screening patients for enrollment is a well-known, labor-intensive bottleneck that leads to under-enrollment and, ultimately, trial failures. Recent breakthroughs in large...
arXiv:2604.05192v1 Announce Type: new Abstract: Byte Pair Encoding (BPE) is a widely used tokenization algorithm, whose tokens cannot extend across pre-tokenization boundaries, functionally limiting it to representing a...
XMark: Reliable Multi-Bit Watermarking for LLM-Generated Texts
arXiv:2604.05242v1 Announce Type: new Abstract: Multi-bit watermarking has emerged as a promising solution for embedding imperceptible binary messages into Large Language Model (LLM)-generated text, enabling reliable at...
arXiv:2604.05243v1 Announce Type: new Abstract: Background: Children do not simply learn that balls are round and blocks are square. They learn that shape is the kind of feature that tends to define object categories --...
Do Domain-specific Experts exist in MoE-based LLMs?
arXiv:2604.05267v1 Announce Type: new Abstract: In the era of Large Language Models (LLMs), the Mixture of Experts (MoE) architecture has emerged as an effective approach for training extremely large models with improve...
Beneath the Surface: Investigating LLMs' Capabilities for Communicating with Subtext
arXiv:2604.05273v1 Announce Type: new Abstract: Human communication is fundamentally creative, and often makes use of subtext -- implied meaning that goes beyond the literal content of the text. Here, we systematically ...
Right at My Level: A Unified Multilingual Framework for Proficiency-Aware Text Simplification
arXiv:2604.05302v1 Announce Type: new Abstract: Text simplification supports second language (L2) learning by providing comprehensible input, consistent with the Input Hypothesis. However, constructing personalized para...
DIA-HARM: Dialectal Disparities in Harmful Content Detection Across 50 English Dialects
arXiv:2604.05318v1 Announce Type: new Abstract: Harmful content detectors-particularly disinformation classifiers-are predominantly developed and evaluated on Standard American English (SAE), leaving their robustness to...
arXiv:2604.05339v1 Announce Type: new Abstract: As LLMs become increasingly integrated into human society, evaluating their orientations on human values from social science has drawn growing attention. Nevertheless, it ...
DQA: Diagnostic Question Answering for IT Support
arXiv:2604.05350v1 Announce Type: new Abstract: Enterprise IT support interactions are fundamentally diagnostic: effective resolution requires iterative evidence gathering from ambiguous user reports to identify an unde...
ICR-Drive: Instruction Counterfactual Robustness for End-to-End Language-Driven Autonomous Driving
arXiv:2604.05378v1 Announce Type: new Abstract: Recent progress in vision-language-action (VLA) models has enabled language-conditioned driving agents to execute natural-language navigation commands in closed-loop simul...
Confidence Should Be Calibrated More Than One Turn Deep
arXiv:2604.05397v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly applied in high-stakes domains such as finance, healthcare, and education, where reliable multi-turn interactions with users ...
Multi-Drafter Speculative Decoding with Alignment Feedback
arXiv:2604.05417v1 Announce Type: new Abstract: Speculative decoding (SD) accelerates large language model (LLM) inference by using a smaller model to draft future tokens, which are then verified by the target LLM. This...
arXiv:2604.03233v1 Announce Type: new Abstract: The conservation of cultural heritage increasingly relies on integrating technological innovation with domain expertise to ensure effective monitoring and predictive maint...
Scaling DPPs for RAG: Density Meets Diversity
arXiv:2604.03240v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by grounding generation in external knowledge, yielding relevance responses that are aligned wit...
DRAFT: Task Decoupled Latent Reasoning for Agent Safety
arXiv:2604.03242v1 Announce Type: new Abstract: The advent of tool-using LLM agents shifts safety monitoring from output moderation to auditing long, noisy interaction trajectories, where risk-critical evidence is spars...
arXiv:2604.03321v1 Announce Type: new Abstract: Machine learning, especially physics-informed neural networks (PINNs) and their neural network variants, has been widely used to solve problems involving partial different...
Apparent Age Estimation: Challenges and Outcomes
arXiv:2604.03335v1 Announce Type: new Abstract: Apparent age estimation is a valuable tool for business personalization, yet current models frequently exhibit demographic biases. We review prior works on the DEX method ...
arXiv:2604.03336v1 Announce Type: new Abstract: BitNet b1.58 (Ma et al., 2024) demonstrates that large language models can operate entirely on ternary weights {-1, 0, +1}, yet no native binary wire format exists for suc...
arXiv:2604.03344v1 Announce Type: new Abstract: Electricity theft and non-technical losses (NTLs) remain critical challenges in modern smart grids, causing significant economic losses and compromising grid reliability. ...
Hardware-Oriented Inference Complexity of Kolmogorov-Arnold Networks
arXiv:2604.03345v1 Announce Type: new Abstract: Kolmogorov-Arnold Networks (KANs) have recently emerged as a powerful architecture for various machine learning applications. However, their unique structure raises signif...
arXiv:2604.03350v1 Announce Type: new Abstract: Systematic exploration of Agent-Based Models (ABMs) is challenged by the curse of dimensionality and their inherent stochasticity. We present a multi-stage pipeline integr...
The limits of bio-molecular modeling with large language models : a cross-scale evaluation
arXiv:2604.03361v1 Announce Type: new Abstract: The modeling of bio-molecular system across molecular scales remains a central challenge in scientific research. Large language models (LLMs) are increasingly applied to b...
Scalable Variational Bayesian Fine-Tuning of LLMs via Orthogonalized Low-Rank Adapters
arXiv:2604.03388v1 Announce Type: new Abstract: When deploying large language models (LLMs) to safety-critical applications, uncertainty quantification (UQ) is of utmost importance to self-assess the reliability of the ...
arXiv:2604.03417v1 Announce Type: new Abstract: Network visualization has traditionally relied on heuristic metrics, such as stress, under the assumption that optimizing them leads to aesthetic and informative layouts. ...
Adaptive Threshold-Driven Continuous Greedy Method for Scalable Submodular Optimization
arXiv:2604.03419v1 Announce Type: new Abstract: Submodular maximization under matroid constraints is a fundamental problem in combinatorial optimization with applications in sensing, data summarization, active learning,...
Adversarial Robustness of Deep State Space Models for Forecasting
arXiv:2604.03427v1 Announce Type: new Abstract: State-space model (SSM) for time-series forecasting have demonstrated strong empirical performance on benchmark datasets, yet their robustness under adversarial perturbati...
arXiv:2604.03436v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) are increasingly used for safety-relevant applications including alignment detection and model steering. These use cases require SAE latents to ...
Olmo Hybrid: From Theory to Practice and Back
arXiv:2604.03444v2 Announce Type: new Abstract: Recent work has demonstrated the potential of non-transformer language models, especially linear recurrent neural networks (RNNs) and hybrid models that mix recurrence and...
Neural Operators for Multi-Task Control and Adaptation
arXiv:2604.03449v1 Announce Type: new Abstract: Neural operator methods have emerged as powerful tools for learning mappings between infinite-dimensional function spaces, yet their potential in optimal control remains l...
Earth Embeddings Reveal Diverse Urban Signals from Space
arXiv:2604.03456v1 Announce Type: new Abstract: Conventional urban indicators derived from censuses, surveys, and administrative records are often costly, spatially inconsistent, and slow to update. Recent geospatial fo...
Super Agents and Confounders: Influence of surrounding agents on vehicle trajectory prediction
arXiv:2604.03463v1 Announce Type: new Abstract: In highly interactive driving scenes, trajectory prediction is conditioned on information from surrounding traffic participants such as cars and pedestrians. Our main cont...
Investigating Data Interventions for Subgroup Fairness: An ICU Case Study
arXiv:2604.03478v1 Announce Type: new Abstract: In high-stakes settings where machine learning models are used to automate decision-making about individuals, the presence of algorithmic bias can exacerbate systemic harm...
Improving Feasibility via Fast Autoencoder-Based Projections
arXiv:2604.03489v1 Announce Type: new Abstract: Enforcing complex (e.g., nonconvex) operational constraints is a critical challenge in real-world learning and control systems. However, existing methods struggle to effic...
Online learning of smooth functions on $\mathbb{R}$
arXiv:2604.03525v1 Announce Type: new Abstract: We study adversarial online learning of real-valued functions on $\mathbb{R}$. In each round the learner is queried at $x_t\in\mathbb{R}$, predicts $\hat y_t$, and then ob...
arXiv:2604.03541v2 Announce Type: new Abstract: This study surveys the historical development of regularization, tracing its evolution from stepwise regression in the 1960s to recent advancements in formal error control...
Simple yet Effective: Low-Rank Spatial Attention for Neural Operators
arXiv:2604.03582v1 Announce Type: new Abstract: Neural operators have emerged as data-driven surrogates for solving partial differential equations (PDEs), and their success hinges on efficiently modeling the long-range,...
Evaluation of Bagging Predictors with Kernel Density Estimation and Bagging Score
arXiv:2604.03599v1 Announce Type: new Abstract: For a larger set of predictions of several differently trained machine learning models, known as bagging predictors, the mean of all predictions is taken by default. Never...
BlazeFL: Fast and Deterministic Federated Learning Simulation
arXiv:2604.03606v1 Announce Type: new Abstract: Federated learning (FL) research increasingly relies on single-node simulations with hundreds or thousands of virtual clients, making both efficiency and reproducibility e...
Neural Global Optimization via Iterative Refinement from Noisy Samples
arXiv:2604.03614v1 Announce Type: new Abstract: Global optimization of black-box functions from noisy samples is a fundamental challenge in machine learning and scientific computing. Traditional methods such as Bayesian...
Algebraic Diversity: Group-Theoretic Spectral Estimation from Single Observations
arXiv:2604.03634v1 Announce Type: new Abstract: We prove that temporal averaging over multiple observations can be replaced by algebraic group action on a single observation for second-order statistical estimation. A Ge...
Delayed Homomorphic Reinforcement Learning for Environments with Delayed Feedback
arXiv:2604.03641v1 Announce Type: new Abstract: Reinforcement learning in real-world systems is often accompanied by delayed feedback, which breaks the Markov assumption and impedes both learning and control. Canonical ...
Automated Attention Pattern Discovery at Scale in Large Language Models
arXiv:2604.03764v1 Announce Type: new Abstract: Large language models have found success by scaling up capabilities to work in general settings. The same can unfortunately not be said for interpretability methods. The c...
Generative AI for Video Trailer Synthesis: From Extractive Heuristics to Autoregressive Creativity
arXiv:2604.04953v1 Announce Type: new Abstract: The domain of automatic video trailer generation is currently undergoing a profound paradigm shift, transitioning from heuristic-based extraction methods to deep generativ...
arXiv:2604.04972v1 Announce Type: new Abstract: Large Vision-Language Models (LVLMs) suffer from prohibitive inference costs due to the massive number of visual tokens processed by the language decoder. Existing pruning...
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding
arXiv:2604.05015v1 Announce Type: new Abstract: With the rapid advancement of video understanding, existing benchmarks are becoming increasingly saturated, exposing a critical discrepancy between inflated leaderboard sc...
ID-Sim: An Identity-Focused Similarity Metric
arXiv:2604.05039v1 Announce Type: new Abstract: Humans have remarkable selective sensitivity to identities -- easily distinguishing between highly similar identities, even across significantly different contexts such as...
R3PM-Net: Real-time, Robust, Real-world Point Matching Network
arXiv:2604.05060v1 Announce Type: new Abstract: Accurate Point Cloud Registration (PCR) is an important task in 3D data processing, involving the estimation of a rigid transformation between two point clouds. While deep...
SVAgent: Storyline-Guided Long Video Understanding via Cross-Modal Multi-Agent Collaboration
arXiv:2604.05079v1 Announce Type: new Abstract: Video question answering (VideoQA) is a challenging task that requires integrating spatial, temporal, and semantic information to capture the complex dynamics of video seq...
Simultaneous Dual-View Mammogram Synthesis Using Denoising Diffusion Probabilistic Models
arXiv:2604.05110v1 Announce Type: new Abstract: Breast cancer screening relies heavily on mammography, where the craniocaudal (CC) and mediolateral oblique (MLO) views provide complementary information for diagnosis. Ho...
Watch Before You Answer: Learning from Visually Grounded Post-Training
arXiv:2604.05117v1 Announce Type: new Abstract: It is critical for vision-language models (VLMs) to comprehensively understand visual, temporal, and textual cues. However, despite rapid progress in multimodal modeling, ...
Lightweight True In-Pixel Encryption with FeFET Enabled Pixel Design for Secure Imaging
arXiv:2604.05147v1 Announce Type: new Abstract: Ensuring end-to-end security in image sensors has become essential as visual data can be exposed through multiple stages of the imaging pipeline. Advanced protection requi...
Modality-Aware and Anatomical Vector-Quantized Autoencoding for Multimodal Brain MRI
arXiv:2604.05171v1 Announce Type: new Abstract: Learning a robust Variational Autoencoder (VAE) is a fundamental step for many deep learning applications in medical image analysis, such as MRI synthesizes. Existing brai...
MIRAGE: Benchmarking and Aligning Multi-Instance Image Editing
arXiv:2604.05180v1 Announce Type: new Abstract: Instruction-guided image editing has seen remarkable progress with models like FLUX.2 and Qwen-Image-Edit, yet they still struggle with complex scenarios with multiple sim...
LSRM: High-Fidelity Object-Centric Reconstruction via Scaled Context Windows
arXiv:2604.05182v1 Announce Type: new Abstract: We introduce the Large Sparse Reconstruction Model to study how scaling transformer context windows impacts feed-forward 3D reconstruction. Although recent object-centric ...
OrthoFuse: Training-free Riemannian Fusion of Orthogonal Style-Concept Adapters for Diffusion Models
arXiv:2604.05183v1 Announce Type: new Abstract: In a rapidly growing field of model training there is a constant practical interest in parameter-efficient fine-tuning and various techniques that use a small amount of tr...
Integration of Object Detection and Small VLMs for Construction Safety Hazard Identification
arXiv:2604.05210v1 Announce Type: new Abstract: Accurate and timely identification of construction hazards around workers is essential for preventing workplace accidents. While large vision-language models (VLMs) demons...
Boxer: Robust Lifting of Open-World 2D Bounding Boxes to 3D
arXiv:2604.05212v1 Announce Type: new Abstract: Detecting and localizing objects in space is a fundamental computer vision problem. While much progress has been made to solve 2D object detection, 3D object localization ...
arXiv:2604.05215v1 Announce Type: new Abstract: Representation learning on large-scale unstructured volumetric and surface meshes poses significant challenges in neuroimaging, especially when models must incorporate div...
Active Measurement of Two-Point Correlations
arXiv:2604.05227v1 Announce Type: new Abstract: Two-point correlation functions (2PCF) are widely used to characterize how points cluster in space. In this work, we study the problem of measuring the 2PCF over a large s...
Protecting and Preserving Protest Dynamics for Responsible Analysis
arXiv:2604.05256v1 Announce Type: new Abstract: Protest-related social media data are valuable for understanding collective action but inherently high-risk due to concerns surrounding surveillance, repression, and indiv...
Coverage Optimization for Camera View Selection
arXiv:2604.05259v1 Announce Type: new Abstract: What makes a good viewpoint? The quality of the data used to learn 3D reconstructions is crucial for enabling efficient and accurate scene modeling. We study the active vi...
Region-R1: Reinforcing Query-Side Region Cropping for Multi-Modal Re-Ranking
arXiv:2604.05268v1 Announce Type: new Abstract: Multi-modal retrieval-augmented generation (MM-RAG) relies heavily on re-rankers to surface the most relevant evidence for image-question queries. However, standard re-ran...
Toward Unified Fine-Grained Vehicle Classification and Automatic License Plate Recognition
arXiv:2604.05271v1 Announce Type: new Abstract: Extracting vehicle information from surveillance images is essential for intelligent transportation systems, enabling applications such as traffic monitoring and criminal ...
arXiv:2604.05296v1 Announce Type: new Abstract: Frozen visual embeddings (e.g., CLIP, DINOv2/v3, SSCD) power retrieval and integrity systems, yet their use on face-containing data is constrained by unmeasured identity l...
SmokeGS-R: Physics-Guided Pseudo-Clean 3DGS for Real-World Multi-View Smoke Restoration
arXiv:2604.05301v1 Announce Type: new Abstract: Real-world smoke simultaneously attenuates scene radiance, adds airlight, and destabilizes multi-view appearance consistency, making robust 3D reconstruction particularly ...
Indoor Asset Detection in Large Scale 360{\deg} Drone-Captured Imagery via 3D Gaussian Splatting
arXiv:2604.05316v1 Announce Type: new Abstract: We present an approach for object-level detection and segmentation of target indoor assets in 3D Gaussian Splatting (3DGS) scenes, reconstructed from 360{\deg} drone-captu...
arXiv:2604.05323v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models integrate visual perception, language understanding, and action decision-making for cross-modal semantic alignment, exhibiting broad ap...
Unsupervised Multi-agent and Single-agent Perception from Cooperative Views
arXiv:2604.05354v1 Announce Type: new Abstract: The LiDAR-based multi-agent and single-agent perception has shown promising performance in environmental understanding for robots and automated vehicles. However, there is...
GESS: Multi-cue Guided Local Feature Learning via Geometric and Semantic Synergy
arXiv:2604.05359v1 Announce Type: new Abstract: Robust local feature detection and description are foundational tasks in computer vision. Existing methods primarily rely on single appearance cues for modeling, leading t...
arXiv:2604.05363v1 Announce Type: new Abstract: Infrared small target detection (IRSTD) aims to separate small targets from clutter backgrounds. Extensive research is dedicated to the pixel-level supervision-guided "enc...
3DTurboQuant: Training-Free Near-Optimal Quantization for 3D Reconstruction Models
arXiv:2604.05366v1 Announce Type: new Abstract: Every existing method for compressing 3D Gaussian Splatting, NeRF, or transformer-based 3D reconstructors requires learning a data-dependent codebook through per-scene fin...
UAVReason: A Unified, Large-Scale Benchmark for Multimodal Aerial Scene Reasoning and Generation
arXiv:2604.05377v1 Announce Type: new Abstract: Vision-Language models (VLMs) have demonstrated remarkable capability in ground-view visual understanding but often fracture when deployed on high-altitude Unmanned Aerial...
Constraint-Driven Warm-Freeze for Efficient Transfer Learning in Photovoltaic Systems
arXiv:2604.05807v1 Announce Type: new Abstract: Detecting cyberattacks in photovoltaic (PV) monitoring and MPPT control signals requires models that are robust to bias, drift, and transient spikes, yet lightweight enoug...
Activity-Dependent Plasticity in Morphogenetically-Grown Recurrent Networks
arXiv:2604.03386v1 Announce Type: cross Abstract: Developmental approaches to neural architecture search grow functional networks from compact genomes through self-organisation, but the resulting networks operate with f...
arXiv:2604.04981v1 Announce Type: cross Abstract: Next-generation sequencing (NGS) is a key technique for studying the DNA and RNA of organisms. However, identifying quality problems in NGS data across different experim...
Neural Network Pruning via QUBO Optimization
arXiv:2604.05856v1 Announce Type: cross Abstract: Neural network pruning can be formulated as a combinatorial optimization problem, yet most existing approaches rely on greedy heuristics that ignore complex interactions...
arXiv:2601.05098v3 Announce Type: replace Abstract: Designing scientific instrumentation often requires exploring large, highly constrained design spaces using computationally expensive physics simulations. These simula...
MOELIGA: a multi-objective evolutionary approach for feature selection with local improvement
arXiv:2603.20934v2 Announce Type: replace Abstract: Selecting the most relevant or informative features is a key issue in actual machine learning problems. Since an exhaustive search is not feasible even for a moderate ...
arXiv:2506.00377v4 Announce Type: replace-cross Abstract: The widespread adoption of the Internet of Things (IoT) has raised a new challenge for developers since it is prone to known and unknown cyberattacks due to its ...
Identification and Inference in Nonlinear Dynamic Network Models
arXiv:2604.04961v1 Announce Type: new Abstract: We study identification and inference in nonlinear dynamic systems defined on unknown interaction networks. The system evolves through an unobserved dependence matrix gove...
Learning Nonlinear Regime Transitions via Semi-Parametric State-Space Models
arXiv:2604.04963v1 Announce Type: new Abstract: We develop a semi-parametric state-space model for time-series data with latent regime transitions. Classical Markov-switching models use fixed parametric transition funct...
arXiv:2604.04973v1 Announce Type: new Abstract: This paper presents a Structured Source-Wise Adaptive Diffusion Framework for linear and nonlinear blind source separation. The framework interprets each latent dimension ...
arXiv:2604.04993v1 Announce Type: new Abstract: We introduce the Hiremath Early Detection (HED) Score, a principled, measure-theoretic evaluation criterion for quantifying the time-value of information in systems operat...
arXiv:2604.05008v1 Announce Type: new Abstract: This paper introduces a novel generative framework for synthesising forward-looking, c\`adl\`ag stochastic trajectories that are sequentially consistent with time-evolving...
Individual-heterogeneous sub-Gaussian Mixture Models
arXiv:2604.05337v1 Announce Type: new Abstract: The classical Gaussian mixture model assumes homogeneity within clusters, an assumption that often fails in real-world data where observations naturally exhibit varying sc...
MEC: Machine-Learning-Assisted Generalized Entropy Calibration for Semi-Supervised Mean Estimation
arXiv:2604.05446v1 Announce Type: new Abstract: Obtaining high-quality labels is costly, whereas unlabeled covariates are often abundant, motivating semi-supervised inference methods with reliable uncertainty quantifica...
Hierarchical Contrastive Learning for Multimodal Data
arXiv:2604.05462v1 Announce Type: new Abstract: Multimodal representation learning is commonly built on a shared-private decomposition, treating latent information as either common to all modalities or specific to one. ...
Efficient machine unlearning with minimax optimality
arXiv:2604.05669v1 Announce Type: new Abstract: There is a growing demand for efficient data removal to comply with regulations like the GDPR and to mitigate the influence of biased or corrupted data. This has motivated...
Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification
arXiv:2604.06032v1 Announce Type: new Abstract: Neural network classifiers trained with cross-entropy loss achieve strong predictive accuracy but lack the capability to provide inherent predictive uncertainty estimates,...
Cactus: Accelerating Auto-Regressive Decoding with Constrained Acceptance Speculative Sampling
arXiv:2604.04987v1 Announce Type: cross Abstract: Speculative sampling (SpS) has been successful in accelerating the decoding throughput of auto-regressive large language models by leveraging smaller draft models. SpS s...
arXiv:2604.05057v1 Announce Type: cross Abstract: Blind-spot mass is a Good-Turing framework for quantifying deployment coverage risk in machine learning. In modern ML systems, operational state distributions are often ...
fastml: Guarded Resampling Workflows for Safer Automated Machine Learning in R
arXiv:2604.05225v1 Announce Type: cross Abstract: Preprocessing leakage arises when scaling, imputation, or other data-dependent transformations are estimated before resampling, inflating apparent performance while rema...
arXiv:2604.05303v1 Announce Type: cross Abstract: Sampling physical systems with rough energy landscapes is hindered by rare events and metastable trapping. While Boltzmann generators already offer a solution, their rel...
Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models
arXiv:2604.05469v1 Announce Type: cross Abstract: We study language models as evolving model organisms and ask when autoregressive next-token learning selects for world-tracking representations. For any encoding of late...
Optimal Centered Active Excitation in Linear System Identification
arXiv:2604.05518v1 Announce Type: cross Abstract: We propose an active learning algorithm for linear system identification with optimal centered noise excitation. Notably, our algorithm, based on ordinary least squares ...
High-dimensional reliability-based design optimization using stochastic emulators
arXiv:2604.05759v1 Announce Type: cross Abstract: Reliability-based design optimization (RBDO) is traditionally formulated as a nested optimization and reliability problem. Although surrogate models are generally employ...
arXiv:2604.05778v1 Announce Type: cross Abstract: The ISOKANN (Invariant Subspaces of Koopman Operators Learned by Artificial Neural Networks) framework provides a data-driven route to extract collective variables (CVs)...
Bivariate Causal Discovery Using Rate-Distortion MDL: An Information Dimension Approach
arXiv:2604.05829v1 Announce Type: cross Abstract: Approaches to bivariate causal discovery based on the minimum description length (MDL) principle approximate the (uncomputable) Kolmogorov complexity of the models in ea...
Expectation Maximization (EM) Converges for General Agnostic Mixtures
arXiv:2604.05842v1 Announce Type: cross Abstract: Mixture of linear regression is well studied in statistics and machine learning, where the data points are generated probabilistically using $k$ linear models. Algorithm...
Data Distribution Valuation Using Generalized Bayesian Inference
arXiv:2604.05993v1 Announce Type: cross Abstract: We investigate the data distribution valuation problem, which aims to quantify the values of data distributions from their samples. This is a recently proposed problem t...
arXiv:2604.06065v1 Announce Type: cross Abstract: Under general assumptions on the target distribution $p^\star$, we establish a sharp Lipschitz regularity theory for flow-matching vector fields and diffusion-model scor...
Sequential Audit Sampling with Statistical Guarantees
arXiv:2604.06116v1 Announce Type: cross Abstract: Financial statement auditing is conducted under a risk-based evidence approach to obtain reasonable assurance. In practice, auditors often perform additional sampling or...
arXiv:2604.06169v1 Announce Type: cross Abstract: The static ``train then deploy" paradigm fundamentally limits Large Language Models (LLMs) from dynamically adapting their weights in response to continuous streams of n...
arXiv:2305.02657v5 Announce Type: replace Abstract: In this paper, we provide a strategy to determine the eigenvalue decay rate (EDR) of a large class of kernel functions defined on a general domain rather than $\mathbb...
arXiv:2509.02617v2 Announce Type: replace Abstract: Parametric partial differential equations (PDEs) serve as fundamental mathematical tools for modeling complex physical phenomena, yet repeated high-fidelity numerical ...
Causal Effect Estimation with Learned Instrument Representations
arXiv:2602.10370v2 Announce Type: replace Abstract: Instrumental variable (IV) methods mitigate bias from unobserved confounding in observational causal inference but rely on the availability of a valid instrument, whic...
Transfer Learning for Meta-analysis Under Covariate Shift
arXiv:2604.02656v2 Announce Type: replace Abstract: Randomized controlled trials often do not represent the populations where decisions are made, and covariate shift across studies can invalidate standard IPD meta-analy...
Edgeworth Accountant: An Analytical Approach to Differential Privacy Composition
arXiv:2206.04236v3 Announce Type: replace-cross Abstract: In privacy-preserving data analysis, many procedures and algorithms are structured as compositions of multiple private building blocks. As such, an important que...
Understanding Uncertainty Sampling via Equivalent Loss
arXiv:2307.02719v4 Announce Type: replace-cross Abstract: Uncertainty sampling is a prevalent active learning algorithm that queries sequentially the annotations of data samples which the current prediction model is unc...
New ways to balance cost and reliability in the Gemini API
Gemini API Dials
Create, edit and share videos at no cost in Google Vids
Google Vids logo surrounded by various video editing UI
We’re creating a new satellite imagery map to help protect Brazil’s forests.
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/bfc_keyword_before_after_white_.max-600x600.format-webp_QNQ8psB.webp" />Google partnered with the Brazilian government on a satellite imagery map ...
The latest AI news we announced in March 2026
March 2026 AI Recap showing new updates
Build with Veo 3.1 Lite, our most cost-effective video generation model
Build with Veo 3.1 Lite
Watch James Manyika talk AI and creativity with LL COOL J.
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/Approved_thumbnail_with_logo.max-600x600.format-webp.webp" />In the latest episode of our Dialogues on Technology and Society series, LL COOL J si...
Transform your headphones into a live personal translator on iOS.
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/Static-InterstitialscreenJapane.max-600x600.format-webp.webp" />Google Translate’s Live translate with headphones is officially arriving on iOS! A...
Gemini 3.1 Flash Live: Making audio AI more natural and reliable
The Gemini emblem sits next to text reading 'Gemini 3.1 Flash Live'. The background has blue, multicolored dots making up a microphone icon
Search Live is expanding globally
A graphic with the words Search Live shown underneath a waveform icon. To the right, a phone shows the Google app with Search Live open. The camera is pointing at trees in a forest.
Build with Lyria 3, our newest music generation model
Google Lyria teaser
Lyria 3 Pro: Create longer tracks in more Google products
Sizzle video showing new capabilities from Lyria 3 Pro
Bringing the power of Personal Intelligence to more people
Bubble that says "Personal Intelligence" with Google G, Google Photos logo, and Gmail logo around it
Our latest investment in open source security for the AI era
A collage including security icons and photos of hands clasped, a man looking at a computer, and two people pointing at something off camera
How AI is helping improve heart health in rural Australia
A doctor is sitting across a desk from a patient. The doctor is holding a tablet and a pen. Medical charts renderings are in the background.
Gemini in Google Sheets just achieved state-of-the-art performance.
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/Workspace_Jan_Moment_Sheets_Blo.max-600x600.format-webp.webp" />Today we announced new beta features for Gemini in Sheets to help you create, orga...
How our open-source AI model SpeciesNet is helping to promote wildlife conservation
Photos of animals being identified by the SpeciesNet AI model
Ask a Techspert: How does AI understand my visual searches?
Mobile phone with a search bar that says "Ask anything"
The latest AI news we announced in February
an MP4 of a carousel with images reading "Gemini 3.1 Pro" and "Nano Banana 2"
Use Canvas in AI Mode to get things done and bring your ideas to life, right in Search.
<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/AIMMode_Social.max-600x600.format-webp.webp" />Canvas in AI Mode is now available for everyone in the U.S. Plus, it can now help you draft documen...
Create new worlds in Project Genie with these 4 tips
A screen capture of Project Genie, an experimental interface showing a grid of circular images, many of which appear to be 360-degree views, with a large, central black globe labeled Create your own
The next phase of enterprise AI
OpenAI outlines the next phase of enterprise AI, as adoption accelerates across industries with Frontier, ChatGPT Enterprise, Codex, and company-wide AI agents.
Introducing the Child Safety Blueprint
Discover OpenAI’s Child Safety Blueprint—a roadmap for building AI responsibly with safeguards, age-appropriate design, and collaboration to protect and empower young people online.
Announcing the OpenAI Safety Fellowship
A pilot program to support independent safety and alignment research and develop the next generation of talent
Industrial policy for the Intelligence Age
Explore our ambitious, people-first industrial policy ideas for the AI era—focused on expanding opportunity, sharing prosperity, and building resilient institutions as advanced intelligence evolves.
OpenAI acquires TBPN to accelerate global conversations around AI and support independent media, expanding dialogue with builders, businesses, and the broader tech community.
Codex now offers more flexible pricing for teams
Codex now includes pay-as-you-go pricing for ChatGPT Business and Enterprise, providing teams a more flexible option to start and scale adoption.
Gradient Labs gives every bank customer an AI account manager
Gradient Labs uses GPT-4.1 and GPT-5.4 mini and nano to power AI agents that automate banking support workflows with low latency and high reliability.
Accelerating the next phase of AI
OpenAI raises $122 billion in new funding to expand frontier AI globally, invest in next-generation compute, and meet growing demand for ChatGPT, Codex, and enterprise AI.
Helping disaster response teams turn AI into action across Asia
AI for Disaster Response in Asia: OpenAI Workshop with Gates Foundation
STADLER reshapes knowledge work at a 230-year-old company
Learn how STADLER uses ChatGPT to transform knowledge work, saving time and accelerating productivity across 650 employees.
Inside our approach to the Model Spec
Learn how OpenAI’s Model Spec serves as a public framework for model behavior, balancing safety, user freedom, and accountability as AI systems advance.
Introducing the OpenAI Safety Bug Bounty program
OpenAI launches a Safety Bug Bounty program to identify AI abuse and safety risks, including agentic vulnerabilities, prompt injection, and data exfiltration.
Helping developers build safer AI experiences for teens
OpenAI releases prompt-based teen safety policies for developers using gpt-oss-safeguard, helping moderate age-specific risks in AI systems.
Update on the OpenAI Foundation
The OpenAI Foundation announces plans to invest at least $1 billion in curing diseases, economic opportunity, AI resilience, and community programs.
Powering product discovery in ChatGPT
ChatGPT introduces richer, visually immersive shopping powered by the Agentic Commerce Protocol, enabling product discovery, side-by-side comparisons, and merchant integration.
To address the novel safety challenges posed by a state-of-the-art video model as well as a new social creation platform, we’ve built Sora 2 and the Sora app with safety at the foundation. Our approach is anchored in con...
How we monitor internal coding agents for misalignment
How OpenAI uses chain-of-thought monitoring to study misalignment in internal coding agents—analyzing real-world deployments to detect risks and strengthen AI safety safeguards.
Accelerates Codex growth to power the next generation of Python developer tools
OpenAI Japan announces Japan Teen Safety Blueprint to put teen safety first
OpenAI Japan announces the Japan Teen Safety Blueprint, introducing stronger age protections, parental controls, and well-being safeguards for teens using generative AI.
Introducing GPT-5.4 mini and nano
GPT-5.4 mini and nano are smaller, faster versions of GPT-5.4 optimized for coding, tool use, multimodal reasoning, and high-volume API and sub-agent workloads.
Equipping workers with insights about compensation
New research shows Americans send nearly 3 million daily messages to ChatGPT asking about compensation and earnings, helping close the wage information gap.
Why Codex Security Doesn’t Include a SAST Report
A deep dive into why Codex Security doesn’t rely on traditional SAST, instead using AI-driven constraint reasoning and validation to find real vulnerabilities with fewer false positives.
Designing AI agents to resist prompt injection
How ChatGPT defends against prompt injection and social engineering by constraining risky actions and protecting sensitive data in agent workflows.
From model to agent: Equipping the Responses API with a computer environment
How OpenAI built an agent runtime using the Responses API, shell tool, and hosted containers to run secure, scalable agents with files, tools, and state.
Wayfair boosts catalog accuracy and support speed with OpenAI
Wayfair uses OpenAI models to improve ecommerce support and product catalog accuracy, automating ticket triage and enhancing millions of product attributes at scale.
Improving instruction hierarchy in frontier LLMs
IH-Challenge trains models to prioritize trusted instructions, improving instruction hierarchy, safety steerability, and resistance to prompt injection attacks.
New ways to learn math and science in ChatGPT
ChatGPT introduces interactive visual explanations for math and science, helping students explore formulas, variables, and concepts in real time.
OpenAI is acquiring Promptfoo, an AI security platform that helps enterprises identify and remediate vulnerabilities in AI systems during development.
Codex Security: now in research preview
Codex Security is an AI application security agent that analyzes project context to detect, validate, and patch complex vulnerabilities with higher confidence and less noise.
Gemma 4: Byte for byte, the most capable open models
Gemma 4: Our most intelligent open models to date, purpose-built for advanced reasoning and agentic workflows.
Gemini 3.1 Flash Live: Making audio AI more natural and reliable
Our latest voice model has improved precision and lower latency to make voice interactions more fluid, natural and precise.
Protecting people from harmful manipulation
Google DeepMind researches AI's harmful manipulation risks across areas like finance and health, leading to new safety measures.
Lyria 3 Pro: Create longer tracks in more
Introducing Lyria 3 Pro, which unlocks longer tracks with structural awareness. We’re also bringing Lyria to more Google products and surfaces.
Measuring progress toward AGI: A cognitive framework
We’re introducing a framework to measure progress toward AGI, and launching a Kaggle hackathon to build the relevant evaluations.
From games to biology and beyond: 10 years of AlphaGo’s impact
Ten years since AlphaGo, we explore how it is catalyzing scientific discovery and paving a path to AGI.
Gemini 3.1 Flash-Lite: Built for intelligence at scale
Gemini 3.1 Flash-Lite is our fastest and most cost-efficient Gemini 3 series model yet.
Nano Banana 2: Combining Pro capabilities with lightning-fast speed
Our latest image generation model offers advanced world knowledge, production ready specs, subject consistency and more, all at Flash speed.
Gemini 3.1 Pro: A smarter model for your most complex tasks
3.1 Pro is designed for tasks where a simple answer isn’t enough.
A new way to express yourself: Gemini can now create music
The Gemini app now features our most advanced music generation model Lyria 3, empowering anyone to make 30-second tracks using text or images.
Accelerating discovery in India through AI-powered science and education
Google DeepMind brings National Partnerships for AI initiative to India, scaling AI for science and education
Gemini 3 Deep Think: Advancing science, research and engineering
Our most specialized reasoning mode is now updated to solve modern science, research and engineering challenges.
Accelerating Mathematical and Scientific Discovery with Gemini Deep Think
Research papers point to the growing impact of Deep Think across fields
Project Genie: Experimenting with infinite, interactive worlds
Google AI Ultra subscribers in the U.S. can try out Project Genie, an experimental research prototype that lets you create and explore worlds.
D4RT: Teaching AI to see the world in four dimensions
D4RT: Unified, efficient 4D reconstruction and tracking up to 300x faster than prior methods.
Veo 3.1 Ingredients to Video: More consistency, creativity and control
Our latest Veo update generates lively, dynamic clips that feel natural and engaging — and supports vertical video generation.
Google's year in review: 8 areas with research breakthroughs in 2025
Google 2025 recap: Research breakthroughs of the year
Gemini 3 Flash: frontier intelligence built for speed
Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost.
Open interpretability tools for language models are now available across the entire Gemma 3 family with the release of Gemma Scope 2.
Deepening our partnership with the UK AI Security Institute
Google DeepMind and UK AI Security Institute (AISI) strengthen collaboration on critical AI safety and security research
Deepening our partnership with the UK government to support prosperity and security in the AI era
FACTS Benchmark Suite: Systematically evaluating the factuality of large language models
Systematically evaluating the factuality of large language models with the FACTS Benchmark Suite.
Engineering more resilient crops for a warming climate
Scientists are using AlphaFold to strengthen a photosynthesis enzyme for resilient, heat-tolerant crops.
AlphaFold: Five years of impact
Explore how AlphaFold has accelerated science and fueled a global wave of biological discovery.
Revealing a key protein behind heart disease
AlphaFold has revealed the structure of a key protein behind heart disease
Google DeepMind and the DOE partner on Genesis, a new effort to accelerate science with AI.
ADeLe: Predicting and explaining AI performance across tasks
<p>AI benchmarks report how large language models (LLMs) perform on specific tasks but provide little insight into their underlying capabilities that drive their performance. They do not explain failures or reliably pred...
AsgardBench: A benchmark for visually grounded interactive planning
<p>Imagine a robot tasked with cleaning a kitchen. It needs to observe its environment, decide what to do, and adjust when things don’t go as expected, for example, when the mug it was tasked to wash is already clean, or...
GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation
<p>Vision-language models (VLMs) use images and text to plan robot actions, but they still struggle to decide what actions to take and where to take them. Most systems split these decisions into two steps: a VLM generate...
Will machines ever be intelligent?
<p>Are machines truly intelligent? AI researchers Subutai Ahmad and Nicolò Fusi join Doug Burger to compare transformer-based AI with the human brain, exploring continual learning, efficiency, and whether today’s models ...
Systematic debugging for AI agents: Introducing the AgentRx framework
<p>As AI agents transition from simple chatbots to autonomous systems capable of managing cloud incidents, navigating complex web interfaces, and executing multi-step API workflows, a new challenge has emerged: transpare...
PlugMem: Transforming raw agent interactions into reusable knowledge
<p>It seems counterintuitive: giving AI agents more memory can make them less effective. As interaction logs accumulate, they grow large, fill with irrelevant content, and become increasingly difficult to use. More memor...
Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model
<p>We are pleased to announce Phi-4-reasoning-vision-15B, a 15 billion parameter open‑weight multimodal reasoning model, available through Microsoft Foundry (opens in new tab), HuggingFace (opens in new tab) and GitHub (...
Trailer: The Shape of Things to Come
<p>Microsoft research lead Doug Burger introduces his new podcast series, "The Shape of Things to Come", an exploration into the fundamental truths about AI and how the technology will reshape the future. </p> <p>The po...
CORPGEN advances AI agents for real work
<p>By mid-morning, a typical knowledge worker is already juggling a client report, a budget spreadsheet, a slide deck, and an email backlog, all interdependent and all demanding attention at once. For AI agents to be gen...
Media Authenticity Methods in Practice: Capabilities, Limitations, and Directions
<p>As synthetic media grows, verifying what’s real, and the origin of content, matters more than ever. Our latest report explores media integrity and authentication methods, their limits, and practical paths toward trust...
Understanding Convolutions on Graphs
Understanding the building blocks and design choices of graph neural networks.
A Gentle Introduction to Graph Neural Networks
What components are needed for building learning algorithms that leverage the structure and properties of graphs?
Adversarial Reprogramming of Neural Cellular Automata
Reprogramming Neural CA to exhibit novel behaviour, using adversarial attacks.
Weights in the final layer of common visual models appear as horizontal bands. We investigate how and why.
When a neural network layer is divided into multiple branches, neurons self-organize into coherent groupings.
Multimodal Neurons in Artificial Neural Networks
We report the existence of multimodal neurons in artificial neural networks, similar to those found in the human brain.
Neural Cellular Automata learn to generate textures, exhibiting surprising properties.
We present techniques for visualizing, contextualizing, and understanding neural network weights.
Reverse engineering the curve detection algorithm from InceptionV1 and reimplementing it from scratch.
A family of early-vision neurons reacting to directional transitions from high to low spatial frequency.
Naturally Occurring Equivariance in Neural Networks
Neural networks naturally learn many transformed copies of the same feature, connected by symmetric weights.
With diverse environments, we can analyze, diagnose and edit deep reinforcement learning models using attribution.
Communicating with Interactive Articles
Examining the design of interactive articles by synthesizing theory from disciplines such as education, journalism, and visualization.
Thread: Differentiable Self-organizing Systems
A collection of articles and comments with the goal of understanding how to design robust and general purpose self-organizing systems.
Training an end-to-end differentiable, self-organising cellular automata for classifying MNIST digits.
Part one of a three part deep dive into the curve neuron family.
Exploring Bayesian Optimization
How to tune hyperparameters for your machine learning model using Bayesian optimization.
An Overview of Early Vision in InceptionV1
An overview of all the neurons in the first five layers of InceptionV1, organized into a taxonomy of 'neuron groups.'
Visualizing Neural Networks with the Grand Tour
By focusing on linear dimensionality reduction, we show how to visualize many dynamic phenomena in neural networks.
What can we learn if we invest heavily in reverse engineering a single neural network?
Zoom In: An Introduction to Circuits
By studying the connections between neurons, we can find meaningful algorithms in the weights of neural networks.
Growing Neural Cellular Automata
Training an end-to-end differentiable, self-organising cellular automata model of morphogenesis, able to both grow and regenerate specific patterns.
Visualizing the Impact of Feature Attribution Baselines
Exploring the baseline input hyperparameter, and how it impacts interpretations of neural network behavior.
Computing Receptive Fields of Convolutional Neural Networks
Detailed derivations and open-source code to analyze the receptive fields of convnets.
The Paths Perspective on Value Learning
A closer look at how Temporal Difference Learning merges paths of experience for greater statistical efficiency
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features'
Six comments from the community and responses from the original authors
The main hypothesis in Ilyas et al. (2019) happens to be a special case of a more general principle that is commonly accepted in the robustness to distributional shift literature
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Robust Feature Leakage
An example project using webpack and svelte-loader and ejs to inline SVGs
An example project using webpack and svelte-loader and ejs to inline SVGs
Why AI Is Training on Its Own Garbage (and How to Fix It)
<p>Deep Web Data Is the Gold We Can't Touch, Yet</p> <p>The post <a href="https://towardsdatascience.com/why-ai-is-training-on-its-own-garbage-and-how-to-fix-it/">Why AI Is Training on Its Own Garbage (and How to Fix It)...
Detecting Translation Hallucinations with Attention Misalignment
<p>A low-budget way to get token-level uncertainty estimation for neural machine translations</p> <p>The post <a href="https://towardsdatascience.com/detecting-translation-hallucinations-with-attention-misalignment/">Det...
How to Use Claude Code to Build a Minimum Viable Product
<p>Learn how to effectively present product ideas by building MVPs with coding agents</p> <p>The post <a href="https://towardsdatascience.com/how-to-use-claude-code-to-build-a-minimum-viable-product/">How to Use Claude C...
Grounding Your LLM: A Practical Guide to RAG for Enterprise Knowledge Bases
<p>A clear mental model and a practical foundation you can build on</p> <p>The post <a href="https://towardsdatascience.com/grounding-your-llm-a-practical-guide-to-rag-for-enterprise-knowledge-bases/">Grounding Your LLM:...
Democratizing Marketing Mix Models (MMM) with Open Source and Gen AI
<p>A practical system design combining open-source Bayesian MMM and GenAI for transparent, vendor independent marketing analytics insights.</p> <p>The post <a href="https://towardsdatascience.com/democratizing-marketing-...
From 4 Weeks to 45 Minutes: Designing a Document Extraction System for 4,700+ PDFs
<p>How a hybrid PyMuPDF + GPT-4 Vision pipeline replaced £8,000 in manual engineering effort, and why the latest models weren’t the answer</p> <p>The post <a href="https://towardsdatascience.com/from-4-weeks-to-45-minute...
Context Engineering for AI Agents: A Deep Dive
<p>How to optimize context, a precious finite resource for AI agents</p> <p>The post <a href="https://towardsdatascience.com/deep-dive-into-context-engineering-for-ai-agents/">Context Engineering for AI Agents: A Deep Di...
<p>Why do grand productivity promises never actually deliver? Is every product just bad, or is there something else hiding in the numbers? </p> <p>The post <a href="https://towardsdatascience.com/the-arithmetic-of-produc...
The Geometry Behind the Dot Product: Unit Vectors, Projections, and Intuition
<p>The geometric foundations you need to understand the dot product</p> <p>The post <a href="https://towardsdatascience.com/the-geometry-behind-the-dot-product-unit-vectors-projections-and-intuition/">The Geometry Behind...
How to Run Claude Code Agents in Parallel
<p>Learn how to apply coding agents in parallel to work more efficiently</p> <p>The post <a href="https://towardsdatascience.com/how-to-run-claude-code-agents-in-parallel/">How to Run Claude Code Agents in Parallel</a> a...
Behavior is the New Credential
<p>We are living through a paradigm shift in how we prove we are who we say we are online. Instead of asking What do you know? (password, PIN, mother’s maiden name) or What do you look like? (Face ID, fingerprint) the qu...
Proxy-Pointer RAG: Achieving Vectorless Accuracy at Vector RAG Scale and Cost
<p>A new way to build vector RAG—structure-aware and reasoning-capable</p> <p>The post <a href="https://towardsdatascience.com/proxy-pointer-rag-achieving-vectorless-accuracy-at-vector-rag-scale-and-cost/">Proxy-Pointer ...
A Data Scientist’s Take on the $599 MacBook Neo
<p>Why it doesn’t fit my workflow but still makes sense for beginners</p> <p>The post <a href="https://towardsdatascience.com/a-data-scientists-take-on-the-599-macbook-neo/">A Data Scientist’s Take on the $599 MacBook Ne...
Building a Python Workflow That Catches Bugs Before Production
<p>Using modern tooling to identify defects earlier in the software lifecycle.</p> <p>The post <a href="https://towardsdatascience.com/building-a-python-workflow-that-catches-bugs-before-production/">Building a Python Wo...
Building Robust Credit Scoring Models with Python
<p>A Practical Guide to Measuring Relationships between Variables for Feature Selection in a Credit Scoring.</p> <p>The post <a href="https://towardsdatascience.com/building-robust-credit-scoring-models-with-python/">Bui...
DenseNet Paper Walkthrough: All Connected
<p>When we try to train a very deep neural network model, one issue that we might encounter is the vanishing gradient problem. This is essentially a problem where the weight update of a model during training slows down o...
I Replaced Vector DBs with Google’s Memory Agent Pattern for my notes in Obsidian
<p>Persistent AI memory without embeddings, Pinecone, or a PhD in similarity search.</p> <p>The post <a href="https://towardsdatascience.com/i-replaced-vector-dbs-with-googles-memory-agent-pattern-for-my-notes-in-obsidia...
Linear Regression Is Actually a Projection Problem (Part 2: From Projections to Predictions)
<p>The Vector View of Least Squares.</p> <p>The post <a href="https://towardsdatascience.com/linear-regression-is-actually-a-projection-problem-part-2-from-projections-to-predictions/">Linear Regression Is Actually a Pro...
How to Handle Classical Data in Quantum Models
<p>Workflows and encoding techniques in quantum machine learning</p> <p>The post <a href="https://towardsdatascience.com/how-to-handle-classical-data-in-quantum-models/">How to Handle Classical Data in Quantum Models</a>...
Quantum Simulations with Python
<p>Run Quantum Experiments with Qiskit-Aer</p> <p>The post <a href="https://towardsdatascience.com/quantum-simulations-with-python/">Quantum Simulations with Python</a> appeared first on <a href="https://towardsdatascien...
AI’s software development success and central management needs
<p>A survey carried out by OutSystems, The State of AI Development 2026 [email wall], argues that AI has moved into early production phase for many enterprises, primarily inside the IT function. The survey was based on t...
Microsoft open-source toolkit secures AI agents at runtime
<p>A new open-source toolkit from Microsoft focuses on runtime security to force strict governance onto enterprise AI agents. The release tackles a growing anxiety: autonomous language models are now executing code and h...
Asylon and Thrive Logic bring physical AI to enterprise perimeter security
<p>Exciting times are ahead in the world of enterprise perimeter security with a new partnership between Thrive Logic, an AI agent-driven security and operational intelligence platform, and Asylon, a security robotics co...
Boomi calls it “data activation” and says it’s the missing step in every AI deployment
<p>The failure mode for enterprise AI in 2026 is not what most people expected. It is not that the models are wrong, or that agents cannot reason, or that the technology is overhyped. The failure mode is that the data fe...
Anthropic’s refusal to arm AI is exactly why the UK wants it
<p>The Anthropic UK expansion story is less about diplomatic courtship and more about what happens when a government punishes a company for having principles. In late February, US Defence Secretary Pete Hegseth gave Anth...
As AI agents take on more tasks, governance becomes a priority
<p>AI systems are starting to move beyond simple responses. In many organisations, AI agents are now being tested to plan tasks, make decisions, and carry out actions with limited human input. It is no longer just about ...
KiloClaw targets shadow AI with autonomous agent governance
<p>With the launch of KiloClaw, enterprises now have a tool to enforce governance over autonomous agents and manage shadow AI. While businesses spent the last year securing large language models and formalising vendor ag...
5 best practices to secure AI systems
<p>A decade ago, it would have been hard to believe that artificial intelligence could do what it can do now. However, it is this same power that introduces a new attack surface that traditional security frameworks were ...
China’s Five-Year Plan details the targets for AI deployment
<p>China has approved its 15th Five-Year Plan [PDF] setting out the country’s economic, education, social, and industrial priorities through to 2030. As might be expected, there is a significant number of references to A...
Autonomous AI systems depend on data governance
<p>Much of the current focus on AI safety has centred on models – how they are trained and monitored. But as systems become more autonomous, attention is changing toward the data those systems depend on. If the data feed...
Experian uncovers fraud paradox in financial services’ AI adoption
<p>The same technology that financial institutions deploying is being weaponised against them. That is the core tension running through Experian’s 2026 Future of Fraud Forecast, and it’s a tension the company is in a pos...
KPMG: Inside the AI agent playbook driving enterprise margin gains
<p>Global AI investment is accelerating, yet KPMG data shows the gap between enterprise AI spend and measurable business value is widening fast. The headline figure from KPMG’s first quarterly Global AI Pulse survey is b...
After Orthogonality: Virtue-Ethical Agency and AI Alignment
<!--kg-card-begin: markdown--><h2 id="preface">Preface</h2> <p>This essay argues that rational people don’t have goals, and that rational AIs shouldn’t have goals. Human actions are rational not because we direct them at...
<blockquote>"In projecting language back as the model for thought, we lose sight of the tacit embodied understanding that undergirds our intelligence." –Terry Winograd</blockquote><p>The recent successes of generative AI...
Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research
<h3 id="what-is-the-role-of-mathematics-in-modern-machine-learning">What is the Role of Mathematics in Modern Machine Learning?</h3><p>The past decade has witnessed a shift in how progress is made in machine learning. Re...
What's Missing From LLM Chatbots: A Sense of Purpose
<p>LLM-based chatbots’ capabilities have been advancing every month. These improvements are mostly measured by benchmarks like MMLU, HumanEval, and MATH (e.g. sonnet 3.5, gpt-4o). However, as these measures get more and ...
We Need Positive Visions for AI Grounded in Wellbeing
<h2 id="introduction">Introduction</h2><p>Imagine yourself a decade ago, jumping directly into the present shock of conversing naturally with an encyclopedic AI that crafts images, writes code, and debates philosophy. Wo...
Financial Market Applications of LLMs
<p>The AI revolution drove frenzied investment in both private and public companies and captured the public’s imagination in 2023. Transformational consumer products like ChatGPT are powered by Large Language Models (LLM...
A Brief Overview of Gender Bias in AI
A brief overview and discussion on gender bias in AI
Is Attention all you need? Mamba, a novel AI model based on State Space Models (SSMs), emerges as a formidable alternative to the widely used Transformer models, addressing their inefficiency in processing long sequences...
Car-GPT: Could LLMs finally make self-driving cars happen?
Exploring the utility of large language models in autonomous driving: Can they be trusted for self-driving cars, and what are the key challenges?
Do text embeddings perfectly encode text?
'Vec2text' can serve as a solution for accurately reverting embeddings back into text, thus highlighting the urgent need for revisiting security protocols around embedded data.
Have you ever trained a model you thought was good, but then it failed miserably when applied to real world data? If so, you’re in good company.
Deep learning for single-cell sequencing: a microscope to see the diversity of cells
On the the pivotal role that Deep Learning has played as a key enabler for advancing single-cell sequencing technologies.
On fish counting – a complex sociotechnical problem in a field that is going through the process of digital transformation.
<p>In this article, we will talk about <em>classical computation</em>: the kind of computation typically found in an undergraduate Computer Science course on Algorithms and Data Structures [1]. Think shortest path-findin...
The Artificiality of Alignment
<p><em>This essay first appeared in <a href="https://joinreboot.org/p/alignment">Reboot</a></em>. </p><p>Credulous, breathless coverage of “AI existential risk” (abbreviated “x-risk”) has reached the mainstream. Who coul...