RADAR / AI/ML Research Papers

AI/ML Research Papers

A fast, reusable HTML report for lightweight Radar projects: entity distribution, article velocity, and source mix, alongside a clean reading list.

articles319 entities7 errors40 Generated at2026-04-09 01:43 UTC

Visuals

Chart.js dark editorial responsive

Entity Distribution

Top entities by frequency

Article Timeline

Daily volume inferred from article dates

Source Distribution

Share of articles by source

Data Freshness

Collection lag distribution

Entity Extraction Rate

Percentage with matched entities

Source Health

Article count by source (sorted)

">

엔티티 히트맵 (Top 15 × 14일)

Entities

clickable pills top 24 shown
ResearchGeneral 3516 Tasks 209 Research Areas 185 Techniques 117 Institutions 41 Venues 35 Key Researchers 2

Articles

cards source + date fast scan

Reading List

Click through to the original source

Pramana: Fine-Tuning Large Language Models for Epistemic Reasoning through Navya-Nyaya

arXiv:2604.04937v1 Announce Type: new Abstract: Large language models produce fluent text but struggle with systematic reasoning, often hallucinating confident but unfounded claims. When Apple researchers added irreleva...

source arXiv CS.AI date 2026-04-07 entities 5

Operational Noncommutativity in Sequential Metacognitive Judgments

arXiv:2604.04938v1 Announce Type: new Abstract: Metacognition, understood as the monitoring and regulation of one's own cognitive processes, is inherently sequential: an agent evaluates an internal state, updates it, an...

source arXiv CS.AI date 2026-04-07 entities 2

Proximity Measure of Information Object Features for Solving the Problem of Their Identification in Information Systems

arXiv:2604.04939v1 Announce Type: new Abstract: The paper considers a new quantitative-qualitative proximity measure for the features of information objects, where data enters a common information resource from several ...

source arXiv CS.AI date 2026-04-07 entities 1

ReVEL: Multi-Turn Reflective LLM-Guided Heuristic Evolution via Structured Performance Feedback

arXiv:2604.04940v1 Announce Type: new Abstract: Designing effective heuristics for NP-hard combinatorial optimization problems remains a challenging and expertise-intensive task. Existing applications of large language ...

source arXiv CS.AI date 2026-04-07 entities 3

Algebraic Structure Discovery for Real World Combinatorial Optimisation Problems: A General Framework from Abstract Algebra to Quotient Space Learning

arXiv:2604.04941v1 Announce Type: new Abstract: Many combinatorial optimisation problems hide algebraic structures that, once exposed, shrink the search space and improve the chance of finding the global optimal solutio...

source arXiv CS.AI date 2026-04-07 entities 1

PaperOrchestra: A Multi-Agent Framework for Automated AI Research Paper Writing

arXiv:2604.05018v1 Announce Type: new Abstract: Synthesizing unstructured research materials into manuscripts is an essential yet under-explored challenge in AI-driven scientific discovery. Existing autonomous writers a...

source arXiv CS.AI date 2026-04-07 entities 3

Part-Level 3D Gaussian Vehicle Generation with Joint and Hinge Axis Estimation

arXiv:2604.05070v1 Announce Type: new Abstract: Simulation is essential for autonomous driving, yet current frameworks often model vehicles as rigid assets and fail to capture part-level articulation. With perception al...

source arXiv CS.AI date 2026-04-07 entities 2

MMORF: A Multi-agent Framework for Designing Multi-objective Retrosynthesis Planning Systems

arXiv:2604.05075v1 Announce Type: new Abstract: Multi-objective retrosynthesis planning is a critical chemistry task requiring dynamic balancing of quality, safety, and cost objectives. Language model-based multi-agent ...

source arXiv CS.AI date 2026-04-07 entities 4

MedGemma 1.5 Technical Report

arXiv:2604.05081v1 Announce Type: new Abstract: We introduce MedGemma 1.5 4B, the latest model in the MedGemma collection. MedGemma 1.5 expands on MedGemma 1 by integrating additional capabilities: high-dimensional medi...

source arXiv CS.AI date 2026-04-07 entities 3

Uncertainty-Guided Latent Diagnostic Trajectory Learning for Sequential Clinical Diagnosis

arXiv:2604.05116v1 Announce Type: new Abstract: Clinical diagnosis requires sequential evidence acquisition under uncertainty. However, most Large Language Model (LLM) based diagnostic systems assume fully observed pati...

source arXiv CS.AI date 2026-04-07 entities 3

Non-monotonic causal discovery with Kolmogorov-Arnold Fuzzy Cognitive Maps

arXiv:2604.05136v1 Announce Type: new Abstract: Fuzzy Cognitive Maps constitute a neuro-symbolic paradigm for modeling complex dynamic systems, widely adopted for their inherent interpretability and recurrent inference ...

source arXiv CS.AI date 2026-04-07 entities 3

A mathematical theory of evolution for self-designing AIs

arXiv:2604.05142v1 Announce Type: new Abstract: As artificial intelligence systems (AIs) become increasingly produced by recursive self-improvement, a form of evolution may emerge, in which the traits of AI systems are ...

source arXiv CS.AI date 2026-04-07 entities 2

IntentScore: Intent-Conditioned Action Evaluation for Computer-Use Agents

arXiv:2604.05157v1 Announce Type: new Abstract: Computer-Use Agents (CUAs) leverage large language models to execute GUI operations on desktop environments, yet they generate actions without evaluating action quality, l...

source arXiv CS.AI date 2026-04-07 entities 3

Bypassing the CSI Bottleneck: MARL-Driven Spatial Control for Reflector Arrays

arXiv:2604.05162v1 Announce Type: new Abstract: Reconfigurable Intelligent Surfaces (RIS) are pivotal for next-generation smart radio environments, yet their practical deployment is severely bottlenecked by the intracta...

source arXiv CS.AI date 2026-04-07 entities 3

Learning to Focus: CSI-Free Hierarchical MARL for Reconfigurable Reflectors

arXiv:2604.05165v1 Announce Type: new Abstract: Reconfigurable Intelligent Surfaces (RIS) has a potential to engineer smart radio environments for next-generation millimeter-wave (mmWave) networks. However, the prohibit...

source arXiv CS.AI date 2026-04-07 entities 3

Instruction-Tuned LLMs for Parsing and Mining Unstructured Logs on Leadership HPC Systems

arXiv:2604.05168v1 Announce Type: new Abstract: Leadership-class HPC systems generate massive volumes of heterogeneous, largely unstructured system logs. Because these logs originate from diverse software, hardware, and...

source arXiv CS.AI date 2026-04-07 entities 5

ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces

arXiv:2604.05172v1 Announce Type: new Abstract: Large language model (LLM) agents are increasingly deployed to automate productivity tasks (e.g., email, scheduling, document management), but evaluating them on live serv...

source arXiv CS.AI date 2026-04-07 entities 3

Attribution Bias in Large Language Models

arXiv:2604.05224v1 Announce Type: new Abstract: As Large Language Models (LLMs) are increasingly used to support search and information retrieval, it is critical that they accurately attribute content to its original au...

source arXiv CS.AI date 2026-04-07 entities 2

From Governance Norms to Enforceable Controls: A Layered Translation Method for Runtime Guardrails in Agentic AI

arXiv:2604.05229v1 Announce Type: new Abstract: Agentic AI systems plan, use tools, maintain state, and produce multi-step trajectories with external effects. Those properties create a governance problem that differs ma...

source arXiv CS.AI date 2026-04-07 entities 2

EAGLE: Edge-Aware Graph Learning for Proactive Delivery Delay Prediction in Smart Logistics Networks

arXiv:2604.05254v1 Announce Type: new Abstract: Modern logistics networks generate rich operational data streams at every warehouse node and transportation lane -- from order timestamps and routing records to shipping m...

source arXiv CS.AI date 2026-04-07 entities 4

Simulating the Evolution of Alignment and Values in Machine Intelligence

arXiv:2604.05274v1 Announce Type: new Abstract: Model alignment is currently applied in a vacuum, evaluated primarily through standardised benchmark performance. The purpose of this study is to examine the effects of al...

source arXiv CS.AI date 2026-04-07 entities 3

Pressure, What Pressure? Sycophancy Disentanglement in Language Models via Reward Decomposition

arXiv:2604.05279v1 Announce Type: new Abstract: Large language models exhibit sycophancy, the tendency to shift their stated positions toward perceived user preferences or authority cues regardless of evidence. Standard...

source arXiv CS.AI date 2026-04-07 entities 2

Breakthrough the Suboptimal Stable Point in Value-Factorization-Based Multi-Agent Reinforcement Learning

arXiv:2604.05297v1 Announce Type: new Abstract: Value factorization, a popular paradigm in MARL, faces significant theoretical and algorithmic bottlenecks: its tendency to converge to suboptimal solutions remains poorly...

source arXiv CS.AI date 2026-04-07 entities 3

Graph of Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills

arXiv:2604.05333v1 Announce Type: new Abstract: Skill usage has become a core component of modern agent systems and can substantially improve agents' ability to complete complex tasks. In real-world settings, where agen...

source arXiv CS.AI date 2026-04-07 entities 3

TRACE: Capability-Targeted Agentic Training

arXiv:2604.05336v1 Announce Type: new Abstract: Large Language Models (LLMs) deployed in agentic environments must exercise multiple capabilities across different task instances, where a capability is performing one or ...

source arXiv CS.AI date 2026-04-07 entities 4

Dynamic Agentic AI Expert Profiler System Architecture for Multidomain Intelligence Modeling

arXiv:2604.05345v1 Announce Type: new Abstract: In today's artificial intelligence driven world, modern systems communicate with people from diverse backgrounds and skill levels. For human-machine interaction to be mean...

source arXiv CS.AI date 2026-04-07 entities 4

From Retinal Evidence to Safe Decisions: RETINA-SAFE and ECRT for Hallucination Risk Triage in Medical LLMs

arXiv:2604.05348v1 Announce Type: new Abstract: Hallucinations in medical large language models (LLMs) remain a safety-critical issue, particularly when available evidence is insufficient or conflicting. We study this p...

source arXiv CS.AI date 2026-04-07 entities 3

ETR: Entropy Trend Reward for Efficient Chain-of-Thought Reasoning

arXiv:2604.05355v1 Announce Type: new Abstract: Chain-of-thought (CoT) reasoning improves large language model performance on complex tasks, but often produces excessively long and inefficient reasoning traces. Existing...

source arXiv CS.AI date 2026-04-07 entities 3

LatentAudit: Real-Time White-Box Faithfulness Monitoring for Retrieval-Augmented Generation with Verifiable Deployment

arXiv:2604.05358v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) mitigates hallucination but does not eliminate it: a deployed system must still decide, at inference time, whether its answer is actua...

source arXiv CS.AI date 2026-04-07 entities 3

TFRBench: A Reasoning Benchmark for Evaluating Forecasting Systems

arXiv:2604.05364v1 Announce Type: new Abstract: We introduce TFRBench, the first benchmark designed to evaluate the reasoning capabilities of forecasting systems. Traditionally, time-series forecasting has been evaluate...

source arXiv CS.AI date 2026-04-07 entities 3

TDA-RC: Task-Driven Alignment for Knowledge-Based Reasoning Chains in Large Language Models

arXiv:2604.04942v1 Announce Type: new Abstract: Enhancing the reasoning capability of large language models (LLMs) remains a core challenge in natural language processing. The Chain-of-Thought (CoT) paradigm dominates p...

source arXiv CS.CL date 2026-04-07 entities 3

The Illusion of Latent Generalization: Bi-directionality and the Reversal Curse

arXiv:2604.04943v1 Announce Type: new Abstract: The reversal curse describes a failure of autoregressive language models to retrieve a fact in reverse order (e.g., training on ``$A > B$'' but failing on ``$B < A$''). Re...

source arXiv CS.CL date 2026-04-07 entities 2

Inclusion-of-Thoughts: Mitigating Preference Instability via Purifying the Decision Space

arXiv:2604.04944v1 Announce Type: new Abstract: Multiple-choice questions (MCQs) are widely used to evaluate large language models (LLMs). However, LLMs remain vulnerable to the presence of plausible distractors. This o...

source arXiv CS.CL date 2026-04-07 entities 3

Phase-Associative Memory: Sequence Modeling in Complex Hilbert Space

arXiv:2604.05030v1 Announce Type: new Abstract: We present Phase-Associative Memory (PAM), a recurrent sequence model in which all representations are complex-valued, associations accumulate in a matrix state $S_{t}$ $\...

source arXiv CS.CL date 2026-04-07 entities 2

This Treatment Works, Right? Evaluating LLM Sensitivity to Patient Question Framing in Medical QA

arXiv:2604.05051v1 Announce Type: new Abstract: Patients are increasingly turning to large language models (LLMs) with medical questions that are complex and difficult to articulate clearly. However, LLMs are sensitive ...

source arXiv CS.CL date 2026-04-07 entities 4

Memory Dial: A Training Framework for Controllable Memorization in Language Models

arXiv:2604.05074v1 Announce Type: new Abstract: Memorization in language models is widely studied but remains difficult to isolate and control. Understanding when and what models memorize is essential for explaining the...

source arXiv CS.CL date 2026-04-07 entities 1

Beyond LLM-as-a-Judge: Deterministic Metrics for Multilingual Generative Text Evaluation

arXiv:2604.05083v1 Announce Type: new Abstract: While Large Language Models (LLMs) are increasingly adopted as automated judges for evaluating generated text, their outputs are often costly, and highly sensitive to prom...

source arXiv CS.CL date 2026-04-07 entities 3

Document Optimization for Black-Box Retrieval via Reinforcement Learning

arXiv:2604.05087v1 Announce Type: new Abstract: Document expansion is a classical technique for improving retrieval quality, and is attractive since it shifts computation offline, avoiding additional query-time processi...

source arXiv CS.CL date 2026-04-07 entities 4

Multilingual Language Models Encode Script Over Linguistic Structure

arXiv:2604.05090v1 Announce Type: new Abstract: Multilingual language models (LMs) organize representations for typologically and orthographically diverse languages into a shared parameter space, yet the nature of this ...

source arXiv CS.CL date 2026-04-07 entities 5

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

arXiv:2604.05091v1 Announce Type: new Abstract: We present MegaTrain, a memory-centric system that efficiently trains 100B+ parameter large language models at full precision on a single GPU. Unlike traditional GPU-centr...

source arXiv CS.CL date 2026-04-07 entities 1

RAG or Learning? Understanding the Limits of LLM Adaptation under Continuous Knowledge Drift in the Real World

arXiv:2604.05096v1 Announce Type: new Abstract: Large language models (LLMs) acquire most of their knowledge during pretraining, which ties them to a fixed snapshot of the world and makes adaptation to continuously evol...

source arXiv CS.CL date 2026-04-07 entities 4

$\pi^2$: Structure-Originated Reasoning Data Improves Long-Context Reasoning Ability of Large Language Models

arXiv:2604.05114v1 Announce Type: new Abstract: We study a pipeline that curates reasoning data from initial structured data for improving long-context reasoning in large language models (LLMs). Our approach, $\pi^2$, c...

source arXiv CS.CL date 2026-04-07 entities 3

SenseAI: A Human-in-the-Loop Dataset for RLHF-Aligned Financial Sentiment Reasoning

arXiv:2604.05135v1 Announce Type: new Abstract: We introduce SenseAI, a human-in-the-loop (HITL) validated financial sentiment dataset designed to capture not only model outputs but the full reasoning process behind the...

source arXiv CS.CL date 2026-04-07 entities 4

EvolveRouter: Co-Evolving Routing and Prompt for Multi-Agent Question Answering

arXiv:2604.05149v1 Announce Type: new Abstract: Large language model agents often exhibit complementary strengths, making routing a promising approach for multi-agent question answering. However, existing routing method...

source arXiv CS.CL date 2026-04-07 entities 3

Just Pass Twice: Efficient Token Classification with LLMs for Zero-Shot NER

arXiv:2604.05158v1 Announce Type: new Abstract: Large language models encode extensive world knowledge valuable for zero-shot named entity recognition. However, their causal attention mechanism, where tokens attend only...

source arXiv CS.CL date 2026-04-07 entities 5

What Makes a Good Response? An Empirical Analysis of Quality in Qualitative Interviews

arXiv:2604.05163v1 Announce Type: new Abstract: Qualitative interviews provide essential insights into human experiences when they elicit high-quality responses. While qualitative and NLP researchers have proposed vario...

source arXiv CS.CL date 2026-04-07 entities 3

Gradient-Controlled Decoding: A Safety Guardrail for LLMs with Dual-Anchor Steering

arXiv:2604.05179v1 Announce Type: new Abstract: Large language models (LLMs) remain susceptible to jailbreak and direct prompt-injection attacks, yet the strongest defensive filters frequently over-refuse benign queries...

source arXiv CS.CL date 2026-04-07 entities 3

Improving Clinical Trial Recruitment using Clinical Narratives and Large Language Models

arXiv:2604.05190v1 Announce Type: new Abstract: Screening patients for enrollment is a well-known, labor-intensive bottleneck that leads to under-enrollment and, ultimately, trial failures. Recent breakthroughs in large...

source arXiv CS.CL date 2026-04-07 entities 4

Faster Superword Tokenization

arXiv:2604.05192v1 Announce Type: new Abstract: Byte Pair Encoding (BPE) is a widely used tokenization algorithm, whose tokens cannot extend across pre-tokenization boundaries, functionally limiting it to representing a...

source arXiv CS.CL date 2026-04-07 entities 1

XMark: Reliable Multi-Bit Watermarking for LLM-Generated Texts

arXiv:2604.05242v1 Announce Type: new Abstract: Multi-bit watermarking has emerged as a promising solution for embedding imperceptible binary messages into Large Language Model (LLM)-generated text, enabling reliable at...

source arXiv CS.CL date 2026-04-07 entities 3

Exemplar Retrieval Without Overhypothesis Induction: Limits of Distributional Sequence Learning in Early Word Learning

arXiv:2604.05243v1 Announce Type: new Abstract: Background: Children do not simply learn that balls are round and blocks are square. They learn that shape is the kind of feature that tends to define object categories --...

source arXiv CS.CL date 2026-04-07 entities 2

Do Domain-specific Experts exist in MoE-based LLMs?

arXiv:2604.05267v1 Announce Type: new Abstract: In the era of Large Language Models (LLMs), the Mixture of Experts (MoE) architecture has emerged as an effective approach for training extremely large models with improve...

source arXiv CS.CL date 2026-04-07 entities 3

Beneath the Surface: Investigating LLMs' Capabilities for Communicating with Subtext

arXiv:2604.05273v1 Announce Type: new Abstract: Human communication is fundamentally creative, and often makes use of subtext -- implied meaning that goes beyond the literal content of the text. Here, we systematically ...

source arXiv CS.CL date 2026-04-07 entities 3

Right at My Level: A Unified Multilingual Framework for Proficiency-Aware Text Simplification

arXiv:2604.05302v1 Announce Type: new Abstract: Text simplification supports second language (L2) learning by providing comprehensible input, consistent with the Input Hypothesis. However, constructing personalized para...

source arXiv CS.CL date 2026-04-07 entities 4

DIA-HARM: Dialectal Disparities in Harmful Content Detection Across 50 English Dialects

arXiv:2604.05318v1 Announce Type: new Abstract: Harmful content detectors-particularly disinformation classifiers-are predominantly developed and evaluated on Standard American English (SAE), leaving their robustness to...

source arXiv CS.CL date 2026-04-07 entities 2

Human Values Matter: Investigating How Misalignment Shapes Collective Behaviors in LLM Agent Communities

arXiv:2604.05339v1 Announce Type: new Abstract: As LLMs become increasingly integrated into human society, evaluating their orientations on human values from social science has drawn growing attention. Nevertheless, it ...

source arXiv CS.CL date 2026-04-07 entities 3

DQA: Diagnostic Question Answering for IT Support

arXiv:2604.05350v1 Announce Type: new Abstract: Enterprise IT support interactions are fundamentally diagnostic: effective resolution requires iterative evidence gathering from ambiguous user reports to identify an unde...

source arXiv CS.CL date 2026-04-07 entities 4

ICR-Drive: Instruction Counterfactual Robustness for End-to-End Language-Driven Autonomous Driving

arXiv:2604.05378v1 Announce Type: new Abstract: Recent progress in vision-language-action (VLA) models has enabled language-conditioned driving agents to execute natural-language navigation commands in closed-loop simul...

source arXiv CS.CL date 2026-04-07 entities 2

Confidence Should Be Calibrated More Than One Turn Deep

arXiv:2604.05397v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly applied in high-stakes domains such as finance, healthcare, and education, where reliable multi-turn interactions with users ...

source arXiv CS.CL date 2026-04-07 entities 2

Multi-Drafter Speculative Decoding with Alignment Feedback

arXiv:2604.05417v1 Announce Type: new Abstract: Speculative decoding (SD) accelerates large language model (LLM) inference by using a smaller model to draft future tokens, which are then verified by the target LLM. This...

source arXiv CS.CL date 2026-04-07 entities 3

Integrating Artificial Intelligence, Physics, and Internet of Things: A Framework for Cultural Heritage Conservation

arXiv:2604.03233v1 Announce Type: new Abstract: The conservation of cultural heritage increasingly relies on integrating technological innovation with domain expertise to ensure effective monitoring and predictive maint...

source arXiv CS.LG date 2026-04-06 entities 2

Scaling DPPs for RAG: Density Meets Diversity

arXiv:2604.03240v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by grounding generation in external knowledge, yielding relevance responses that are aligned wit...

source arXiv CS.LG date 2026-04-06 entities 3

DRAFT: Task Decoupled Latent Reasoning for Agent Safety

arXiv:2604.03242v1 Announce Type: new Abstract: The advent of tool-using LLM agents shifts safety monitoring from output moderation to auditing long, noisy interaction trajectories, where risk-critical evidence is spars...

source arXiv CS.LG date 2026-04-06 entities 4

General Explicit Network (GEN): A novel deep learning architecture for solving partial differential equations

arXiv:2604.03321v1 Announce Type: new Abstract: Machine learning, especially physics-informed neural networks (PINNs) and their neural network variants, has been widely used to solve problems involving partial different...

source arXiv CS.LG date 2026-04-06 entities 2

Apparent Age Estimation: Challenges and Outcomes

arXiv:2604.03335v1 Announce Type: new Abstract: Apparent age estimation is a valuable tool for business personalization, yet current models frequently exhibit demographic biases. We review prior works on the DEX method ...

source arXiv CS.LG date 2026-04-06 entities 3

NativeTernary: A Self-Delimiting Binary Encoding with Unary Run-Length Hierarchy Markers for Ternary Neural Network Weights, Structured Data, and General Computing Infrastructure

arXiv:2604.03336v1 Announce Type: new Abstract: BitNet b1.58 (Ma et al., 2024) demonstrates that large language models can operate entirely on ternary weights {-1, 0, +1}, yet no native binary wire format exists for suc...

source arXiv CS.LG date 2026-04-06 entities 1

Towards Intelligent Energy Security: A Unified Spatio-Temporal and Graph Learning Framework for Scalable Electricity Theft Detection in Smart Grids

arXiv:2604.03344v1 Announce Type: new Abstract: Electricity theft and non-technical losses (NTLs) remain critical challenges in modern smart grids, causing significant economic losses and compromising grid reliability. ...

source arXiv CS.LG date 2026-04-06 entities 4

Hardware-Oriented Inference Complexity of Kolmogorov-Arnold Networks

arXiv:2604.03345v1 Announce Type: new Abstract: Kolmogorov-Arnold Networks (KANs) have recently emerged as a powerful architecture for various machine learning applications. However, their unique structure raises signif...

source arXiv CS.LG date 2026-04-06 entities 3

From Model-Based Screening to Data-Driven Surrogates: A Multi-Stage Workflow for Exploring Stochastic Agent-Based Models

arXiv:2604.03350v1 Announce Type: new Abstract: Systematic exploration of Agent-Based Models (ABMs) is challenged by the curse of dimensionality and their inherent stochasticity. We present a multi-stage pipeline integr...

source arXiv CS.LG date 2026-04-06 entities 2

The limits of bio-molecular modeling with large language models : a cross-scale evaluation

arXiv:2604.03361v1 Announce Type: new Abstract: The modeling of bio-molecular system across molecular scales remains a central challenge in scientific research. Large language models (LLMs) are increasingly applied to b...

source arXiv CS.LG date 2026-04-06 entities 4

Scalable Variational Bayesian Fine-Tuning of LLMs via Orthogonalized Low-Rank Adapters

arXiv:2604.03388v1 Announce Type: new Abstract: When deploying large language models (LLMs) to safety-critical applications, uncertainty quantification (UQ) is of utmost importance to self-assess the reliability of the ...

source arXiv CS.LG date 2026-04-06 entities 4

Beauty in the Eye of AI: Aligning LLMs and Vision Models with Human Aesthetics in Network Visualization

arXiv:2604.03417v1 Announce Type: new Abstract: Network visualization has traditionally relied on heuristic metrics, such as stress, under the assumption that optimizing them leads to aesthetic and informative layouts. ...

source arXiv CS.LG date 2026-04-06 entities 3

Adaptive Threshold-Driven Continuous Greedy Method for Scalable Submodular Optimization

arXiv:2604.03419v1 Announce Type: new Abstract: Submodular maximization under matroid constraints is a fundamental problem in combinatorial optimization with applications in sensing, data summarization, active learning,...

source arXiv CS.LG date 2026-04-06 entities 2

Adversarial Robustness of Deep State Space Models for Forecasting

arXiv:2604.03427v1 Announce Type: new Abstract: State-space model (SSM) for time-series forecasting have demonstrated strong empirical performance on benchmark datasets, yet their robustness under adversarial perturbati...

source arXiv CS.LG date 2026-04-06 entities 3

MetaSAEs: Joint Training with a Decomposability Penalty Produces More Atomic Sparse Autoencoder Latents

arXiv:2604.03436v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) are increasingly used for safety-relevant applications including alignment detection and model steering. These use cases require SAE latents to ...

source arXiv CS.LG date 2026-04-06 entities 3

Olmo Hybrid: From Theory to Practice and Back

arXiv:2604.03444v2 Announce Type: new Abstract: Recent work has demonstrated the potential of non-transformer language models, especially linear recurrent neural networks (RNNs) and hybrid models that mix recurrence and...

source arXiv CS.LG date 2026-04-06 entities 3

Neural Operators for Multi-Task Control and Adaptation

arXiv:2604.03449v1 Announce Type: new Abstract: Neural operator methods have emerged as powerful tools for learning mappings between infinite-dimensional function spaces, yet their potential in optimal control remains l...

source arXiv CS.LG date 2026-04-06 entities 4

Earth Embeddings Reveal Diverse Urban Signals from Space

arXiv:2604.03456v1 Announce Type: new Abstract: Conventional urban indicators derived from censuses, surveys, and administrative records are often costly, spatially inconsistent, and slow to update. Recent geospatial fo...

source arXiv CS.LG date 2026-04-06 entities 2

Super Agents and Confounders: Influence of surrounding agents on vehicle trajectory prediction

arXiv:2604.03463v1 Announce Type: new Abstract: In highly interactive driving scenes, trajectory prediction is conditioned on information from surrounding traffic participants such as cars and pedestrians. Our main cont...

source arXiv CS.LG date 2026-04-06 entities 2

Investigating Data Interventions for Subgroup Fairness: An ICU Case Study

arXiv:2604.03478v1 Announce Type: new Abstract: In high-stakes settings where machine learning models are used to automate decision-making about individuals, the presence of algorithmic bias can exacerbate systemic harm...

source arXiv CS.LG date 2026-04-06 entities 2

Improving Feasibility via Fast Autoencoder-Based Projections

arXiv:2604.03489v1 Announce Type: new Abstract: Enforcing complex (e.g., nonconvex) operational constraints is a critical challenge in real-world learning and control systems. However, existing methods struggle to effic...

source arXiv CS.LG date 2026-04-06 entities 2

Online learning of smooth functions on $\mathbb{R}$

arXiv:2604.03525v1 Announce Type: new Abstract: We study adversarial online learning of real-valued functions on $\mathbb{R}$. In each round the learner is queried at $x_t\in\mathbb{R}$, predicts $\hat y_t$, and then ob...

source arXiv CS.LG date 2026-04-06 entities 1

Choosing the Right Regularizer for Applied ML: Simulation Benchmarks of Popular Scikit-learn Regularization Frameworks

arXiv:2604.03541v2 Announce Type: new Abstract: This study surveys the historical development of regularization, tracing its evolution from stepwise regression in the 1960s to recent advancements in formal error control...

source arXiv CS.LG date 2026-04-06 entities 4

Simple yet Effective: Low-Rank Spatial Attention for Neural Operators

arXiv:2604.03582v1 Announce Type: new Abstract: Neural operators have emerged as data-driven surrogates for solving partial differential equations (PDEs), and their success hinges on efficiently modeling the long-range,...

source arXiv CS.LG date 2026-04-06 entities 2

Evaluation of Bagging Predictors with Kernel Density Estimation and Bagging Score

arXiv:2604.03599v1 Announce Type: new Abstract: For a larger set of predictions of several differently trained machine learning models, known as bagging predictors, the mean of all predictions is taken by default. Never...

source arXiv CS.LG date 2026-04-06 entities 3

BlazeFL: Fast and Deterministic Federated Learning Simulation

arXiv:2604.03606v1 Announce Type: new Abstract: Federated learning (FL) research increasingly relies on single-node simulations with hundreds or thousands of virtual clients, making both efficiency and reproducibility e...

source arXiv CS.LG date 2026-04-06 entities 3

Neural Global Optimization via Iterative Refinement from Noisy Samples

arXiv:2604.03614v1 Announce Type: new Abstract: Global optimization of black-box functions from noisy samples is a fundamental challenge in machine learning and scientific computing. Traditional methods such as Bayesian...

source arXiv CS.LG date 2026-04-06 entities 2

Algebraic Diversity: Group-Theoretic Spectral Estimation from Single Observations

arXiv:2604.03634v1 Announce Type: new Abstract: We prove that temporal averaging over multiple observations can be replaced by algebraic group action on a single observation for second-order statistical estimation. A Ge...

source arXiv CS.LG date 2026-04-06 entities 3

Delayed Homomorphic Reinforcement Learning for Environments with Delayed Feedback

arXiv:2604.03641v1 Announce Type: new Abstract: Reinforcement learning in real-world systems is often accompanied by delayed feedback, which breaks the Markov assumption and impedes both learning and control. Canonical ...

source arXiv CS.LG date 2026-04-06 entities 3

Automated Attention Pattern Discovery at Scale in Large Language Models

arXiv:2604.03764v1 Announce Type: new Abstract: Large language models have found success by scaling up capabilities to work in general settings. The same can unfortunately not be said for interpretability methods. The c...

source arXiv CS.LG date 2026-04-06 entities 5

Generative AI for Video Trailer Synthesis: From Extractive Heuristics to Autoregressive Creativity

arXiv:2604.04953v1 Announce Type: new Abstract: The domain of automatic video trailer generation is currently undergoing a profound paradigm shift, transitioning from heuristic-based extraction methods to deep generativ...

source arXiv CS.CV date 2026-04-07 entities 5

RCP: Representation Consistency Pruner for Mitigating Distribution Shift in Large Vision-Language Models

arXiv:2604.04972v1 Announce Type: new Abstract: Large Vision-Language Models (LVLMs) suffer from prohibitive inference costs due to the massive number of visual tokens processed by the language decoder. Existing pruning...

source arXiv CS.CV date 2026-04-07 entities 4

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

arXiv:2604.05015v1 Announce Type: new Abstract: With the rapid advancement of video understanding, existing benchmarks are becoming increasingly saturated, exposing a critical discrepancy between inflated leaderboard sc...

source arXiv CS.CV date 2026-04-07 entities 4

ID-Sim: An Identity-Focused Similarity Metric

arXiv:2604.05039v1 Announce Type: new Abstract: Humans have remarkable selective sensitivity to identities -- easily distinguishing between highly similar identities, even across significantly different contexts such as...

source arXiv CS.CV date 2026-04-07 entities 3

R3PM-Net: Real-time, Robust, Real-world Point Matching Network

arXiv:2604.05060v1 Announce Type: new Abstract: Accurate Point Cloud Registration (PCR) is an important task in 3D data processing, involving the estimation of a rigid transformation between two point clouds. While deep...

source arXiv CS.CV date 2026-04-07 entities 2

SVAgent: Storyline-Guided Long Video Understanding via Cross-Modal Multi-Agent Collaboration

arXiv:2604.05079v1 Announce Type: new Abstract: Video question answering (VideoQA) is a challenging task that requires integrating spatial, temporal, and semantic information to capture the complex dynamics of video seq...

source arXiv CS.CV date 2026-04-07 entities 3

Simultaneous Dual-View Mammogram Synthesis Using Denoising Diffusion Probabilistic Models

arXiv:2604.05110v1 Announce Type: new Abstract: Breast cancer screening relies heavily on mammography, where the craniocaudal (CC) and mediolateral oblique (MLO) views provide complementary information for diagnosis. Ho...

source arXiv CS.CV date 2026-04-07 entities 5

Watch Before You Answer: Learning from Visually Grounded Post-Training

arXiv:2604.05117v1 Announce Type: new Abstract: It is critical for vision-language models (VLMs) to comprehensively understand visual, temporal, and textual cues. However, despite rapid progress in multimodal modeling, ...

source arXiv CS.CV date 2026-04-07 entities 3

Lightweight True In-Pixel Encryption with FeFET Enabled Pixel Design for Secure Imaging

arXiv:2604.05147v1 Announce Type: new Abstract: Ensuring end-to-end security in image sensors has become essential as visual data can be exposed through multiple stages of the imaging pipeline. Advanced protection requi...

source arXiv CS.CV date 2026-04-07 entities 2

Modality-Aware and Anatomical Vector-Quantized Autoencoding for Multimodal Brain MRI

arXiv:2604.05171v1 Announce Type: new Abstract: Learning a robust Variational Autoencoder (VAE) is a fundamental step for many deep learning applications in medical image analysis, such as MRI synthesizes. Existing brai...

source arXiv CS.CV date 2026-04-07 entities 3

MIRAGE: Benchmarking and Aligning Multi-Instance Image Editing

arXiv:2604.05180v1 Announce Type: new Abstract: Instruction-guided image editing has seen remarkable progress with models like FLUX.2 and Qwen-Image-Edit, yet they still struggle with complex scenarios with multiple sim...

source arXiv CS.CV date 2026-04-07 entities 3

LSRM: High-Fidelity Object-Centric Reconstruction via Scaled Context Windows

arXiv:2604.05182v1 Announce Type: new Abstract: We introduce the Large Sparse Reconstruction Model to study how scaling transformer context windows impacts feed-forward 3D reconstruction. Although recent object-centric ...

source arXiv CS.CV date 2026-04-07 entities 3

OrthoFuse: Training-free Riemannian Fusion of Orthogonal Style-Concept Adapters for Diffusion Models

arXiv:2604.05183v1 Announce Type: new Abstract: In a rapidly growing field of model training there is a constant practical interest in parameter-efficient fine-tuning and various techniques that use a small amount of tr...

source arXiv CS.CV date 2026-04-07 entities 3

Integration of Object Detection and Small VLMs for Construction Safety Hazard Identification

arXiv:2604.05210v1 Announce Type: new Abstract: Accurate and timely identification of construction hazards around workers is essential for preventing workplace accidents. While large vision-language models (VLMs) demons...

source arXiv CS.CV date 2026-04-07 entities 3

Boxer: Robust Lifting of Open-World 2D Bounding Boxes to 3D

arXiv:2604.05212v1 Announce Type: new Abstract: Detecting and localizing objects in space is a fundamental computer vision problem. While much progress has been made to solve 2D object detection, 3D object localization ...

source arXiv CS.CV date 2026-04-07 entities 4

Hierarchical Mesh Transformers with Topology-Guided Pretraining for Morphometric Analysis of Brain Structures

arXiv:2604.05215v1 Announce Type: new Abstract: Representation learning on large-scale unstructured volumetric and surface meshes poses significant challenges in neuroimaging, especially when models must incorporate div...

source arXiv CS.CV date 2026-04-07 entities 4

Active Measurement of Two-Point Correlations

arXiv:2604.05227v1 Announce Type: new Abstract: Two-point correlation functions (2PCF) are widely used to characterize how points cluster in space. In this work, we study the problem of measuring the 2PCF over a large s...

source arXiv CS.CV date 2026-04-07 entities 1

Protecting and Preserving Protest Dynamics for Responsible Analysis

arXiv:2604.05256v1 Announce Type: new Abstract: Protest-related social media data are valuable for understanding collective action but inherently high-risk due to concerns surrounding surveillance, repression, and indiv...

source arXiv CS.CV date 2026-04-07 entities 1

Coverage Optimization for Camera View Selection

arXiv:2604.05259v1 Announce Type: new Abstract: What makes a good viewpoint? The quality of the data used to learn 3D reconstructions is crucial for enabling efficient and accurate scene modeling. We study the active vi...

source arXiv CS.CV date 2026-04-07 entities 2

Region-R1: Reinforcing Query-Side Region Cropping for Multi-Modal Re-Ranking

arXiv:2604.05268v1 Announce Type: new Abstract: Multi-modal retrieval-augmented generation (MM-RAG) relies heavily on re-rankers to surface the most relevant evidence for image-question queries. However, standard re-ran...

source arXiv CS.CV date 2026-04-07 entities 3

Toward Unified Fine-Grained Vehicle Classification and Automatic License Plate Recognition

arXiv:2604.05271v1 Announce Type: new Abstract: Extracting vehicle information from surveillance images is essential for intelligent transportation systems, enabling applications such as traffic monitoring and criminal ...

source arXiv CS.CV date 2026-04-07 entities 4

From Measurement to Mitigation: Quantifying and Reducing Identity Leakage in Image Representation Encoders with Linear Subspace Removal

arXiv:2604.05296v1 Announce Type: new Abstract: Frozen visual embeddings (e.g., CLIP, DINOv2/v3, SSCD) power retrieval and integrity systems, yet their use on face-containing data is constrained by unmeasured identity l...

source arXiv CS.CV date 2026-04-07 entities 3

SmokeGS-R: Physics-Guided Pseudo-Clean 3DGS for Real-World Multi-View Smoke Restoration

arXiv:2604.05301v1 Announce Type: new Abstract: Real-world smoke simultaneously attenuates scene radiance, adds airlight, and destabilizes multi-view appearance consistency, making robust 3D reconstruction particularly ...

source arXiv CS.CV date 2026-04-07 entities 2

Indoor Asset Detection in Large Scale 360{\deg} Drone-Captured Imagery via 3D Gaussian Splatting

arXiv:2604.05316v1 Announce Type: new Abstract: We present an approach for object-level detection and segmentation of target indoor assets in 3D Gaussian Splatting (3DGS) scenes, reconstructed from 360{\deg} drone-captu...

source arXiv CS.CV date 2026-04-07 entities 3

VLA-InfoEntropy: A Training-Free Vision-Attention Information Entropy Approach for Vision-Language-Action Models Inference Acceleration and Success

arXiv:2604.05323v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models integrate visual perception, language understanding, and action decision-making for cross-modal semantic alignment, exhibiting broad ap...

source arXiv CS.CV date 2026-04-07 entities 2

Unsupervised Multi-agent and Single-agent Perception from Cooperative Views

arXiv:2604.05354v1 Announce Type: new Abstract: The LiDAR-based multi-agent and single-agent perception has shown promising performance in environmental understanding for robots and automated vehicles. However, there is...

source arXiv CS.CV date 2026-04-07 entities 3

GESS: Multi-cue Guided Local Feature Learning via Geometric and Semantic Synergy

arXiv:2604.05359v1 Announce Type: new Abstract: Robust local feature detection and description are foundational tasks in computer vision. Existing methods primarily rely on single appearance cues for modeling, leading t...

source arXiv CS.CV date 2026-04-07 entities 2

Rethinking IRSTD: Single-Point Supervision Guided Encoder-only Framework is Enough for Infrared Small Target Detection

arXiv:2604.05363v1 Announce Type: new Abstract: Infrared small target detection (IRSTD) aims to separate small targets from clutter backgrounds. Extensive research is dedicated to the pixel-level supervision-guided "enc...

source arXiv CS.CV date 2026-04-07 entities 2

3DTurboQuant: Training-Free Near-Optimal Quantization for 3D Reconstruction Models

arXiv:2604.05366v1 Announce Type: new Abstract: Every existing method for compressing 3D Gaussian Splatting, NeRF, or transformer-based 3D reconstructors requires learning a data-dependent codebook through per-scene fin...

source arXiv CS.CV date 2026-04-07 entities 2

UAVReason: A Unified, Large-Scale Benchmark for Multimodal Aerial Scene Reasoning and Generation

arXiv:2604.05377v1 Announce Type: new Abstract: Vision-Language models (VLMs) have demonstrated remarkable capability in ground-view visual understanding but often fracture when deployed on high-altitude Unmanned Aerial...

source arXiv CS.CV date 2026-04-07 entities 4

Constraint-Driven Warm-Freeze for Efficient Transfer Learning in Photovoltaic Systems

arXiv:2604.05807v1 Announce Type: new Abstract: Detecting cyberattacks in photovoltaic (PV) monitoring and MPPT control signals requires models that are robust to bias, drift, and transient spikes, yet lightweight enoug...

source arXiv CS.NE date 2026-04-07 entities 3

Activity-Dependent Plasticity in Morphogenetically-Grown Recurrent Networks

arXiv:2604.03386v1 Announce Type: cross Abstract: Developmental approaches to neural architecture search grow functional networks from compact genomes through self-organisation, but the resulting networks operate with f...

source arXiv CS.NE date 2026-04-07 entities 2

An Imbalanced Dataset with Multiple Feature Representations for Studying Quality Control of Next-Generation Sequencing

arXiv:2604.04981v1 Announce Type: cross Abstract: Next-generation sequencing (NGS) is a key technique for studying the DNA and RNA of organisms. However, identifying quality problems in NGS data across different experim...

source arXiv CS.NE date 2026-04-07 entities 3

Neural Network Pruning via QUBO Optimization

arXiv:2604.05856v1 Announce Type: cross Abstract: Neural network pruning can be formulated as a combinatorial optimization problem, yet most existing approaches rely on greedy heuristics that ignore complex interactions...

source arXiv CS.NE date 2026-04-07 entities 3

ECLIPSE: An Evolutionary Computation Library for Instrumentation Prototyping in Scientific Engineering

arXiv:2601.05098v3 Announce Type: replace Abstract: Designing scientific instrumentation often requires exploring large, highly constrained design spaces using computationally expensive physics simulations. These simula...

source arXiv CS.NE date 2026-04-07 entities 2

MOELIGA: a multi-objective evolutionary approach for feature selection with local improvement

arXiv:2603.20934v2 Announce Type: replace Abstract: Selecting the most relevant or informative features is a key issue in actual machine learning problems. Since an exhaustive search is not feasible even for a moderate ...

source arXiv CS.NE date 2026-04-07 entities 3

A systematic review of metaheuristics-based and machine learning-driven intrusion detection systems in IoT

arXiv:2506.00377v4 Announce Type: replace-cross Abstract: The widespread adoption of the Internet of Things (IoT) has raised a new challenge for developers since it is prone to known and unknown cyberattacks due to its ...

source arXiv CS.NE date 2026-04-07 entities 3

Identification and Inference in Nonlinear Dynamic Network Models

arXiv:2604.04961v1 Announce Type: new Abstract: We study identification and inference in nonlinear dynamic systems defined on unknown interaction networks. The system evolves through an unobserved dependence matrix gove...

source arXiv Stat.ML date 2026-04-07 entities 1

Learning Nonlinear Regime Transitions via Semi-Parametric State-Space Models

arXiv:2604.04963v1 Announce Type: new Abstract: We develop a semi-parametric state-space model for time-series data with latent regime transitions. Classical Markov-switching models use fixed parametric transition funct...

source arXiv Stat.ML date 2026-04-07 entities 2

StrADiff: A Structured Source-Wise Adaptive Diffusion Framework for Linear and Nonlinear Blind Source Separation

arXiv:2604.04973v1 Announce Type: new Abstract: This paper presents a Structured Source-Wise Adaptive Diffusion Framework for linear and nonlinear blind source separation. The framework interprets each latent dimension ...

source arXiv Stat.ML date 2026-04-07 entities 2

The Hiremath Early Detection (HED) Score: A Measure-Theoretic Evaluation Standard for Temporal Intelligence

arXiv:2604.04993v1 Announce Type: new Abstract: We introduce the Hiremath Early Detection (HED) Score, a principled, measure-theoretic evaluation criterion for quantifying the time-value of information in systems operat...

source arXiv Stat.ML date 2026-04-07 entities 2

Generative Path-Law Jump-Diffusion: Sequential MMD-Gradient Flows and Generalisation Bounds in Marcus-Signature RKHS

arXiv:2604.05008v1 Announce Type: new Abstract: This paper introduces a novel generative framework for synthesising forward-looking, c\`adl\`ag stochastic trajectories that are sequentially consistent with time-evolving...

source arXiv Stat.ML date 2026-04-07 entities 2

Individual-heterogeneous sub-Gaussian Mixture Models

arXiv:2604.05337v1 Announce Type: new Abstract: The classical Gaussian mixture model assumes homogeneity within clusters, an assumption that often fails in real-world data where observations naturally exhibit varying sc...

source arXiv Stat.ML date 2026-04-07 entities 2

MEC: Machine-Learning-Assisted Generalized Entropy Calibration for Semi-Supervised Mean Estimation

arXiv:2604.05446v1 Announce Type: new Abstract: Obtaining high-quality labels is costly, whereas unlabeled covariates are often abundant, motivating semi-supervised inference methods with reliable uncertainty quantifica...

source arXiv Stat.ML date 2026-04-07 entities 1

Hierarchical Contrastive Learning for Multimodal Data

arXiv:2604.05462v1 Announce Type: new Abstract: Multimodal representation learning is commonly built on a shared-private decomposition, treating latent information as either common to all modalities or specific to one. ...

source arXiv Stat.ML date 2026-04-07 entities 3

Efficient machine unlearning with minimax optimality

arXiv:2604.05669v1 Announce Type: new Abstract: There is a growing demand for efficient data removal to comply with regulations like the GDPR and to mitigate the influence of biased or corrupted data. This has motivated...

source arXiv Stat.ML date 2026-04-07 entities 1

Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification

arXiv:2604.06032v1 Announce Type: new Abstract: Neural network classifiers trained with cross-entropy loss achieve strong predictive accuracy but lack the capability to provide inherent predictive uncertainty estimates,...

source arXiv Stat.ML date 2026-04-07 entities 4

Cactus: Accelerating Auto-Regressive Decoding with Constrained Acceptance Speculative Sampling

arXiv:2604.04987v1 Announce Type: cross Abstract: Speculative sampling (SpS) has been successful in accelerating the decoding throughput of auto-regressive large language models by leveraging smaller draft models. SpS s...

source arXiv Stat.ML date 2026-04-07 entities 2

Blind-Spot Mass: A Good-Turing Framework for Quantifying Deployment Coverage Risk in Machine Learning Systems

arXiv:2604.05057v1 Announce Type: cross Abstract: Blind-spot mass is a Good-Turing framework for quantifying deployment coverage risk in machine learning. In modern ML systems, operational state distributions are often ...

source arXiv Stat.ML date 2026-04-07 entities 3

fastml: Guarded Resampling Workflows for Safer Automated Machine Learning in R

arXiv:2604.05225v1 Announce Type: cross Abstract: Preprocessing leakage arises when scaling, imputation, or other data-dependent transformations are estimated before resampling, inflating apparent performance while rema...

source arXiv Stat.ML date 2026-04-07 entities 2

Jeffreys Flow: Robust Boltzmann Generators for Rare Event Sampling via Parallel Tempering Distillation

arXiv:2604.05303v1 Announce Type: cross Abstract: Sampling physical systems with rough energy landscapes is hindered by rare events and metastable trapping. While Boltzmann generators already offer a solution, their rel...

source arXiv Stat.ML date 2026-04-07 entities 2

Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models

arXiv:2604.05469v1 Announce Type: cross Abstract: We study language models as evolving model organisms and ask when autoregressive next-token learning selects for world-tracking representations. For any encoding of late...

source arXiv Stat.ML date 2026-04-07 entities 2

Optimal Centered Active Excitation in Linear System Identification

arXiv:2604.05518v1 Announce Type: cross Abstract: We propose an active learning algorithm for linear system identification with optimal centered noise excitation. Notably, our algorithm, based on ordinary least squares ...

source arXiv Stat.ML date 2026-04-07 entities 1

High-dimensional reliability-based design optimization using stochastic emulators

arXiv:2604.05759v1 Announce Type: cross Abstract: Reliability-based design optimization (RBDO) is traditionally formulated as a nested optimization and reliability problem. Although surrogate models are generally employ...

source arXiv Stat.ML date 2026-04-07 entities 2

Effective Dynamics and Transition Pathways from Koopman-Inspired Neural Learning of Collective Variables

arXiv:2604.05778v1 Announce Type: cross Abstract: The ISOKANN (Invariant Subspaces of Koopman Operators Learned by Artificial Neural Networks) framework provides a data-driven route to extract collective variables (CVs)...

source arXiv Stat.ML date 2026-04-07 entities 3

Bivariate Causal Discovery Using Rate-Distortion MDL: An Information Dimension Approach

arXiv:2604.05829v1 Announce Type: cross Abstract: Approaches to bivariate causal discovery based on the minimum description length (MDL) principle approximate the (uncomputable) Kolmogorov complexity of the models in ea...

source arXiv Stat.ML date 2026-04-07 entities 3

Expectation Maximization (EM) Converges for General Agnostic Mixtures

arXiv:2604.05842v1 Announce Type: cross Abstract: Mixture of linear regression is well studied in statistics and machine learning, where the data points are generated probabilistically using $k$ linear models. Algorithm...

source arXiv Stat.ML date 2026-04-07 entities 3

Data Distribution Valuation Using Generalized Bayesian Inference

arXiv:2604.05993v1 Announce Type: cross Abstract: We investigate the data distribution valuation problem, which aims to quantify the values of data distributions from their samples. This is a recently proposed problem t...

source arXiv Stat.ML date 2026-04-07 entities 2

Lipschitz regularity in Flow Matching and Diffusion Models: sharp sampling rates and functional inequalities

arXiv:2604.06065v1 Announce Type: cross Abstract: Under general assumptions on the target distribution $p^\star$, we establish a sharp Lipschitz regularity theory for flow-matching vector fields and diffusion-model scor...

source arXiv Stat.ML date 2026-04-07 entities 2

Sequential Audit Sampling with Statistical Guarantees

arXiv:2604.06116v1 Announce Type: cross Abstract: Financial statement auditing is conducted under a risk-based evidence approach to obtain reasonable assurance. In practice, auditors often perform additional sampling or...

source arXiv Stat.ML date 2026-04-07 entities 1

In-Place Test-Time Training

arXiv:2604.06169v1 Announce Type: cross Abstract: The static ``train then deploy" paradigm fundamentally limits Large Language Models (LLMs) from dynamically adapting their weights in response to continuous streams of n...

source arXiv Stat.ML date 2026-04-07 entities 3

On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains

arXiv:2305.02657v5 Announce Type: replace Abstract: In this paper, we provide a strategy to determine the eigenvalue decay rate (EDR) of a large class of kernel functions defined on a general domain rather than $\mathbb...

source arXiv Stat.ML date 2026-04-07 entities 3

Gaussian process surrogate with physical law-corrected prior for multi-coupled PDEs defined on irregular geometry

arXiv:2509.02617v2 Announce Type: replace Abstract: Parametric partial differential equations (PDEs) serve as fundamental mathematical tools for modeling complex physical phenomena, yet repeated high-fidelity numerical ...

source arXiv Stat.ML date 2026-04-07 entities 2

Causal Effect Estimation with Learned Instrument Representations

arXiv:2602.10370v2 Announce Type: replace Abstract: Instrumental variable (IV) methods mitigate bias from unobserved confounding in observational causal inference but rely on the availability of a valid instrument, whic...

source arXiv Stat.ML date 2026-04-07 entities 2

Transfer Learning for Meta-analysis Under Covariate Shift

arXiv:2604.02656v2 Announce Type: replace Abstract: Randomized controlled trials often do not represent the populations where decisions are made, and covariate shift across studies can invalidate standard IPD meta-analy...

source arXiv Stat.ML date 2026-04-07 entities 3

Edgeworth Accountant: An Analytical Approach to Differential Privacy Composition

arXiv:2206.04236v3 Announce Type: replace-cross Abstract: In privacy-preserving data analysis, many procedures and algorithms are structured as compositions of multiple private building blocks. As such, an important que...

source arXiv Stat.ML date 2026-04-07 entities 2

Understanding Uncertainty Sampling via Equivalent Loss

arXiv:2307.02719v4 Announce Type: replace-cross Abstract: Uncertainty sampling is a prevalent active learning algorithm that queries sequentially the annotations of data samples which the current prediction model is unc...

source arXiv Stat.ML date 2026-04-07 entities 2

New ways to balance cost and reliability in the Gemini API

Gemini API Dials

source Google AI Blog date 2026-04-02 entities 2

Create, edit and share videos at no cost in Google Vids

Google Vids logo surrounded by various video editing UI

source Google AI Blog date 2026-04-02 entities 1

We’re creating a new satellite imagery map to help protect Brazil’s forests.

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/bfc_keyword_before_after_white_.max-600x600.format-webp_QNQ8psB.webp" />Google partnered with the Brazilian government on a satellite imagery map ...

source Google AI Blog date 2026-04-01 entities 1

The latest AI news we announced in March 2026

March 2026 AI Recap showing new updates

source Google AI Blog date 2026-04-01 entities 1

Build with Veo 3.1 Lite, our most cost-effective video generation model

Build with Veo 3.1 Lite

source Google AI Blog date 2026-03-31 entities 2

Watch James Manyika talk AI and creativity with LL COOL J.

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/Approved_thumbnail_with_logo.max-600x600.format-webp.webp" />In the latest episode of our Dialogues on Technology and Society series, LL COOL J si...

source Google AI Blog date 2026-03-26 entities 1

Transform your headphones into a live personal translator on iOS.

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/Static-InterstitialscreenJapane.max-600x600.format-webp.webp" />Google Translate’s Live translate with headphones is officially arriving on iOS! A...

source Google AI Blog date 2026-03-26 entities 1

Gemini 3.1 Flash Live: Making audio AI more natural and reliable

The Gemini emblem sits next to text reading 'Gemini 3.1 Flash Live'. The background has blue, multicolored dots making up a microphone icon

source Google AI Blog date 2026-03-26 entities 2

Search Live is expanding globally

A graphic with the words Search Live shown underneath a waveform icon. To the right, a phone shows the Google app with Search Live open. The camera is pointing at trees in a forest.

source Google AI Blog date 2026-03-26 entities 1

Build with Lyria 3, our newest music generation model

Google Lyria teaser

source Google AI Blog date 2026-03-25 entities 2

Lyria 3 Pro: Create longer tracks in more Google products

Sizzle video showing new capabilities from Lyria 3 Pro

source Google AI Blog date 2026-03-25 entities 1

Bringing the power of Personal Intelligence to more people

Bubble that says "Personal Intelligence" with Google G, Google Photos logo, and Gmail logo around it

source Google AI Blog date 2026-03-17

Our latest investment in open source security for the AI era

A collage including security icons and photos of hands clasped, a man looking at a computer, and two people pointing at something off camera

source Google AI Blog date 2026-03-17 entities 1

How AI is helping improve heart health in rural Australia

A doctor is sitting across a desk from a patient. The doctor is holding a tablet and a pen. Medical charts renderings are in the background.

source Google AI Blog date 2026-03-12 entities 1

Gemini in Google Sheets just achieved state-of-the-art performance.

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/Workspace_Jan_Moment_Sheets_Blo.max-600x600.format-webp.webp" />Today we announced new beta features for Gemini in Sheets to help you create, orga...

source Google AI Blog date 2026-03-10 entities 3

How our open-source AI model SpeciesNet is helping to promote wildlife conservation

Photos of animals being identified by the SpeciesNet AI model

source Google AI Blog date 2026-03-06 entities 1

Ask a Techspert: How does AI understand my visual searches?

Mobile phone with a search bar that says "Ask anything"

source Google AI Blog date 2026-03-05 entities 1

The latest AI news we announced in February

an MP4 of a carousel with images reading "Gemini 3.1 Pro" and "Nano Banana 2"

source Google AI Blog date 2026-03-05 entities 2

Use Canvas in AI Mode to get things done and bring your ideas to life, right in Search.

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/AIMMode_Social.max-600x600.format-webp.webp" />Canvas in AI Mode is now available for everyone in the U.S. Plus, it can now help you draft documen...

source Google AI Blog date 2026-03-04 entities 1

Create new worlds in Project Genie with these 4 tips

A screen capture of Project Genie, an experimental interface showing a grid of circular images, many of which appear to be 360-degree views, with a large, central black globe labeled Create your own

source Google AI Blog date 2026-03-03 entities 1

The next phase of enterprise AI

OpenAI outlines the next phase of enterprise AI, as adoption accelerates across industries with Frontier, ChatGPT Enterprise, Codex, and company-wide AI agents.

source OpenAI Blog date 2026-04-08 entities 2

Introducing the Child Safety Blueprint

Discover OpenAI’s Child Safety Blueprint—a roadmap for building AI responsibly with safeguards, age-appropriate design, and collaboration to protect and empower young people online.

source OpenAI Blog date 2026-04-07 entities 2

Announcing the OpenAI Safety Fellowship

A pilot program to support independent safety and alignment research and develop the next generation of talent

source OpenAI Blog date 2026-04-06 entities 4

Industrial policy for the Intelligence Age

Explore our ambitious, people-first industrial policy ideas for the AI era—focused on expanding opportunity, sharing prosperity, and building resilient institutions as advanced intelligence evolves.

source OpenAI Blog date 2026-04-05 entities 1

OpenAI acquires TBPN

OpenAI acquires TBPN to accelerate global conversations around AI and support independent media, expanding dialogue with builders, businesses, and the broader tech community.

source OpenAI Blog date 2026-04-02 entities 3

Codex now offers more flexible pricing for teams

Codex now includes pay-as-you-go pricing for ChatGPT Business and Enterprise, providing teams a more flexible option to start and scale adoption.

source OpenAI Blog date 2026-04-02 entities 1

Gradient Labs gives every bank customer an AI account manager

Gradient Labs uses GPT-4.1 and GPT-5.4 mini and nano to power AI agents that automate banking support workflows with low latency and high reliability.

source OpenAI Blog date 2026-03-31 entities 2

Accelerating the next phase of AI

OpenAI raises $122 billion in new funding to expand frontier AI globally, invest in next-generation compute, and meet growing demand for ChatGPT, Codex, and enterprise AI.

source OpenAI Blog date 2026-03-31 entities 3

Helping disaster response teams turn AI into action across Asia

AI for Disaster Response in Asia: OpenAI Workshop with Gates Foundation

source OpenAI Blog date 2026-03-29 entities 3

STADLER reshapes knowledge work at a 230-year-old company

Learn how STADLER uses ChatGPT to transform knowledge work, saving time and accelerating productivity across 650 employees.

source OpenAI Blog date 2026-03-27 entities 1

Inside our approach to the Model Spec

Learn how OpenAI’s Model Spec serves as a public framework for model behavior, balancing safety, user freedom, and accountability as AI systems advance.

source OpenAI Blog date 2026-03-25 entities 2

Introducing the OpenAI Safety Bug Bounty program

OpenAI launches a Safety Bug Bounty program to identify AI abuse and safety risks, including agentic vulnerabilities, prompt injection, and data exfiltration.

source OpenAI Blog date 2026-03-24 entities 2

Helping developers build safer AI experiences for teens

OpenAI releases prompt-based teen safety policies for developers using gpt-oss-safeguard, helping moderate age-specific risks in AI systems.

source OpenAI Blog date 2026-03-24 entities 3

Update on the OpenAI Foundation

The OpenAI Foundation announces plans to invest at least $1 billion in curing diseases, economic opportunity, AI resilience, and community programs.

source OpenAI Blog date 2026-03-24 entities 2

Powering product discovery in ChatGPT

ChatGPT introduces richer, visually immersive shopping powered by the Agentic Commerce Protocol, enabling product discovery, side-by-side comparisons, and merchant integration.

source OpenAI Blog date 2026-03-24 entities 1

Creating with Sora Safely

To address the novel safety challenges posed by a state-of-the-art video model as well as a new social creation platform, we’ve built Sora 2 and the Sora app with safety at the foundation. Our approach is anchored in con...

source OpenAI Blog date 2026-03-22 entities 3

How we monitor internal coding agents for misalignment

How OpenAI uses chain-of-thought monitoring to study misalignment in internal coding agents—analyzing real-world deployments to detect risks and strengthen AI safety safeguards.

source OpenAI Blog date 2026-03-19 entities 3

OpenAI to acquire Astral

Accelerates Codex growth to power the next generation of Python developer tools

source OpenAI Blog date 2026-03-18 entities 3

OpenAI Japan announces Japan Teen Safety Blueprint to put teen safety first

OpenAI Japan announces the Japan Teen Safety Blueprint, introducing stronger age protections, parental controls, and well-being safeguards for teens using generative AI.

source OpenAI Blog date 2026-03-17 entities 2

Introducing GPT-5.4 mini and nano

GPT-5.4 mini and nano are smaller, faster versions of GPT-5.4 optimized for coding, tool use, multimodal reasoning, and high-volume API and sub-agent workloads.

source OpenAI Blog date 2026-03-17 entities 4

Equipping workers with insights about compensation

New research shows Americans send nearly 3 million daily messages to ChatGPT asking about compensation and earnings, helping close the wage information gap.

source OpenAI Blog date 2026-03-16 entities 1

Why Codex Security Doesn’t Include a SAST Report

A deep dive into why Codex Security doesn’t rely on traditional SAST, instead using AI-driven constraint reasoning and validation to find real vulnerabilities with fewer false positives.

source OpenAI Blog date 2026-03-15 entities 2

Designing AI agents to resist prompt injection

How ChatGPT defends against prompt injection and social engineering by constraining risky actions and protecting sensitive data in agent workflows.

source OpenAI Blog date 2026-03-11 entities 1

From model to agent: Equipping the Responses API with a computer environment

How OpenAI built an agent runtime using the Responses API, shell tool, and hosted containers to run secure, scalable agents with files, tools, and state.

source OpenAI Blog date 2026-03-11 entities 2

Wayfair boosts catalog accuracy and support speed with OpenAI

Wayfair uses OpenAI models to improve ecommerce support and product catalog accuracy, automating ticket triage and enhancing millions of product attributes at scale.

source OpenAI Blog date 2026-03-10 entities 2

Improving instruction hierarchy in frontier LLMs

IH-Challenge trains models to prioritize trusted instructions, improving instruction hierarchy, safety steerability, and resistance to prompt injection attacks.

source OpenAI Blog date 2026-03-10 entities 1

New ways to learn math and science in ChatGPT

ChatGPT introduces interactive visual explanations for math and science, helping students explore formulas, variables, and concepts in real time.

source OpenAI Blog date 2026-03-10 entities 2

OpenAI to acquire Promptfoo

OpenAI is acquiring Promptfoo, an AI security platform that helps enterprises identify and remediate vulnerabilities in AI systems during development.

source OpenAI Blog date 2026-03-09 entities 2

Codex Security: now in research preview

Codex Security is an AI application security agent that analyzes project context to detect, validate, and patch complex vulnerabilities with higher confidence and less noise.

source OpenAI Blog date 2026-03-06 entities 1

Gemma 4: Byte for byte, the most capable open models

Gemma 4: Our most intelligent open models to date, purpose-built for advanced reasoning and agentic workflows.

source DeepMind Blog date 2026-04-02 entities 2

Gemini 3.1 Flash Live: Making audio AI more natural and reliable

Our latest voice model has improved precision and lower latency to make voice interactions more fluid, natural and precise.

source DeepMind Blog date 2026-03-26 entities 2

Protecting people from harmful manipulation

Google DeepMind researches AI's harmful manipulation risks across areas like finance and health, leading to new safety measures.

source DeepMind Blog date 2026-03-25 entities 2

Lyria 3 Pro: Create longer tracks in more

Introducing Lyria 3 Pro, which unlocks longer tracks with structural awareness. We’re also bringing Lyria to more Google products and surfaces.

source DeepMind Blog date 2026-03-25

Measuring progress toward AGI: A cognitive framework

We’re introducing a framework to measure progress toward AGI, and launching a Kaggle hackathon to build the relevant evaluations.

source DeepMind Blog date 2026-03-17 entities 1

From games to biology and beyond: 10 years of AlphaGo’s impact

Ten years since AlphaGo, we explore how it is catalyzing scientific discovery and paving a path to AGI.

source DeepMind Blog date 2026-03-09 entities 1

Gemini 3.1 Flash-Lite: Built for intelligence at scale

Gemini 3.1 Flash-Lite is our fastest and most cost-efficient Gemini 3 series model yet.

source DeepMind Blog date 2026-03-03 entities 2

Nano Banana 2: Combining Pro capabilities with lightning-fast speed

Our latest image generation model offers advanced world knowledge, production ready specs, subject consistency and more, all at Flash speed.

source DeepMind Blog date 2026-02-26 entities 3

Gemini 3.1 Pro: A smarter model for your most complex tasks

3.1 Pro is designed for tasks where a simple answer isn’t enough.

source DeepMind Blog date 2026-02-19 entities 2

A new way to express yourself: Gemini can now create music

The Gemini app now features our most advanced music generation model Lyria 3, empowering anyone to make 30-second tracks using text or images.

source DeepMind Blog date 2026-02-18 entities 3

Accelerating discovery in India through AI-powered science and education

Google DeepMind brings National Partnerships for AI initiative to India, scaling AI for science and education

source DeepMind Blog date 2026-02-17 entities 3

Gemini 3 Deep Think: Advancing science, research and engineering

Our most specialized reasoning mode is now updated to solve modern science, research and engineering challenges.

source DeepMind Blog date 2026-02-12 entities 4

Accelerating Mathematical and Scientific Discovery with Gemini Deep Think

Research papers point to the growing impact of Deep Think across fields

source DeepMind Blog date 2026-02-09 entities 2

Project Genie: Experimenting with infinite, interactive worlds

Google AI Ultra subscribers in the U.S. can try out Project Genie, an experimental research prototype that lets you create and explore worlds.

source DeepMind Blog date 2026-01-29 entities 1

D4RT: Teaching AI to see the world in four dimensions

D4RT: Unified, efficient 4D reconstruction and tracking up to 300x faster than prior methods.

source DeepMind Blog date 2026-01-16 entities 1

Veo 3.1 Ingredients to Video: More consistency, creativity and control

Our latest Veo update generates lively, dynamic clips that feel natural and engaging — and supports vertical video generation.

source DeepMind Blog date 2026-01-13 entities 2

Google's year in review: 8 areas with research breakthroughs in 2025

Google 2025 recap: Research breakthroughs of the year

source DeepMind Blog date 2025-12-23 entities 1

Gemini 3 Flash: frontier intelligence built for speed

Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost.

source DeepMind Blog date 2025-12-17 entities 2

Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior

Open interpretability tools for language models are now available across the entire Gemma 3 family with the release of Gemma Scope 2.

source DeepMind Blog date 2025-12-16 entities 2

Deepening our partnership with the UK AI Security Institute

Google DeepMind and UK AI Security Institute (AISI) strengthen collaboration on critical AI safety and security research

source DeepMind Blog date 2025-12-10 entities 3

Strengthening our partnership with the UK government to support prosperity and security in the AI era

Deepening our partnership with the UK government to support prosperity and security in the AI era

source DeepMind Blog date 2025-12-10 entities 1

FACTS Benchmark Suite: Systematically evaluating the factuality of large language models

Systematically evaluating the factuality of large language models with the FACTS Benchmark Suite.

source DeepMind Blog date 2025-12-09 entities 2

Engineering more resilient crops for a warming climate

Scientists are using AlphaFold to strengthen a photosynthesis enzyme for resilient, heat-tolerant crops.

source DeepMind Blog date 2025-12-04 entities 1

AlphaFold: Five years of impact

Explore how AlphaFold has accelerated science and fueled a global wave of biological discovery.

source DeepMind Blog date 2025-11-25 entities 2

Revealing a key protein behind heart disease

AlphaFold has revealed the structure of a key protein behind heart disease

source DeepMind Blog date 2025-11-25

Google DeepMind supports U.S. Department of Energy on Genesis: a national mission to accelerate innovation and scientific discovery

Google DeepMind and the DOE partner on Genesis, a new effort to accelerate science with AI.

source DeepMind Blog date 2025-11-24 entities 3

ADeLe: Predicting and explaining AI performance across tasks

<p>AI benchmarks report how large language models (LLMs) perform on specific tasks but provide little insight into their underlying capabilities that drive their performance. They do not explain failures or reliably pred...

source Microsoft Research date 2026-04-01 entities 2

AsgardBench: A benchmark for visually grounded interactive planning

<p>Imagine a robot tasked with cleaning a kitchen. It needs to observe its environment, decide what to do, and adjust when things don’t go as expected, for example, when the mug it was tasked to wash is already clean, or...

source Microsoft Research date 2026-03-26 entities 3

GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation

<p>Vision-language models (VLMs) use images and text to plan robot actions, but they still struggle to decide what actions to take and where to take them. Most systems split these decisions into two steps: a VLM generate...

source Microsoft Research date 2026-03-26 entities 2

Will machines ever be intelligent?

<p>Are machines truly intelligent? AI researchers Subutai Ahmad and Nicolò Fusi join Doug Burger to compare transformer-based AI with the human brain, exploring continual learning, efficiency, and whether today’s models ...

source Microsoft Research date 2026-03-23 entities 4

Systematic debugging for AI agents: Introducing the AgentRx framework

<p>As AI agents transition from simple chatbots to autonomous systems capable of managing cloud incidents, navigating complex web interfaces, and executing multi-step API workflows, a new challenge has emerged: transpare...

source Microsoft Research date 2026-03-12 entities 2

PlugMem: Transforming raw agent interactions into reusable knowledge

<p>It seems counterintuitive: giving AI agents more memory can make them less effective. As interaction logs accumulate, they grow large, fill with irrelevant content, and become increasingly difficult to use. More memor...

source Microsoft Research date 2026-03-10 entities 2

Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model

<p>We are pleased to announce Phi-4-reasoning-vision-15B, a 15 billion parameter open‑weight multimodal reasoning model, available through Microsoft Foundry (opens in new tab), HuggingFace (opens in new tab) and GitHub (...

source Microsoft Research date 2026-03-04 entities 4

Trailer: The Shape of Things to Come

<p>Microsoft research lead Doug Burger introduces his new podcast series, "The Shape of Things to Come", an exploration into the fundamental truths about AI and how the technology will reshape the future. </p> <p>The po...

source Microsoft Research date 2026-03-03 entities 2

CORPGEN advances AI agents for real work

<p>By mid-morning, a typical knowledge worker is already juggling a client report, a budget spreadsheet, a slide deck, and an email backlog, all interdependent and all demanding attention at once. For AI agents to be gen...

source Microsoft Research date 2026-02-26 entities 2

Media Authenticity Methods in Practice: Capabilities, Limitations, and Directions

<p>As synthetic media grows, verifying what’s real, and the origin of content, matters more than ever. Our latest report explores media integrity and authentication methods, their limits, and practical paths toward trust...

source Microsoft Research date 2026-02-19 entities 2

Understanding Convolutions on Graphs

Understanding the building blocks and design choices of graph neural networks.

source Distill.pub date 2021-09-02 entities 2

A Gentle Introduction to Graph Neural Networks

What components are needed for building learning algorithms that leverage the structure and properties of graphs?

source Distill.pub date 2021-09-02 entities 2

Distill Hiatus

After five years, Distill will be taking a break.

source Distill.pub date 2021-07-02

Adversarial Reprogramming of Neural Cellular Automata

Reprogramming Neural CA to exhibit novel behaviour, using adversarial attacks.

source Distill.pub date 2021-05-06 entities 1

Weight Banding

Weights in the final layer of common visual models appear as horizontal bands. We investigate how and why.

source Distill.pub date 2021-04-08 entities 1

Branch Specialization

When a neural network layer is divided into multiple branches, neurons self-organize into coherent groupings.

source Distill.pub date 2021-04-05 entities 1

Multimodal Neurons in Artificial Neural Networks

We report the existence of multimodal neurons in artificial neural networks, similar to those found in the human brain.

source Distill.pub date 2021-03-04 entities 2

Self-Organising Textures

Neural Cellular Automata learn to generate textures, exhibiting surprising properties.

source Distill.pub date 2021-02-11 entities 1

Visualizing Weights

We present techniques for visualizing, contextualizing, and understanding neural network weights.

source Distill.pub date 2021-02-04 entities 1

Curve Circuits

Reverse engineering the curve detection algorithm from InceptionV1 and reimplementing it from scratch.

source Distill.pub date 2021-01-30 entities 1

High-Low Frequency Detectors

A family of early-vision neurons reacting to directional transitions from high to low spatial frequency.

source Distill.pub date 2021-01-27 entities 1

Naturally Occurring Equivariance in Neural Networks

Neural networks naturally learn many transformed copies of the same feature, connected by symmetric weights.

source Distill.pub date 2020-12-08 entities 2

Understanding RL Vision

With diverse environments, we can analyze, diagnose and edit deep reinforcement learning models using attribution.

source Distill.pub date 2020-11-17 entities 2

Communicating with Interactive Articles

Examining the design of interactive articles by synthesizing theory from disciplines such as education, journalism, and visualization.

source Distill.pub date 2020-09-11 entities 1

Thread: Differentiable Self-organizing Systems

A collection of articles and comments with the goal of understanding how to design robust and general purpose self-organizing systems.

source Distill.pub date 2020-08-27 entities 1

Self-classifying MNIST Digits

Training an end-to-end differentiable, self-organising cellular automata for classifying MNIST digits.

source Distill.pub date 2020-08-27 entities 1

Curve Detectors

Part one of a three part deep dive into the curve neuron family.

source Distill.pub date 2020-06-17

Exploring Bayesian Optimization

How to tune hyperparameters for your machine learning model using Bayesian optimization.

source Distill.pub date 2020-05-05 entities 2

An Overview of Early Vision in InceptionV1

An overview of all the neurons in the first five layers of InceptionV1, organized into a taxonomy of 'neuron groups.'

source Distill.pub date 2020-04-01 entities 1

Visualizing Neural Networks with the Grand Tour

By focusing on linear dimensionality reduction, we show how to visualize many dynamic phenomena in neural networks.

source Distill.pub date 2020-03-16 entities 2

Thread: Circuits

What can we learn if we invest heavily in reverse engineering a single neural network?

source Distill.pub date 2020-03-10 entities 1

Zoom In: An Introduction to Circuits

By studying the connections between neurons, we can find meaningful algorithms in the weights of neural networks.

source Distill.pub date 2020-03-10 entities 2

Growing Neural Cellular Automata

Training an end-to-end differentiable, self-organising cellular automata model of morphogenesis, able to both grow and regenerate specific patterns.

source Distill.pub date 2020-02-11 entities 1

Visualizing the Impact of Feature Attribution Baselines

Exploring the baseline input hyperparameter, and how it impacts interpretations of neural network behavior.

source Distill.pub date 2020-01-10 entities 2

Computing Receptive Fields of Convolutional Neural Networks

Detailed derivations and open-source code to analyze the receptive fields of convnets.

source Distill.pub date 2019-11-04 entities 2

The Paths Perspective on Value Learning

A closer look at how Temporal Difference Learning merges paths of experience for greater statistical efficiency

source Distill.pub date 2019-09-30 entities 1

A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features'

Six comments from the community and responses from the original authors

source Distill.pub date 2019-08-06 entities 1

A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarial Example Researchers Need to Expand What is Meant by 'Robustness'

The main hypothesis in Ilyas et al. (2019) happens to be a special case of a more general principle that is commonly accepted in the robustness to distributional shift literature

source Distill.pub date 2019-08-06 entities 1

A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Robust Feature Leakage

An example project using webpack and svelte-loader and ejs to inline SVGs

source Distill.pub date 2019-08-06 entities 1

A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Two Examples of Useful, Non-Robust Features

An example project using webpack and svelte-loader and ejs to inline SVGs

source Distill.pub date 2019-08-06 entities 1

Why AI Is Training on Its Own Garbage (and How to Fix It)

<p>Deep Web Data Is the Gold We Can't Touch, Yet</p> <p>The post <a href="https://towardsdatascience.com/why-ai-is-training-on-its-own-garbage-and-how-to-fix-it/">Why AI Is Training on Its Own Garbage (and How to Fix It)...

source Towards Data Science date 2026-04-08 entities 2

Detecting Translation Hallucinations with Attention Misalignment

<p>A low-budget way to get token-level uncertainty estimation for neural machine translations</p> <p>The post <a href="https://towardsdatascience.com/detecting-translation-hallucinations-with-attention-misalignment/">Det...

source Towards Data Science date 2026-04-08 entities 3

How to Use Claude Code to Build a Minimum Viable Product

<p>Learn how to effectively present product ideas by building MVPs with coding agents</p> <p>The post <a href="https://towardsdatascience.com/how-to-use-claude-code-to-build-a-minimum-viable-product/">How to Use Claude C...

source Towards Data Science date 2026-04-08 entities 3

Grounding Your LLM: A Practical Guide to RAG for Enterprise Knowledge Bases

<p>A clear mental model and a practical foundation you can build on</p> <p>The post <a href="https://towardsdatascience.com/grounding-your-llm-a-practical-guide-to-rag-for-enterprise-knowledge-bases/">Grounding Your LLM:...

source Towards Data Science date 2026-04-08 entities 4

Democratizing Marketing Mix Models (MMM) with Open Source and Gen AI

<p>A practical system design combining open-source Bayesian MMM and GenAI for transparent, vendor independent marketing analytics insights.</p> <p>The post <a href="https://towardsdatascience.com/democratizing-marketing-...

source Towards Data Science date 2026-04-07 entities 2

From 4 Weeks to 45 Minutes: Designing a Document Extraction System for 4,700+ PDFs

<p>How a hybrid PyMuPDF + GPT-4 Vision pipeline replaced £8,000 in manual engineering effort, and why the latest models weren’t the answer</p> <p>The post <a href="https://towardsdatascience.com/from-4-weeks-to-45-minute...

source Towards Data Science date 2026-04-07 entities 3

Context Engineering for AI Agents: A Deep Dive

<p>How to optimize context, a precious finite resource for AI agents</p> <p>The post <a href="https://towardsdatascience.com/deep-dive-into-context-engineering-for-ai-agents/">Context Engineering for AI Agents: A Deep Di...

source Towards Data Science date 2026-04-07 entities 2

The Arithmetic of Productivity Boosts: Why Does a “40% Increase in Productivity” Never Actually Work?

<p>Why do grand productivity promises never actually deliver? Is every product just bad, or is there something else hiding in the numbers? </p> <p>The post <a href="https://towardsdatascience.com/the-arithmetic-of-produc...

source Towards Data Science date 2026-04-07 entities 2

The Geometry Behind the Dot Product: Unit Vectors, Projections, and Intuition

<p>The geometric foundations you need to understand the dot product</p> <p>The post <a href="https://towardsdatascience.com/the-geometry-behind-the-dot-product-unit-vectors-projections-and-intuition/">The Geometry Behind...

source Towards Data Science date 2026-04-06 entities 2

How to Run Claude Code Agents in Parallel

<p>Learn how to apply coding agents in parallel to work more efficiently</p> <p>The post <a href="https://towardsdatascience.com/how-to-run-claude-code-agents-in-parallel/">How to Run Claude Code Agents in Parallel</a> a...

source Towards Data Science date 2026-04-06 entities 3

Behavior is the New Credential

<p>We are living through a paradigm shift in how we prove we are who we say we are online. Instead of asking What do you know? (password, PIN, mother’s maiden name) or What do you look like? (Face ID, fingerprint) the qu...

source Towards Data Science date 2026-04-06 entities 2

Proxy-Pointer RAG: Achieving Vectorless Accuracy at Vector RAG Scale and Cost

<p>A new way to build vector RAG—structure-aware and reasoning-capable</p> <p>The post <a href="https://towardsdatascience.com/proxy-pointer-rag-achieving-vectorless-accuracy-at-vector-rag-scale-and-cost/">Proxy-Pointer ...

source Towards Data Science date 2026-04-05 entities 4

A Data Scientist’s Take on the $599 MacBook Neo

<p>Why it doesn’t fit my workflow but still makes sense for beginners</p> <p>The post <a href="https://towardsdatascience.com/a-data-scientists-take-on-the-599-macbook-neo/">A Data Scientist’s Take on the $599 MacBook Ne...

source Towards Data Science date 2026-04-05 entities 2

Building a Python Workflow That Catches Bugs Before Production

<p>Using modern tooling to identify defects earlier in the software lifecycle.</p> <p>The post <a href="https://towardsdatascience.com/building-a-python-workflow-that-catches-bugs-before-production/">Building a Python Wo...

source Towards Data Science date 2026-04-04 entities 2

Building Robust Credit Scoring Models with Python

<p>A Practical Guide to Measuring Relationships between Variables for Feature Selection in a Credit Scoring.</p> <p>The post <a href="https://towardsdatascience.com/building-robust-credit-scoring-models-with-python/">Bui...

source Towards Data Science date 2026-04-04 entities 2

DenseNet Paper Walkthrough: All Connected

<p>When we try to train a very deep neural network model, one issue that we might encounter is the vanishing gradient problem. This is essentially a problem where the weight update of a model during training slows down o...

source Towards Data Science date 2026-04-03 entities 2

I Replaced Vector DBs with Google’s Memory Agent Pattern for my notes in Obsidian

<p>Persistent AI memory without embeddings, Pinecone, or a PhD in similarity search.</p> <p>The post <a href="https://towardsdatascience.com/i-replaced-vector-dbs-with-googles-memory-agent-pattern-for-my-notes-in-obsidia...

source Towards Data Science date 2026-04-03 entities 2

Linear Regression Is Actually a Projection Problem (Part 2: From Projections to Predictions)

<p>The Vector View of Least Squares.</p> <p>The post <a href="https://towardsdatascience.com/linear-regression-is-actually-a-projection-problem-part-2-from-projections-to-predictions/">Linear Regression Is Actually a Pro...

source Towards Data Science date 2026-04-02 entities 3

How to Handle Classical Data in Quantum Models

<p>Workflows and encoding techniques in quantum machine learning</p> <p>The post <a href="https://towardsdatascience.com/how-to-handle-classical-data-in-quantum-models/">How to Handle Classical Data in Quantum Models</a>...

source Towards Data Science date 2026-04-02 entities 3

Quantum Simulations with Python

<p>Run Quantum Experiments with Qiskit-Aer</p> <p>The post <a href="https://towardsdatascience.com/quantum-simulations-with-python/">Quantum Simulations with Python</a> appeared first on <a href="https://towardsdatascien...

source Towards Data Science date 2026-04-02 entities 2

AI’s software development success and central management needs

<p>A survey carried out by OutSystems, The State of AI Development 2026 [email wall], argues that AI has moved into early production phase for many enterprises, primarily inside the IT function. The survey was based on t...

source AI News date 2026-04-08 entities 1

Microsoft open-source toolkit secures AI agents at runtime

<p>A new open-source toolkit from Microsoft focuses on runtime security to force strict governance onto enterprise AI agents. The release tackles a growing anxiety: autonomous language models are now executing code and h...

source AI News date 2026-04-08 entities 1

Asylon and Thrive Logic bring physical AI to enterprise perimeter security

<p>Exciting times are ahead in the world of enterprise perimeter security with a new partnership between Thrive Logic, an AI agent-driven security and operational intelligence platform, and Asylon, a security robotics co...

source AI News date 2026-04-07 entities 2

Boomi calls it “data activation” and says it’s the missing step in every AI deployment

<p>The failure mode for enterprise AI in 2026 is not what most people expected. It is not that the models are wrong, or that agents cannot reason, or that the technology is overhyped. The failure mode is that the data fe...

source AI News date 2026-04-07 entities 1

Anthropic’s refusal to arm AI is exactly why the UK wants it

<p>The Anthropic UK expansion story is less about diplomatic courtship and more about what happens when a government punishes a company for having principles. In late February, US Defence Secretary Pete Hegseth gave Anth...

source AI News date 2026-04-07 entities 4

As AI agents take on more tasks, governance becomes a priority

<p>AI systems are starting to move beyond simple responses. In many organisations, AI agents are now being tested to plan tasks, make decisions, and carry out actions with limited human input. It is no longer just about ...

source AI News date 2026-04-06 entities 1

KiloClaw targets shadow AI with autonomous agent governance

<p>With the launch of KiloClaw, enterprises now have a tool to enforce governance over autonomous agents and manage shadow AI. While businesses spent the last year securing large language models and formalising vendor ag...

source AI News date 2026-04-02 entities 1

5 best practices to secure AI systems

<p>A decade ago, it would have been hard to believe that artificial intelligence could do what it can do now. However, it is this same power that introduces a new attack surface that traditional security frameworks were ...

source AI News date 2026-04-02 entities 1

China’s Five-Year Plan details the targets for AI deployment

<p>China has approved its 15th Five-Year Plan [PDF] setting out the country’s economic, education, social, and industrial priorities through to 2030. As might be expected, there is a significant number of references to A...

source AI News date 2026-04-02 entities 1

Autonomous AI systems depend on data governance

<p>Much of the current focus on AI safety has centred on models – how they are trained and monitored. But as systems become more autonomous, attention is changing toward the data those systems depend on. If the data feed...

source AI News date 2026-04-02 entities 2

Experian uncovers fraud paradox in financial services’ AI adoption

<p>The same technology that financial institutions deploying is being weaponised against them. That is the core tension running through Experian’s 2026 Future of Fraud Forecast, and it’s a tension the company is in a pos...

source AI News date 2026-04-02 entities 1

KPMG: Inside the AI agent playbook driving enterprise margin gains

<p>Global AI investment is accelerating, yet KPMG data shows the gap between enterprise AI spend and measurable business value is widening fast. The headline figure from KPMG’s first quarterly Global AI Pulse survey is b...

source AI News date 2026-04-01 entities 1

After Orthogonality: Virtue-Ethical Agency and AI Alignment

<!--kg-card-begin: markdown--><h2 id="preface">Preface</h2> <p>This essay argues that rational people don’t have goals, and that rational AIs shouldn’t have goals. Human actions are rational not because we direct them at...

source The Gradient date 2026-02-18 entities 3

AGI Is Not Multimodal

<blockquote>"In projecting language back as the model for thought, we lose sight of the tacit embodied understanding that undergirds our intelligence." –Terry Winograd</blockquote><p>The recent successes of generative AI...

source The Gradient date 2025-06-04 entities 2

Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research

<h3 id="what-is-the-role-of-mathematics-in-modern-machine-learning">What is the Role of Mathematics in Modern Machine Learning?</h3><p>The past decade has witnessed a shift in how progress is made in machine learning. Re...

source The Gradient date 2024-11-16 entities 2

What's Missing From LLM Chatbots: A Sense of Purpose

<p>LLM-based chatbots’ capabilities have been advancing every month. These improvements are mostly measured by benchmarks like MMLU, HumanEval, and MATH (e.g. sonnet 3.5, gpt-4o). However, as these measures get more and ...

source The Gradient date 2024-09-09 entities 2

We Need Positive Visions for AI Grounded in Wellbeing

<h2 id="introduction">Introduction</h2><p>Imagine yourself a decade ago, jumping directly into the present shock of conversing naturally with an encyclopedic AI that crafts images, writes code, and debates philosophy. Wo...

source The Gradient date 2024-08-03 entities 1

Financial Market Applications of LLMs

<p>The AI revolution drove frenzied investment in both private and public companies and captured the public’s imagination in 2023. Transformational consumer products like ChatGPT are powered by Large Language Models (LLM...

source The Gradient date 2024-04-20 entities 1

A Brief Overview of Gender Bias in AI

A brief overview and discussion on gender bias in AI

source The Gradient date 2024-04-08 entities 1

Mamba Explained

Is Attention all you need? Mamba, a novel AI model based on State Space Models (SSMs), emerges as a formidable alternative to the widely used Transformer models, addressing their inefficiency in processing long sequences...

source The Gradient date 2024-03-27 entities 2

Car-GPT: Could LLMs finally make self-driving cars happen?

Exploring the utility of large language models in autonomous driving: Can they be trusted for self-driving cars, and what are the key challenges?

source The Gradient date 2024-03-08 entities 2

Do text embeddings perfectly encode text?

'Vec2text' can serve as a solution for accurately reverting embeddings back into text, thus highlighting the urgent need for revisiting security protocols around embedded data.

source The Gradient date 2024-03-05 entities 1

Why Doesn’t My Model Work?

Have you ever trained a model you thought was good, but then it failed miserably when applied to real world data? If so, you’re in good company.

source The Gradient date 2024-02-24 entities 1

Deep learning for single-cell sequencing: a microscope to see the diversity of cells

On the the pivotal role that Deep Learning has played as a key enabler for advancing single-cell sequencing technologies.

source The Gradient date 2024-01-13 entities 2

Salmon in the Loop

On fish counting – a complex sociotechnical problem in a field that is going through the process of digital transformation.

source The Gradient date 2023-12-16

Neural algorithmic reasoning

<p>In this article, we will talk about <em>classical computation</em>: the kind of computation typically found in an undergraduate Computer Science course on Algorithms and Data Structures [1]. Think shortest path-findin...

source The Gradient date 2023-10-14 entities 3

The Artificiality of Alignment

<p><em>This essay first appeared in <a href="https://joinreboot.org/p/alignment">Reboot</a></em>. </p><p>Credulous, breathless coverage of “AI existential risk” (abbreviated “x-risk”) has reached the mainstream. Who coul...

source The Gradient date 2023-10-07 entities 2