Get trending papers in your email inbox once a day!
Get trending papers in your email inbox!
SubscribeThe effects of AGN feedback on the structural and dynamical properties of Milky Way-mass galaxies in cosmological simulations
Feedback from active galactic nuclei (AGN) has become established as a fundamental process in the evolution of the most massive galaxies. Its impact on Milky Way (MW)-mass systems, however, remains comparatively unexplored. In this work, we use the Auriga simulations to probe the impact of AGN feedback on the dynamical and structural properties of galaxies, focussing on the bar, bulge, and disc. We analyse three galaxies -- two strongly and one unbarred/weakly barred -- using three setups: (i) the fiducial Auriga model, which includes both radio and quasar mode feedback, (ii) a setup with no radio mode, and (iii) one with neither the radio nor the quasar mode. When removing the radio mode, gas in the circumgalactic medium cools more efficiently and subsequently settles in an extended disc, with little effect on the inner disc. Contrary to previous studies, we find that although the removal of the quasar mode results in more massive central components, these are in the form of compact discs, rather than spheroidal bulges. Therefore, galaxies without quasar mode feedback are more baryon-dominated and thus prone to forming stronger and shorter bars, which reveals an anti-correlation between the ejective nature of AGN feedback and bar strength. Hence, we report that the effect of AGN feedback (i.e. ejective or preventive) can significantly alter the dynamical properties of MW-like galaxies. Therefore, the observed dynamical and structural properties of MW-mass galaxies can be used as additional constraints for calibrating the efficiency of AGN feedback models.
Metrological detection of multipartite entanglement through dynamical symmetries
Multipartite entanglement, characterized by the quantum Fisher information (QFI), plays a central role in quantum-enhanced metrology and understanding quantum many-body physics. With a dynamical generalization of the Mazur-Suzuki relations, we provide a rigorous lower bound on the QFI for the thermal Gibbs states in terms of dynamical symmetries, i.e., operators with periodic time dependence. We demonstrate that this bound can be saturated when considering a complete set of dynamical symmetries. Moreover, this lower bound with dynamical symmetries can be generalized to the QFI matrix and to the QFI for the thermal pure states, predicted by the eigenstate thermalization hypothesis. Our results reveal a new perspective to detect multipartite entanglement and other generalized variances in an equilibrium system, from its nonstationary dynamical properties, and is promising for studying emergent nonequilibrium many-body physics.
Demystifying the Token Dynamics of Deep Selective State Space Models
Selective state space models (SSM), such as Mamba, have gained prominence for their effectiveness in modeling sequential data. Despite their outstanding empirical performance, a comprehensive theoretical understanding of deep selective SSM remains elusive, hindering their further development and adoption for applications that need high fidelity. In this paper, we investigate the dynamical properties of tokens in a pre-trained Mamba model. In particular, we derive the dynamical system governing the continuous-time limit of the Mamba model and characterize the asymptotic behavior of its solutions. In the one-dimensional case, we prove that only one of the following two scenarios happens: either all tokens converge to zero, or all tokens diverge to infinity. We provide criteria based on model parameters to determine when each scenario occurs. For the convergent scenario, we empirically verify that this scenario negatively impacts the model's performance. For the divergent scenario, we prove that different tokens will diverge to infinity at different rates, thereby contributing unequally to the updates during model training. Based on these investigations, we propose two refinements for the model: excluding the convergent scenario and reordering tokens based on their importance scores, both aimed at improving practical performance. Our experimental results validate these refinements, offering insights into enhancing Mamba's effectiveness in real-world applications.
MSA-3D: Metallicity Gradients in Galaxies at $z\sim1$ with JWST/NIRSpec Slit-stepping Spectroscopy
The radial gradient of gas-phase metallicity is a powerful probe of the chemical and structural evolution of star-forming galaxies, closely tied to disk formation and gas kinematics in the early universe. We present spatially resolved chemical and dynamical properties for a sample of 25 galaxies at 0.5 lesssim z lesssim 1.7 from the \msasd survey. These innovative observations provide 3D spectroscopy of galaxies at a spatial resolution approaching JWST's diffraction limit and a high spectral resolution of Rsimeq2700. The metallicity gradients measured in our galaxy sample range from -0.03 to 0.02 dex~kpc^{-1}. Most galaxies exhibit negative or flat radial gradients, indicating lower metallicity in the outskirts or uniform metallicity throughout the entire galaxy. We confirm a tight relationship between stellar mass and metallicity gradient at zsim1 with small intrinsic scatter of 0.02 dex~kpc^{-1}. Our results indicate that metallicity gradients become increasingly negative as stellar mass increases, likely because the more massive galaxies tend to be more ``disky". This relationship is consistent with the predictions from cosmological hydrodynamic zoom-in simulations with strong stellar feedback. This work presents the effort to harness the multiplexing capability of JWST NIRSpec/MSA in slit-stepping mode to map the chemical and kinematic profiles of high-redshift galaxies in large samples and at high spatial and spectral resolution.
On the higher-order smallest ring star network of Chialvo neurons under diffusive couplings
We put forward the dynamical study of a novel higher-order small network of Chialvo neurons arranged in a ring-star topology, with the neurons interacting via linear diffusive couplings. This model is perceived to imitate the nonlinear dynamical properties exhibited by a realistic nervous system where the neurons transfer information through higher-order multi-body interactions. We first analyze our model using the tools from nonlinear dynamics literature: fixed point analysis, Jacobian matrix, and bifurcation patterns. We observe the coexistence of chaotic attractors, and also an intriguing route to chaos starting from a fixed point, to period-doubling, to cyclic quasiperiodic closed invariant curves, to ultimately chaos. We numerically observe the existence of codimension-1 bifurcation patterns: saddle-node, period-doubling, and Neimark Sacker. We also qualitatively study the typical phase portraits of the system and numerically quantify chaos and complexity using the 0-1 test and sample entropy measure respectively. Finally, we study the collective behavior of the neurons in terms of two synchronization measures: the cross-correlation coefficient, and the Kuramoto order parameter.
Impact of QCD sum rules coupling constants on neutron stars structure
We present a detailed investigation on the structure of neutron stars, incorporating the presence of hyperons within a relativistic model under the mean-field approximation. Employing coupling constants derived from QCD sum rules, we explore the particle fraction in beta equilibrium and establish the mass-radius relationship for neutron stars with hyperonic matter. Additionally, we compute the stellar Love number (K_{2}) and the tidal deformability parameter (varLambda), providing valuable insights into the dynamical properties of these celestial objects. Through comparison with theoretical predictions and observational data, our results exhibit good agreement, affirming the validity of our approach. These findings contribute significantly to refining the understanding of neutron star physics, particularly in environments containing hyperons, and offer essential constraints on the equation of state governing such extreme astrophysical conditions.
The $Hubble$ Missing Globular Cluster Survey. I. Survey overview and the first precise age estimate for ESO452-11 and 2MASS-GC01
We present the Hubble Missing Globular Cluster Survey (MGCS), a Hubble Space Telescope treasury programme dedicated to the observation of all the kinematically confirmed Milky Way globular clusters that missed previous Hubble imaging. After introducing the aims of the programme and describing its target clusters, we showcase the first results of the survey. These are related to two clusters, one located at the edge of the Milky Way Bulge and observed in optical bands, namely ESO452-11, and one located in the Galactic Disc observed in the near-IR, namely 2MASS-GC01. For both clusters, the deep colour-magnitude diagrams obtained from the MGCS observations reach several magnitudes below their main-sequence turn-off, and thus enable the first precise estimate of their age. By using the methods developed within the CARMA project, we find ESO452-11 to be an old, metal-intermediate globular cluster, with {rm [M/H]}simeq-0.80^{+0.08}_{-0.11} and an age of {rm t}=13.59^{+0.48}_{-0.69} Gyr. Its location on the age-metallicity relation makes it consistent with an in-situ origin, in agreement with its dynamical properties. On the other hand, the results for 2MASS-GC01 highlight it as a young, metal-intermediate cluster, with an age of {rm t}=7.22^{+0.93}_{-1.11} Gyr at {rm [M/H]}=-0.73^{+0.06}_{-0.06}. This is the first ever age estimate for this extremely extincted cluster, and indicates it either as the youngest globular known to date, or as a massive and compact open cluster, which is consistent with its almost circular, disc-like orbit.
Limits and Powers of Koopman Learning
Dynamical systems provide a comprehensive way to study complex and changing behaviors across various sciences. Many modern systems are too complicated to analyze directly or we do not have access to models, driving significant interest in learning methods. Koopman operators have emerged as a dominant approach because they allow the study of nonlinear dynamics using linear techniques by solving an infinite-dimensional spectral problem. However, current algorithms face challenges such as lack of convergence, hindering practical progress. This paper addresses a fundamental open question: When can we robustly learn the spectral properties of Koopman operators from trajectory data of dynamical systems, and when can we not? Understanding these boundaries is crucial for analysis, applications, and designing algorithms. We establish a foundational approach that combines computational analysis and ergodic theory, revealing the first fundamental barriers -- universal for any algorithm -- associated with system geometry and complexity, regardless of data quality and quantity. For instance, we demonstrate well-behaved smooth dynamical systems on tori where non-trivial eigenfunctions of the Koopman operator cannot be determined by any sequence of (even randomized) algorithms, even with unlimited training data. Additionally, we identify when learning is possible and introduce optimal algorithms with verification that overcome issues in standard methods. These results pave the way for a sharp classification theory of data-driven dynamical systems based on how many limits are needed to solve a problem. These limits characterize all previous methods, presenting a unified view. Our framework systematically determines when and how Koopman spectral properties can be learned.
Agentic End-to-End De Novo Protein Design for Tailored Dynamics Using a Language Diffusion Model
Proteins are dynamic molecular machines whose biological functions, spanning enzymatic catalysis, signal transduction, and structural adaptation, are intrinsically linked to their motions. Designing proteins with targeted dynamic properties, however, remains a challenge due to the complex, degenerate relationships between sequence, structure, and molecular motion. Here, we introduce VibeGen, a generative AI framework that enables end-to-end de novo protein design conditioned on normal mode vibrations. VibeGen employs an agentic dual-model architecture, comprising a protein designer that generates sequence candidates based on specified vibrational modes and a protein predictor that evaluates their dynamic accuracy. This approach synergizes diversity, accuracy, and novelty during the design process. Via full-atom molecular simulations as direct validation, we demonstrate that the designed proteins accurately reproduce the prescribed normal mode amplitudes across the backbone while adopting various stable, functionally relevant structures. Notably, generated sequences are de novo, exhibiting no significant similarity to natural proteins, thereby expanding the accessible protein space beyond evolutionary constraints. Our work integrates protein dynamics into generative protein design, and establishes a direct, bidirectional link between sequence and vibrational behavior, unlocking new pathways for engineering biomolecules with tailored dynamical and functional properties. This framework holds broad implications for the rational design of flexible enzymes, dynamic scaffolds, and biomaterials, paving the way toward dynamics-informed AI-driven protein engineering.
Physical properties of circumnuclear ionising clusters. III. Kinematics of gas and stars in NGC 7742
In this third paper of a series, we study the kinematics of the ionised gas and stars, calculating the dynamical masses of the circumnuclear star-forming regions in the ring of of the face-on spiral NGC 7742. We have used high spectral resolution data from the MEGARA instrument attached to the Gran Telescopio Canarias (GTC) to measure the kinematical components of the nebular emission lines of selected HII regions and the stellar velocity dispersions from the CaT absorption lines that allow the derivation of the associated cluster virialized masses. The emission line profiles show two different kinematical components: a narrow one with velocity dispersion sim 10 km/s and a broad one with velocity dispersion similar to those found for the stellar absorption lines. The derived star cluster dynamical masses range from 2.5 times 10^6 to 10.0 times 10^7 M_odot. The comparison of gas and stellar velocity dispersions suggests a scenario where the clusters have formed simultaneously in a first star formation episode with a fraction of the stellar evolution feedback remaining trapped in the cluster, subject to the same gravitational potential as the cluster stars. Between 0.15 and 7.07 % of the total dynamical mass of the cluster would have cooled down and formed a new, younger, population of stars, responsible for the ionisation of the gas currently observed.
Ground State Preparation via Dynamical Cooling
Quantum algorithms for probing ground-state properties of quantum systems require good initial states. Projection-based methods such as eigenvalue filtering rely on inputs that have a significant overlap with the low-energy subspace, which can be challenging for large, strongly-correlated systems. This issue has motivated the study of physically-inspired dynamical approaches such as thermodynamic cooling. In this work, we introduce a ground-state preparation algorithm based on the simulation of quantum dynamics. Our main insight is to transform the Hamiltonian by a shifted sign function via quantum signal processing, effectively mapping eigenvalues into positive and negative subspaces separated by a large gap. This automatically ensures that all states within each subspace conserve energy with respect to the transformed Hamiltonian. Subsequent time-evolution with a perturbed Hamiltonian induces transitions to lower-energy states while preventing unwanted jumps to higher energy states. The approach does not rely on a priori knowledge of energy gaps and requires no additional qubits to model a bath. Furthermore, it makes mathcal{O}(d^{,3/2}/epsilon) queries to the time-evolution operator of the system and mathcal{O}(d^{,3/2}) queries to a block-encoding of the perturbation, for d cooling steps and an epsilon-accurate energy resolution. Our results provide a framework for combining quantum signal processing and Hamiltonian simulation to design heuristic quantum algorithms for ground-state preparation.
The Implicit Regularization of Dynamical Stability in Stochastic Gradient Descent
In this paper, we study the implicit regularization of stochastic gradient descent (SGD) through the lens of {\em dynamical stability} (Wu et al., 2018). We start by revising existing stability analyses of SGD, showing how the Frobenius norm and trace of Hessian relate to different notions of stability. Notably, if a global minimum is linearly stable for SGD, then the trace of Hessian must be less than or equal to 2/eta, where eta denotes the learning rate. By contrast, for gradient descent (GD), the stability imposes a similar constraint but only on the largest eigenvalue of Hessian. We then turn to analyze the generalization properties of these stable minima, focusing specifically on two-layer ReLU networks and diagonal linear networks. Notably, we establish the {\em equivalence} between these metrics of sharpness and certain parameter norms for the two models, which allows us to show that the stable minima of SGD provably generalize well. By contrast, the stability-induced regularization of GD is provably too weak to ensure satisfactory generalization. This discrepancy provides an explanation of why SGD often generalizes better than GD. Note that the learning rate (LR) plays a pivotal role in the strength of stability-induced regularization. As the LR increases, the regularization effect becomes more pronounced, elucidating why SGD with a larger LR consistently demonstrates superior generalization capabilities. Additionally, numerical experiments are provided to support our theoretical findings.
ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction
Deep neural networks (DNN) have shown great capacity of modeling a dynamical system; nevertheless, they usually do not obey physics constraints such as conservation laws. This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling to endow the invariant properties. ConCerNet consists of two steps: (i) a contrastive learning method to automatically capture the system invariants (i.e. conservation properties) along the trajectory observations; (ii) a neural projection layer to guarantee that the learned dynamics models preserve the learned invariants. We theoretically prove the functional relationship between the learned latent representation and the unknown system invariant function. Experiments show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics by a large margin. With neural network based parameterization and no dependence on prior knowledge, our method can be extended to complex and large-scale dynamics by leveraging an autoencoder.
The effect of dynamical states on galaxy clusters populations. I. Classification of dynamical states
While the influence of galaxy clusters on galaxy evolution is relatively well-understood, the impact of the dynamical states of these clusters is less clear. This paper series explores how the dynamical state of galaxy clusters affects their galaxy populations' physical and morphological properties. The primary aim of this first paper is to evaluate the dynamical state of 87 massive (M_{500} geq 1.5 times 10^{14} M_{odot}) galaxy clusters at low redshifts (0.10 leq z leq 0.35). This will allow us to have a well-characterized sample for analyzing physical and morphological properties in our next work. We employ six dynamical state proxies utilizing optical and X-ray imaging data. Principal Component Analysis (PCA) is applied to integrate these proxies effectively, allowing for robust classification of galaxy clusters into relaxed, intermediate, and disturbed states based on their dynamical characteristics. The methodology successfully segregates the galaxy clusters into the three dynamical states. Examination of the galaxy distributions in optical wavelengths and gas distributions in X-ray further confirms the consistency of these classifications. The clusters' dynamical states are statistically distinguishable, providing a clear categorization for further analysis.
Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in Dynamical Systems Reconstruction
Dynamical systems (DS) theory is fundamental for many areas of science and engineering. It can provide deep insights into the behavior of systems evolving in time, as typically described by differential or recursive equations. A common approach to facilitate mathematical tractability and interpretability of DS models involves decomposing nonlinear DS into multiple linear DS separated by switching manifolds, i.e. piecewise linear (PWL) systems. PWL models are popular in engineering and a frequent choice in mathematics for analyzing the topological properties of DS. However, hand-crafting such models is tedious and only possible for very low-dimensional scenarios, while inferring them from data usually gives rise to unnecessarily complex representations with very many linear subregions. Here we introduce Almost-Linear Recurrent Neural Networks (AL-RNNs) which automatically and robustly produce most parsimonious PWL representations of DS from time series data, using as few PWL nonlinearities as possible. AL-RNNs can be efficiently trained with any SOTA algorithm for dynamical systems reconstruction (DSR), and naturally give rise to a symbolic encoding of the underlying DS that provably preserves important topological properties. We show that for the Lorenz and R\"ossler systems, AL-RNNs discover, in a purely data-driven way, the known topologically minimal PWL representations of the corresponding chaotic attractors. We further illustrate on two challenging empirical datasets that interpretable symbolic encodings of the dynamics can be achieved, tremendously facilitating mathematical and computational analysis of the underlying systems.
Gradient Starvation: A Learning Proclivity in Neural Networks
We identify and formalize a fundamental gradient descent phenomenon resulting in a learning proclivity in over-parameterized neural networks. Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task, despite the presence of other predictive features that fail to be discovered. This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks. Using tools from Dynamical Systems theory, we identify simple properties of learning dynamics during gradient descent that lead to this imbalance, and prove that such a situation can be expected given certain statistical structure in training data. Based on our proposed formalism, we develop guarantees for a novel regularization method aimed at decoupling feature learning dynamics, improving accuracy and robustness in cases hindered by gradient starvation. We illustrate our findings with simple and real-world out-of-distribution (OOD) generalization experiments.
Chaos as an interpretable benchmark for forecasting and data-driven modelling
The striking fractal geometry of strange attractors underscores the generative nature of chaos: like probability distributions, chaotic systems can be repeatedly measured to produce arbitrarily-detailed information about the underlying attractor. Chaotic systems thus pose a unique challenge to modern statistical learning techniques, while retaining quantifiable mathematical properties that make them controllable and interpretable as benchmarks. Here, we present a growing database currently comprising 131 known chaotic dynamical systems spanning fields such as astrophysics, climatology, and biochemistry. Each system is paired with precomputed multivariate and univariate time series. Our dataset has comparable scale to existing static time series databases; however, our systems can be re-integrated to produce additional datasets of arbitrary length and granularity. Our dataset is annotated with known mathematical properties of each system, and we perform feature analysis to broadly categorize the diverse dynamics present across the collection. Chaotic systems inherently challenge forecasting models, and across extensive benchmarks we correlate forecasting performance with the degree of chaos present. We also exploit the unique generative properties of our dataset in several proof-of-concept experiments: surrogate transfer learning to improve time series classification, importance sampling to accelerate model training, and benchmarking symbolic regression algorithms.
E(n) Equivariant Graph Neural Networks
This paper introduces a new model to learn graph neural networks equivariant to rotations, translations, reflections and permutations called E(n)-Equivariant Graph Neural Networks (EGNNs). In contrast with existing methods, our work does not require computationally expensive higher-order representations in intermediate layers while it still achieves competitive or better performance. In addition, whereas existing methods are limited to equivariance on 3 dimensional spaces, our model is easily scaled to higher-dimensional spaces. We demonstrate the effectiveness of our method on dynamical systems modelling, representation learning in graph autoencoders and predicting molecular properties.
Force-Free Molecular Dynamics Through Autoregressive Equivariant Networks
Molecular dynamics (MD) simulations play a crucial role in scientific research. Yet their computational cost often limits the timescales and system sizes that can be explored. Most data-driven efforts have been focused on reducing the computational cost of accurate interatomic forces required for solving the equations of motion. Despite their success, however, these machine learning interatomic potentials (MLIPs) are still bound to small time-steps. In this work, we introduce TrajCast, a transferable and data-efficient framework based on autoregressive equivariant message passing networks that directly updates atomic positions and velocities lifting the constraints imposed by traditional numerical integration. We benchmark our framework across various systems, including a small molecule, crystalline material, and bulk liquid, demonstrating excellent agreement with reference MD simulations for structural, dynamical, and energetic properties. Depending on the system, TrajCast allows for forecast intervals up to 30times larger than traditional MD time-steps, generating over 15 ns of trajectory data per day for a solid with more than 4,000 atoms. By enabling efficient large-scale simulations over extended timescales, TrajCast can accelerate materials discovery and explore physical phenomena beyond the reach of traditional simulations and experiments. An open-source implementation of TrajCast is accessible under https://github.com/IBM/trajcast.
Liquid Time-constant Networks
We introduce a new class of time-continuous recurrent neural network models. Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems modulated via nonlinear interlinked gates. The resulting models represent dynamical systems with varying (i.e., liquid) time-constants coupled to their hidden state, with outputs being computed by numerical differential equation solvers. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations, and give rise to improved performance on time-series prediction tasks. To demonstrate these properties, we first take a theoretical approach to find bounds over their dynamics and compute their expressive power by the trajectory length measure in latent trajectory space. We then conduct a series of time-series prediction experiments to manifest the approximation capability of Liquid Time-Constant Networks (LTCs) compared to classical and modern RNNs. Code and data are available at https://github.com/raminmh/liquid_time_constant_networks
Disentangled Generative Models for Robust Prediction of System Dynamics
Deep neural networks have become increasingly of interest in dynamical system prediction, but out-of-distribution generalization and long-term stability still remains challenging. In this work, we treat the domain parameters of dynamical systems as factors of variation of the data generating process. By leveraging ideas from supervised disentanglement and causal factorization, we aim to separate the domain parameters from the dynamics in the latent space of generative models. In our experiments we model dynamics both in phase space and in video sequences and conduct rigorous OOD evaluations. Results indicate that disentangled VAEs adapt better to domain parameters spaces that were not present in the training data. At the same time, disentanglement can improve the long-term and out-of-distribution predictions of state-of-the-art models in video sequences.
Graph Switching Dynamical Systems
Dynamical systems with complex behaviours, e.g. immune system cells interacting with a pathogen, are commonly modelled by splitting the behaviour into different regimes, or modes, each with simpler dynamics, and then learning the switching behaviour from one mode to another. Switching Dynamical Systems (SDS) are a powerful tool that automatically discovers these modes and mode-switching behaviour from time series data. While effective, these methods focus on independent objects, where the modes of one object are independent of the modes of the other objects. In this paper, we focus on the more general interacting object setting for switching dynamical systems, where the per-object dynamics also depends on an unknown and dynamically changing subset of other objects and their modes. To this end, we propose a novel graph-based approach for switching dynamical systems, GRAph Switching dynamical Systems (GRASS), in which we use a dynamic graph to characterize interactions between objects and learn both intra-object and inter-object mode-switching behaviour. We introduce two new datasets for this setting, a synthesized ODE-driven particles dataset and a real-world Salsa Couple Dancing dataset. Experiments show that GRASS can consistently outperforms previous state-of-the-art methods.
CausalDynamics: A large-scale benchmark for structural discovery of dynamical causal models
Causal discovery for dynamical systems poses a major challenge in fields where active interventions are infeasible. Most methods used to investigate these systems and their associated benchmarks are tailored to deterministic, low-dimensional and weakly nonlinear time-series data. To address these limitations, we present CausalDynamics, a large-scale benchmark and extensible data generation framework to advance the structural discovery of dynamical causal models. Our benchmark consists of true causal graphs derived from thousands of coupled ordinary and stochastic differential equations as well as two idealized climate models. We perform a comprehensive evaluation of state-of-the-art causal discovery algorithms for graph reconstruction on systems with noisy, confounded, and lagged dynamics. CausalDynamics consists of a plug-and-play, build-your-own coupling workflow that enables the construction of a hierarchy of physical systems. We anticipate that our framework will facilitate the development of robust causal discovery algorithms that are broadly applicable across domains while addressing their unique challenges. We provide a user-friendly implementation and documentation on https://kausable.github.io/CausalDynamics.
Generalized Teacher Forcing for Learning Chaotic Dynamics
Chaotic dynamical systems (DS) are ubiquitous in nature and society. Often we are interested in reconstructing such systems from observed time series for prediction or mechanistic insight, where by reconstruction we mean learning geometrical and invariant temporal properties of the system in question (like attractors). However, training reconstruction algorithms like recurrent neural networks (RNNs) on such systems by gradient-descent based techniques faces severe challenges. This is mainly due to exploding gradients caused by the exponential divergence of trajectories in chaotic systems. Moreover, for (scientific) interpretability we wish to have as low dimensional reconstructions as possible, preferably in a model which is mathematically tractable. Here we report that a surprisingly simple modification of teacher forcing leads to provably strictly all-time bounded gradients in training on chaotic systems, and, when paired with a simple architectural rearrangement of a tractable RNN design, piecewise-linear RNNs (PLRNNs), allows for faithful reconstruction in spaces of at most the dimensionality of the observed system. We show on several DS that with these amendments we can reconstruct DS better than current SOTA algorithms, in much lower dimensions. Performance differences were particularly compelling on real world data with which most other methods severely struggled. This work thus led to a simple yet powerful DS reconstruction algorithm which is highly interpretable at the same time.
FlashMD: long-stride, universal prediction of molecular dynamics
Molecular dynamics (MD) provides insights into atomic-scale processes by integrating over time the equations that describe the motion of atoms under the action of interatomic forces. Machine learning models have substantially accelerated MD by providing inexpensive predictions of the forces, but they remain constrained to minuscule time integration steps, which are required by the fast time scale of atomic motion. In this work, we propose FlashMD, a method to predict the evolution of positions and momenta over strides that are between one and two orders of magnitude longer than typical MD time steps. We incorporate considerations on the mathematical and physical properties of Hamiltonian dynamics in the architecture, generalize the approach to allow the simulation of any thermodynamic ensemble, and carefully assess the possible failure modes of such a long-stride MD approach. We validate FlashMD's accuracy in reproducing equilibrium and time-dependent properties, using both system-specific and general-purpose models, extending the ability of MD simulation to reach the long time scales needed to model microscopic processes of high scientific and technological relevance.
Stochastic interpolants with data-dependent couplings
Generative models inspired by dynamical transport of measure -- such as flows and diffusions -- construct a continuous-time map between two probability densities. Conventionally, one of these is the target density, only accessible through samples, while the other is taken as a simple base density that is data-agnostic. In this work, using the framework of stochastic interpolants, we formalize how to couple the base and the target densities. This enables us to incorporate information about class labels or continuous embeddings to construct dynamical transport maps that serve as conditional generative models. We show that these transport maps can be learned by solving a simple square loss regression problem analogous to the standard independent setting. We demonstrate the usefulness of constructing dependent couplings in practice through experiments in super-resolution and in-painting.
Action Matching: Learning Stochastic Dynamics from Samples
Learning the continuous dynamics of a system from snapshots of its temporal marginals is a problem which appears throughout natural sciences and machine learning, including in quantum systems, single-cell biological data, and generative modeling. In these settings, we assume access to cross-sectional samples that are uncorrelated over time, rather than full trajectories of samples. In order to better understand the systems under observation, we would like to learn a model of the underlying process that allows us to propagate samples in time and thereby simulate entire individual trajectories. In this work, we propose Action Matching, a method for learning a rich family of dynamics using only independent samples from its time evolution. We derive a tractable training objective, which does not rely on explicit assumptions about the underlying dynamics and does not require back-propagation through differential equations or optimal transport solvers. Inspired by connections with optimal transport, we derive extensions of Action Matching to learn stochastic differential equations and dynamics involving creation and destruction of probability mass. Finally, we showcase applications of Action Matching by achieving competitive performance in a diverse set of experiments from biology, physics, and generative modeling.
Holographic quantum criticality from multi-trace deformations
We explore the consequences of multi-trace deformations in applications of gauge-gravity duality to condensed matter physics. We find that they introduce a powerful new "knob" that can implement spontaneous symmetry breaking, and can be used to construct a new type of holographic superconductor. This knob can be tuned to drive the critical temperature to zero, leading to a new quantum critical point. We calculate nontrivial critical exponents, and show that fluctuations of the order parameter are `locally' quantum critical in the disordered phase. Most notably the dynamical critical exponent is determined by the dimension of an operator at the critical point. We argue that the results are robust against quantum corrections and discuss various generalizations.
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics
Complex, temporally evolving phenomena, from climate to brain activity, are governed by dynamical systems (DS). DS reconstruction (DSR) seeks to infer generative surrogate models of these from observed data, reproducing their long-term behavior. Existing DSR approaches require purpose-training for any new system observed, lacking the zero-shot and in-context inference capabilities known from LLMs. Here we introduce DynaMix, a novel multivariate ALRNN-based mixture-of-experts architecture pre-trained for DSR, the first DSR model able to generalize zero-shot to out-of-domain DS. Just from a provided context signal, without any re-training, DynaMix faithfully forecasts the long-term evolution of novel DS where existing time series (TS) foundation models, like Chronos, fail -- at a fraction of the number of parameters and orders of magnitude faster inference times. DynaMix outperforms TS foundation models in terms of long-term statistics, and often also short-term forecasts, even on real-world time series, like traffic or weather data, typically used for training and evaluating TS models, but not at all part of DynaMix' training corpus. We illustrate some of the failure modes of TS models for DSR problems, and conclude that models built on DS principles may bear a huge potential also for advancing the TS prediction field.
Random Quantum Circuits
Quantum circuits -- built from local unitary gates and local measurements -- are a new playground for quantum many-body physics and a tractable setting to explore universal collective phenomena far-from-equilibrium. These models have shed light on longstanding questions about thermalization and chaos, and on the underlying universal dynamics of quantum information and entanglement. In addition, such models generate new sets of questions and give rise to phenomena with no traditional analog, such as new dynamical phases in quantum systems that are monitored by an external observer. Quantum circuit dynamics is also topical in view of experimental progress in building digital quantum simulators that allow control of precisely these ingredients. Randomness in the circuit elements allows a high level of theoretical control, with a key theme being mappings between real-time quantum dynamics and effective classical lattice models or dynamical processes. Many of the universal phenomena that can be identified in this tractable setting apply to much wider classes of more structured many-body dynamics.
Learning the Dynamics of Sparsely Observed Interacting Systems
We address the problem of learning the dynamics of an unknown non-parametric system linking a target and a feature time series. The feature time series is measured on a sparse and irregular grid, while we have access to only a few points of the target time series. Once learned, we can use these dynamics to predict values of the target from the previous values of the feature time series. We frame this task as learning the solution map of a controlled differential equation (CDE). By leveraging the rich theory of signatures, we are able to cast this non-linear problem as a high-dimensional linear regression. We provide an oracle bound on the prediction error which exhibits explicit dependencies on the individual-specific sampling schemes. Our theoretical results are illustrated by simulations which show that our method outperforms existing algorithms for recovering the full time series while being computationally cheap. We conclude by demonstrating its potential on real-world epidemiological data.
A Multi-Branched Radial Basis Network Approach to Predicting Complex Chaotic Behaviours
In this study, we propose a multi branched network approach to predict the dynamics of a physics attractor characterized by intricate and chaotic behavior. We introduce a unique neural network architecture comprised of Radial Basis Function (RBF) layers combined with an attention mechanism designed to effectively capture nonlinear inter-dependencies inherent in the attractor's temporal evolution. Our results demonstrate successful prediction of the attractor's trajectory across 100 predictions made using a real-world dataset of 36,700 time-series observations encompassing approximately 28 minutes of activity. To further illustrate the performance of our proposed technique, we provide comprehensive visualizations depicting the attractor's original and predicted behaviors alongside quantitative measures comparing observed versus estimated outcomes. Overall, this work showcases the potential of advanced machine learning algorithms in elucidating hidden structures in complex physical systems while offering practical applications in various domains requiring accurate short-term forecasting capabilities.
Panda: A pretrained forecast model for universal representation of chaotic dynamics
Chaotic systems are intrinsically sensitive to small errors, challenging efforts to construct predictive data-driven models of real-world dynamical systems such as fluid flows or neuronal activity. Prior efforts comprise either specialized models trained separately on individual time series, or foundation models trained on vast time series databases with little underlying dynamical structure. Motivated by dynamical systems theory, we present Panda, Patched Attention for Nonlinear DynAmics. We train Panda on a novel synthetic, extensible dataset of 2 times 10^4 chaotic dynamical systems that we discover using an evolutionary algorithm. Trained purely on simulated data, Panda exhibits emergent properties: zero-shot forecasting of unseen real world chaotic systems, and nonlinear resonance patterns in cross-channel attention heads. Despite having been trained only on low-dimensional ordinary differential equations, Panda spontaneously develops the ability to predict partial differential equations without retraining. We demonstrate a neural scaling law for differential equations, underscoring the potential of pretrained models for probing abstract mathematical domains like nonlinear dynamics.
Learning invariant representations of time-homogeneous stochastic dynamical systems
We consider the general class of time-homogeneous stochastic dynamical systems, both discrete and continuous, and study the problem of learning a representation of the state that faithfully captures its dynamics. This is instrumental to learning the transfer operator or the generator of the system, which in turn can be used for numerous tasks, such as forecasting and interpreting the system dynamics. We show that the search for a good representation can be cast as an optimization problem over neural networks. Our approach is supported by recent results in statistical learning theory, highlighting the role of approximation error and metric distortion in the learning problem. The objective function we propose is associated with projection operators from the representation space to the data space, overcomes metric distortion, and can be empirically estimated from data. In the discrete-time setting, we further derive a relaxed objective function that is differentiable and numerically well-conditioned. We compare our method against state-of-the-art approaches on different datasets, showing better performance across the board.
Autoregressive Transformer Neural Network for Simulating Open Quantum Systems via a Probabilistic Formulation
The theory of open quantum systems lays the foundations for a substantial part of modern research in quantum science and engineering. Rooted in the dimensionality of their extended Hilbert spaces, the high computational complexity of simulating open quantum systems calls for the development of strategies to approximate their dynamics. In this paper, we present an approach for tackling open quantum system dynamics. Using an exact probabilistic formulation of quantum physics based on positive operator-valued measure (POVM), we compactly represent quantum states with autoregressive transformer neural networks; such networks bring significant algorithmic flexibility due to efficient exact sampling and tractable density. We further introduce the concept of String States to partially restore the symmetry of the autoregressive transformer neural network and improve the description of local correlations. Efficient algorithms have been developed to simulate the dynamics of the Liouvillian superoperator using a forward-backward trapezoid method and find the steady state via a variational formulation. Our approach is benchmarked on prototypical one and two-dimensional systems, finding results which closely track the exact solution and achieve higher accuracy than alternative approaches based on using Markov chain Monte Carlo to sample restricted Boltzmann machines. Our work provides general methods for understanding quantum dynamics in various contexts, as well as techniques for solving high-dimensional probabilistic differential equations in classical setups.
Generative Modeling with Phase Stochastic Bridges
Diffusion models (DMs) represent state-of-the-art generative models for continuous inputs. DMs work by constructing a Stochastic Differential Equation (SDE) in the input space (ie, position space), and using a neural network to reverse it. In this work, we introduce a novel generative modeling framework grounded in phase space dynamics, where a phase space is defined as {an augmented space encompassing both position and velocity.} Leveraging insights from Stochastic Optimal Control, we construct a path measure in the phase space that enables efficient sampling. {In contrast to DMs, our framework demonstrates the capability to generate realistic data points at an early stage of dynamics propagation.} This early prediction sets the stage for efficient data generation by leveraging additional velocity information along the trajectory. On standard image generation benchmarks, our model yields favorable performance over baselines in the regime of small Number of Function Evaluations (NFEs). Furthermore, our approach rivals the performance of diffusion models equipped with efficient sampling techniques, underscoring its potential as a new tool generative modeling.
Model scale versus domain knowledge in statistical forecasting of chaotic systems
Chaos and unpredictability are traditionally synonymous, yet large-scale machine learning methods recently have demonstrated a surprising ability to forecast chaotic systems well beyond typical predictability horizons. However, recent works disagree on whether specialized methods grounded in dynamical systems theory, such as reservoir computers or neural ordinary differential equations, outperform general-purpose large-scale learning methods such as transformers or recurrent neural networks. These prior studies perform comparisons on few individually-chosen chaotic systems, thereby precluding robust quantification of how statistical modeling choices and dynamical invariants of different chaotic systems jointly determine empirical predictability. Here, we perform the largest to-date comparative study of forecasting methods on the classical problem of forecasting chaos: we benchmark 24 state-of-the-art forecasting methods on a crowdsourced database of 135 low-dimensional systems with 17 forecast metrics. We find that large-scale, domain-agnostic forecasting methods consistently produce predictions that remain accurate up to two dozen Lyapunov times, thereby accessing a new long-horizon forecasting regime well beyond classical methods. We find that, in this regime, accuracy decorrelates with classical invariant measures of predictability like the Lyapunov exponent. However, in data-limited settings outside the long-horizon regime, we find that physics-based hybrid methods retain a comparative advantage due to their strong inductive biases.
Dynamical Cosmological Constant
The dynamical realisation of the equation of state p +rho =0 is studied. A non-pathological dynamics for the perturbations of such a system mimicking a dynamical cosmological constant (DCC) requires to go beyond the perfect fluid paradigm. It is shown that an anisotropic stress must be always present. The Hamiltonian of the system in isolation resembles the one of a Pais-Uhlenbeck oscillator and linear stability requires that it cannot be positive definite. The dynamics of linear cosmological perturbations in a DCC dominated Universe is studied in detail showing that when DCC is minimally coupled to gravity no dramatic instability is present. In contrast to what happens in a cosmological constant dominated Universe, the non-relativistic matter contrast is no longer constant and exhibits an oscillator behaviour at small scales while it grows weakly at large scales. In the gravitational waves sector, at small scales, the amplitude is still suppressed as the inverse power of the scale factor while it grows logarithmically at large scales. Also the vector modes propagate, though no growing mode is found.
Exploring Quantum Spacetime with Topological Data Analysis
In a novel application of the tools of topological data analysis (TDA) to nonperturbative quantum gravity, we introduce a new class of observables that allows us to assess whether quantum spacetime really resembles a ``quantum foam" near the Planck scale. The key idea is to investigate the Betti numbers of coarse-grained path integral histories, regularized in terms of dynamical triangulations, as a function of the coarse-graining scale. In two dimensions our analysis exhibits the well-known fractal structure of Euclidean quantum gravity.
WeightFlow: Learning Stochastic Dynamics via Evolving Weight of Neural Network
Modeling stochastic dynamics from discrete observations is a key interdisciplinary challenge. Existing methods often fail to estimate the continuous evolution of probability densities from trajectories or face the curse of dimensionality. To address these limitations, we presents a novel paradigm: modeling dynamics directly in the weight space of a neural network by projecting the evolving probability distribution. We first theoretically establish the connection between dynamic optimal transport in measure space and an equivalent energy functional in weight space. Subsequently, we design WeightFlow, which constructs the neural network weights into a graph and learns its evolution via a graph controlled differential equation. Experiments on interdisciplinary datasets demonstrate that WeightFlow improves performance by an average of 43.02\% over state-of-the-art methods, providing an effective and scalable solution for modeling high-dimensional stochastic dynamics.
Towards Physically Interpretable World Models: Meaningful Weakly Supervised Representations for Visual Trajectory Prediction
Deep learning models are increasingly employed for perception, prediction, and control in complex systems. Embedding physical knowledge into these models is crucial for achieving realistic and consistent outputs, a challenge often addressed by physics-informed machine learning. However, integrating physical knowledge with representation learning becomes difficult when dealing with high-dimensional observation data, such as images, particularly under conditions of incomplete or imprecise state information. To address this, we propose Physically Interpretable World Models, a novel architecture that aligns learned latent representations with real-world physical quantities. Our method combines a variational autoencoder with a dynamical model that incorporates unknown system parameters, enabling the discovery of physically meaningful representations. By employing weak supervision with interval-based constraints, our approach eliminates the reliance on ground-truth physical annotations. Experimental results demonstrate that our method improves the quality of learned representations while achieving accurate predictions of future states, advancing the field of representation learning in dynamic systems.
Multimarginal generative modeling with stochastic interpolants
Given a set of K probability densities, we consider the multimarginal generative modeling problem of learning a joint distribution that recovers these densities as marginals. The structure of this joint distribution should identify multi-way correspondences among the prescribed marginals. We formalize an approach to this task within a generalization of the stochastic interpolant framework, leading to efficient learning algorithms built upon dynamical transport of measure. Our generative models are defined by velocity and score fields that can be characterized as the minimizers of simple quadratic objectives, and they are defined on a simplex that generalizes the time variable in the usual dynamical transport framework. The resulting transport on the simplex is influenced by all marginals, and we show that multi-way correspondences can be extracted. The identification of such correspondences has applications to style transfer, algorithmic fairness, and data decorruption. In addition, the multimarginal perspective enables an efficient algorithm for reducing the dynamical transport cost in the ordinary two-marginal setting. We demonstrate these capacities with several numerical examples.
Leap into the future: shortcut to dynamics for quantum mixtures
The study of the long-time dynamics of quantum systems can be a real challenge, especially in systems like ultracold gases, where the required timescales may be longer than the lifetime of the system itself. In this work, we show that it is possible to access the long-time dynamics of a strongly repulsive atomic gas mixture in shorter times. The shortcut-to-dynamics protocol that we propose does not modify the fate of the observables, but effectively jumps ahead in time without changing the system's inherent evolution. Just like the next-chapter button in a movie player that allows to quickly reach the part of the movie one wants to watch, it is a leap into the future.
State-dependent diffusion: thermodynamic consistency and its path integral formulation
The friction coefficient of a particle can depend on its position as it does when the particle is near a wall. We formulate the dynamics of particles with such state-dependent friction coefficients in terms of a general Langevin equation with multiplicative noise, whose evaluation requires the introduction of specific rules. Two common conventions, the Ito and the Stratonovich, provide alternative rules for evaluation of the noise, but other conventions are possible. We show the requirement that a particle's distribution function approach the Boltzmann distribution at long times dictates that a drift term must be added to the Langevin equation. This drift term is proportional to the derivative of the diffusion coefficient times a factor that depends on the convention used to define the multiplicative noise. We explore the consequences of this result in a number examples with spatially varying diffusion coefficients. We also derive path integral representations for arbitrary interpretation of the noise, and use it in a perturbative study of correlations in a simple system.
Neural Contractive Dynamical Systems
Stability guarantees are crucial when ensuring a fully autonomous robot does not take undesirable or potentially harmful actions. Unfortunately, global stability guarantees are hard to provide in dynamical systems learned from data, especially when the learned dynamics are governed by neural networks. We propose a novel methodology to learn neural contractive dynamical systems, where our neural architecture ensures contraction, and hence, global stability. To efficiently scale the method to high-dimensional dynamical systems, we develop a variant of the variational autoencoder that learns dynamics in a low-dimensional latent representation space while retaining contractive stability after decoding. We further extend our approach to learning contractive systems on the Lie group of rotations to account for full-pose end-effector dynamic motions. The result is the first highly flexible learning architecture that provides contractive stability guarantees with capability to perform obstacle avoidance. Empirically, we demonstrate that our approach encodes the desired dynamics more accurately than the current state-of-the-art, which provides less strong stability guarantees.
Physics-informed Reduced Order Modeling of Time-dependent PDEs via Differentiable Solvers
Reduced-order modeling (ROM) of time-dependent and parameterized differential equations aims to accelerate the simulation of complex high-dimensional systems by learning a compact latent manifold representation that captures the characteristics of the solution fields and their time-dependent dynamics. Although high-fidelity numerical solvers generate the training datasets, they have thus far been excluded from the training process, causing the learned latent dynamics to drift away from the discretized governing physics. This mismatch often limits generalization and forecasting capabilities. In this work, we propose Physics-informed ROM (Φ-ROM) by incorporating differentiable PDE solvers into the training procedure. Specifically, the latent space dynamics and its dependence on PDE parameters are shaped directly by the governing physics encoded in the solver, ensuring a strong correspondence between the full and reduced systems. Our model outperforms state-of-the-art data-driven ROMs and other physics-informed strategies by accurately generalizing to new dynamics arising from unseen parameters, enabling long-term forecasting beyond the training horizon, maintaining continuity in both time and space, and reducing the data cost. Furthermore, Φ-ROM learns to recover and forecast the solution fields even when trained or evaluated with sparse and irregular observations of the fields, providing a flexible framework for field reconstruction and data assimilation. We demonstrate the framework's robustness across various PDE solvers and highlight its broad applicability by providing an open-source JAX implementation that is readily extensible to other PDE systems and differentiable solvers, available at https://phi-rom.github.io.
Generalizing the No-U-Turn Sampler to Riemannian Manifolds
Hamiltonian Monte Carlo provides efficient Markov transitions at the expense of introducing two free parameters: a step size and total integration time. Because the step size controls discretization error it can be readily tuned to achieve certain accuracy criteria, but the total integration time is left unconstrained. Recently Hoffman and Gelman proposed a criterion for tuning the integration time in certain systems with their No U-Turn Sampler, or NUTS. In this paper I investigate the dynamical basis for the success of NUTS and generalize it to Riemannian Manifold Hamiltonian Monte Carlo.
Quantum Thermalization via Travelling Waves
Isolated quantum many-body systems which thermalize under their own dynamics are expected to act as their own thermal baths, thereby bringing their local subsystems to thermal equilibrium. Here we show that the infinite-dimensional limit of a quantum lattice model, as described by Dynamical Mean-Field theory (DMFT), provides a natural framework to understand this self-consistent thermalization process. Using the Fermi-Hubbard model as working example, we demonstrate that the emergence of a self-consistent bath thermalising the system is characterized by a sharp thermalization front, moving balistically and separating the initial condition from the long-time thermal fixed point. We characterize the full DMFT dynamics through an effective temperature for which we derive a travelling-wave equation of the Fisher-Kolmogorov-Petrovsky-Piskunov (FKPP) type. This equation allows to predict the asymptotic shape of the front and its velocity, which match perfectly the full DMFT numerics. Our results provide a new angle to understand the onset of quantum thermalisation in closed isolated systems.
Multi-marginal temporal Schrödinger Bridge Matching for video generation from unpaired data
Many natural dynamic processes -- such as in vivo cellular differentiation or disease progression -- can only be observed through the lens of static sample snapshots. While challenging, reconstructing their temporal evolution to decipher underlying dynamic properties is of major interest to scientific research. Existing approaches enable data transport along a temporal axis but are poorly scalable in high dimension and require restrictive assumptions to be met. To address these issues, we propose \textbf{Multi-Marginal temporal Schr\"odinger Bridge Matching} (MMtSBM) for video generation from unpaired data, extending the theoretical guarantees and empirical efficiency of Diffusion Schr\"odinger Bridge Matching (arXiv:archive/2303.16852) by deriving the Iterative Markovian Fitting algorithm to multiple marginals in a novel factorized fashion. Experiments show that MMtSBM retains theoretical properties on toy examples, achieves state-of-the-art performance on real world datasets such as transcriptomic trajectory inference in 100 dimensions, and for the first time recovers couplings and dynamics in very high dimensional image settings. Our work establishes multi-marginal Schr\"odinger bridges as a practical and principled approach for recovering hidden dynamics from static data.
Out of equilibrium Phase Diagram of the Quantum Random Energy Model
In this paper we study the out-of-equilibrium phase diagram of the quantum version of Derrida's Random Energy Model, which is the simplest model of mean-field spin glasses. We interpret its corresponding quantum dynamics in Fock space as a one-particle problem in very high dimension to which we apply different theoretical methods tailored for high-dimensional lattices: the Forward-Scattering Approximation, a mapping to the Rosenzweig-Porter model, and the cavity method. Our results indicate the existence of two transition lines and three distinct dynamical phases: a completely many-body localized phase at low energy, a fully ergodic phase at high energy, and a multifractal "bad metal" phase at intermediate energy. In the latter, eigenfunctions occupy a diverging volume, yet an exponentially vanishing fraction of the total Hilbert space. We discuss the limitations of our approximations and the relationship with previous studies.
LETS Forecast: Learning Embedology for Time Series Forecasting
Real-world time series are often governed by complex nonlinear dynamics. Understanding these underlying dynamics is crucial for precise future prediction. While deep learning has achieved major success in time series forecasting, many existing approaches do not explicitly model the dynamics. To bridge this gap, we introduce DeepEDM, a framework that integrates nonlinear dynamical systems modeling with deep neural networks. Inspired by empirical dynamic modeling (EDM) and rooted in Takens' theorem, DeepEDM presents a novel deep model that learns a latent space from time-delayed embeddings, and employs kernel regression to approximate the underlying dynamics, while leveraging efficient implementation of softmax attention and allowing for accurate prediction of future time steps. To evaluate our method, we conduct comprehensive experiments on synthetic data of nonlinear dynamical systems as well as real-world time series across domains. Our results show that DeepEDM is robust to input noise, and outperforms state-of-the-art methods in forecasting accuracy. Our code is available at: https://abrarmajeedi.github.io/deep_edm.
On Neural Differential Equations
The conjoining of dynamical systems and deep learning has become a topic of great interest. In particular, neural differential equations (NDEs) demonstrate that neural networks and differential equation are two sides of the same coin. Traditional parameterised differential equations are a special case. Many popular neural network architectures, such as residual networks and recurrent networks, are discretisations. NDEs are suitable for tackling generative problems, dynamical systems, and time series (particularly in physics, finance, ...) and are thus of interest to both modern machine learning and traditional mathematical modelling. NDEs offer high-capacity function approximation, strong priors on model space, the ability to handle irregular data, memory efficiency, and a wealth of available theory on both sides. This doctoral thesis provides an in-depth survey of the field. Topics include: neural ordinary differential equations (e.g. for hybrid neural/mechanistic modelling of physical systems); neural controlled differential equations (e.g. for learning functions of irregular time series); and neural stochastic differential equations (e.g. to produce generative models capable of representing complex stochastic dynamics, or sampling from complex high-dimensional distributions). Further topics include: numerical methods for NDEs (e.g. reversible differential equations solvers, backpropagation through differential equations, Brownian reconstruction); symbolic regression for dynamical systems (e.g. via regularised evolution); and deep implicit models (e.g. deep equilibrium models, differentiable optimisation). We anticipate this thesis will be of interest to anyone interested in the marriage of deep learning with dynamical systems, and hope it will provide a useful reference for the current state of the art.
Cybloids - Creation and Control of Cybernetic Colloids
Colloids play an important role in fundamental science as well as in nature and technology. They have had a strong impact on the fundamental understanding of statistical physics. For example, colloids have helped to obtain a better understanding of collective phenomena, ranging from phase transitions and glass formation to the swarming of active Brownian particles. Yet the success of colloidal systems hinges crucially on the specific physical and chemical properties of the colloidal particles, i.e. particles with the appropriate characteristics must be available. Here we present an idea to create particles with freely selectable properties. The properties might depend, for example, on the presence of other particles (hence mimicking specific pair or many-body interactions), previous configurations (hence introducing some memory or feedback), or a directional bias (hence changing the dynamics). Without directly interfering with the sample, each particle is fully controlled and can receive external commands through a predefined algorithm that can take into account any input parameters. This is realized with computer-controlled colloids, which we term cybloids - short for cybernetic colloids. The potential of cybloids is illustrated by programming a time-delayed external potential acting on a single colloid and interaction potentials for many colloids. Both an attractive harmonic potential and an annular potential are implemented. For a single particle, this programming can cause subdiffusive behavior or lend activity. For many colloids, the programmed interaction potential allows to select a crystal structure at wish. Beyond these examples, we discuss further opportunities which cybloids offer.
Categorical Hopfield Networks
This paper discusses a simple and explicit toy-model example of the categorical Hopfield equations introduced in previous work of Manin and the author. These describe dynamical assignments of resources to networks, where resources are objects in unital symmetric monoidal categories and assignments are realized by summing functors. The special case discussed here is based on computational resources (computational models of neurons) as objects in a category of DNNs, with a simple choice of the endofunctors defining the Hopfield equations that reproduce the usual updating of the weights in DNNs by gradient descent.
Gravity Duals of Lifshitz-like Fixed Points
We find candidate macroscopic gravity duals for scale-invariant but non-Lorentz invariant fixed points, which do not have particle number as a conserved quantity. We compute two-point correlation functions which exhibit novel behavior relative to their AdS counterparts, and find holographic renormalization group flows to conformal field theories. Our theories are characterized by a dynamical critical exponent z, which governs the anisotropy between spatial and temporal scaling t to lambda^z t, x to lambda x; we focus on the case with z=2. Such theories describe multicritical points in certain magnetic materials and liquid crystals, and have been shown to arise at quantum critical points in toy models of the cuprate superconductors. This work can be considered a small step towards making useful dual descriptions of such critical points.
Mamba Integrated with Physics Principles Masters Long-term Chaotic System Forecasting
Long-term forecasting of chaotic systems from short-term observations remains a fundamental and underexplored challenge due to the intrinsic sensitivity to initial conditions and the complex geometry of strange attractors. Existing approaches often rely on long-term training data or focus on short-term sequence correlations, struggling to maintain predictive stability and dynamical coherence over extended horizons. We propose PhyxMamba, a novel framework that integrates a Mamba-based state-space model with physics-informed principles to capture the underlying dynamics of chaotic systems. By reconstructing the attractor manifold from brief observations using time-delay embeddings, PhyxMamba extracts global dynamical features essential for accurate forecasting. Our generative training scheme enables Mamba to replicate the physical process, augmented by multi-token prediction and attractor geometry regularization for physical constraints, enhancing prediction accuracy and preserving key statistical invariants. Extensive evaluations on diverse simulated and real-world chaotic systems demonstrate that PhyxMamba delivers superior long-term forecasting and faithfully captures essential dynamical invariants from short-term data. This framework opens new avenues for reliably predicting chaotic systems under observation-scarce conditions, with broad implications across climate science, neuroscience, epidemiology, and beyond. Our code is open-source at https://github.com/tsinghua-fib-lab/PhyxMamba.
PhononBench:A Large-Scale Phonon-Based Benchmark for Dynamical Stability in Crystal Generation
In this work, we introduce PhononBench, the first large-scale benchmark for dynamical stability in AI-generated crystals. Leveraging the recently developed MatterSim interatomic potential, which achieves DFT-level accuracy in phonon predictions across more than 10,000 materials, PhononBench enables efficient large-scale phonon calculations and dynamical-stability analysis for 108,843 crystal structures generated by six leading crystal generation models. PhononBench reveals a widespread limitation of current generative models in ensuring dynamical stability: the average dynamical-stability rate across all generated structures is only 25.83%, with the top-performing model, MatterGen, reaching just 41.0%. Further case studies show that in property-targeted generation-illustrated here by band-gap conditioning with MatterGen--the dynamical-stability rate remains as low as 23.5% even at the optimal band-gap condition of 0.5 eV. In space-group-controlled generation, higher-symmetry crystals exhibit better stability (e.g., cubic systems achieve rates up to 49.2%), yet the average stability across all controlled generations is still only 34.4%. An important additional outcome of this study is the identification of 28,119 crystal structures that are phonon-stable across the entire Brillouin zone, providing a substantial pool of reliable candidates for future materials exploration. By establishing the first large-scale dynamical-stability benchmark, this work systematically highlights the current limitations of crystal generation models and offers essential evaluation criteria and guidance for their future development toward the design and discovery of physically viable materials. All model-generated crystal structures, phonon calculation results, and the high-throughput evaluation workflows developed in PhononBench will be openly released at https://github.com/xqh19970407/PhononBench
Classical Glasses, Black Holes, and Strange Quantum Liquids
From the dynamics of a broad class of classical mean-field glass models one may obtain a quantum model with finite zero-temperature entropy, a quantum transition at zero temperature, and a time-reparametrization (quasi-)invariance in the dynamical equations for correlations. The low eigenvalue spectrum of the resulting quantum model is directly related to the structure and exploration of metastable states in the landscape of the original classical glass model. This mapping reveals deep connections between classical glasses and the properties of SYK-like models.
Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit
The cost of hyperparameter tuning in deep learning has been rising with model sizes, prompting practitioners to find new tuning methods using a proxy of smaller networks. One such proposal uses muP parameterized networks, where the optimal hyperparameters for small width networks transfer to networks with arbitrarily large width. However, in this scheme, hyperparameters do not transfer across depths. As a remedy, we study residual networks with a residual branch scale of 1/text{depth} in combination with the muP parameterization. We provide experiments demonstrating that residual architectures including convolutional ResNets and Vision Transformers trained with this parameterization exhibit transfer of optimal hyperparameters across width and depth on CIFAR-10 and ImageNet. Furthermore, our empirical findings are supported and motivated by theory. Using recent developments in the dynamical mean field theory (DMFT) description of neural network learning dynamics, we show that this parameterization of ResNets admits a well-defined feature learning joint infinite-width and infinite-depth limit and show convergence of finite-size network dynamics towards this limit.
SEGNO: Generalizing Equivariant Graph Neural Networks with Physical Inductive Biases
Graph Neural Networks (GNNs) with equivariant properties have emerged as powerful tools for modeling complex dynamics of multi-object physical systems. However, their generalization ability is limited by the inadequate consideration of physical inductive biases: (1) Existing studies overlook the continuity of transitions among system states, opting to employ several discrete transformation layers to learn the direct mapping between two adjacent states; (2) Most models only account for first-order velocity information, despite the fact that many physical systems are governed by second-order motion laws. To incorporate these inductive biases, we propose the Second-order Equivariant Graph Neural Ordinary Differential Equation (SEGNO). Specifically, we show how the second-order continuity can be incorporated into GNNs while maintaining the equivariant property. Furthermore, we offer theoretical insights into SEGNO, highlighting that it can learn a unique trajectory between adjacent states, which is crucial for model generalization. Additionally, we prove that the discrepancy between this learned trajectory of SEGNO and the true trajectory is bounded. Extensive experiments on complex dynamical systems including molecular dynamics and motion capture demonstrate that our model yields a significant improvement over the state-of-the-art baselines.
Remarks on uniform recurrence properties for beta-transformation
We complement the recent paper of Zheng and Wu [Uniform recurrence properties for beta-transformation, Nonlinearity 33 (2020), 4590--4612], where the authors study, from the metrical point of view, the uniform recurrence properties of the orbit of a point under the β-transformation to the point itself.
Information Theory and Statistical Mechanics Revisited
The statistical mechanics of Gibbs is a juxtaposition of subjective, probabilistic ideas on the one hand and objective, mechanical ideas on the other. In this paper, we follow the path set out by Jaynes, including elements added subsequently to that original work, to explore the consequences of the purely statistical point of view. We show how standard methods in the equilibrium theory could have been derived simply from a description of the available problem information. In addition, our presentation leads to novel insights into questions associated with symmetry and non-equilibrium statistical mechanics. Two surprising consequences to be explored in further work are that (in)distinguishability factors are automatically predicted from the problem formulation and that a quantity related to the thermodynamic entropy production is found by considering information loss in non-equilibrium processes. Using the problem of ion channel thermodynamics as an example, we illustrate the idea of building up complexity by successively adding information to create progressively more complex descriptions of a physical system. Our result is that such statistical mechanical descriptions can be used to create transparent, computable, experimentally-relevant models that may be informed by more detailed atomistic simulations. We also derive a theory for the kinetic behavior of this system, identifying the nonequilibrium `process' free energy functional. The Gibbs relation for this functional is a fluctuation-dissipation theorem applicable arbitrarily far from equilibrium, that captures the effect of non-local and time-dependent behavior from transient driving forces. Based on this work, it is clear that statistical mechanics is a general tool for constructing the relationships between constraints on system information.
Two-timescale Extragradient for Finding Local Minimax Points
Minimax problems are notoriously challenging to optimize. However, we demonstrate that the two-timescale extragradient can be a viable solution. By utilizing dynamical systems theory, we show that it converges to points that satisfy the second-order necessary condition of local minimax points, under a mild condition. This work surpasses all previous results as we eliminate a crucial assumption that the Hessian, with respect to the maximization variable, is nondegenerate.
Mean-field underdamped Langevin dynamics and its spacetime discretization
We propose a new method called the N-particle underdamped Langevin algorithm for optimizing a special class of non-linear functionals defined over the space of probability measures. Examples of problems with this formulation include training mean-field neural networks, maximum mean discrepancy minimization and kernel Stein discrepancy minimization. Our algorithm is based on a novel spacetime discretization of the mean-field underdamped Langevin dynamics, for which we provide a new, fast mixing guarantee. In addition, we demonstrate that our algorithm converges globally in total variation distance, bridging the theoretical gap between the dynamics and its practical implementation.
GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs
Estimating physical properties for visual data is a crucial task in computer vision, graphics, and robotics, underpinning applications such as augmented reality, physical simulation, and robotic grasping. However, this area remains under-explored due to the inherent ambiguities in physical property estimation. To address these challenges, we introduce GaussianProperty, a training-free framework that assigns physical properties of materials to 3D Gaussians. Specifically, we integrate the segmentation capability of SAM with the recognition capability of GPT-4V(ision) to formulate a global-local physical property reasoning module for 2D images. Then we project the physical properties from multi-view 2D images to 3D Gaussians using a voting strategy. We demonstrate that 3D Gaussians with physical property annotations enable applications in physics-based dynamic simulation and robotic grasping. For physics-based dynamic simulation, we leverage the Material Point Method (MPM) for realistic dynamic simulation. For robot grasping, we develop a grasping force prediction strategy that estimates a safe force range required for object grasping based on the estimated physical properties. Extensive experiments on material segmentation, physics-based dynamic simulation, and robotic grasping validate the effectiveness of our proposed method, highlighting its crucial role in understanding physical properties from visual data. Online demo, code, more cases and annotated datasets are available on https://Gaussian-Property.github.io{this https URL}.
Information Shapes Koopman Representation
The Koopman operator provides a powerful framework for modeling dynamical systems and has attracted growing interest from the machine learning community. However, its infinite-dimensional nature makes identifying suitable finite-dimensional subspaces challenging, especially for deep architectures. We argue that these difficulties come from suboptimal representation learning, where latent variables fail to balance expressivity and simplicity. This tension is closely related to the information bottleneck (IB) dilemma: constructing compressed representations that are both compact and predictive. Rethinking Koopman learning through this lens, we demonstrate that latent mutual information promotes simplicity, yet an overemphasis on simplicity may cause latent space to collapse onto a few dominant modes. In contrast, expressiveness is sustained by the von Neumann entropy, which prevents such collapse and encourages mode diversity. This insight leads us to propose an information-theoretic Lagrangian formulation that explicitly balances this tradeoff. Furthermore, we propose a new algorithm based on the Lagrangian formulation that encourages both simplicity and expressiveness, leading to a stable and interpretable Koopman representation. Beyond quantitative evaluations, we further visualize the learned manifolds under our representations, observing empirical results consistent with our theoretical predictions. Finally, we validate our approach across a diverse range of dynamical systems, demonstrating improved performance over existing Koopman learning methods. The implementation is publicly available at https://github.com/Wenxuan52/InformationKoopman.
Deviation Dynamics in Cardinal Hedonic Games
Computing stable partitions in hedonic games is a challenging task because there exist games in which stable outcomes do not exist. Even more, these No-instances can often be leveraged to prove computational hardness results. We make this impression rigorous in a dynamic model of cardinal hedonic games by providing meta theorems. These imply hardness of deciding about the possible or necessary convergence of deviation dynamics based on the mere existence of No-instances. Our results hold for additively separable, fractional, and modified fractional hedonic games (ASHGs, FHGs, and MFHGs). Moreover, they encompass essentially all reasonable stability notions based on single-agent deviations. In addition, we propose dynamics as a method to find individually rational and contractually individual stable (CIS) partitions in ASHGs. In particular, we find that CIS dynamics from the singleton partition possibly converge after a linear number of deviations but may require an exponential number of deviations in the worst case.
Solving physics-based initial value problems with unsupervised machine learning
Initial value problems -- a system of ordinary differential equations and corresponding initial conditions -- can be used to describe many physical phenomena including those arise in classical mechanics. We have developed a novel approach to solve physics-based initial value problems using unsupervised machine learning. We propose a deep learning framework that models the dynamics of a variety of mechanical systems through neural networks. Our framework is flexible, allowing us to solve non-linear, coupled, and chaotic dynamical systems. We demonstrate the effectiveness of our approach on systems including a free particle, a particle in a gravitational field, a classical pendulum, and the H\'enon--Heiles system (a pair of coupled harmonic oscillators with a non-linear perturbation, used in celestial mechanics). Our results show that deep neural networks can successfully approximate solutions to these problems, producing trajectories which conserve physical properties such as energy and those with stationary action. We note that probabilistic activation functions, as defined in this paper, are required to learn any solutions of initial value problems in their strictest sense, and we introduce coupled neural networks to learn solutions of coupled systems.
Intriguing Properties of Quantization at Scale
Emergent properties have been widely adopted as a term to describe behavior not present in smaller models but observed in larger models. Recent work suggests that the trade-off incurred by quantization is also an emergent property, with sharp drops in performance in models over 6B parameters. In this work, we ask "are quantization cliffs in performance solely a factor of scale?" Against a backdrop of increased research focus on why certain emergent properties surface at scale, this work provides a useful counter-example. We posit that it is possible to optimize for a quantization friendly training recipe that suppresses large activation magnitude outliers. Here, we find that outlier dimensions are not an inherent product of scale, but rather sensitive to the optimization conditions present during pre-training. This both opens up directions for more efficient quantization, and poses the question of whether other emergent properties are inherent or can be altered and conditioned by optimization and architecture design choices. We successfully quantize models ranging in size from 410M to 52B with minimal degradation in performance.
From black holes to strange metals
Since the mid-eighties there has been an accumulation of metallic materials whose thermodynamic and transport properties differ significantly from those predicted by Fermi liquid theory. Examples of these so-called non-Fermi liquids include the strange metal phase of high transition temperature cuprates, and heavy fermion systems near a quantum phase transition. We report on a class of non-Fermi liquids discovered using gauge/gravity duality. The low energy behavior of these non-Fermi liquids is shown to be governed by a nontrivial infrared (IR) fixed point which exhibits nonanalytic scaling behavior only in the temporal direction. Within this class we find examples whose single-particle spectral function and transport behavior resemble those of strange metals. In particular, the contribution from the Fermi surface to the conductivity is inversely proportional to the temperature. In our treatment these properties can be understood as being controlled by the scaling dimension of the fermion operator in the emergent IR fixed point.
Learning Dynamical Demand Response Model in Real-Time Pricing Program
Price responsiveness is a major feature of end use customers (EUCs) that participate in demand response (DR) programs, and has been conventionally modeled with static demand functions, which take the electricity price as the input and the aggregate energy consumption as the output. This, however, neglects the inherent temporal correlation of the EUC behaviors, and may result in large errors when predicting the actual responses of EUCs in real-time pricing (RTP) programs. In this paper, we propose a dynamical DR model so as to capture the temporal behavior of the EUCs. The states in the proposed dynamical DR model can be explicitly chosen, in which case the model can be represented by a linear function or a multi-layer feedforward neural network, or implicitly chosen, in which case the model can be represented by a recurrent neural network or a long short-term memory unit network. In both cases, the dynamical DR model can be learned from historical price and energy consumption data. Numerical simulation illustrated how the states are chosen and also showed the proposed dynamical DR model significantly outperforms the static ones.
Thermodynamic Performance Limits for Score-Based Diffusion Models
We establish a fundamental connection between score-based diffusion models and non-equilibrium thermodynamics by deriving performance limits based on entropy rates. Our main theoretical contribution is a lower bound on the negative log-likelihood of the data that relates model performance to entropy rates of diffusion processes. We numerically validate this bound on a synthetic dataset and investigate its tightness. By building a bridge to entropy rates - system, intrinsic, and exchange entropy - we provide new insights into the thermodynamic operation of these models, drawing parallels to Maxwell's demon and implications for thermodynamic computing hardware. Our framework connects generative modeling performance to fundamental physical principles through stochastic thermodynamics.
Course Correcting Koopman Representations
Koopman representations aim to learn features of nonlinear dynamical systems (NLDS) which lead to linear dynamics in the latent space. Theoretically, such features can be used to simplify many problems in modeling and control of NLDS. In this work we study autoencoder formulations of this problem, and different ways they can be used to model dynamics, specifically for future state prediction over long horizons. We discover several limitations of predicting future states in the latent space and propose an inference-time mechanism, which we refer to as Periodic Reencoding, for faithfully capturing long term dynamics. We justify this method both analytically and empirically via experiments in low and high dimensional NLDS.
Detailed balance in large language model-driven agents
Large language model (LLM)-driven agents are emerging as a powerful new paradigm for solving complex problems. Despite the empirical success of these practices, a theoretical framework to understand and unify their macroscopic dynamics remains lacking. This Letter proposes a method based on the least action principle to estimate the underlying generative directionality of LLMs embedded within agents. By experimentally measuring the transition probabilities between LLM-generated states, we statistically discover a detailed balance in LLM-generated transitions, indicating that LLM generation may not be achieved by generally learning rule sets and strategies, but rather by implicitly learning a class of underlying potential functions that may transcend different LLM architectures and prompt templates. To our knowledge, this is the first discovery of a macroscopic physical law in LLM generative dynamics that does not depend on specific model details. This work is an attempt to establish a macroscopic dynamics theory of complex AI systems, aiming to elevate the study of AI agents from a collection of engineering practices to a science built on effective measurements that are predictable and quantifiable.
rd-spiral: An open-source Python library for learning 2D reaction-diffusion dynamics through pseudo-spectral method
We introduce rd-spiral, an open-source Python library for simulating 2D reaction-diffusion systems using pseudo-spectral methods. The framework combines FFT-based spatial discretization with adaptive Dormand-Prince time integration, achieving exponential convergence while maintaining pedagogical clarity. We analyze three dynamical regimes: stable spirals, spatiotemporal chaos, and pattern decay, revealing extreme non-Gaussian statistics (kurtosis >96) in stable states. Information-theoretic metrics show 10.7% reduction in activator-inhibitor coupling during turbulence versus 6.5% in stable regimes. The solver handles stiffness ratios >6:1 with features including automated equilibrium classification and checkpointing. Effect sizes (delta=0.37--0.78) distinguish regimes, with asymmetric field sensitivities to perturbations. By balancing computational rigor with educational transparency, rd-spiral bridges theoretical and practical nonlinear dynamics.
Understanding Self-Predictive Learning for Reinforcement Learning
We study the learning dynamics of self-predictive learning for reinforcement learning, a family of algorithms that learn representations by minimizing the prediction error of their own future latent representations. Despite its recent empirical success, such algorithms have an apparent defect: trivial representations (such as constants) minimize the prediction error, yet it is obviously undesirable to converge to such solutions. Our central insight is that careful designs of the optimization dynamics are critical to learning meaningful representations. We identify that a faster paced optimization of the predictor and semi-gradient updates on the representation, are crucial to preventing the representation collapse. Then in an idealized setup, we show self-predictive learning dynamics carries out spectral decomposition on the state transition matrix, effectively capturing information of the transition dynamics. Building on the theoretical insights, we propose bidirectional self-predictive learning, a novel self-predictive algorithm that learns two representations simultaneously. We examine the robustness of our theoretical insights with a number of small-scale experiments and showcase the promise of the novel representation learning algorithm with large-scale experiments.
Generative Modeling of Regular and Irregular Time Series Data via Koopman VAEs
Generating realistic time series data is important for many engineering and scientific applications. Existing work tackles this problem using generative adversarial networks (GANs). However, GANs are often unstable during training, and they can suffer from mode collapse. While variational autoencoders (VAEs) are known to be more robust to these issues, they are (surprisingly) less often considered for time series generation. In this work, we introduce Koopman VAE (KVAE), a new generative framework that is based on a novel design for the model prior, and that can be optimized for either regular and irregular training data. Inspired by Koopman theory, we represent the latent conditional prior dynamics using a linear map. Our approach enhances generative modeling with two desired features: (i) incorporating domain knowledge can be achieved by leverageing spectral tools that prescribe constraints on the eigenvalues of the linear map; and (ii) studying the qualitative behavior and stablity of the system can be performed using tools from dynamical systems theory. Our results show that KVAE outperforms state-of-the-art GAN and VAE methods across several challenging synthetic and real-world time series generation benchmarks. Whether trained on regular or irregular data, KVAE generates time series that improve both discriminative and predictive metrics. We also present visual evidence suggesting that KVAE learns probability density functions that better approximate empirical ground truth distributions.
DyFraNet: Forecasting and Backcasting Dynamic Fracture Mechanics in Space and Time Using a 2D-to-3D Deep Neural Network
The dynamics of materials failure is one of the most critical phenomena in a range of scientific and engineering fields, from healthcare to structural materials to transportation. In this paper we propose a specially designed deep neural network, DyFraNet, which can predict dynamic fracture behaviors by identifying a complete history of fracture propagation - from cracking onset, as a crack grows through the material, modeled as a series of frames evolving over time and dependent on each other. Furthermore, this model can not only forecast future fracture processes but also backcast to elucidate the past fracture history. In this scenario, once provided with the outcome of a fracture event, the model will elucidate past events that led to this state and will predict the future evolution of the failure process. By comparing the predicted results with atomistic-level simulations and theory, we show that DyFraNet can capture dynamic fracture mechanics by accurately predicting how cracks develop over time, including measures such as the crack speed, as well as when cracks become unstable. We use GradCAM to interpret how DyFraNet perceives the relationship between geometric conditions and fracture dynamics and we find DyFraNet pays special attention to the areas around crack tips, which have a critical influence in the early stage of fracture propagation. In later stages, the model pays increased attention to the existing or newly formed damage distribution in the material. The proposed approach offers significant potential to accelerate the exploration of the dynamics in material design against fracture failures and can be beneficially adapted for all kinds of dynamical engineering problems.
A Low-complexity Structured Neural Network to Realize States of Dynamical Systems
Data-driven learning is rapidly evolving and places a new perspective on realizing state-space dynamical systems. However, dynamical systems derived from nonlinear ordinary differential equations (ODEs) suffer from limitations in computational efficiency. Thus, this paper stems from data-driven learning to advance states of dynamical systems utilizing a structured neural network (StNN). The proposed learning technique also seeks to identify an optimal, low-complexity operator to solve dynamical systems, the so-called Hankel operator, derived from time-delay measurements. Thus, we utilize the StNN based on the Hankel operator to solve dynamical systems as an alternative to existing data-driven techniques. We show that the proposed StNN reduces the number of parameters and computational complexity compared with the conventional neural networks and also with the classical data-driven techniques, such as Sparse Identification of Nonlinear Dynamics (SINDy) and Hankel Alternative view of Koopman (HAVOK), which is commonly known as delay-Dynamic Mode Decomposition(DMD) or Hankel-DMD. More specifically, we present numerical simulations to solve dynamical systems utilizing the StNN based on the Hankel operator beginning from the fundamental Lotka-Volterra model, where we compare the StNN with the LEarning Across Dynamical Systems (LEADS), and extend our analysis to highly nonlinear and chaotic Lorenz systems, comparing the StNN with conventional neural networks, SINDy, and HAVOK. Hence, we show that the proposed StNN paves the way for realizing state-space dynamical systems with a low-complexity learning algorithm, enabling prediction and understanding of future states.
Electronic properties, correlated topology and Green's function zeros
There is extensive current interest about electronic topology in correlated settings. In strongly correlated systems, contours of Green's function zeros may develop in frequency-momentum space, and their role in correlated topology has increasingly been recognized. However, whether and how the zeros contribute to electronic properties is a matter of uncertainty. Here we address the issue in an exactly solvable model for Mott insulator. We show that the Green's function zeros contribute to several physically measurable correlation functions, in a way that does not run into inconsistencies. In particular, the physical properties remain robust to chemical potential variations up to the Mott gap as it should be based on general considerations. Our work sets the stage for further understandings on the rich interplay among topology, symmetry and strong correlations.
Learning minimal representations of stochastic processes with variational autoencoders
Stochastic processes have found numerous applications in science, as they are broadly used to model a variety of natural phenomena. Due to their intrinsic randomness and uncertainty, they are however difficult to characterize. Here, we introduce an unsupervised machine learning approach to determine the minimal set of parameters required to effectively describe the dynamics of a stochastic process. Our method builds upon an extended beta-variational autoencoder architecture. By means of simulated datasets corresponding to paradigmatic diffusion models, we showcase its effectiveness in extracting the minimal relevant parameters that accurately describe these dynamics. Furthermore, the method enables the generation of new trajectories that faithfully replicate the expected stochastic behavior. Overall, our approach enables for the autonomous discovery of unknown parameters describing stochastic processes, hence enhancing our comprehension of complex phenomena across various fields.
Orb-v3: atomistic simulation at scale
We introduce Orb-v3, the next generation of the Orb family of universal interatomic potentials. Models in this family expand the performance-speed-memory Pareto frontier, offering near SoTA performance across a range of evaluations with a >10x reduction in latency and > 8x reduction in memory. Our experiments systematically traverse this frontier, charting the trade-off induced by roto-equivariance, conservatism and graph sparsity. Contrary to recent literature, we find that non-equivariant, non-conservative architectures can accurately model physical properties, including those which require higher-order derivatives of the potential energy surface. This model release is guided by the principle that the most valuable foundation models for atomic simulation will excel on all fronts: accuracy, latency and system size scalability. The reward for doing so is a new era of computational chemistry driven by high-throughput and mesoscale all-atom simulations.
Phase diagram and eigenvalue dynamics of stochastic gradient descent in multilayer neural networks
Hyperparameter tuning is one of the essential steps to guarantee the convergence of machine learning models. We argue that intuition about the optimal choice of hyperparameters for stochastic gradient descent can be obtained by studying a neural network's phase diagram, in which each phase is characterised by distinctive dynamics of the singular values of weight matrices. Taking inspiration from disordered systems, we start from the observation that the loss landscape of a multilayer neural network with mean squared error can be interpreted as a disordered system in feature space, where the learnt features are mapped to soft spin degrees of freedom, the initial variance of the weight matrices is interpreted as the strength of the disorder, and temperature is given by the ratio of the learning rate and the batch size. As the model is trained, three phases can be identified, in which the dynamics of weight matrices is qualitatively different. Employing a Langevin equation for stochastic gradient descent, previously derived using Dyson Brownian motion, we demonstrate that the three dynamical regimes can be classified effectively, providing practical guidance for the choice of hyperparameters of the optimiser.
Denoising Hamiltonian Network for Physical Reasoning
Machine learning frameworks for physical problems must capture and enforce physical constraints that preserve the structure of dynamical systems. Many existing approaches achieve this by integrating physical operators into neural networks. While these methods offer theoretical guarantees, they face two key limitations: (i) they primarily model local relations between adjacent time steps, overlooking longer-range or higher-level physical interactions, and (ii) they focus on forward simulation while neglecting broader physical reasoning tasks. We propose the Denoising Hamiltonian Network (DHN), a novel framework that generalizes Hamiltonian mechanics operators into more flexible neural operators. DHN captures non-local temporal relationships and mitigates numerical integration errors through a denoising mechanism. DHN also supports multi-system modeling with a global conditioning mechanism. We demonstrate its effectiveness and flexibility across three diverse physical reasoning tasks with distinct inputs and outputs.
Spacetime Neural Network for High Dimensional Quantum Dynamics
We develop a spacetime neural network method with second order optimization for solving quantum dynamics from the high dimensional Schr\"{o}dinger equation. In contrast to the standard iterative first order optimization and the time-dependent variational principle, our approach utilizes the implicit mid-point method and generates the solution for all spatial and temporal values simultaneously after optimization. We demonstrate the method in the Schr\"{o}dinger equation with a self-normalized autoregressive spacetime neural network construction. Future explorations for solving different high dimensional differential equations are discussed.
Piecewise DMD for oscillatory and Turing spatio-temporal dynamics
Dynamic Mode Decomposition (DMD) is an equation-free method that aims at reconstructing the best linear fit from temporal datasets. In this paper, we show that DMD does not provide accurate approximation for datasets describing oscillatory dynamics, like spiral waves and relaxation oscillations, or spatio-temporal Turing instability. Inspired from the classical "divide and conquer" approach, we propose a piecewise version of DMD (pDMD) to overcome this problem. The main idea is to split the original dataset in N submatrices and then apply the exact (randomized) DMD method in each subset of the obtained partition. We describe the pDMD algorithm in detail and we introduce some error indicators to evaluate its performance when N is increased. Numerical experiments show that very accurate reconstructions are obtained by pDMD for datasets arising from time snapshots of some reaction-diffusion PDE systems, like the FitzHugh-Nagumo model, the lambda-omega system and the DIB morpho-chemical system for battery modeling.
Automatic-differentiated Physics-Informed Echo State Network (API-ESN)
We propose the Automatic-differentiated Physics-Informed Echo State Network (API-ESN). The network is constrained by the physical equations through the reservoir's exact time-derivative, which is computed by automatic differentiation. As compared to the original Physics-Informed Echo State Network, the accuracy of the time-derivative is increased by up to seven orders of magnitude. This increased accuracy is key in chaotic dynamical systems, where errors grows exponentially in time. The network is showcased in the reconstruction of unmeasured (hidden) states of a chaotic system. The API-ESN eliminates a source of error, which is present in existing physics-informed echo state networks, in the computation of the time-derivative. This opens up new possibilities for an accurate reconstruction of chaotic dynamical states.
PropMolFlow: Property-guided Molecule Generation with Geometry-Complete Flow Matching
Molecule generation is advancing rapidly in chemical discovery and drug design. Flow matching methods have recently set the state of the art (SOTA) in unconditional molecule generation, surpassing score-based diffusion models. However, diffusion models still lead in property-guided generation. In this work, we introduce PropMolFlow, a novel approach for property-guided molecule generation based on geometry-complete SE(3)-equivariant flow matching. Integrating five different property embedding methods with a Gaussian expansion of scalar properties, PropMolFlow outperforms previous SOTA diffusion models in conditional molecule generation across various properties while preserving the stability and validity of the generated molecules, consistent with its unconditional counterpart. Additionally, it enables faster inference with significantly fewer time steps compared to baseline models. We highlight the importance of validating the properties of generated molecules through DFT calculations performed at the same level of theory as the training data. Specifically, our analysis identifies properties that require DFT validation and others where a pretrained SE(3) geometric vector perceptron regressors provide sufficiently accurate predictions on generated molecules. Furthermore, we introduce a new property metric designed to assess the model's ability to propose molecules with underrepresented property values, assessing its capacity for out-of-distribution generalization. Our findings reveal shortcomings in existing structural metrics, which mistakenly validate open-shell molecules or molecules with invalid valence-charge configurations, underscoring the need for improved evaluation frameworks. Overall, this work paves the way for developing targeted property-guided generation methods, enhancing the design of molecular generative models for diverse applications.
Generative Latent Space Dynamics of Electron Density
Modeling the time-dependent evolution of electron density is essential for understanding quantum mechanical behaviors of condensed matter and enabling predictive simulations in spectroscopy, photochemistry, and ultrafast science. Yet, while machine learning methods have advanced static density prediction, modeling its spatiotemporal dynamics remains largely unexplored. In this work, we introduce a generative framework that combines a 3D convolutional autoencoder with a latent diffusion model (LDM) to learn electron density trajectories from ab-initio molecular dynamics (AIMD) simulations. Our method encodes electron densities into a compact latent space and predicts their future states by sampling from the learned conditional distribution, enabling stable long-horizon rollouts without drift or collapse. To preserve statistical fidelity, we incorporate a scaled Jensen-Shannon divergence regularization that aligns generated and reference density distributions. On AIMD trajectories of liquid lithium at 800 K, our model accurately captures both the spatial correlations and the log-normal-like statistical structure of the density. The proposed framework has the potential to accelerate the simulation of quantum dynamics and overcome key challenges faced by current spatiotemporal machine learning methods as surrogates of quantum mechanical simulators.
Latent Field Discovery In Interacting Dynamical Systems With Neural Fields
Systems of interacting objects often evolve under the influence of field effects that govern their dynamics, yet previous works have abstracted away from such effects, and assume that systems evolve in a vacuum. In this work, we focus on discovering these fields, and infer them from the observed dynamics alone, without directly observing them. We theorize the presence of latent force fields, and propose neural fields to learn them. Since the observed dynamics constitute the net effect of local object interactions and global field effects, recently popularized equivariant networks are inapplicable, as they fail to capture global information. To address this, we propose to disentangle local object interactions -- which are SE(n) equivariant and depend on relative states -- from external global field effects -- which depend on absolute states. We model interactions with equivariant graph networks, and combine them with neural fields in a novel graph network that integrates field forces. Our experiments show that we can accurately discover the underlying fields in charged particles settings, traffic scenes, and gravitational n-body problems, and effectively use them to learn the system and forecast future trajectories.
Local Convergence of Gradient Descent-Ascent for Training Generative Adversarial Networks
Generative Adversarial Networks (GANs) are a popular formulation to train generative models for complex high dimensional data. The standard method for training GANs involves a gradient descent-ascent (GDA) procedure on a minimax optimization problem. This procedure is hard to analyze in general due to the nonlinear nature of the dynamics. We study the local dynamics of GDA for training a GAN with a kernel-based discriminator. This convergence analysis is based on a linearization of a non-linear dynamical system that describes the GDA iterations, under an isolated points model assumption from [Becker et al. 2022]. Our analysis brings out the effect of the learning rates, regularization, and the bandwidth of the kernel discriminator, on the local convergence rate of GDA. Importantly, we show phase transitions that indicate when the system converges, oscillates, or diverges. We also provide numerical simulations that verify our claims.
Zero Sound from Holography
Quantum liquids are characterized by the distinctive properties such as the low temperature behavior of heat capacity and the spectrum of low-energy quasiparticle excitations. In particular, at low temperature, Fermi liquids exhibit the zero sound, predicted by L. D. Landau in 1957 and subsequently observed in liquid He-3. In this paper, we ask a question whether such a characteristic behavior is present in theories with holographically dual description. We consider a class of gauge theories with fundamental matter fields whose holographic dual in the appropriate limit is given in terms of the Dirac-Born-Infeld action in AdS_{p+1} space. An example of such a system is the N=4 SU(N_c) supersymmetric Yang-Mills theory with N_f massless N=2 hypermultiplets at strong coupling, finite baryon number density, and low temperature. We find that these systems exhibit a zero sound mode despite having a non-Fermi liquid type behavior of the specific heat. These properties suggest that holography identifies a new type of quantum liquids.
First principles simulations of dense hydrogen
Accurate knowledge of the properties of hydrogen at high compression is crucial for astrophysics (e.g. planetary and stellar interiors, brown dwarfs, atmosphere of compact stars) and laboratory experiments, including inertial confinement fusion. There exists experimental data for the equation of state, conductivity, and Thomson scattering spectra. However, the analysis of the measurements at extreme pressures and temperatures typically involves additional model assumptions, which makes it difficult to assess the accuracy of the experimental data. rigorously. On the other hand, theory and modeling have produced extensive collections of data. They originate from a very large variety of models and simulations including path integral Monte Carlo (PIMC) simulations, density functional theory (DFT), chemical models, machine-learned models, and combinations thereof. At the same time, each of these methods has fundamental limitations (fermion sign problem in PIMC, approximate exchange-correlation functionals of DFT, inconsistent interaction energy contributions in chemical models, etc.), so for some parameter ranges accurate predictions are difficult. Recently, a number of breakthroughs in first principle PIMC and DFT simulations were achieved which are discussed in this review. Here we use these results to benchmark different simulation methods. We present an update of the hydrogen phase diagram at high pressures, the expected phase transitions, and thermodynamic properties including the equation of state and momentum distribution. Furthermore, we discuss available dynamic results for warm dense hydrogen, including the conductivity, dynamic structure factor, plasmon dispersion, imaginary-time structure, and density response functions. We conclude by outlining strategies to combine different simulations to achieve accurate theoretical predictions.
Symphony: Symmetry-Equivariant Point-Centered Spherical Harmonics for Molecule Generation
We present Symphony, an E(3)-equivariant autoregressive generative model for 3D molecular geometries that iteratively builds a molecule from molecular fragments. Existing autoregressive models such as G-SchNet and G-SphereNet for molecules utilize rotationally invariant features to respect the 3D symmetries of molecules. In contrast, Symphony uses message-passing with higher-degree E(3)-equivariant features. This allows a novel representation of probability distributions via spherical harmonic signals to efficiently model the 3D geometry of molecules. We show that Symphony is able to accurately generate small molecules from the QM9 dataset, outperforming existing autoregressive models and approaching the performance of diffusion models.
Persistent homology of the cosmic web. I: Hierarchical topology in ΛCDM cosmologies
Using a set of LambdaCDM simulations of cosmic structure formation, we study the evolving connectivity and changing topological structure of the cosmic web using state-of-the-art tools of multiscale topological data analysis (TDA). We follow the development of the cosmic web topology in terms of the evolution of Betti number curves and feature persistence diagrams of the three (topological) classes of structural features: matter concentrations, filaments and tunnels, and voids. The Betti curves specify the prominence of features as a function of density level, and their evolution with cosmic epoch reflects the changing network connections between these structural features. The persistence diagrams quantify the longevity and stability of topological features. In this study we establish, for the first time, the link between persistence diagrams, the features they show, and the gravitationally driven cosmic structure formation process. By following the diagrams' development over cosmic time, the link between the multiscale topology of the cosmic web and the hierarchical buildup of cosmic structure is established. The sharp apexes in the diagrams are intimately related to key transitions in the structure formation process. The apex in the matter concentration diagrams coincides with the density level at which, typically, they detach from the Hubble expansion and begin to collapse. At that level many individual islands merge to form the network of the cosmic web and a large number of filaments and tunnels emerge to establish its connecting bridges. The location trends of the apex possess a self-similar character that can be related to the cosmic web's hierarchical buildup. We find that persistence diagrams provide a significantly higher and more profound level of information on the structure formation process than more global summary statistics like Euler characteristic or Betti numbers.
A Framework for Fast and Stable Representations of Multiparameter Persistent Homology Decompositions
Topological data analysis (TDA) is an area of data science that focuses on using invariants from algebraic topology to provide multiscale shape descriptors for geometric data sets such as point clouds. One of the most important such descriptors is {\em persistent homology}, which encodes the change in shape as a filtration parameter changes; a typical parameter is the feature scale. For many data sets, it is useful to simultaneously vary multiple filtration parameters, for example feature scale and density. While the theoretical properties of single parameter persistent homology are well understood, less is known about the multiparameter case. In particular, a central question is the problem of representing multiparameter persistent homology by elements of a vector space for integration with standard machine learning algorithms. Existing approaches to this problem either ignore most of the multiparameter information to reduce to the one-parameter case or are heuristic and potentially unstable in the face of noise. In this article, we introduce a new general representation framework that leverages recent results on {\em decompositions} of multiparameter persistent homology. This framework is rich in information, fast to compute, and encompasses previous approaches. Moreover, we establish theoretical stability guarantees under this framework as well as efficient algorithms for practical computation, making this framework an applicable and versatile tool for analyzing geometric and point cloud data. We validate our stability results and algorithms with numerical experiments that demonstrate statistical convergence, prediction accuracy, and fast running times on several real data sets.
Str2Str: A Score-based Framework for Zero-shot Protein Conformation Sampling
The dynamic nature of proteins is crucial for determining their biological functions and properties, for which Monte Carlo (MC) and molecular dynamics (MD) simulations stand as predominant tools to study such phenomena. By utilizing empirically derived force fields, MC or MD simulations explore the conformational space through numerically evolving the system via Markov chain or Newtonian mechanics. However, the high-energy barrier of the force fields can hamper the exploration of both methods by the rare event, resulting in inadequately sampled ensemble without exhaustive running. Existing learning-based approaches perform direct sampling yet heavily rely on target-specific simulation data for training, which suffers from high data acquisition cost and poor generalizability. Inspired by simulated annealing, we propose Str2Str, a novel structure-to-structure translation framework capable of zero-shot conformation sampling with roto-translation equivariant property. Our method leverages an amortized denoising score matching objective trained on general crystal structures and has no reliance on simulation data during both training and inference. Experimental results across several benchmarking protein systems demonstrate that Str2Str outperforms previous state-of-the-art generative structure prediction models and can be orders of magnitude faster compared to long MD simulations. Our open-source implementation is available at https://github.com/lujiarui/Str2Str
Efficient Dynamics Modeling in Interactive Environments with Koopman Theory
The accurate modeling of dynamics in interactive environments is critical for successful long-range prediction. Such a capability could advance Reinforcement Learning (RL) and Planning algorithms, but achieving it is challenging. Inaccuracies in model estimates can compound, resulting in increased errors over long horizons. We approach this problem from the lens of Koopman theory, where the nonlinear dynamics of the environment can be linearized in a high-dimensional latent space. This allows us to efficiently parallelize the sequential problem of long-range prediction using convolution while accounting for the agent's action at every time step. Our approach also enables stability analysis and better control over gradients through time. Taken together, these advantages result in significant improvement over the existing approaches, both in the efficiency and the accuracy of modeling dynamics over extended horizons. We also show that this model can be easily incorporated into dynamics modeling for model-based planning and model-free RL and report promising experimental results.
On the Dynamics of Acceleration in First order Gradient Methods
Ever since the original algorithm by Nesterov (1983), the true nature of the acceleration phenomenon has remained elusive, with various interpretations of why the method is actually faster. The diagnosis of the algorithm through the lens of Ordinary Differential Equations (ODEs) and the corresponding dynamical system formulation to explain the underlying dynamics has a rich history. In the literature, the ODEs that explain algorithms are typically derived by considering the limiting case of the algorithm maps themselves, that is, an ODE formulation follows the development of an algorithm. This obfuscates the underlying higher order principles and thus provides little evidence of the working of the algorithm. Such has been the case with Nesterov algorithm and the various analogies used to describe the acceleration phenomena, viz, momentum associated with the rolling of a Heavy-Ball down a slope, Hessian damping etc. The main focus of our work is to ideate the genesis of the Nesterov algorithm from the viewpoint of dynamical systems leading to demystifying the mathematical rigour behind the algorithm. Instead of reverse engineering ODEs from discrete algorithms, this work explores tools from the recently developed control paradigm titled Passivity and Immersion approach and the Geometric Singular Perturbation theory which are applied to arrive at the formulation of a dynamical system that explains and models the acceleration phenomena. This perspective helps to gain insights into the various terms present and the sequence of steps used in Nesterovs accelerated algorithm for the smooth strongly convex and the convex case. The framework can also be extended to derive the acceleration achieved using the triple momentum method and provides justifications for the non-convergence to the optimal solution in the Heavy-Ball method.
Feature Programming for Multivariate Time Series Prediction
We introduce the concept of programmable feature engineering for time series modeling and propose a feature programming framework. This framework generates large amounts of predictive features for noisy multivariate time series while allowing users to incorporate their inductive bias with minimal effort. The key motivation of our framework is to view any multivariate time series as a cumulative sum of fine-grained trajectory increments, with each increment governed by a novel spin-gas dynamical Ising model. This fine-grained perspective motivates the development of a parsimonious set of operators that summarize multivariate time series in an abstract fashion, serving as the foundation for large-scale automated feature engineering. Numerically, we validate the efficacy of our method on several synthetic and real-world noisy time series datasets.
From non-ergodic eigenvectors to local resolvent statistics and back: a random matrix perspective
We study the statistics of the local resolvent and non-ergodic properties of eigenvectors for a generalised Rosenzweig-Porter Ntimes N random matrix model, undergoing two transitions separated by a delocalised non-ergodic phase. Interpreting the model as the combination of on-site random energies {a_i} and a structurally disordered hopping, we found that each eigenstate is delocalised over N^{2-gamma} sites close in energy |a_j-a_i|leq N^{1-gamma} in agreement with Kravtsov et al, arXiv:1508.01714. Our other main result, obtained combining a recurrence relation for the resolvent matrix with insights from Dyson's Brownian motion, is to show that the properties of the non-ergodic delocalised phase can be probed studying the statistics of the local resolvent in a non-standard scaling limit.
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion
In recent years, there has been rapid development in 3D generation models, opening up new possibilities for applications such as simulating the dynamic movements of 3D objects and customizing their behaviors. However, current 3D generative models tend to focus only on surface features such as color and shape, neglecting the inherent physical properties that govern the behavior of objects in the real world. To accurately simulate physics-aligned dynamics, it is essential to predict the physical properties of materials and incorporate them into the behavior prediction process. Nonetheless, predicting the diverse materials of real-world objects is still challenging due to the complex nature of their physical attributes. In this paper, we propose Physics3D, a novel method for learning various physical properties of 3D objects through a video diffusion model. Our approach involves designing a highly generalizable physical simulation system based on a viscoelastic material model, which enables us to simulate a wide range of materials with high-fidelity capabilities. Moreover, we distill the physical priors from a video diffusion model that contains more understanding of realistic object materials. Extensive experiments demonstrate the effectiveness of our method with both elastic and plastic materials. Physics3D shows great potential for bridging the gap between the physical world and virtual neural space, providing a better integration and application of realistic physical principles in virtual environments. Project page: https://liuff19.github.io/Physics3D.
User-defined Event Sampling and Uncertainty Quantification in Diffusion Models for Physical Dynamical Systems
Diffusion models are a class of probabilistic generative models that have been widely used as a prior for image processing tasks like text conditional generation and inpainting. We demonstrate that these models can be adapted to make predictions and provide uncertainty quantification for chaotic dynamical systems. In these applications, diffusion models can implicitly represent knowledge about outliers and extreme events; however, querying that knowledge through conditional sampling or measuring probabilities is surprisingly difficult. Existing methods for conditional sampling at inference time seek mainly to enforce the constraints, which is insufficient to match the statistics of the distribution or compute the probability of the chosen events. To achieve these ends, optimally one would use the conditional score function, but its computation is typically intractable. In this work, we develop a probabilistic approximation scheme for the conditional score function which provably converges to the true distribution as the noise level decreases. With this scheme we are able to sample conditionally on nonlinear userdefined events at inference time, and matches data statistics even when sampling from the tails of the distribution.
Probing Off-diagonal Eigenstate Thermalization with Tensor Networks
Energy filter methods in combination with quantum simulation can efficiently access the properties of quantum many-body systems at finite energy densities [Lu et al. PRX Quantum 2, 020321 (2021)]. Classically simulating this algorithm with tensor networks can be used to investigate the microcanonical properties of large spin chains, as recently shown in [Yang et al. Phys. Rev. B 106, 024307 (2022)]. Here we extend this strategy to explore the properties of off-diagonal matrix elements of observables in the energy eigenbasis, fundamentally connected to the thermalization behavior and the eigenstate thermalization hypothesis. We test the method on integrable and non-integrable spin chains of up to 60 sites, much larger than accessible with exact diagonalization. Our results allow us to explore the scaling of the off-diagonal functions with the size and energy difference, and to establish quantitative differences between integrable and non-integrable cases.
HMC with Normalizing Flows
We propose using Normalizing Flows as a trainable kernel within the molecular dynamics update of Hamiltonian Monte Carlo (HMC). By learning (invertible) transformations that simplify our dynamics, we can outperform traditional methods at generating independent configurations. We show that, using a carefully constructed network architecture, our approach can be easily scaled to large lattice volumes with minimal retraining effort. The source code for our implementation is publicly available online at https://github.com/nftqcd/fthmc.
Dynamical Linear Bandits
In many real-world sequential decision-making problems, an action does not immediately reflect on the feedback and spreads its effects over a long time frame. For instance, in online advertising, investing in a platform produces an instantaneous increase of awareness, but the actual reward, i.e., a conversion, might occur far in the future. Furthermore, whether a conversion takes place depends on: how fast the awareness grows, its vanishing effects, and the synergy or interference with other advertising platforms. Previous work has investigated the Multi-Armed Bandit framework with the possibility of delayed and aggregated feedback, without a particular structure on how an action propagates in the future, disregarding possible dynamical effects. In this paper, we introduce a novel setting, the Dynamical Linear Bandits (DLB), an extension of the linear bandits characterized by a hidden state. When an action is performed, the learner observes a noisy reward whose mean is a linear function of the hidden state and of the action. Then, the hidden state evolves according to linear dynamics, affected by the performed action too. We start by introducing the setting, discussing the notion of optimal policy, and deriving an expected regret lower bound. Then, we provide an optimistic regret minimization algorithm, Dynamical Linear Upper Confidence Bound (DynLin-UCB), that suffers an expected regret of order mathcal{O} Big( d sqrt{T}{(1-rho)^{3/2}} Big), where rho is a measure of the stability of the system, and d is the dimension of the action vector. Finally, we conduct a numerical validation on a synthetic environment and on real-world data to show the effectiveness of DynLin-UCB in comparison with several baselines.
Generative Modeling of Molecular Dynamics Trajectories
Molecular dynamics (MD) is a powerful technique for studying microscopic phenomena, but its computational cost has driven significant interest in the development of deep learning-based surrogate models. We introduce generative modeling of molecular trajectories as a paradigm for learning flexible multi-task surrogate models of MD from data. By conditioning on appropriately chosen frames of the trajectory, we show such generative models can be adapted to diverse tasks such as forward simulation, transition path sampling, and trajectory upsampling. By alternatively conditioning on part of the molecular system and inpainting the rest, we also demonstrate the first steps towards dynamics-conditioned molecular design. We validate the full set of these capabilities on tetrapeptide simulations and show that our model can produce reasonable ensembles of protein monomers. Altogether, our work illustrates how generative modeling can unlock value from MD data towards diverse downstream tasks that are not straightforward to address with existing methods or even MD itself. Code is available at https://github.com/bjing2016/mdgen.
Recovering a Molecule's 3D Dynamics from Liquid-phase Electron Microscopy Movies
The dynamics of biomolecules are crucial for our understanding of their functioning in living systems. However, current 3D imaging techniques, such as cryogenic electron microscopy (cryo-EM), require freezing the sample, which limits the observation of their conformational changes in real time. The innovative liquid-phase electron microscopy (liquid-phase EM) technique allows molecules to be placed in the native liquid environment, providing a unique opportunity to observe their dynamics. In this paper, we propose TEMPOR, a Temporal Electron MicroscoPy Object Reconstruction algorithm for liquid-phase EM that leverages an implicit neural representation (INR) and a dynamical variational auto-encoder (DVAE) to recover time series of molecular structures. We demonstrate its advantages in recovering different motion dynamics from two simulated datasets, 7bcq and Cas9. To our knowledge, our work is the first attempt to directly recover 3D structures of a temporally-varying particle from liquid-phase EM movies. It provides a promising new approach for studying molecules' 3D dynamics in structural biology.
Cross-Domain Policy Adaptation via Value-Guided Data Filtering
Generalizing policies across different domains with dynamics mismatch poses a significant challenge in reinforcement learning. For example, a robot learns the policy in a simulator, but when it is deployed in the real world, the dynamics of the environment may be different. Given the source and target domain with dynamics mismatch, we consider the online dynamics adaptation problem, in which case the agent can access sufficient source domain data while online interactions with the target domain are limited. Existing research has attempted to solve the problem from the dynamics discrepancy perspective. In this work, we reveal the limitations of these methods and explore the problem from the value difference perspective via a novel insight on the value consistency across domains. Specifically, we present the Value-Guided Data Filtering (VGDF) algorithm, which selectively shares transitions from the source domain based on the proximity of paired value targets across the two domains. Empirical results on various environments with kinematic and morphology shifts demonstrate that our method achieves superior performance compared to prior approaches.
MatterGen: a generative model for inorganic materials design
The design of functional materials with desired properties is essential in driving technological advances in areas like energy storage, catalysis, and carbon capture. Generative models provide a new paradigm for materials design by directly generating entirely novel materials given desired property constraints. Despite recent progress, current generative models have low success rate in proposing stable crystals, or can only satisfy a very limited set of property constraints. Here, we present MatterGen, a model that generates stable, diverse inorganic materials across the periodic table and can further be fine-tuned to steer the generation towards a broad range of property constraints. To enable this, we introduce a new diffusion-based generative process that produces crystalline structures by gradually refining atom types, coordinates, and the periodic lattice. We further introduce adapter modules to enable fine-tuning towards any given property constraints with a labeled dataset. Compared to prior generative models, structures produced by MatterGen are more than twice as likely to be novel and stable, and more than 15 times closer to the local energy minimum. After fine-tuning, MatterGen successfully generates stable, novel materials with desired chemistry, symmetry, as well as mechanical, electronic and magnetic properties. Finally, we demonstrate multi-property materials design capabilities by proposing structures that have both high magnetic density and a chemical composition with low supply-chain risk. We believe that the quality of generated materials and the breadth of MatterGen's capabilities represent a major advancement towards creating a universal generative model for materials design.
VisionLaw: Inferring Interpretable Intrinsic Dynamics from Visual Observations via Bilevel Optimization
The intrinsic dynamics of an object governs its physical behavior in the real world, playing a critical role in enabling physically plausible interactive simulation with 3D assets. Existing methods have attempted to infer the intrinsic dynamics of objects from visual observations, but generally face two major challenges: one line of work relies on manually defined constitutive priors, making it difficult to generalize to complex scenarios; the other models intrinsic dynamics using neural networks, resulting in limited interpretability and poor generalization. To address these challenges, we propose VisionLaw, a bilevel optimization framework that infers interpretable expressions of intrinsic dynamics from visual observations. At the upper level, we introduce an LLMs-driven decoupled constitutive evolution strategy, where LLMs are prompted as a knowledgeable physics expert to generate and revise constitutive laws, with a built-in decoupling mechanism that substantially reduces the search complexity of LLMs. At the lower level, we introduce a vision-guided constitutive evaluation mechanism, which utilizes visual simulation to evaluate the consistency between the generated constitutive law and the underlying intrinsic dynamics, thereby guiding the upper-level evolution. Experiments on both synthetic and real-world datasets demonstrate that VisionLaw can effectively infer interpretable intrinsic dynamics from visual observations. It significantly outperforms existing state-of-the-art methods and exhibits strong generalization for interactive simulation in novel scenarios.
Controlgym: Large-Scale Safety-Critical Control Environments for Benchmarking Reinforcement Learning Algorithms
We introduce controlgym, a library of thirty-six safety-critical industrial control settings, and ten infinite-dimensional partial differential equation (PDE)-based control problems. Integrated within the OpenAI Gym/Gymnasium (Gym) framework, controlgym allows direct applications of standard reinforcement learning (RL) algorithms like stable-baselines3. Our control environments complement those in Gym with continuous, unbounded action and observation spaces, motivated by real-world control applications. Moreover, the PDE control environments uniquely allow the users to extend the state dimensionality of the system to infinity while preserving the intrinsic dynamics. This feature is crucial for evaluating the scalability of RL algorithms for control. This project serves the learning for dynamics & control (L4DC) community, aiming to explore key questions: the convergence of RL algorithms in learning control policies; the stability and robustness issues of learning-based controllers; and the scalability of RL algorithms to high- and potentially infinite-dimensional systems. We open-source the controlgym project at https://github.com/xiangyuan-zhang/controlgym.
Constructor Theory of Thermodynamics
All current formulations of thermodynamics invoke some form of coarse-graining or ensembles as the supposed link between their own laws and the microscopic laws of motion. They deal only with ensemble-averages, expectation values, macroscopic limits, infinite heat baths, etc., not with the details of physical variables of individual microscopic systems. They are consistent with the laws of motion for finite systems only in certain approximations, which improve with increasing scale, given various assumptions about initial conditions which are neither specified precisely nor even thought to hold exactly in nature. Here I propose a new formulation of the zeroth, first and second laws, improving upon the axiomatic approach to thermodynamics (Carath\'eodory, 1909; Lieb & Yngvason, 1999), via the principles of the recently proposed constructor theory. Specifically, I provide a non-approximative, scale-independent formulation of 'adiabatic accessibility'; this in turn provides a non-approximative, scale-independent distinction between work and heat and reveals an unexpected connection between information theory and the first law of thermodynamics (not just the second). It also achieves the long-sought unification of the axiomatic approach with Kelvin's.
Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton
In contrast to entropy, which increases monotonically, the "complexity" or "interestingness" of closed systems seems intuitively to increase at first and then decrease as equilibrium is approached. For example, our universe lacked complex structures at the Big Bang and will also lack them after black holes evaporate and particles are dispersed. This paper makes an initial attempt to quantify this pattern. As a model system, we use a simple, two-dimensional cellular automaton that simulates the mixing of two liquids ("coffee" and "cream"). A plausible complexity measure is then the Kolmogorov complexity of a coarse-grained approximation of the automaton's state, which we dub the "apparent complexity." We study this complexity measure, and show analytically that it never becomes large when the liquid particles are non-interacting. By contrast, when the particles do interact, we give numerical evidence that the complexity reaches a maximum comparable to the "coffee cup's" horizontal dimension. We raise the problem of proving this behavior analytically.
Analytical Lyapunov Function Discovery: An RL-based Generative Approach
Despite advances in learning-based methods, finding valid Lyapunov functions for nonlinear dynamical systems remains challenging. Current neural network approaches face two main issues: challenges in scalable verification and limited interpretability. To address these, we propose an end-to-end framework using transformers to construct analytical Lyapunov functions (local), which simplifies formal verification, enhances interpretability, and provides valuable insights for control engineers. Our framework consists of a transformer-based trainer that generates candidate Lyapunov functions and a falsifier that verifies candidate expressions and refines the model via risk-seeking policy gradient. Unlike Alfarano et al. (2024), which utilizes pre-training and seeks global Lyapunov functions for low-dimensional systems, our model is trained from scratch via reinforcement learning (RL) and succeeds in finding local Lyapunov functions for high-dimensional and non-polynomial systems. Given the analytical nature of the candidates, we employ efficient optimization methods for falsification during training and formal verification tools for the final verification. We demonstrate the efficiency of our approach on a range of nonlinear dynamical systems with up to ten dimensions and show that it can discover Lyapunov functions not previously identified in the control literature.
Artificial Intelligence for EEG Prediction: Applied Chaos Theory
In the present research, we delve into the intricate realm of electroencephalogram (EEG) data analysis, focusing on sequence-to-sequence prediction of data across 32 EEG channels. The study harmoniously fuses the principles of applied chaos theory and dynamical systems theory to engender a novel feature set, enriching the representational capacity of our deep learning model. The endeavour's cornerstone is a transformer-based sequence-to-sequence architecture, calibrated meticulously to capture the non-linear and high-dimensional temporal dependencies inherent in EEG sequences. Through judicious architecture design, parameter initialisation strategies, and optimisation techniques, we have navigated the intricate balance between computational expediency and predictive performance. Our model stands as a vanguard in EEG data sequence prediction, demonstrating remarkable generalisability and robustness. The findings not only extend our understanding of EEG data dynamics but also unveil a potent analytical framework that can be adapted to diverse temporal sequence prediction tasks in neuroscience and beyond.
Collective Dynamics from Stochastic Thermodynamics
From a viewpoint of stochastic thermodynamics, we derive equations that describe the collective dynamics near the order-disorder transition in the globally coupled XY model and near the synchronization-desynchronization transition in the Kuramoto model. A new way of thinking is to interpret the deterministic time evolution of a macroscopic variable as an external operation to a thermodynamic system. We then find that the irreversible work determines the equation for the collective dynamics. When analyzing the Kuramoto model, we employ a generalized concept of irreversible work which originates from a non-equilibrium identity associated with steady state thermodynamics.
Scaling Properties of Avalanche Activity in the Two-Dimensional Abelian Sandpile Model
We study the scaling properties of avalanche activity in the two-dimensional Abelian sandpile model. Instead of the conventional avalanche size distribution, we analyze the site activity distribution, which measures how often a site participates in avalanches when grains are added across the lattice. Using numerical simulations for system sizes up to \(L = 160\), averaged over \(10^4\) configurations, we determine the probability distribution \(P(A, L)\) of site activities. The results show that \(P(A, L)\) follows a finite-size scaling form \[ P(A, L) \sim L^{-2} F\Big(A{L^2}\Big). \] For small values \(A \ll L^2\) the scaling function behaves as \[ F(u) \sim u^{-1/2}, \quad corresponding to \quad P(A) \sim 1{L}, \] while for large activities \(A \sim O(L^2)\) the distribution decays as \[ F(u) \sim \exp\big(-c_3 u - c_4 u^2\big). \] The crossover between these two regimes occurs at \[ A^* \sim 0.1 \, L^2, \] marking the threshold between typical and highly excitable sites. This characterization of local avalanche activity provides complementary information to the usual avalanche size statistics, highlighting how local regions serve as frequent conduits for critical dynamics. These results may help connect sandpile models to real-world self-organized critical systems where only partial local activity can be observed.
Designing High-Tc Superconductors with BCS-inspired Screening, Density Functional Theory and Deep-learning
We develop a multi-step workflow for the discovery of conventional superconductors, starting with a Bardeen Cooper Schrieffer inspired pre-screening of 1736 materials with high Debye temperature and electronic density of states. Next, we perform electron-phonon coupling calculations for 1058 of them to establish a large and systematic database of BCS superconducting properties. Using the McMillan-Allen-Dynes formula, we identify 105 dynamically stable materials with transition temperatures, Tc>5 K. Additionally, we analyze trends in our dataset and individual materials including MoN, VC, VTe, KB6, Ru3NbC, V3Pt, ScN, LaN2, RuO2, and TaC. We demonstrate that deep-learning(DL) models can predict superconductor properties faster than direct first principles computations. Notably, we find that by predicting the Eliashberg function as an intermediate quantity, we can improve model performance versus a direct DL prediction of Tc. We apply the trained models on the crystallographic open database and pre-screen candidates for further DFT calculations.
