mastodon.world is one of the many independent Mastodon servers you can use to participate in the fediverse.
Generic Mastodon server for anyone to use.

Server stats:

8.8K
active users

#CSNA

2 posts2 participants0 posts today
arXiv logo
arXiv.orgAMG with Filtering: An Efficient Preconditioner for Interior Point Methods in Large-Scale Contact Mechanics OptimizationLarge-scale contact mechanics simulations are crucial in many engineering fields such as structural design and manufacturing. In the frictionless case, contact can be modeled by minimizing an energy functional; however, these problems are often nonlinear, non-convex, and increasingly difficult to solve as mesh resolution increases. In this work, we employ a Newton-based interior-point (IP) filter line-search method; an effective approach for large-scale constrained optimization. While this method converges rapidly, each iteration requires solving a large saddle-point linear system that becomes ill-conditioned as the optimization process converges, largely due to IP treatment of the contact constraints. Such ill-conditioning can hinder solver scalability and increase iteration counts with mesh refinement. To address this, we introduce a novel preconditioner, AMG with Filtering (AMGF), tailored to the Schur complement of the saddle-point system. Building on the classical algebraic multigrid (AMG) solver, commonly used for elasticity, we augment it with a specialized subspace correction that filters near null space components introduced by contact interface constraints. Through theoretical analysis and numerical experiments on a range of linear and nonlinear contact problems, we demonstrate that the proposed solver achieves mesh independent convergence and maintains robustness against the ill-conditioning that notoriously plagues IP methods. These results indicate that AMGF makes contact mechanics simulations more tractable and broadens the applicability of Newton-based IP methods in challenging engineering scenarios. More broadly, AMGF is well suited for problems, optimization or otherwise, where solver performance is limited by a problematic low-dimensional subspace. This makes the method widely applicable beyond contact mechanics and constrained optimization.
arXiv logo
arXiv.orgThe Polar Express: Optimal Matrix Sign Methods and Their Application to the Muon AlgorithmComputing the polar decomposition and the related matrix sign function, has been a well-studied problem in numerical analysis for decades. More recently, it has emerged as an important subroutine in deep learning, particularly within the Muon optimization framework. However, the requirements in this setting differ significantly from those of traditional numerical analysis. In deep learning, methods must be highly efficient and GPU-compatible, but high accuracy is often unnecessary. As a result, classical algorithms like Newton-Schulz (which suffers from slow initial convergence) and methods based on rational functions (which rely on QR decompositions or matrix inverses) are poorly suited to this context. In this work, we introduce Polar Express, a GPU-friendly algorithm for computing the polar decomposition. Like classical polynomial methods such as Newton-Schulz, our approach uses only matrix-matrix multiplications, making it GPU-compatible. Motivated by earlier work of Chen & Chow and Nakatsukasa & Freund, Polar Express adapts the polynomial update rule at each iteration by solving a minimax optimization problem, and we prove that it enjoys a strong worst-case optimality guarantee. This property ensures both rapid early convergence and fast asymptotic convergence. We also address finite-precision issues, making it stable in bfloat16 in practice. We apply Polar Express within the Muon optimization framework and show consistent improvements in validation loss on large-scale models such as GPT-2, outperforming recent alternatives across a range of learning rates.
#cslg#csai#cscl
arXiv logo
arXiv.orgAccelerating Fast Ewald Summation with Prolates for Molecular Dynamics SimulationsFast Ewald summation is the most widely used approach for computing long-range Coulomb interactions in molecular dynamics (MD) simulations. While the asymptotic scaling is nearly optimal, its performance on parallel architectures is dominated by the global communication required for the underlying fast Fourier transform (FFT). Here, we develop a novel method, ESP - Ewald summation with prolate spheroidal wave functions (PSWFs) - that, for a fixed precision, sharply reduces the size of this transform by performing the Ewald split via a PSWF. In addition, PSWFs minimize the cost of spreading and interpolation steps that move information between the particles and the underlying uniform grid. We have integrated the ESP method into two widely-used open-source MD packages: LAMMPS and GROMACS. Detailed benchmarks show that this reduces the cost of computing far-field electrostatic interactions by an order of magnitude, leading to better strong scaling with respect to number of cores. The total execution time is reduced by a factor of 2 to 3 when using more than one thousand cores, even after optimally tuning the existing internal parameters in the native codes. We validate the accelerated codes in realistic long-time biological simulations.
arXiv logo
arXiv.orgAccelerating Fast Ewald Summation with Prolates for Molecular Dynamics SimulationsFast Ewald summation is the most widely used approach for computing long-range Coulomb interactions in molecular dynamics (MD) simulations. While the asymptotic scaling is nearly optimal, its performance on parallel architectures is dominated by the global communication required for the underlying fast Fourier transform (FFT). Here, we develop a novel method, ESP - Ewald summation with prolate spheroidal wave functions (PSWFs) - that, for a fixed precision, sharply reduces the size of this transform by performing the Ewald split via a PSWF. In addition, PSWFs minimize the cost of spreading and interpolation steps that move information between the particles and the underlying uniform grid. We have integrated the ESP method into two widely-used open-source MD packages: LAMMPS and GROMACS. Detailed benchmarks show that this reduces the cost of computing far-field electrostatic interactions by an order of magnitude, leading to better strong scaling with respect to number of cores. The total execution time is reduced by a factor of 2 to 3 when using more than one thousand cores, even after optimally tuning the existing internal parameters in the native codes. We validate the accelerated codes in realistic long-time biological simulations.
arXiv logo
arXiv.orgA monolithic first--order BSSNOK formulation of the Einstein--Euler equations and its solution with path-conservative finite difference CWENO schemesWe present a new, monolithic first--order (both in time and space) BSSNOK formulation of the coupled Einstein--Euler equations. The entire system of hyperbolic PDEs is solved in a completely unified manner via one single numerical scheme applied to both the conservative sector of the matter part and to the first--order strictly non--conservative sector of the spacetime evolution. The coupling between matter and space-time is achieved via algebraic source terms. The numerical scheme used for the solution of the new monolithic first order formulation is a path-conservative central WENO (CWENO) finite difference scheme, with suitable insertions to account for the presence of the non--conservative terms. By solving several crucial tests of numerical general relativity, including a stable neutron star, Riemann problems in relativistic matter with shock waves and the stable long-time evolution of single and binary puncture black holes up and beyond the binary merger, we show that our new CWENO scheme, introduced two decades ago for the compressible Euler equations of gas dynamics, can be successfully applied also to numerical general relativity, solving all equations at the same time with one single numerical method. In the future the new monolithic approach proposed in this paper may become an attractive alternative to traditional methods that couple central finite difference schemes with Kreiss-Oliger dissipation for the space-time part with totally different TVD schemes for the matter evolution and which are currently the state of the art in the field.
arXiv logo
arXiv.orgA monolithic first--order BSSNOK formulation of the Einstein--Euler equations and its solution with path-conservative finite difference CWENO schemesWe present a new, monolithic first--order (both in time and space) BSSNOK formulation of the coupled Einstein--Euler equations. The entire system of hyperbolic PDEs is solved in a completely unified manner via one single numerical scheme applied to both the conservative sector of the matter part and to the first--order strictly non--conservative sector of the spacetime evolution. The coupling between matter and space-time is achieved via algebraic source terms. The numerical scheme used for the solution of the new monolithic first order formulation is a path-conservative central WENO (CWENO) finite difference scheme, with suitable insertions to account for the presence of the non--conservative terms. By solving several crucial tests of numerical general relativity, including a stable neutron star, Riemann problems in relativistic matter with shock waves and the stable long-time evolution of single and binary puncture black holes up and beyond the binary merger, we show that our new CWENO scheme, introduced two decades ago for the compressible Euler equations of gas dynamics, can be successfully applied also to numerical general relativity, solving all equations at the same time with one single numerical method. In the future the new monolithic approach proposed in this paper may become an attractive alternative to traditional methods that couple central finite difference schemes with Kreiss-Oliger dissipation for the space-time part with totally different TVD schemes for the matter evolution and which are currently the state of the art in the field.
arXiv logo
arXiv.orgAvoided-crossings, degeneracies and Berry phases in the spectrum of quantum noise through analytic Bloch-Messiah decompositionThe Bloch-Messiah decomposition (BMD) is a fundamental tool in quantum optics, enabling the analysis and tailoring of multimode Gaussian states by decomposing linear optical transformations into passive interferometers and single-mode squeezers. Its extension to frequency-dependent matrix-valued functions, recently introduced as the "analytic Bloch-Messiah decomposition" (ABMD), provides the most general approach for characterizing the driven-dissipative dynamics of quantum optical systems governed by quadratic Hamiltonians. In this work, we present a detailed study of the ABMD, focusing on the typical behavior of parameter-dependent singular values and of their corresponding singular vectors. In particular, we analyze the hitherto unexplored occurrence of avoided and genuine crossings in the spectrum of quantum noise, the latter being manifested by nontrivial topological Berry phases of the singular vectors. We demonstrate that avoided crossings arise naturally when a single parameter is varied, leading to hypersensitivity of the singular vectors and suggesting the presence of genuine crossings in nearby systems. We highlight the possibility of programming the spectral response of photonic systems through the deliberate design of avoided crossings. As a notable example, we show that such control can be exploited to generate broad, flat-band squeezing spectra -- a desirable feature for enhancing degaussification protocols. This study provides new insights into the structure of multimode quantum correlations and offers a theoretical framework for experimental exploitation of complex quantum optical systems.
arXiv logo
arXiv.orgMAGNET: an open-source library for mesh agglomeration by Graph Neural NetworksWe introduce MAGNET, an open-source Python library designed for mesh agglomeration in both two- and three-dimensions, based on employing Graph Neural Networks (GNN). MAGNET serves as a comprehensive solution for training a variety of GNN models, integrating deep learning and other advanced algorithms such as METIS and k-means to facilitate mesh agglomeration and quality metric computation. The library's introduction is outlined through its code structure and primary features. The GNN framework adopts a graph bisection methodology that capitalizes on connectivity and geometric mesh information via SAGE convolutional layers, in line with the methodology proposed by Antonietti et al. (2024). Additionally, the proposed MAGNET library incorporates reinforcement learning to enhance the accuracy and robustness of the model for predicting coarse partitions within a multilevel framework. A detailed tutorial is provided to guide the user through the process of mesh agglomeration and the training of a GNN bisection model. We present several examples of mesh agglomeration conducted by MAGNET, demonstrating the library's applicability across various scenarios. Furthermore, the performance of the newly introduced models is contrasted with that of METIS and k-means, illustrating that the proposed GNN models are competitive regarding partition quality and computational efficiency. Finally, we exhibit the versatility of MAGNET's interface through its integration with Lymph, an open-source library implementing discontinuous Galerkin methods on polytopal grids for the numerical discretization of multiphysics differential problems.
arXiv logo
arXiv.orgAvoided-crossings, degeneracies and Berry phases in the spectrum of quantum noise through analytic Bloch-Messiah decompositionThe Bloch-Messiah decomposition (BMD) is a fundamental tool in quantum optics, enabling the analysis and tailoring of multimode Gaussian states by decomposing linear optical transformations into passive interferometers and single-mode squeezers. Its extension to frequency-dependent matrix-valued functions, recently introduced as the "analytic Bloch-Messiah decomposition" (ABMD), provides the most general approach for characterizing the driven-dissipative dynamics of quantum optical systems governed by quadratic Hamiltonians. In this work, we present a detailed study of the ABMD, focusing on the typical behavior of parameter-dependent singular values and of their corresponding singular vectors. In particular, we analyze the hitherto unexplored occurrence of avoided and genuine crossings in the spectrum of quantum noise, the latter being manifested by nontrivial topological Berry phases of the singular vectors. We demonstrate that avoided crossings arise naturally when a single parameter is varied, leading to hypersensitivity of the singular vectors and suggesting the presence of genuine crossings in nearby systems. We highlight the possibility of programming the spectral response of photonic systems through the deliberate design of avoided crossings. As a notable example, we show that such control can be exploited to generate broad, flat-band squeezing spectra -- a desirable feature for enhancing degaussification protocols. This study provides new insights into the structure of multimode quantum correlations and offers a theoretical framework for experimental exploitation of complex quantum optical systems.
arXiv logo
arXiv.orgAneuPy: An open source Python tool for creating simulation-ready geometries of abdominal aortic aneurysmsAbdominal aortic aneurysms (AAAs) are localized dilations of the abdominal aorta that can lead to life-threatening rupture if left untreated. AAAs predominantly affect older individuals, with a high mortality rate upon rupture, making early diagnosis and risk assessment critical. The geometric characteristics of an AAA, such as its maximum diameter, asymmetry, and wall thickness, play a crucial role in biomechanical models used to assess rupture risk. Despite the growing use of computational modeling to study AAAs, there is a lack of open source software that facilitates the generation of simulation-ready geometries tailored for biomechanical and hemodynamic analyses. To address this need, we introduce AneuPy, an open-source Python-based tool designed to generate idealized and patient-specific AAA geometrical models. AneuPy provides an efficient and automated approach to aneurysm geometry generation, requiring minimal input data while allowing for flexible parameterization. By streamlining the creation of simulation-ready geometries for finite element analysis (FEA), computational fluid dynamics (CFD), or fluid-structure interaction (FSI) models, AneuPy aims to facilitate research in AAAs and enhance patient-specific risk assessment.
arXiv logo
arXiv.orgAn Unsupervised Network Architecture Search Method for Solving Partial Differential EquationsSolving partial differential equations (PDEs) has been indispensable in scientific and engineering applications. Recently, deep learning methods have been widely used to solve high-dimensional problems, one of which is the physics-informed neural network (PINN). Typically, a deep learning method has three main components: a neural network, a loss function, and an optimizer. While the construction of the loss function is rooted in the definition of solution space, how to choose a optimal neural network is somewhat ad hoc, leaving much room for improvement. In the framework of PINN, we propose an unsupervised network architecture search method for solving PDEs, termed PINN-DARTS, which applies the differentiable architecture search (DARTS) to find the optimal network architecture structure in a given set of neural networks. In this set, the number of layers and the number of neurons in each layer can change. In the searching phase, both network and architecture parameters are updated simultaneously, so the running time is close to that of PINN with a pre-determined network structure. Unlike available works, our approach is unsupervised and purely based on the PDE residual without any prior usage of solutions. PINN-DARTS outputs the optimal network structure as well as the associated numerical solution. The performance of PINN-DARTS is verified on several benchmark PDEs, including elliptic, parabolic, wave, and Burgers' equations. Compared to traditional architecture search methods, PINN-DARTS achieves significantly higher architectural accuracy. Another interesting observation is that both the solution complexity and the PDE type have a prominent impact on the optimal network architecture. Our study suggests that architectures with uneven widths from layer to layer may have superior performance across different solution complexities and different PDE types.