mastodon.world is one of the many independent Mastodon servers you can use to participate in the fediverse.
Generic Mastodon server for anyone to use.

Server stats:

10K
active users

#CSNA

3 posts2 participants0 posts today
arXiv logo
arXiv.orgAn Unsupervised Network Architecture Search Method for Solving Partial Differential EquationsSolving partial differential equations (PDEs) has been indispensable in scientific and engineering applications. Recently, deep learning methods have been widely used to solve high-dimensional problems, one of which is the physics-informed neural network (PINN). Typically, a deep learning method has three main components: a neural network, a loss function, and an optimizer. While the construction of the loss function is rooted in the definition of solution space, how to choose a optimal neural network is somewhat ad hoc, leaving much room for improvement. In the framework of PINN, we propose an unsupervised network architecture search method for solving PDEs, termed PINN-DARTS, which applies the differentiable architecture search (DARTS) to find the optimal network architecture structure in a given set of neural networks. In this set, the number of layers and the number of neurons in each layer can change. In the searching phase, both network and architecture parameters are updated simultaneously, so the running time is close to that of PINN with a pre-determined network structure. Unlike available works, our approach is unsupervised and purely based on the PDE residual without any prior usage of solutions. PINN-DARTS outputs the optimal network structure as well as the associated numerical solution. The performance of PINN-DARTS is verified on several benchmark PDEs, including elliptic, parabolic, wave, and Burgers' equations. Compared to traditional architecture search methods, PINN-DARTS achieves significantly higher architectural accuracy. Another interesting observation is that both the solution complexity and the PDE type have a prominent impact on the optimal network architecture. Our study suggests that architectures with uneven widths from layer to layer may have superior performance across different solution complexities and different PDE types.
arXiv logo
arXiv.orgNumerical Fuzz: A Type System for Rounding Error AnalysisAlgorithms operating on real numbers are implemented as floating-point computations in practice, but floating-point operations introduce roundoff errors that can degrade the accuracy of the result. We propose $Λ_{num}$, a functional programming language with a type system that can express quantitative bounds on roundoff error. Our type system combines a sensitivity analysis, enforced through a linear typing discipline, with a novel graded monad to track the accumulation of roundoff errors. We prove that our type system is sound by relating the denotational semantics of our language to the exact and floating-point operational semantics. To demonstrate our system, we instantiate $Λ_{num}$ with error metrics proposed in the numerical analysis literature and we show how to incorporate rounding operations that faithfully model aspects of the IEEE 754 floating-point standard. To show that $Λ_{num}$ can be a useful tool for automated error analysis, we develop a prototype implementation for $Λ_{num}$ that infers error bounds that are competitive with existing tools, while often running significantly faster. Finally, we consider semantic extensions of our graded monad to bound error under more complex rounding behaviors, such as non-deterministic and randomized rounding.
arXiv logo
arXiv.orgFEABench: Evaluating Language Models on Multiphysics Reasoning AbilityBuilding precise simulations of the real world and invoking numerical solvers to answer quantitative problems is an essential requirement in engineering and science. We present FEABench, a benchmark to evaluate the ability of large language models (LLMs) and LLM agents to simulate and solve physics, mathematics and engineering problems using finite element analysis (FEA). We introduce a comprehensive evaluation scheme to investigate the ability of LLMs to solve these problems end-to-end by reasoning over natural language problem descriptions and operating COMSOL Multiphysics$^\circledR$, an FEA software, to compute the answers. We additionally design a language model agent equipped with the ability to interact with the software through its Application Programming Interface (API), examine its outputs and use tools to improve its solutions over multiple iterations. Our best performing strategy generates executable API calls 88% of the time. LLMs that can successfully interact with and operate FEA software to solve problems such as those in our benchmark would push the frontiers of automation in engineering. Acquiring this capability would augment LLMs' reasoning skills with the precision of numerical solvers and advance the development of autonomous systems that can tackle complex problems in the real world. The code is available at https://github.com/google/feabench
#csai#cscl#csna
arXiv logo
arXiv.orgThe Average and Essential Best Rate of Convergence of the Exact Line Search Gradient Descent MethodIt is very well known that when the exact line search gradient descent method is applied to a convex quadratic objective, the worst-case rate of convergence (ROC), among all seed vectors, deteriorates as the condition number of the Hessian of the objective grows. By an elegant analysis due to H. Akaike, it is generally believed -- but not proved -- that in the ill-conditioned regime the ROC for almost all initial vectors, and hence also the average ROC, is close to the worst case ROC. We complete Akaike's analysis using the theorem of center and stable manifolds. Our analysis also makes apparent the effect of an intermediate eigenvalue in the Hessian by establishing the following somewhat amusing result: In the absence of an intermediate eigenvalue, the average ROC gets arbitrarily \emph{fast} -- not slow -- as the Hessian gets increasingly ill-conditioned. We discuss in passing some contemporary applications of exact line search GD to polynomial optimization problems arising from imaging and data sciences. In particular, we observe that a tailored exact line search GD algorithm for a POP arising from the phase retrieval problem is only 50\% more expensive per iteration than its constant step size counterpart, while promising a ROC only matched by the optimally tuned (constant) step size which can almost never be achieved in practice.
arXiv logo
arXiv.orgHomotopy Methods for Convex OptimizationConvex optimization encompasses a wide range of optimization problems that contain many efficiently solvable subclasses. Interior point methods are currently the state-of-the-art approach for solving such problems, particularly effective for classes like semidefinite programming, quadratic programming, and geometric programming. However, their success hinges on the construction of self-concordant barrier functions for feasible sets. In this work, we investigate and develop a homotopy-based approach to solve convex optimization problems. While homotopy methods have been considered in optimization before, their potential for general convex programs remains underexplored. This approach gradually transforms the feasible set of a trivial optimization problem into the target one while tracking solutions by solving a differential equation, in contrast to traditional central path methods. We establish a criterion that ensures that the homotopy method correctly solves the optimization problem and prove the existence of such homotopies for several important classes, including semidefinite and hyperbolic programs. Furthermore, we demonstrate that our approach numerically outperforms state-of-the-art methods in hyperbolic programming, highlighting its practical advantages.
#mathoc#csna#mathag
arXiv logo
arXiv.orgOpenGERT: Open Source Automated Geometry Extraction with Geometric and Electromagnetic Sensitivity Analyses for Ray-Tracing Propagation ModelsAccurate RF propagation modeling in urban environments is critical for developing digital spectrum twins and optimizing wireless communication systems. We introduce OpenGERT, an open-source automated Geometry Extraction tool for Ray Tracing, which collects and processes terrain and building data from OpenStreetMap, Microsoft Global ML Building Footprints, and USGS elevation data. Using the Blender Python API, it creates detailed urban models for high-fidelity simulations with NVIDIA Sionna RT. We perform sensitivity analyses to examine how variations in building height, position, and electromagnetic material properties affect ray-tracing accuracy. Specifically, we present pairwise dispersion plots of channel statistics (path gain, mean excess delay, delay spread, link outage, and Rician K-factor) and investigate how their sensitivities change with distance from transmitters. We also visualize the variance of these statistics for selected transmitter locations to gain deeper insights. Our study covers Munich and Etoile scenes, each with 10 transmitter locations. For each location, we apply five types of perturbations: material, position, height, height-position, and all combined, with 50 perturbations each. Results show that small changes in permittivity and conductivity minimally affect channel statistics, whereas variations in building height and position significantly alter all statistics, even with noise standard deviations of 1 meter in height and 0.4 meters in position. These findings highlight the importance of precise environmental modeling for accurate propagation predictions, essential for digital spectrum twins and advanced communication networks. The code for geometry extraction and sensitivity analyses is available at github.com/serhatadik/OpenGERT/.
arXiv logo
arXiv.orgOpenGERT: Open Source Automated Geometry Extraction with Geometric and Electromagnetic Sensitivity Analyses for Ray-Tracing Propagation ModelsAccurate RF propagation modeling in urban environments is critical for developing digital spectrum twins and optimizing wireless communication systems. We introduce OpenGERT, an open-source automated Geometry Extraction tool for Ray Tracing, which collects and processes terrain and building data from OpenStreetMap, Microsoft Global ML Building Footprints, and USGS elevation data. Using the Blender Python API, it creates detailed urban models for high-fidelity simulations with NVIDIA Sionna RT. We perform sensitivity analyses to examine how variations in building height, position, and electromagnetic material properties affect ray-tracing accuracy. Specifically, we present pairwise dispersion plots of channel statistics (path gain, mean excess delay, delay spread, link outage, and Rician K-factor) and investigate how their sensitivities change with distance from transmitters. We also visualize the variance of these statistics for selected transmitter locations to gain deeper insights. Our study covers Munich and Etoile scenes, each with 10 transmitter locations. For each location, we apply five types of perturbations: material, position, height, height-position, and all combined, with 50 perturbations each. Results show that small changes in permittivity and conductivity minimally affect channel statistics, whereas variations in building height and position significantly alter all statistics, even with noise standard deviations of 1 meter in height and 0.4 meters in position. These findings highlight the importance of precise environmental modeling for accurate propagation predictions, essential for digital spectrum twins and advanced communication networks. The code for geometry extraction and sensitivity analyses is available at github.com/serhatadik/OpenGERT/.
arXiv logo
arXiv.orgA Neural Multigrid Solver for Helmholtz Equations with High Wavenumber and Heterogeneous MediaIn this paper, we propose a deep learning-enhanced multigrid solver for high-frequency and heterogeneous Helmholtz equations. By applying spectral analysis, we categorize the iteration error into characteristic and non-characteristic components. We eliminate the non-characteristic components by a multigrid wave cycle, which employs carefully selected smoothers on each grid. We diminish the characteristic components by a learned phase function and the approximate solution of an advection-diffusion-reaction (ADR) equation, which is solved using another multigrid V-cycle on a coarser scale, referred to as the ADR cycle. The resulting solver, termed Wave-ADR-NS, enables the handling of error components with varying frequencies and overcomes constraints on the number of grid points per wavelength on coarse grids. Furthermore, we provide an efficient implementation using differentiable programming, making Wave-ADR-NS an end-to-end Helmholtz solver that incorporates parameters learned through a semi-supervised training. Wave-ADR-NS demonstrates robust generalization capabilities for both in-distribution and out-of-distribution velocity fields of varying difficulty. Comparative experiments with other multigrid methods validate its superior performance in solving heterogeneous 2D Helmholtz equations with wavenumbers exceeding 2000.