November 27, 2023 to December 1, 2023
Dual node
Europe/Paris timezone

Vision Transformers for Cosmological Inference from Weak Lensing

Nov 29, 2023, 3:22 PM
Dual node

Dual node

IAP (Paris) & CCA/Flatiron (New York) IAP 98bis Boulevard Arago 75014 Paris FRANCE CCA/Flatiron 5th Avenue New York (NY) USA
Flash talk New York Contributed talks


Shubh Agrawal (University of Pennsylvania)


Weak gravitational lensing is an excellent quantifier of the growth of structure in our universe, as the distortion of galaxy ellipticities measures the spatial fluctuations in the matter field density along a line of sight. Traditional two-point statistical analyses of weak lensing only capture Gaussian features of the observable field, hence leaking information from smaller scales where non-linear gravitational interactions yield non-Gaussian features in the matter distribution. Higher-order statistics such as peak counts, Minkowski-functionals, three-point correlation functions, and convolutional neural networks, have been introduced to capture this additional non-Gaussian information and improve constraints on key cosmological parameters such as $\Omega_m$ and $\sigma_8$.

We demonstrate the potential of applying a self-attention-based deep learning method, specifically a Vision Transformer, to predict cosmological parameters from weak lensing observables, particularly convergence $\kappa$ maps. Transformers, which were first developed for natural language processing and are now at the core of generative large language models, can be used for computer vision tasks with patches from an input image serving as sequential tokens analogous to words in a sentence. In the context of weak lensing, Vision Transformers are worth exploring for their different approach to capturing long-scale and inter-channel information, improved parallelization, and lack of strong inductive bias and locality of operations.

Using transfer learning, we compare the performance of Vision Transformers to that of benchmark residual convolutional networks (ResNets) on simulated $w$CDM theory predictions for $\kappa$, with noise properties and sky coverage similar to DESY3, LSSTY1, and LSSTY10. We further use neural density estimators to investigate the differences in the cosmological parameters' posteriors recovered by either deep learning method. These results showcase a potential astronomical application derived from the advent of powerful large language models, as well as machine learning tools relevant to the next generation of large-scale surveys.

Primary author

Shubh Agrawal (University of Pennsylvania)


Marco Gatti (UPenn) Prof. Bhuvnesh Jain (University of Pennsylvania)

Presentation materials