Debating the potential of machine learning in astronomical surveys

Studying Morphology & Quenching of Galaxies in the All Sky-Era using Interpretable Bayesian Convolutional Neural Networks
Aritra Ghosh  1@  , C. M. Urry  1  
1 : Yale University

The traditional methods of obtaining morphological classifications of galaxies are not scalable to the large data volumes expected from future surveys like Euclid, NGRST, and LSST. To overcome this, we have developed a publicly available Basian Convolutional Neural Network (CNN) called Galaxy Morphology Network (GaMorNet) [http://www.astro.yale.edu/aghosh/gamornet.html] that can be used to extract Bayesian posteriors for morphological parameters of galaxies at a variety of redshifts from different surveys.

One of the most important features of GaMorNet is that it doesn't require a large training set of real galaxies -- this is very important because if CNNs are to become the method of choice for analyzing unclassified data from future surveys, this necessitates an algorithm that does not require a large pre-classified training set of galaxies from the same survey. To train GaMorNet, we first use a large simulation suite of galaxies and then perform transfer-learning / domain adaptation using a small amount of real data.

We have demonstrated that a preliminary classification-version of GaMorNet can be successfully applied to data from different surveys with misclassification rates of <5%. We have also used GaMorNet to study the morphology and quenching of ~100,000 (z~0) SDSS and ~20,000 (z~1) CANDELS galaxies using morphology-separated color-mass diagrams. Using the GaMorNet classifications, we find that bulge- and disk-dominated galaxies have distinct color-mass diagrams with separate evolutionary pathways. For both datasets, disk-dominated galaxies peak in the blue cloud, across a broad range of masses, consistent with the slow exhaustion of star-forming gas. In contrast, bulge-dominated galaxies are mostly red, with much smaller numbers down toward the blue cloud, suggesting rapid quenching and fast evolution across the green valley.

We have now also applied GaMorNet to Hyper Suprime-Cam (HSC) data to obtain robust posterior distributions for morphological parameters of ~ 3 million galaxies in the HSC-Wide survey. This is the first-ever large morphological catalog of HSC galaxies and will be publicly available soon.

I will also outline in this talk why GaMorNet is not a black box and how the representations learned by the network are highly amenable to visual interpretation. We have used a combination of different CNN visualization techniques to investigate and shed light on GaMorNet's decision-making process, making our results interpretable, reproducible, and robust.

 

Slide/poster: in PDF

Video: https://youtu.be/3BR_M8wWCIs


Online user: 1 RSS Feed | Privacy
Loading...