Debating the potential of machine learning in astronomical surveys

A new catalog of 343,000 quasars with their photometric redshifts derived with machine learning from the Kilo Degree Survey
Szymon Nakoneczny  1@  , Maciej Bilicki  2  , Agnieszka Pollo  1  , Marika Asgari  3  , Andrej Dvornik  4  , Thomas Erben  5  , Benjamin Giblin  6  , Catherine Heymans  3  , Hendrik Hildebrandt  7  , Arun Kannawadi  8  , Koen Kuijken  9  , Nicola Napolitano  10  , Edwin Valentijn  11  
1 : National Centre for Nuclear Research [Otwock]
2 : Center for Theoretical Physics, Polish Academy of Sciences
3 : Institute for Astronomy, University of Edinburgh
4 : Ruhr University Bochum, Faculty of Physics and Astronomy, Astronomical Institute (AIRUB), German Centre for Cosmological Lensing
5 : Argelander-Institut für Astronomie
6 : Institute for Astronomy, University of Edinburgh, Royal Observatory
7 : Argelander Institute for Astronomy, Bonn University  (AIfA)
8 : Department of Astrophysical Sciences, Princeton University
9 : Leiden Observatory, Leiden University
10 : School of Physics and Astronomy, Sun Yat-sen University
11 : Kapteyn Institute
University of Groningen -  Pays-Bas

In the present era of ever larger photometric sky surveys, designing efficient automated methods for unbiased selection of specific categories of objects and prediction of their properties based on photometric data is becoming an increasingly urgent quest. A good example of such objects are quasars (QSOs), both because of our interest in their physics and possible cosmological applications.

In my talk I present a catalog of 343,000 QSOs derived from one of the largest photometric surveys up to date - the Kilo-Degree Survey (KiDS) Data Release 4. The catalog is characterised by robust photometric redshifts with uncertainties, and its high magnitude depth was achieved with artificial neural networks (ANN) applied in a controlled way. The catalog is based on optical ugri and near-infrared ZYJHKs bands. With purity 97%, completeness 94%, and redshift error (mean and scatter) 0.009 +/- 0.12, the catalog is limited at r < 23.5. The robustness of the catalog was demonstrated e.g. by comparing the number counts of our QSO candidates to models from eBOSS, and by cross-matching the QSOs with Gaia in order to confirm zero parallaxes. In the talk, I will explain how machine learning (ML) methods can be successfully applied to observations fainter than the training data known from spectroscopy. It includes building inference subsets using high dimensional t-SNE visualisations of the feature space, and properly adapting bias vs variance tradeoff of the ML models with tests against the problem of extrapolation in the feature space. Our success of the extrapolation challenges the way that models are optimised and applied at the faint data end. The resultant catalog is ready for cosmological and active galactic nucleus (AGN) studies, and developed methodology is ready to be used on future yet larger photometric datasets.

Slides: in PDF

Video: https://youtu.be/Kwv0CH4KKRA


Online user: 13 RSS Feed | Privacy
Loading...