Understanding False Positives

Sprint Demo

Eric Bellm, Meredith Rawls, Gabor Kovacs, and Yusra AlSayyad

Introduction

The AP team has been using PI data of DECam from The High Cadence Transient Survey (HiTS) to test difference imaging. Difference imaging fidelity is measured in terms of false positive rates. Along with Stripe 82 https://dmtn-069.lsst.io/, this HiTS dataset has been used a few times in the past to measure false positive difference imaging rates (e.g., https://dmtn-006.lsst.io/; http://dmtn-021.lsst.io/), sometimes on the instcal” processed images from the DECam community pipeline (CP) and sometimes using using other visits as templates instead of coadds.

Some preliminary analysis from Meredith indicated that differencing processed images (processed with our own stack’s processCcd using CP master calibs) with best seeing direct CompareWarp coadd templates yields higher false positive rates than reported in the other contexts.

The goal of this sprint was to explore various hypotheses as to the source of the false positives, and to learn something about how to report a higher fraction of real sources in difference imaging.
Results are being recorded in the existing repo: https://github.com/lsst-dm/ap_pipe-notebooks

One hypothesis is that the false positive increase is due to our naive ISR. In this sprint demo we hoped to investigate whether it is due to ISR by characterizing the false positives and also using a survey where we understand ISR very well: HSC.

Definitions

  • DIASource: A detected source in a difference image.
  • DIAObject: A collection of one or more DIASources that have been spatially associated. If the DIASources represent astrophysical variability over time, the DIAObject represents the astrophysical object.
  • ap_pipe: LSST Science Pipelines package that does image processing, difference imaging, and spatial association. Input is ingested science images, calibration products, and templates. Output is a database of associated DIAObjects and their constituent DIASources, as well as all the usual difference imaging output (difference source catalogs and difference image exposures).
  • ap_verify: LSST Science Pipelines package that does ingestion, runs ap_pipe, and reports various metrics.
  • HiTS: High Cadence Transient Survey, an imaging survey which revisited the same regions of the sky using DECam in 2013, 2014, and 2015.

Sprint Goals 

  • Create:
  • An ap_verify test dataset for HSC PDR1. 
  • One or more metrics we can use to measure false positive rates.
  • Plots and visualizations to better understand the metrics.
  • Investigate:
  • How are flags distributed on DiaSources? (Can they filter DiaSources?)
  • What do the DiaSources look like?
  • How to DiaSource Counts depend on _?
  • Ccd, visit, seeing
  • ISR: instcals  vs. stack ISR’ed calexps

Datasets

  • /project/mrawls/hits2015/rerun/cw_processed5 Three fields (pointings) from the HiTS 2015 survey, each with 28 visits on all 60 functional DECam CCDs in g band. Two of the fields overlap slightly, so some DIAObjects may be composed of up to 56 DIASources. Templates are constructed from HiTS 2014 which overlap HiTS 2015 nearly exactly. This ap_pipe rerun used the latest weekly build from the end of May 2019.
  • ap_verify_hits2015 The same as the above, plus additional r- and i- band visits, nicely packaged up on git-lfs in an ap_verify format dataset. Includes HiTS 2014 best seeing direct CompareWarp coadd templates.
  • ap_verify_ci_hits2015 A subset of ap_verify_hits2015 with just three CCDs, each with two visits, with some spatial overlaps. ap_verify regularly runs on this dataset in CI. This region of sky is sometimes used as a starting place to cut down the full dataset described above into something more manageable for a mini-census.
  • HSC stuff: see below

 

Garbage Collection

Or, a census of what kinds of DIASources we have.

First up is the cw_processed5 rerun.

HiTS 2015

Census of DIASources as a function of CCD, visit, and seeing