-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
constructing validation test for DC2 dN/dmag #21
Comments
CANDELS offers public catalogs with significantly increased area vs. HUDF. I haven't plotted HUDF number counts, but since we're effectively integrating along the entire line-of-sight and (I think) just looking for a bias in DC2 catalog number counts versus data in broad bins of object brightness, we should be relatively robust to cosmic variance. |
CANDELS volume is still tiny, and sample/cosmic variance is substantial. It's enough to affect number counts (and even more so redshift distributions) probably at the ~10% level. Maybe we're fine with that for this test. How deep does HSC go? |
Actually, I think you're probably better off validating against the existing much larger-area but a bit shallower surveys (e.g. HSC + DES deep fields) rather than the tiny area HST data. You'd just truncate the comparison where the existing data is fully complete (e.g. around the 10 sigma limit of the photometry). It's probably OK to assume that the number counts in actuality continue to rise geometrically a mag or so fainter than they are observed to do so in those datasets. |
We should do whatever is easiest that offers the needed precision. I will guess however that if we were to require our simulated number counts beyond LSST depth (e.g., r~28) to match reality at anything close to the 10% level, it might take a decade for the DC2 catalogs to pass QA. The goal is "realistic levels of blending", so I would interpret that as wanting to be within a factor of 2 at such number counts and hoping to be off by only 50% (in either direction). Can anyone quantify the requirement more precisely? |
Well, for starters, the total blend rate should scale like the square of the overall number density, shouldn't it? (at least in the limit that clustering is not a big contributor to the blend rate, which is true for LSST according to @dkirkby ) So if the overall number density is wrong by say 10%, the blend rate would be wrong by 20%. But that's not precisely what we care about, in the sense that we care about the rate of blends as a function of flux ratio, so you could imagine thinking of the blend rate in a 2d plane defined by brighter vs. fainter galaxy mag. In a given bin in that plane, the blend rate depends linearly on the number densities on each axis. It seems like this becomes a requirement on the number density as a function of mag, which was why I was proposing looking at dN/dmag in the first place. But it's complicated because for example we might care more about similar flux ratio blends within the limit of the gold sample, and so getting dN/dmag right for something 1.5 mag fainter than that is less important (that becomes noise on the outskirts of the gold sample galaxies, which we do want to include in the sims, but it probably has a smaller impact than blends with something at the same mag). I would imagine we want dN/dmag to 10% down to r~26, but beyond that our requirements are quite a bit less stringent. 30%? I'm guessing at this point. I believe it may have been one of @mdschneider @joezuntz @rmjarvis @esheldon who articulated the need to have a realistic blending rate in the DC2 sims for WL, so perhaps one of them can weigh in either on how realistic is good enough, or on a good validation dataset. (Michael, Joe, Mike, Erin - please read the first post in this thread for an explanation of what this is about) |
I don't know that we need to match it perfectly. The goal of DC2 is to test the ability of our code to handle the kinds of blends that are in real data. Not to precisely match the level of blending that we will see in some year of LSST data. In other words, it is a validation sim, not a calibration sim. So I'd say a factor of two or better is good enough for our purposes. If it were really 10% squared = 20%, I'd be ecstatic. |
We showed in Dawson et al. (2016) that the number density of ambiguously blended galaxies grow proportional the number density of galaxies to the ~3rd power (not squared; see figure 1). In section 5.1 we note why this is greater than to the power of 2:
Historically, in earlier versions of the paper and presentations to DESC we noted the impact of ambiguous blending on LSS number counts as an important systematic. |
@wadawson - yep, I was using an approximation that may not be correct for LSST, probably for multiple reasons.
Absolutely - that isn't really a matter of question here; I think that message has been pretty well internalized by the group. The only issue is how accurately do we need to represent the blend rate for DC2 given what the goal of DC2 is (which is not the same thing as asking how well we want to be able to represent it e.g. for dedicated blending studies). |
I think people are overcomplicating things here... in all the datasets I've seen, number counts rise geometrically with only a very very slow change in power law slope as a function of magnitude, until incompleteness sets in (see, e.g., Fig. 15 of https://academic.oup.com/mnras/article/423/3/1992/2459951/Galaxy-properties-from-the-ultraviolet-to-the-far ). This is true to at least i~29 in HST ACS data. This means that one can analytically extend the magnitude distribution from a shallower (but wider, so low sample/cosmic variance) survey by a magnitude or two with much less than 10% uncertainty (I would suggest measuring the logarithmic derivative of dN/dm and its second derivative as well to see if the latter is significant). |
@janewman-pitt-edu - I agree with you about analytically extending something from a shallower survey, but I think you may have misunderstood the 10% (or whatever %) discussion. That wasn't a discussion of the error we'd allow on the validation data, which I agree we can easily know well. It was a discussion of the error we'd allow in DC2, i.e., if the number counts and therefore blend rates don't quite match those expected in reality, how far off are we OK with them being. |
Ah, thanks for clearing that up! (Oh, and I definitely agree with @dkirkby that the majority of blends at LSST depths are from very different redshifts, though I'm not certain what the fraction is exactly). |
Update on this issue. As part of our SRD work, I used the publicly-released HSC deep survey layer to measure the dN/dmag (including absolute normalization and slope) in i-band. I also did a power-law fit to dlogN/dmag and the cumulative fits as a function of limiting mag, so we can extrapolate that, given that these power laws are generally pretty accurate across a wide range of magnitudes as Jeff said. What I got from HSC Deep was a best-fitting number density (in units of arcmin^-2) of Just to throw out numbers, it would be nice to agree with this at the 20% level at ilim=25.3 and 40% at ilim=27. Would this work for a validation test? |
OK, so in the interest of not having too many open issues that duplicate each other, do you want to close this issue (now that we've decided what the test should be) and, in #30, link to the final comment with the power-law fit on this page? |
@rmandelb sure. I don't have the permission to close issues in this repo though. |
I would actually leave it open. The #30 is the epic about validation,
mostly introduced so that Anders can track it. I think having the issue
21 as linked to the overall validation issue is fine.
…On 11/1/17 9:11 PM, Rachel Mandelbaum wrote:
OK, so in the interest of not having too many open issues that
duplicate each other, do you want to close this issue (now that we've
decided what the test should be) and, in #30
<#30>, link to the final
comment with the power-law fit on this page?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#21 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AMQ9jKRHxOz2phzbD72TaoAoEMCQjNsOks5sySTFgaJpZM4QFUg_>.
|
Yes, it would make a good validation test. It is on my list to implement
it. Thanks for doing the fits.
…On Wed, 1 Nov 2017, Rachel Mandelbaum wrote:
Date: Wed, 1 Nov 2017 18:31:11
From: Rachel Mandelbaum ***@***.***>
Reply-To: LSSTDESC/DC2_Repo
<reply+0076fc4d7ff441653d9fc18b36afb56509405512976d9bc692cf0000000116121b3
***@***.***>
To: LSSTDESC/DC2_Repo ***@***.***>
Cc: Subscribed ***@***.***>
Subject: Re: [LSSTDESC/DC2_Repo] constructing validation test for DC2 dN/dmag
(#21)
Update on this issue. As part of our SRD work, I used the publicly-released HSC deep survey layer to measure the dN/dmag (including absolute
normalization and slope) in i-band. I also did a power-law fit to dlogN/dmag and the cumulative fits as a function of limiting mag, so we can
extrapolate that, given that these power laws are generally pretty accurate across a wide range of magnitudes as Jeff said.
What I got from HSC Deep was a best-fitting number density (in units of arcmin^-2) of
N(<ilim) = 43 * 10^(0.36 (ilim - 25))
where ilim is the i-band magnitude limit of the sample. So, to be clear, these are cumulative number densities, not differential. If you want
differential counts, it's just a multiplicative factor to convert.
Just to throw out numbers, it would be nice to agree with this at the 20% level at ilim=25.3 and 40% at ilim=27.
Would this work for a validation test?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.[AHb8TeYApJgPksw-Qh0XO-kXBHLuzOK2ks5syP8_gaJpZM4QFUg_.gif]
|
@rmandelb would you mind the sharing the code you used to calculate N(<i_lim) with the HSC data or point me to a useable catalog where I can calculate myself? |
What I would like is some self-contained code that does the following: In principle, the above is something that someone can provide me, but I would feel better if I could follow it through the steps, make sure it is reproducible, and use as similar as possible code to create more lookup tables from different surveys. |
OK, didn't realize you wanted to download the data. Just a few comments:
I'm going to separately e-mail you an SQL script for downloading the HSC data, and a python script that does the process I described above for HSC Deep i-band data. The latter is already in the Requirements repo which I believe you don't have access to. I'm happy for comments on any of it, or for the code to be put into this repo as needed. |
My first pass implementation idea is to simply use the cumulative number density at a single magnitude limit as the 'test' and not the full apparent magnitude function in bins. I think this gets at the basic quantity of interest. it is a simple extension to test for a few different magnitude limits if the shape of the app_mag_func is important. |
@duncandc How about using the fit for a start? Ideally, there would be papers which do the fitting and to which you could refer, but I understand that these may not exist yet for LSST-like depths. But initially, if we just want to get an idea of how the catalogs look, using Rachel's fit would be quickest. It is not obvious to me that the WGs will need fancier cuts or options, so it might be a good idea to get feedback from them before putting a lot of work into a more general structure. |
That's the path I am taking--The test itself is exceedingly simple. I am also interested in the data. |
We definitely care about getting the shape of the number counts roughly right. I just sent you some stuff by e-mail so you can download the data with the same SQL query and process it with my script (or modify the script as needed). I should emphasize that I only did the i-band cumulative counts, so if we want to test >1 band, we need to at least rerun the script for r-band (which I can do myself if you want, because I did download the r-band magnitudes) and possibly others (which would involve re-downloading the data with more information). |
@duncandc are you thinking to generate the validation data on the fly every time when the test is run? Need to consider computational resources. |
I am not considering that. I am putting on the creating validation data hat. |
I agree we want to get the shape of the number counts right. However, we also want to be sure to do that test only where counts are highly complete (i.e. at least 0.5-1 mag brighter than where counts turnover, and probably avoiding the bright end to minimize impact from stars). Best, Jeff |
@duncandc @janewman-pitt-edu @rmandelb @yymao Duncan has worked on a test and example plots can be seen here There are some normalization issues, but even so, we can now start to think about specifying validation criteria, such as defining a validation region to be used, and the required level of agreement with the data. |
@evevkovacs @duncandc - thanks for sharing these plots. Apologies if I am missing where to find this information, but what is each curve? (i.e., which is validation dataset vs. the simulation we are validating, and why does one plot have 3 curves but the other plots have 2 curves?) Is there anything I can help with to try to track down normalization issues? For example, Duncan, if this comes from your own recalculation based on downloading HSC data, do you want to compare with the normalizations that I have calculated to see if we agree about the data? Just to throw out an idea for validation criteria: I think we'd like to sample a few i-band magnitude values within the range where HSC is clearly complete. Let's say i=24, 24.5, 25. We'd like N(<i) in the data and simulations to agree to within ~20% at those values of i-band magnitude. It would also be nice to take the power-law extrapolation of the HSC density for 22<i<25, and require N(<i=27) in the sims and extrapolated power-law from data to agree to within ~40%. |
@rmandelb the one with 2 curves + data points is the summary plots (i.e., overplotting all catalogs), and other plots are just for individual catalogs (i.e., no overplotting other catalogs) --- the contents are the same though. Maybe you can help reviewing this notebook @duncandc put together for getting the HSC data? You can also find the corresponding PR at LSSTDESC/descqa#46 |
As for the normalization, I will check it against @rmandelb 's previous calculation. Is there a better way to estimate "error bars" on N(<m) than back-of-the-envelope estimates? Asking for O(10%) agreement seems arbitrary to me. |
I'm happy to review the notebook and PR. Some cosmetic updates that would be helpful for people who look at this test:
I will try to review the notebook and PR in the next day or so. |
@rmandelb thanks for the suggestions. This validation test, as you can tell, is a sprint product and is still under review on a different PR LSSTDESC/descqa#47, so your suggestions are very useful and I believe @duncandc will implement them :) |
@duncandc - thanks for checking against my previous calculation, that will be helpful.
The statistical uncertainty on the N(<m) due to number counts alone is very small given the area of the HSC survey and the sims, while systematic uncertainty due to e.g. whether we've defined the samples in totally consistent ways in data and sims is larger and hard to quantify, but roughly speaking can give order 10% uncertainties (this is pretty much an empirical statement based on comparing sims and data many times and seeing how big changes can arise due to that kind of thing). Maybe somebody else will have a better idea than I do... |
@duncandc @rmandelb @yymao So to summarize a first pass attempt at validation criteria, we have:
Items 1) and 2) Plot enhancements in addition to those mentioned by Rachel are:
Does anyone have any other suggestions or improvements? |
Not exactly. I wasn't proposing to do (1) in addition to the others. Rather, I was saying that (2) and (3) are meant to incorporate that 10% uncertainty. |
@rmandelb Thanks very much. I adjusted the comment above and struck through the incorrect steps. |
@rmandelb, sure thing. I will ping you when I have made those changes. |
@rmandelb yes --- and JFYI, we (the DESCQA team) encourage early PR so people can see the progress. Thus, it is totally OK to defer review until the PR submitter says it's ready for review! In the meantime, leave comments and suggestions in the PR (i.e., not here) would be very helpful :) |
Closing this issue now since it is taken care of in descqa repo: LSSTDESC/descqa#7 |
Question for @janewman-pitt-edu @damonge and anybody else who is interested:
There is an interest in making sure that DC2 has a galaxy population that is complete beyond the flux limit of LSST, to ensure that realistic levels of blending are included. We would like to do this validation test on proto-DC2, comparing dN/dmag against some realistic reference using DESCQA. The question is what is the reference sample we should use?
Currently if we were to compare the DC2 population with, e.g., the COSMOS catalog, we can only test to i~25.2. The purpose of this issue is to solicit suggestions for catalogs we could use to make a deeper validation test. I could imagine the extreme case would be Hubble UDF, but that is so small that it should have significant cosmic variance. There are some intermediate-depth and area HST surveys (deeper than COSMOS but smaller area, while shallower and larger area than UDF). We could explore those?
The CatSim page has a comparison against data going to 27th magnitude: https://www.lsst.org/scientists/simulations/catsim but the plot caption does not say where the validation data comes from.
The text was updated successfully, but these errors were encountered: