Skip to content

Commit c29f379

Browse files
authored
Merge pull request #700 from ochase10/seedfix
What I am updating here is the code for generating log normal mock catalogs. How those work is by first using a random number to create a density field which follows some power spectrum. You need the random number because there are many possible random fields for a given power spectrum. Once you have a density field, you then look in each grid cell and do Poisson sampling to determine whether there is a source there. This is, of course, random. Therefore, there are 2 separate stages of randomness involved in making a mock catalog. First, one needs the initial conditions to determine the exact density field, and two, the density field needs to be randomly sampled to generate sources. The way the code currently works is to use the same seed for both of these random processes (making the density field and sampling it). Now suppose I make a mock catalog and realize it was too small. If I use a new seed for the density field (with the same power spectrum), the two mocks will be incompatible at the field level because the specific locations of the density peaks would be uncorrelated between them. Their power spectra would match, but if you combined them and computed a power spectrum you would not get the right answer. However, in the current implementation, the only way to use the same initial conditions (same density field) again for a new mock is to also use the same Poisson sampling. In other words, there is only 1 random sample of sources possible from each random density field. So, if I ever want to increase the size of my mock, I have to recreate all the sources I already have. This is an issue not only due to wasted compute, it limits the size of mocks I can possibly make to the density I can fit in the RAM. If I want anything bigger (more dense), I simply cannot do it because I will always get the same source catalog from a given set of initial conditions. What my update does is separate the seeds for these two random processes to allow for each set of initial conditions to result in myriad possible source instantiations. If only the standard 'seed' is provided or no seed is provided at all, the behavior is identical to before and should preserve the functionality of all legacy code (I think). There is simply a new argument which allows me to fix the initial conditions (using the random seed for the density field) while leaving the sampling seed free to vary. Or, at least, that is its intention.
2 parents 7bd79d6 + 35b915c commit c29f379

1 file changed

Lines changed: 13 additions & 3 deletions

File tree

nbodykit/source/catalog/lognormal.py

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,9 @@ class LogNormalCatalog(CatalogSource):
2929
seed : int, optional
3030
the global random seed; if set to ``None``, the seed will be set
3131
randomly
32+
cosmo_seed : int, optional
33+
the random seed used for constructing the density box; if set to ``None``,
34+
this will be set to 'seed'
3235
cosmo : :class:`nbodykit.cosmology.core.Cosmology`, optional
3336
this must be supplied if ``Plin`` does not carry ``cosmo`` attribute
3437
redshift : float, optional
@@ -43,13 +46,13 @@ class LogNormalCatalog(CatalogSource):
4346
`Agrawal et al. 2017 <https://arxiv.org/abs/1706.09195>`_
4447
"""
4548
def __repr__(self):
46-
return "LogNormalCatalog(seed=%(seed)d, bias=%(bias)g)" %self.attrs
49+
return "LogNormalCatalog(seed=%(seed)d, cosmo_seed=%(cosmo_seed)d, bias=%(bias)g)" %self.attrs
4750

4851
logger = logging.getLogger("LogNormalCatalog")
4952

5053
@CurrentMPIComm.enable
5154
def __init__(self, Plin, nbar, BoxSize, Nmesh, bias=2., seed=None,
52-
cosmo=None, redshift=None,
55+
cosmo_seed=None, cosmo=None, redshift=None,
5356
unitary_amplitude=False, inverted_phase=False, comm=None):
5457

5558
self.comm = comm
@@ -83,7 +86,14 @@ def __init__(self, Plin, nbar, BoxSize, Nmesh, bias=2., seed=None,
8386
if seed is None:
8487
if self.comm.rank == 0:
8588
seed = numpy.random.randint(0, 4294967295)
89+
if cosmo_seed is None:
90+
cosmo_seed = seed
91+
cosmo_seed = self.comm.bcast(cosmo_seed)
8692
seed = self.comm.bcast(seed)
93+
elif cosmo_seed is None:
94+
cosmo_seed = seed
95+
96+
self.attrs['cosmo_seed'] = cosmo_seed
8797
self.attrs['seed'] = seed
8898

8999
# make the actual source
@@ -141,7 +151,7 @@ def _makesource(self, BoxSize, Nmesh):
141151
self.logger.info("Growth Rate is %g" % f)
142152

143153
# compute the linear overdensity and displacement fields
144-
delta, disp = mockmaker.gaussian_real_fields(pm, self.Plin, self.attrs['seed'],
154+
delta, disp = mockmaker.gaussian_real_fields(pm, self.Plin, self.attrs['cosmo_seed'],
145155
unitary_amplitude=self.attrs['unitary_amplitude'],
146156
inverted_phase=self.attrs['inverted_phase'],
147157
compute_displacement=True,

0 commit comments

Comments
 (0)