Extra modules
dm_sheet
This is a module for ARES/HADES/BORG. It adds the algorithms dm_sheet to compute cosmological fields from the dark matter phase-space sheet (in particular, density and velocity fields from tetrahedra formalism).
borg_forward
supports the use of dm_sheet when it is available.
Setup
To use this module, clone the repository in $ARES_ROOT/extra/ (where $ARES_ROOT represents the root source directory of ARES on your computer).
For example, you can do:
cd $ARES_SOURCE/extra
git clone git@bitbucket.org:/bayesian_lss_team/dm_sheet.git dm_sheet
and rebuild.
Use
To use dm_sheet in borg_forward
, use the flag --dmsheet
. New
fields are then added to the output files.
Contributors
The main authors of this module are:
Florent Leclercq
Guilhem Lavaux
To add more features, please contact these people, or submit pull requests.
Additional contributions from:
James Prideaux-Ghee
References
Abel, O. Hahn, R. Kaehler (2012), Tracing the Dark Matter Sheet in Phase Space, arXiv:1111.3944
Hahn, R. Angulo, T. Abel (2015), The Properties of Cosmic Velocity Fields, arXiv:1404.2280
Leclercq, J. Jasche, G. Lavaux, B. Wandelt, W. Percival (2017), The phase-space structure of nearby dark matter as constrained by the SDSS, arXiv:1601.00093
hmclet
Guilhem has developped a much smaller variant of the Hamiltonian Markov Chain algorithm to jointly sample a limited set of parameters (like < 100).
This is HMCLET: a small extra HMC framework for ARES to allow sampling a bunch of model parameters together. It provides a self calibration step to estimate the masses for the HMC.
Setup
The code is available in “hmclet” module . To use it, clone this repository into extra/hmclet in ARES source tree. You can for example do:
cd $ARES_SOURCE/extra
git clone https://bitbucket.org/bayesian_lss_team/hmclet.git hmclet
Once it is checked out you can move to the build directory and run
cmake .
, then make
and you will have the new module compiled.
You can run libLSS/tests/test_hmclet
to check that no error is
triggered and verify the content of “test_sample.h5”. It must contain a
chain with 2 parameters for which the first one oscillates around 1 with
a variance of 10, and the other oscillates around 4 with a variance of
2.
Use
The Little HMC (HMClet, like Applet) framework consists in two classes
in the namespace LibLSS::HMCLet
:
JointPosterior, which is the one acting like a parent to your class describing the log-posterior,
SimpleSampler, which is using an instance of JointPosterior to generate samples using the HMC algorithm.
There is a demonstration (and test case) available in libLSS/tests/test_hmclet.cpp, please have a look at it.
To use SingleSampler you have to make a new class derivative of JointPosterior and implement three functions:
getNumberOfParameters()
which returns an integer corresponding to the number of parameters supported by your posteriorevaluate(parameters)
which returns the opposite of the log-posterior (i.e. like chi2/2)adjointGradient(parameters, adjoint_gradient)
which fills the adjoint gradient vector corresponding to the given parameters.
An example is as follow:
class MyPosterior: virtual public JointPosterior {
public:
/* Bla bla for constructor and destructor */
virtual size_t getNumberOfParameters() const {
return 1;
}
virtual double evaluate(VectorType const& params) {
return 0.5 * square(params[0]-1)/10.;
}
virtual void adjointGradient(VectorType const& params, VectorType& params_gradient) {
params_gradient[0] = (params[0]-1)/10.;
}
};
The above posterior will represent a Gaussian distribution centered on one, with a variance of 10. It depends on a single parameter.
The sampling would occur like this:
auto posterior = std::make_shared<MyPosterior>();
SimpleSampler sampler(posterior);
/* Calibrate the mass matrix.
* comm: MPI communication
* rgen: Random number generator
* steps: number of steps to attempt for calibration
* init_params: initial parameters to start calibration
* init_step: typical step size to start with
*/
sampler.calibrate(comm, rgen, steps, init_params, init_step);
/* Generate a sample with HMC
* comm: MPI communication
* rgen: Random number generator
* params: current parameter state
*/
sampler.newSample(comm, rgen, init_params);
virbius
To be written…
Python
This pages presents the features of the ARES/BORG Python module
Installation
bash get-aquila-modules.sh --clone
automatically retrieves the
module.
Use the --python
flag in build.sh
(see building). The
python package installation is automatic if you run make install
. At the end
of the make phase, a python module will be installed in the user site-package
directory and made available to python VM. If you also require to run with
python defined likelihood (see how to write a likelihood in python) with hades then you also need to append
--hades-python
while executing build.sh
. This requirement will probably
go away later.
Note
If you compile with MPI support the Python binding interface will look for the MPI4PY package. If it is not found, it will just proceed as usual. However, if it is found, the MPI4PY must have been compiled with the same MPI framework as ARES/BORG. Not doing so will very likely result in a segmentation fault when importing borg. A succesfull import will look like the following:
>>> import borg
Initializing console.
[INFO ] libLSS version v2.0.0alpha-47-g7d560cc built-in modules ares_fg;borg;dm_sheet;hades;hmclet;python
[INFO S ] Registered forward models:
[INFO S ] - 2LPT_CIC
[INFO S ] - 2LPT_CIC_OPENMP
[INFO S ] - 2LPT_DOUBLE
[INFO S ] - 2LPT_NGP
[INFO S ] - Downgrade
[INFO S ] - EnforceMass
[INFO S ] - HADES_LOG
[INFO S ] - HADES_PT
[INFO S ] - Haar
[INFO S ] - LPT_CIC
[INFO S ] - LPT_CIC_OPENMP
[INFO S ] - LPT_DOUBLE
[INFO S ] - LPT_NGP
[INFO S ] - PATCH_MODEL
[INFO S ] - PM_CIC
[INFO S ] - PM_CIC_OPENMP
[INFO S ] - PRIMORDIAL
[INFO S ] - PRIMORDIAL_FNL
[INFO S ] - Softplus
[INFO S ] - TRANSFER_EHU
[INFO S ] - Transfer
[INFO S ] - Upgrade
[INFO S ] - bias::BrokenPowerLaw
[INFO S ] - bias::DoubleBrokenPowerLaw
[INFO S ] - bias::EFT
[INFO S ] - bias::EFT_Thresh
[INFO S ] - bias::Linear
[INFO S ] - bias::ManyPower_1^1
[INFO S ] - bias::ManyPower_1^2
[INFO S ] - bias::ManyPower_1^4
[INFO S ] - bias::ManyPower_2^2
[INFO S ] - bias::Noop
[INFO S ] - bias::PowerLaw
[INFO ] Found MPI4PY.
[INFO ] CPU features: MMX [!AVX] [!AVX2] SSE SSE2 [!SSE3] [!SSE4.1] [!SSE4.2]
>>>
As you can see there is a line “Found MPI4PY”.
Usage
First step:
import borg
# This retrieve the console management object
console = borg.console()
# This prints at the STD level
console.print_std("Hello!")
# Reduce verbosity
console.setVerboseLevel(2)
Building your first chain
The BORG python pipeline closely follow the BORGForwardModel v2 API. This means that the input is assumed to be Gaussian random number with unit variance in Fourier space. Fortunately the generation of such numbers is easy:
import numpy as np
# Define a physical box (that is optional for this step, but it will be useful later
box = borg.forward.BoxModel()
box.L = (200,200,200)
box.N = (64,64,64)
# Generate gaussian random numbers, Fourier transform them, and rescale to ensure unit-variance
ic = np.fft.rfftn(np.random.randn(*box.N))/box.N[0]**(1.5)
In the above code snippet we have also defined a BORG box, which is at
the moment limited to 3d. box.L
is the physical size (in Mpc/h) of
the box in each direction, while box.N
is the grid size. In the
above you see that the Fourier transformed density has been rescaled by
\(1/\sqrt{N^3}\). This comes because of simple linear algebraic
properties, and the requirement of unit variance in the Fourier
representation.
Now we need to create a new chain object:
chain = borg.forward.ChainForwardModel(box)
chain.addModel(borg.forward.models.HermiticEnforcer(box))
We have immediately added an element that enforces that the elements of
the input Fourier density field to be self-complex conjugated. This is
not strictly required here as ic
was generated by np.fft.rfftn
.
Our first real element of the chain is the injection of primordial gravity fluctuation:
chain.addModel(borg.forward.models.Primordial(box, 0.1))
This multiplies in Fourier space the input density with a function:
\(A(k) \propto -k^{n_S/2-2}\) The exact constant of proportionality
depends on \(\sigma_8\) (or \(A_S\)), the volume and the Hubble
constant. Note the 0.1
which indicates the scale factor at which the
potential is seeded in the chain. The next elements depend on that
number.
The next element is to add a physical transfer function to produce density fluctuations out of this gravitational potential:
chain.addModel(borg.forward.models.EisensteinHu(box))
This is a simple Einsenstein & Hu power spectrum, which does not change the scale factor of the universe.
Now we need to add a real gravity solver. One simple solver is provided by “BorgLpt” (BORG 1-Lagrangian Perturbation Theory, also known as Zel’dovich approximation).
lpt = borg.forward.models.BorgLpt(box=box, box_out=box, ai=0.1, af=1.0, supersampling=4)
chain.addModel(lpt)
(Question from Andrija: What does the supersampling param control? The ai and af look intuitive enough, for initial scale factor and final one essentially controlling the time, but supersampling I don’t understand. Also doing help(borg.forward.models.BorgLpt) didn’t help me much in understanding)
In the above case we keep the object lpt
in the current scope to be
able to access more internal state later.
We can now setup the cosmology:
cosmo_par = borg.cosmo.CosmologicalParameters()
cosmo_par.default()
print(repr(cosmo_par))
chain.setCosmoParams(cosmo_par)
We have used some sane defaults for the cosmology in the above. The values of the parameters are printed using the print statement. All the elements of the chain are being updated with the last statement. They try to do this “lazily”, i.e. if the cosmology has not changed nothing will happen (as updating the internal cached state may be very costly).
The model is run with chain.forwardModel_v2(ic)
, which goes through
the entire chain. The final density field is not yet produced. To do
this we need to request it explicitly:
rho = np.empty(chain.getOutputBoxModel().N)
chain.getDensityFinal(rho)
rho
holds now density contrast of the simulation. In IPython, one
can show check a slice using:
from matplotlib import pyplot as plt
plt.imshow(rho[:,:,chain.getOutputBoxModel().N[2]//2])
plt.show()
Computing the adjoint gradient
The evaluation of the adjoint gradient follows the same pattern as for
the forward evaluation. Instead of the pair forwardModel_v2
and
getDensityFinal
, one must use adjointModel_v2
and
getAdjointModel
. However keep in mind the shapes of the arrays are
reversed: adjointModel_v2
requires an array according to the output
of the forward model. Thus we have:
dlogL_drho = np.empty(chain.getOutputBoxModel().N)
# Here fill up dlogL_drho from the gradient of the likelihood
chain.adjointModel_v2(dlogL_drho)
ic = np.empty(chain.getBoxModel().N)
chain.getAdjointModel(ic)
Note also that we have requested the initial conditions in real representation (and not Fourier). A Fourier representation may be requested by providing an adequate sized complex array.
Computing the velocity field
BORG comes pre-bundled with velocity field estimator (along with their adjoint gradient of course). A very simple estimator is provided by the CIC density estimator. It requires a particle based simulator to estimate the velocity field from. Such particle based simulators are for example BorgLpt, Borg2Lpt or BorgPM. If the types are not compatible, an exception will be thrown.
The usage is simple, here is an example:
vmodel = borg.forward.velocity.CICModel(box, lpt)
out_v = np.empty((3,)+box.N)
vmodel.getVelocityField(out_v)
The first statement creates the velocity field estimator, with the
requested box to be produced and the particle based forward model
lpt
(same variable as in the section “Building your first chain”). The second statement
allocates the required memory. The last statement triggers the
computation. The above statements shall be run after executing
forwardModel_v2
on the chain
object.
One can then show a slice (here of the x-component), and the check the compatibility with the density field:
plt.imshow(out_v[0,:,:,chain.getOutputBoxModel().N[2]//2])
plt.show()
Computing some bias models directly
PyBORG has a submodule called “bias” which provides a direct route to some of the bundled bias models (in C++ those are the generic bias models). Not all models are linked in though. The usage is relatively straightforward. We will use EFTBiasDefault as an example:
import numpy as np
import borg
boxm = borg.forward.BoxModel()
model = borg.forward.models.HadesLinear(boxm, 0.1, 1.0)
bias_model = borg.bias.EFTBiasDefault(0.1)
density = np.random.normal(size=boxm.N)
biased_density = np.zeros(boxm.N)
params = np.ones(7)
bias_model.compute(model, 1.0, params, density, biased_density)
The example starts by loading the borg
module. Then we just
construct a forward model element for the example using HadesLinear
.
In your code that should be a reasonable element that you used to
produce the matter density field. The bias model may try to discuss
directly with that element so it is a good practice to really provide
meaningful elements. Then we construct a bias model object
EFTBiasDefault
. This one has a mandatory argument to specify the
Lambda
parameter in that specific model, which we set to
\(0.1h \mathrm{Mpc}^{-1}\) here. The next steps are just
initialization of the field used for bias_model.compute
. As can be
directly inferred from the call the following arguments are required:
a borg forward model (
model
)the value of nmean (though it could be ignored depending on the specific bias model)
a 1d numpy array of float64 for the parameters of the model
the 3d density contrast (
density
)the output 3d biased density (
biased_density
)
Running with MPI
Using MPI requires some care that is not completely handled automatically.
One may initialize the python with MPI like this:
import numpy as np
import borg
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
In rank and size you will now have the rank of the current process in
the MPI communicator, and size will hold the total size. Then a typical
initialization chain of forward models may be constructed as indicated
there. Assuming that chain is such an
object one may query the expected slabs with getMPISlice()
(for the input) and getOutputMPISlice()
(for the output):
startN0,localN0,in_N1,in_N2 = chain.getMPISlice()
out_startN0,out_localN0,out_N1,out_N2 = chain.getOutputMPISlice()
These may be used like this:
x = np.zeros((localN0,in_N1,in_N2))
if localN0 > 0:
x[:,:,:] = ref_data[startN0:(startN0+localN0),:,:]
with ref_data
some array that covers the entire box. As you can see,
the x
array requires only the part between startN0 and
startN0+localN0 of that array. In practice that array (ref_data
) may
not have to exist in memory.
Then x
may be directly provided to chain.forwardModel_v2
as a
first argument. The output density field follow the same rule as the
input density field.
Writing a new forward model
The interface of the forward model in python closely follow the one in C++. The basic skeleton is given by the following lines of code:
import jax
class MyModel(borg.forward.BaseForwardModel):
# Constructor
def __init__(self, box):
super().__init__(box, box)
# IO "preferences"
def getPreferredInput(self):
return borg.forward.PREFERRED_REAL
def getPreferredOutput(self):
return borg.forward.PREFERRED_REAL
# Forward part
def forwardModel_v2(self, input_array):
self.save = jax.numpy.array(input_array)
def getDensityFinal(self, output_array):
output_array[:] = self.save**2
# Adjoint part
def adjointModel_v2(self, input_ag):
self.ag = input_ag
def getAdjointModel(self, output_ag):
output_ag[:] = 2 * self.ag * self.save
There are four main group in the function that needs be implemented:
the constructor. It is crucial that the constructor of the parent is explicitly called. Otherwise the interface will not work. The parent constructor takes two argument: the input box (of type
borg.forward.BoxModel
) and the output box (same type).the function providing the “preferred IO” for the forward and adjoint functions. In practice the preferrence is enforced for python. This means that the value indicated here will change the kind of arrays that are provided to the forward and adjoint part. At the moment two type of IO are possible:
PREFERRED_REAL: the model wants a 3d real space representation as an argument
PREFERRED_FOURIER: the model wants a 3d fourier space representation as an argument
then the forward evaluation part itself has to be implemented in two pieces:
forwardModel_v2
andgetDensityFinal
(it is optional depending on what is put after that model). It is expected thatforwardModel_v2
executes the main part of the computation but it is not fully required.finally the computation of the adjoint gradient follows the same pattern as the forward computation. The difference is that the types and shapes of arrays are reversed.
input_ag
has a shape/type corresponding to the output andoutput_ag
to the input.
Finally, as shown above, the input/output array are using a numpy interface. They can thus be used in JAX/Tensorflow/whatever. In the example code above the input array is saved in a jax array and evaluated later. This is legit, though bear in mind that means there will be memory that will not be freed while you retain that reference.
Build a python likelihood script
Ini file
[python]
likelihood_path=test_likelihood.py
bias_sampler_type=slice
The hades_python initializers
A typical python likelihood requires three initialization function. They must be registered using the helper decorators borg.registerGravityBuilder (for the forward model), borg.registerLikelihoodBuilder (for the bias+likelihood part), borg.registerSamplerBuilder (for extra sampling strategies).
An example of their use is the following piece of code:
import borg
@borg.registerGravityBuilder
def build_gravity_model(state, box):
global model
chain = borg.forward.ChainForwardModel(box)
chain.addModel(borg.forward.models.HermiticEnforcer(box))
chain.addModel(borg.forward.models.Primordial(box, 1.0))
chain.addModel(borg.forward.models.EisensteinHu(box))
model = chain
return chain
@borg.registerLikelihoodBuilder
def build_likelihood(state, info):
boxm = model.getBoxModel()
return MyLikelihood(model, boxm.N, boxm.L)
@borg.registerSamplerBuilder
def build_sampler(state, info):
return []
The build_gravity_model function returns a BORGForwardModel object, and take a MarkovState and a BoxModel as parameters. The build_likelihood function must return a Likelihood3d object (check help(borg.likelihood.Likelihood3d)). Finally build_sampler must return a list of sampler object.
The forward model elements can be either the C++ or Python object and both work transparently. Likelihoods may also be written in pure python though MPI is still untested at this time (August 2020).
Writing a likelihood
In the previous section we have seen how to build the objects required
by hades_python to analyze data. We have not approached how to write a
likelihood in python. A lot of likelihood and bias are already available
from the C++ side, for example borg.likelihood.GaussianPassthrough
,
borg.likelihood.GaussianLinear
or
borg.likelihood.PoissonPowerLaw
. To create new ones easily in
python, one has to write a class inheriting from
borg.likelihood.BaseLikelihood
and implement a number of functions.
An example of a simple gaussian likelihood is shown herebelow:
import borg
cons = borg.console()
myprint = lambda x: cons.print_std(x) if type(x) == str else cons.print_std(
repr(x))
class MyLikelihood(borg.likelihood.BaseLikelihood):
def __init__(self, fwd, N, L):
myprint(f" Init {N}, {L}")
super().__init__(fwd, N, L)
def initializeLikelihood(self, state):
myprint("Init likelihood")
self.data = state['galaxy_data_0']
state.newArray3d("my_density_field", True, self.data.shape[0],
self.data.shape[1], self.data.shape[2])
def updateMetaParameters(self, state):
cpar = state['cosmology']
myprint(f"Cosmology is {cpar}")
self.getForwardModel().setCosmoParams(cpar)
def generateMockData(self, s_hat, state):
fwd = self.getForwardModel()
output = np.zeros(fwd.getOutputBoxModel().N)
fwd.forwardModel_v2(s_hat)
fwd.getDensityFinal(output)
state['galaxy_data_0'][:] = output + np.random.normal(
size=output.shape) * sigma_noise
state['my_density_field'][:] = output
like = ((state['galaxy_data_0'][:] - output)**2).sum() / sigma_noise**2
myprint(
f"Initial log_likelihood: {like}, var(s_hat) = {np.var(s_hat)}")
def logLikelihoodComplex(self, s_hat, gradientIsNext):
fwd = self.getForwardModel()
output = np.zeros(fwd.getBoxModel().N)
fwd.forwardModel_v2(s_hat)
fwd.getDensityFinal(output)
L = 0.5 * ((output - self.data)**2).sum() / sigma_noise**2
myprint(f"var(s_hat): {np.var(s_hat)}, Call to logLike: {L}")
return L
def gradientLikelihoodComplex(self, s_hat):
fwd = self.getForwardModel()
output = np.zeros(fwd.getOutputBoxModel().N)
fwd.forwardModel_v2(s_hat)
fwd.getDensityFinal(output)
mygradient = (output - self.data) / sigma_noise**2
fwd.adjointModel_v2(mygradient)
mygrad_hat = np.zeros(s_hat.shape, dtype=np.complex128)
fwd.getAdjointModel(mygrad_hat)
return mygrad_hat
The function myprint
is an helper to create nice output that streams
correctly with the rest of the C++ code. It is not mandatory but
strongly recommended to use the borg.console() object as it will
seemlessly integrate with other BORG tools.
We will now look at each function one after the other:
__init__
is the constructor. It is crucial that the base constructor is called in the constructor of the new class: it will not be done implicitly by the python virtual machine. The base constructor takes aBORGForwardModel
object, and the grid specificationsN
andL
as tuples.initializeLikelihood
is called at the initialization of the chain and before restoration. If you want to store additional fields in the mcmc, you should allocate them at that moment in the state object. In the above example, a new 3d array is allocated to store the density field after the forward model evaluation.
Note that the forward model has to be evaluated in the log likelihood and its gradient. Though it is in principle required to implement logLikelihood and gradientLikelihood (the real counterpart of the complex functions hereabove), in practice they are not used for the run.