Code tutorials

Types used in the ARES code

A lot of the useful type ‘aliases’ are actually defined in libLSS/samplers/core/types_samplers.hpp. We can discuss a few of those types here.

LibLSS::multi_array

template<typename T, size_t N>
using multi_array = boost::multi_array<T, N, LibLSS::track_allocator<T>>;

This is a type alias for boost::multi_array which uses the default allocator provided by LibLSS to track allocation. It is advised to use it so that it is possible to investigate memory consumption automatically in future. It is perfectly legal not to use it, however you will those features in your report.

LibLSS::ArrayType

This is a type to hold, and store in MCMC file, 3d array targeted to be used in FFT transforms. The definition is

typedef ArrayStateElement<double, 3, FFTW_Allocator<double>, true > ArrayType;

It happens that ArrayType is misnamed as it is only a shell for the type. In future, we can expect it to be renamed to something else like ArrayTypeElement (or something else). We can see that it is a double array, with 3 dimensions. It requires an FFTW_Allocator and it is a spliced array to be reconstructed for mcmc files (last ‘true’).

Allocating the element automatically requires the array to be allocated at the same time. An example for that is as follow:

s_field =new ArrayType(extents[range(startN0,startN0+localN0)][N1][N2], allocator_real);
s_field->setRealDims(ArrayDimension(N0, N1, N2));

To access to the underlying multi_array one needs to access to the member variable array. In the case of the above s_field, it would be:

auto& my_array = *s_field->array;
// Now we can access the array
std::cout << my_array[startN0][0][0] << std::endl;

Warning

Do not store a pointer to the above my_array. The array member variable is a shared pointer which can be safely stored with the following type std::shared_ptr<LibLSS::ArrayType::ArrayType>.

LibLSS::CArrayType

This is a type to hold, and store in MCMC file, 3d complex array targeted to be used in FFT transforms. The definition is

typedef ArrayStateElement<std::complex<double>, 3, FFTW_Allocator<std::complex<double> >, true > CArrayType;

It happens that ArrayType is misnamed as it is only a shell for the type. In future, we can expect it to be renamed to something else like CArrayTypeElement (or something else). We can see that it is a double array, with 3 dimensions. It requires an FFTW_Allocator and it is a spliced array to be reconstructed for mcmc files (last ‘true’).

Allocating the element automatically requires the array to be allocated at the same time. An example for that is as follow:

s_hat_field = new CArrayType(base_mgr->extents_complex(), allocator_complex);
s_hat_field->setRealDims(ArrayDimension(N0, N1, N2_HC));

LibLSS::Uninit_FFTW_Complex_Array

The types above are for arrays designated to be saved in MCMC file. To allocator *temporary* arrays that still needs to be run through FFTW, the adequate type is:

typedef UninitializedArray<FFTW_Complex_Array, FFTW_Allocator<std::complex<double> > > Uninit_FFTW_Complex_Array;

This is a helper type because

boost::multi_array

wants to do slow preinitialization of the large array that we use. To circumvent the uninitialization the trick is to create a

boost::multi_array_ref

on a memory allocated by an helper class. UninitializedArray is built for that however it comes at the cost of adding one step before using the array:

Uninit_FFTW_Complex_Array gradient_psi_p(extents[range(startN0,startN0+localN0)][N1][N2_HC],
                                           allocator_complex);
Uninit_FFTW_Complex_Array::array_type& gradient_psi = gradient_psi_p.get_array();

Here ‘gradient_psi_p’ is the holder of the array (i.e. if it gets destroyed, the array itself is destroyed). But if you want to use the array you need to first get it with ‘get_array’.

FFTW manager

Using FFTW, particularly with MPI, can be generally delicate and requiring a lot of intermediate steps. A specific class was created to handle a good fraction of this code pattern that are often used. The class is named LibLSS::FFTW_Manager_3d and is defined in libLSS/tools/mpi_fftw_helper.hpp. The class is limited to the management of 3d transforms. A generalization for \(N\) dimensions is also available: LibLSS::FFTW_Manager<T,Nd>. We will only talk about that last generation here.

Initializing the manager

The constructor is fairly straightforward to use. The constructor has \(N+1\) parameters, the first \(N\) parameters are for specificying the grid dimensions and the last one the MPI communicator.

Allocating arrays

The manager provides a very quick way to allocate arrays that are padded correctly and incorporates the appropriate limits for MPI. The two functions are allocate_array() and allocate_complex_array(). The first one allocates the array with the real representation and the second with the complex representation. The returned value are of the type UnitializedArray. A type usage is the following:

FFTW_Manager<double, 3> mgr(N0, N1, N2, comm);
{
  auto array = mgr.allocate_array();
  auto& real_array = array.get_array();

  real_array[i][j][k] = something;
  // The array is totally destroyed when exiting here.
  //
}

The array allocated that way are designed to be temporary.

Reading in meta-parameters and arrays

If one wishes to access the the content of ARES MCMC files in C++, functions are available in CosmoTool and LibLSS. For example:

#include <iostream>
#include <boost/multi_array.hpp> //produce arrays
#include "CosmoTool/hdf5_array.hpp" //read h5 atributes as said arrays
#include "libLSS/tools/hdf5_scalar.hpp" //read h5 attributes as scalars
#include <H5Cpp.h> //access h5 files

using namespace std;
using namespace LibLSS;

int main()
{
    typedef  boost::multi_array<double, 3> array3_type;

    //access mcmc and restart files
    H5::H5File meta("restart.h5_0", H5F_ACC_RDONLY);
    H5::H5File f("mcmc_0.h5", H5F_ACC_RDONLY);

    //read the number of pixels of the cube as integrer values (x,y,z)
    int N0 = LibLSS::hdf5_load_scalar<int>(meta, "scalars/N0");
    int N1 = LibLSS::hdf5_load_scalar<int>(meta, "scalars/N1");
    int N2 = LibLSS::hdf5_load_scalar<int>(meta, "scalars/N2");

    array3_type density(boost::extents[N0][N1][N2]);

    //read the density field as a 3d array
    CosmoTool::hdf5_read_array(f, "scalars/s_field", density);
}

Obtaining timing statistics

By default the statistics are not gathered. It is possible (and advised during development and testing) to activate them through a build.sh option --perf. In that case, each “ConsoleContext” block is timed separately. In the C++ code, a console context behaves like this:

/* blabla */
{
  LibLSS::ConsoleContext<LOG_DEBUG> ctx("costly computation");

  /* Computations */
  ctx.print("Something I want to say");
} /* Exiting context */
/* something else */

Another variant that automatically notes down the function name and the filename is

/* blabla */
{
   LIBLSS_AUTO_CONTEXT(LOG_DEBUG, ctx);
  /* Computations */
  ctx.print("Something I want to say");
} /* Exiting context */
/* something else */

A timer is started at the moment the ConsoleContext object is created. The timer is destroyed at the “Exiting context” stage. The result is marked in a separate hash table. Be aware that in production mode you should turn off the performance measurements as they take time for functions that are called very often. You can decide on a log level different than LOG_DEBUG (it can be LOG_VERBOSE, LOG_INFO, …), it is the default level for any print call used with the context.

The string given to console context is used as an identifier, so please use something sensible. At the moment the code gathering performances is not aware of how things are recursively called. So you will only get one line per context. Once you have run an executable based on libLSS it will produce a file called “timing_stats.txt” in the current working directory. It is formatted like this:

Cumulative timing spent in different context
--------------------------------------------
Context,   Total time (seconds)

                          BORG LPT MODEL        2       0.053816
                   BORG LPT MODEL SIMPLE        2       0.048709
                      BORG forward model        2       0.047993
                  Classic CIC projection        2       0.003018
(...)

It consists in three columns, separated by a tab. The first column is the name of the context. The second column is the number of times this context has been called. The last and third column is the cumulative time taken by this context, in seconds. At the moment the output is not sorted but it may be in future. You want the total time to be as small as possible. This time may be large for two reasons: you call the context an insane amount of time, or you call it a few times but each one is very costly. The optimization to achieve is then up to you.

Multi-dimensional array management

Allocating arrays

There are several ways of allocating multidimensional arrays dependent on the effect that wants to be achieved.

For use with FFTW/MPI

It is strongly recommended to use the class FFTW_Manager<T,N> (see documentation here, most of BORG is used assuming that you have T=double, N=3; for 3D) to allocate arrays as the MPI and FFTW needs some specific padding and over-allocation of memory which are difficult to get right at first. Assuming mgr is such an object then you can allocate an array like this:

auto array_p = mgr.allocate_array();
auto& a = array_p.get_array();

// a is now a boost::multi_array_ref
for (int i = a0; i < a1; i++)
  for (int j = b0; j < b1; j++)
    for (int k = c0; k < c1; k++)
      std::cout << "a has some value " << a[i][j][k] << std::endl;

With the above statement, keep in mind that the array will be destroyed at the exit of the context. It is possible to have more permanent arrays with the following statement:

auto array_p = mgr.allocate_ptr_array();
auto& a = array_p->get_array();

// array_p is a shared_ptr that can be saved elsewhere
// a is now a boost::multi_array_ref

Uninitialized array

Generally it is advised to allocate the array with the type LibLSS::U_Array<T,N>. It creates an array that is a much faster to initialize and statistics on memory allocation is gathered.

The typical usage is the following:

using namespace LibLSS;

U_Array<double, 2> x_p(boost::extents[N][M]);
auto&x = x_p.get_array();

The line with U_Array will allocate the array (at the same time gathering the statistics), the second line provides with you a boost::multi_array_ref object that can directly access all elements as usual (see previous section).

Dumping an array of scalars

A significant amount of abstraction has been coded in to dump arrays into HDF5 file the most painless possible. Typically to dump an array you would have the following code.

#include <H5Cpp.h>
#include <CosmoTool/hdf5_array.hpp>
#include <boost/multi_array.hpp>

void myfunction() {
   boost::multi_array<double, 2> a(boost::extents[10][4]);

   // Do some initialization of a

   {
     // Open and truncate myfile.h5 (i.e. removes everything in it)
     H5::H5File f("myfile.h5", H5F_ACC_TRUNC);
     // Save 'a' into the dataset "myarray" in the file f.
     CosmoTool::hdf5_write_array(f, "myarray", a);
   }
}

But you need to have your array either be a multi_array or mapped to it through multi_array_ref. Usual types (float, double, int, …) are supported, as well as complex types of. There is also a mechanism to allow for the

FUSE array mechanism

The FUSE subsystem is made available through the includes libLSS/tools/fused_array.hpp, libLSS/tools/fuse_wrapper.hpp. They define wrappers and operators to make the writing of expressions on array relatively trivial, parallelized and possibly vectorized if the arrays permit. To illustrate this there are two examples in the library of testcases: test_fused_array.cpp and test_fuse_wrapper.cpp.

We will start from a most basic example:

boost::multi_array<double, 1> a(boost::extents[N]);
auto w_a = LibLSS::fwrap(a);

w_a = 1;

These few lines create a one dimensional array of length N. Then this array is wrapped in the seamless FUSE expression system. It is quite advised to use auto here as the types can be complex and difficult to guess for newcomers. Finally, the last line fills the array with value 1. This is a trivial example but we can do better:

w_a = std::pow(std::cos(w_a*2*M_PI), 2);

This transforms the content of a by evaluating \(cos(2\pi x)^2\) for each element \(x\) of the array wrapped in w_a. This is done without copy using the lazy expression mechanism. It is possiible to save the expression for later:

auto b = std::pow(std::cos(w_a*2*M_PI), 2);

Note that nothing is evaluated. This only occurs at the assignment phase. This wrap behaves also mostly like a virtual array:

(*b)[i]

accesses computes the i-th value of the expression and nothing else.

Some other helpers in the libLSS supports natively the fuse mechanism. That is the case for RandomNumber::poisson for example:

auto c = fwrap(...);
c = rgen.poisson(b);

This piece of code would compute a poisson realization for a mean value given by the element of the b expression (which must be a wrapped array or one expression of it) and stores this into c.

The sum reduce (parallel reduction) operation is supported by the wrapper:

double s = c.sum();

Some arrays could be entirely virtual, i.e. derived from C++ expressions. This needs to invoke a lower layer of the FUSE mechanism. Creating a pure virtual array looks like that:

auto d = LibLSS::fwrap(LibLSS::b_fused_idx<double, 2>(
   [](size_t i, size_t j)->double {
     return sqrt(i*i + j*j);
   }
));

This operation creates a virtual array and wraps it immediately. The virtual array is a double bidimensional array (the two template parameters), and infinite. Its element are computed using the provided lambda function, which obligatorily takes 2 parameters. It is possible to make finite virtual arrays by adding an extent parameter:

auto d = LibLSS::fwrap(LibLSS::b_fused_idx<double, 2>(
   [](size_t i, size_t j)->double {
     return sqrt(i*i + j*j);
   },
   boost::extents[N][N]
));

Only in that case it is possible to query the dimension of the array.

Finally FUSED mechanism does not yet support automatic dimensional broadcast!

MPI tools

Automatic particle exchange between MPI tasks

It is often useful for code doing N-body simulations to exchange the ownership of particles and all their attributes. The BORG submodule has a generic framework to handle these cases. It is composed of the following parts:

  • a BalanceInfo structure (in libLSS/physics/forwards/particle_balancer/particle_distribute.hpp) which holds temporary information required to do the balancing, and eventually undo it for adjoint gradients. It has an empty constructor and a special function allocate which must take an MPI communicator and the amount of particles that are to be considered (including extra buffering).

  • generic distribute / undistribute functions called respectively particle_redistribute and particle_undistribute.

  • a generic attribute management system to remove buffer copies.

We can start from an example taken from test_part_swapper.cpp:

BalanceInfo info;
NaiveSelector selector;
boost::multi_vector<double, 2> in_positions;
size_t numRealPositions, Nparticles;

/* Fill in_positions... */

info.allocate(comm, Nparticles);

info.localNumParticlesBefore = numRealPositions;
particle_redistribute(info, in_positions, selector);
/* info.localNumParticlesAfter is filled */

In the code above all the initializations are skipped. The load balancer is initialized with allocate. Then the actual number of particles that is really used in the input buffer is indicated by filling localNumParticlesBefore. Then particle_redistribute is invoked. The particles may be completely reshuffled in that operation. The real number of viable particles is indicated in localNumParticlesAfter. Finally, but importantly, the balancing decision is taken by selector, which at the moment must be a functor and bases its decision on the position alone. In future it is possible to use an attribute instead.

Now it is possible to pass an arbitrary number of attributes, living in separate array-like objects. The example is similar as previously:

BalanceInfo info;
NaiveSelector selector;
boost::multi_vector<double, 2> in_positions;
boost::multi_vector<double, 2> velocities;
size_t numRealPositions, Nparticles;

/* Fill in_positions... */

info.allocate(comm, Nparticles);

info.localNumParticlesBefore = numRealPositions;
particle_redistribute(info, in_positions, selector,
      make_attribute_helper(Particles::vector(velocities))
);
/* info.localNumParticlesAfter is filled */

The code will allocate automatically a little amount of temporary memory to accommodate for I/O operations. Two kind of attribute are supported by default, though it is extendable by creating new adequate classes:

  • scalar: a simple 1d array of single elements (float, double, whatever is supported by the automatic MPI translation layer and does not rely on dynamic allocations).

  • vector: a simple 2d array of the shape Nx3 of whatever elements supported by the automatic MPI translation layer.

Ghost planes

The BORG module has a special capabilities to handle ghost planes, i.e. (N-1)d-planes of a Nd cube that are split for MPI work. This happens typically when using FFTW for which only a slab of planes are available locally and the code needs some other information from the other planes to do local computation. An example of this case is the computation of gradient: one needs one extra plane at each edge of the slab to be able to compute the gradient. The ghost plane mechanism tries to automate the boring part of gathering information and eventually redistributing the adjoint gradient of that same operation. The header is libLSS/tools/mpi/ghost_planes.hpp and is exporting one templated structure:

template<typename T, size_t Nd>
struct GhostPlanes: GhostPlaneTypes<T, Nd> {
  template<typename PlaneList,typename PlaneSet, typename DimList>
  void setup(
      MPI_Communication* comm_,
      PlaneList&& planes, PlaneSet&& owned_planes,
      DimList&& dims,
      size_t maxPlaneId_);

   void clear_ghosts();

   template<typename T0, size_t N>
   void synchronize(boost::multi_array_ref<T0,N> const& planes);

   template<typename T0, size_t N>
   void synchronize_ag(boost::multi_array_ref<T0,N>& ag_planes);

   ArrayType& ag_getPlane(size_t i);
   ArrayType& getPlane(size_t i);
};

Many comments are written in the code. Note that Nd above designate the number of dimension for a plane. So if you manipulate 3d-boxes, you want to indicate Nd=2. The typical work flow of using ghostplanes is the following:

  • GhostPlanes object creation

  • call setup method to indicate what are the provided data and requirements

  • do stuff

  • call synchronize before needing the ghost planes

  • use the ghost planes with getPlane()

  • Repeat synchronize if needed

There is an adjoint gradient variant of the synchronization step which does sum reduction of the adjoint gradient arrays corresponding to the ghost planes.

An example C++ code is

std::vector<size_t> here_planes{/* list of the planes that are on the current MPI node */};
std::vector<size_t> required_planes{/* list of the  planes that you need to do computation on this node */};
ghosts.setup(comm, required_planes, here_planes, std::array<int,2>{128,128} /* That's the dimension of the plane, here 2d */, 64 /* That's the total number of planes over all nodes */);

/* A is a slab with range in [startN0,startN0+localN0]. This function will synchronize the data over all nodes. */
ghosts.synchronize(A);

/* ghosts.getPlane(plane_id) will return a 2d array containing the data of the ghost plane 'plane_id'. Note that the data of A are not accessible through that function. */

The synchronize and synchronize_ag accepts an optional argument to indicate what kind of synchronization the user wants. At the moment two synchronization are supported GHOST_COPY and GHOST_ACCUMULATE. GHOST_COPY is the classic mode, which indicates the missing planes has to be copied from a remote task to the local memory. It specified that the adjoint gradient will accumulate information from the different tasks. Note that the array A is a slab. It means that if you do not use the FFTW helper mechanism you should allocate it using the following pattern for 3d arrays

// Some alias for convenience
using boost::extents;
typedef boost::multi_array_types::extent_range e_range;

/* To use a classical multi_array allocation, may be slow */
boost::multi_array<double, 2> A(extents[e_range(startN0, localN0)][N1][N2]);

/* To allocate using the uninitialized array mechanism */
U_Array A_p(extents[e_range(startN0, localN0)][N1][N2]);
auto& A = A_p.get_array();
// Note that A_p is destroyed at the end of the current context if you
// use that.

/* To allocate using the uninitialized array mechanism, and shared_ptr */
std::shared_ptr<U_Array> A_p = std::make_shared<U_Array>(extents[e_range(startN0, localN0)][N1][N2]);
auto& A = A_p->get_array();

// If A_p is transferred somewhere else, then it will not be deallocated.

For 2d arrays, just remove one dimension in all the above code.

The use of the adjoint gradient part is very similar

ghosts.clear_ghosts();

/* declare gradient, fill up with the local information on the slab */
/* if there is information to deposit on 'plane' use the special array as follow*/
ghosts.ag_getPlane(plane)[j][k] = some_value;

/* finish the computation with synchronize_ag, the gradient will compute  */
ghosts.synchronize_ag(gradient);

/* now the gradient holds the complete gradient that must resides on the local slab and the computation may continue */

You can check extra/borg/libLSS/samplers/julia/julia_likelihood.cpp for a more detailed usage for the Julia binding. This tool is also used by the ManyPower bias model though in a much more complicated fashion (extra/borg/libLSS/physics/bias/many_power.hpp).

Julia and TensorFlow

The julia language can be used within HADES. It is automatically installed if julia (at least v0.7.0) is available on the machine and if the hmclet is pulled into extra/. Note that julia is a relatively new language and develops quickly - it is also 1 indexed!

hmclet

At the moment, the julia core is available as part of hmclet - a small HMC which can be used to sample external parameters, such as bias parameters.

.jl files

The julia code is contained in .jl files which must contain several things to be used by the hmclet. An example of a linear bias test likelihood can be found in extra/hmclet/example/test_like.jl.

Initialisation file

The .ini needs to have a few lines added to describe the julia file to use, the name of the module defined in the julia file and whether to use a slice sampler or the hmclet. They are added to the .ini file as

[julia]
likelihood_path=test_like.jl
likelihood_module=julia_test
bias_sampler_type=hmclet

Module name and importing from libLSS

Each julia file must contain a module (whose name is entered in the .ini file)

module julia_test

To be able to import from libLSS (including the state and the print functions) the julia module needs to contain the using statement, including the points.

using ..libLSS

import ..libLSS.State
import ..libLSS.GhostPlanes, ..libLSS.get_ghost_plane
import ..libLSS.print, ..libLSS.LOG_INFO, ..libLSS.LOG_VERBOSE, ..libLSS.LOG_DEBUG

The dots are necessary since the second point is to access the current module and the first point is to access the higher level directory.

Importing modules

Any other julia module can be included in this julia code by using

using MyModule

where MyModule can be self defined or installed before calling in HADES using

using Pkg
Pkg.add("MyModule")

in a julia terminal.

Necessary functions

A bunch of different functions are necessary in the julia code to be used in the hmclet. These are:

function initialize(state)
    print(LOG_INFO, "Likelihood initialization in Julia")
    # This is where hmclet parameters can be initialised in the state
    NCAT = libLSS.get(state, "NCAT", Int64, synchronous=true) # Number of catalogs
    number_of_parameters = 2 # Number of parameters
    for i=1:NCAT
        hmclet_parameters = libLSS.resize_array(state, "galaxy_bias_"*repr(i - 1), number_of_parameters, Float64)
        hmclet_parameters[:] = 1
    end
end

function get_required_planes(state::State)
    print(LOG_INFO, "Check required planes")
    # This is where the planes are gathered when they live on different mpi nodes
    return Array{UInt64,1}([])
end

function likelihood(state::State, ghosts::GhostPlanes, array::AbstractArray{Float64,3})
   print(LOG_INFO, "Likelihood evaluation in Julia")
   # Here is where the likelihood is calculated and returned.
   # This can be a call to likelihood_bias() which is also a necessary function
   NCAT = libLSS.get(state, "NCAT", Int64, synchronous=true)
   L = Float64(0.)
   for i=1:NCAT
       hmclet_parameters = libLSS.get_array_1d(state, "galaxy_bias_"*repr(i - 1), Float64)
       L += likelihood_bias(state, ghosts, array, i, hmclet_parameters)
   end
   return L
end

function generate_mock_data(state::State, ghosts::GhostPlanes, array::AbstractArray{Float64,3})
    print(LOG_INFO, "Generate mock")
    # Mock data needs to be generated also
    NCAT = libLSS.get(state, "NCAT", Int64, synchronous=true)
    for i=1:NCAT
        data = libLSS.get_array_3d(state, "galaxy_data_"*sc, Float64)
        generated_data = function_to_generate_data() # We can use other functions which are defined within the julia module
        for i=1:size(data)[1],j=1:size(data)[2],k=1:size(data)[3]
            data[i, j, k] = generated_data[i, j, k] + libLSS.gaussian(state) # We can use functions defined in libLSS
        end
    end
end

function adjoint_gradient(state::State, array::AbstractArray{Float64,3}, ghosts::GhostPlanes, ag::AbstractArray{Float64,3})
    print(LOG_VERBOSE, "Adjoint gradient in Julia")
    # The gradient of the likelihood with respect to the input array
    NCAT = libLSS.get(state, "NCAT", Int64, synchronous=true)
    ag[:,:,:] .= 0 # Watch out - this . before the = is necessary... extremely necessary!
    for i=1:NCAT
       # Calculate the adjoint gradient here and update ag
       # Make sure not to update any gradients which are not in the selection
       selection = libLSS.get_array_3d(state, "galaxy_sel_window_"*repr(i - 1), Float64)
       mask = selection .> 0
       adjoint_gradient = function_to_calculate_adjoint_gradient()
       ag[mask] += adjoint_gradient[mask]
    end
end

function likelihood_bias(state::State, ghosts::GhostPlanes, array, catalog_id, catalog_bias_tilde)
    # The likelihood after biasing the input array
    L = function_to_calculate_likelihood()
    return L
end

function get_step_hint(state, catalog_id, bias_id)
    # Guess for the initialisation of the hmclet mass matrix or the slice sample step size
    return 0.1
end

function log_prior_bias(state, catalog_id, bias_tilde)
    # Prior for the bias parameters
    return 0.
end

function adjoint_bias(state::State, ghosts::GhostPlanes, array, catalog_id, catalog_bias_tilde, adjoint_gradient_bias)
    # Calculate the gradient of the likelihood with respect to the parameters in the hmclet
    adjoint_gradient_bias[:] .= function_to_calculate_gradient_with_respect_to_bias()
end

TensorFlow in julia

One amazing advantage of having julia built into HADES is that we can now use TensorFlow. TensorFlow is a very powerful tensor based computational language which has the exact same syntax for running on GPUs and CPUs. The version of TensorFlow.jl is not officially supported, but is relatively well maintained, although it is based on v1.4 whilst the current version is well beyond that. One can use a newer vesion of TensorFlow by installing it from source and placing it in the julia TensorFlow directory, however doing this does not give you access to all the commands available in TensorFlow. For example, TensorFlow.subtract() and TensorFlow.divide() do not exist. Fortunately, a lot of julia functions work on TensorFlow tensors (such as -, .-, / and ./).

There is a TensorFlow implementation of test_like.jl (discussed above) in extra/hmclet/example/test_like_TF.jl.

The essence of TensorFlow is to build a graph of tensors connected by computations. Once the graph is built then results are accessed by passing values through the graph. An example graph could be:

using TensorFlow
using Distributions # To be used for initialising variable values

 = TensorFlow.placeholder(Float64, shape = [100, 1], name = "a")      # This is a tensor which contains no value and has a shape
                                                                      # of [100, 1]
b = TensorFlow.placeholder(Float64, shape = (), name = "b")           # This is a tensor which contains no value or shape

c = TensorFlow.placeholder(Float64, shape = [1, 10], name = "c")      # This is a tensor which has no value and has a shape of [1, 10]

variable_scope("RandomVariable"; initializer=Normal(0., 0.1)) do
    global d = TensorFlow.get_variable("d", Int64[10], Float64)       # This is a variable tensor which can be initialised to a value
end                                                                   # and has a shape of [10]. It must be global so it has maintains
                                                                      # outside of the scope
e = TensorFlow.constant(1.:10., dtype = Float64, name = "e")          # This is a tensor of constant value with shape [10]

f = TensorFlow.matmul(a, c, name = "f")                               # Matrix multiplication of a and c with output shape [100, 10]

#g = TensorFlow.matmul(b, c, name = "g")                              # Matrix multiplication of b and c
                                                                      # !THIS WILL FAIL SINCE b HAS NO SHAPE! Instead one can use
g = TensorFlow.identity(b .* c, name = "g")                           # Here we make use of the overload matrix multiplication
                                                                      # function in julia, the tensor will say it has shape [1, 10]
                                                                      # but this might not be true. We use identity() to give the
                                                                      # tensor a name.

h = TensorFlow.add(f, e, name = "h")                                  # Addition of f and e

i = TensorFlow.identity(f - e, name = "i")                            # Subtraction of f and e

j = TensorFlow.identity(f / e, name = "j")                            # Matrix division of f and e

k = TensorFlow.identity(j ./ i, name = "k")                           # Elementwise division of j by i

We now have lots of tensors defined, but notice that these are tensors and are not available as valued quantities until they are run. For example running these tensors gives

a
    > <Tensor a:1 shape=(100, 1) dtype=Float64>
b
    > <Tensor b:1 shape=() dtype=Float64> # Note this is not the real shape of this tensor
c
    > <Tensor c:1 shape=(1, 10) dtype=Float64>
d
    > <Tensor d:1 shape=(10) dtype=Float64>
e
    > <Tensor e:1 shape=(10) dtype=Float64>
f
    > <Tensor f:1 shape=(100, 10) dtype=Float64>
g
    > <Tensor g:1 shape=(1, 10) dtype=Float64> # Note this is not the real shape of this tensor either
h
    > <Tensor h:1 shape=(100, 10) dtype=Float64>
i
    > <Tensor i:1 shape=(100, 10) dtype=Float64>
j
    > <Tensor j:1 shape=(100, 10) dtype=Float64>
k
    > <Tensor k:1 shape=(100, 10) dtype=Float64>

To actually run any computations a session is needed

sess = Session(allow_growth = true)

The allow_growth option prevents TensorFlow for taking up the entire memory of a GPU.

Any constant value tensors can now be accessed by running the tensor in the session

run(sess, TensorFlow.get_tensor_by_name("e"))
    > 10-element Array{Float64,1}:
    >   1.0
    >   2.0
    >   3.0
    >   4.0
    >   5.0
    >   6.0
    >   7.0
    >   8.0
    >   9.0
    >   10.0
run(sess, e)
    > 10-element Array{Float64,1}:
    >   1.0
    >   2.0
    >   3.0
    >   4.0
    >   5.0
    >   6.0
    >   7.0
    >   8.0
    >   9.0
    >   10.0

Notice how we can call the tensor by its name in the graph (which is the proper way to do things) or by its variable name. If we want to call an output to a computation we need to supply all necessary input tensors

distribution = Normal()
onehundredbyone = reshape(rand(distribution, 100), (100, 1))
onebyten = reshape(rand(distribution, 10), (1, 10))

run(sess, TensorFlow.get_tensor_by_name("f"), Dict(TensorFlow.get_tensor_by_name("a")=>onehundredbyone, TensorFlow.get_tensor_by_name("c")=>onebyten))
    > 100×10 Array{Float64,2}:
    >   ... ...
run(sess, f, Dict(a=>onehundredbyone, c=>onebyten))
    > 100×10 Array{Float64,2}:
    >   ... ...
run(sess, TensorFlow.get_tensor_by_name("k"), Dict(TensorFlow.get_tensor_by_name("a")=>onehundredbyone, TensorFlow.get_tensor_by_name("c")=>onebyten))
    > 100×10 Array{Float64,2}:
    >   ... ...
run(sess, k, Dict(a=>onehundredbyone, c=>onebyten))
    > 100×10 Array{Float64,2}:
    >   ... ...

Any unknown shape tensor needs to be fed in with the correct shape, but can in principle be any shape. If there are any uninitialised values in the graph they need initialising otherwise the code will output an error

run(sess, TensorFlow.get_tensor_by_name("RandomVariable/d"))
    > Tensorflow error: Status: Attempting to use uninitialized value RandomVariable/d

Notice that the variable built within variable_scope has the scope name prepended to the tensor name. The initialisation of the tensor can be done with TensorFlow.global_variables_initializer():

run(sess, TensorFlow.global_variables_initializer())

Once this has been run then tensor d will have a value. This value can only be accessed by running the tensor in the session

run(sess, TensorFlow.get_tensor_by_name("RandomVariable/d"))
    > 1×10 Array{Float64,2}:
    >  0.0432947  -0.208361  0.0554441    -0.017653  -0.0239981  -0.0339648
run(sess, d)
    > 1×10 Array{Float64,2}:
    >  0.0432947  -0.208361  0.0554441    -0.017653  -0.0239981  -0.0339648

This is a brief overview of how to use TensorFlow. The HADES hmclet likelihood code sets up all of the graph in the initialisation phase

function setup(N0, N1, N2)
    global adgrad, wgrad
    p = [TensorFlow.placeholder(Float64, shape = (), name = "bias"), TensorFlow.placeholder(Float64, shape = (), name = "noise")]
    δ = TensorFlow.placeholder(Float64, shape = Int64[N0, N1, N2], name = "density")
    g = TensorFlow.placeholder(Float64, shape = Int64[N0, N1, N2], name = "galaxy")
    s = TensorFlow.placeholder(Float64, shape = Int64[N0, N1, N2], name = "selection")
    gaussian = TensorFlow.placeholder(Float64, shape = Int64[N0, N1, N2], name = "gaussian_field")
    mask = TensorFlow.placeholder(Bool, shape = Int64[N0, N1, N2], name = "mask")
    mask_ = TensorFlow.reshape(mask, N0 * N1 * N2, name = "flat_mask")
    g_ = TensorFlow.identity(TensorFlow.boolean_mask(TensorFlow.reshape(g, N0 * N1 * N2), mask_), name = "flat_masked_galaxy")
    s_ = TensorFlow.identity(TensorFlow.boolean_mask(TensorFlow.reshape(s, N0 * N1 * N2), mask_), name = "flat_masked_selection")
    output = TensorFlow.add(1., TensorFlow.multiply(p[1], δ), name = "biased_density")
    mock = TensorFlow.multiply(s, output, name = "selected_biased_density")
    mock_ = TensorFlow.identity(TensorFlow.boolean_mask(TensorFlow.reshape(mock, N0 * N1 * N2), mask_), name = "flat_masked_selected_biased_density")
    mock_galaxy = TensorFlow.add(mock, TensorFlow.multiply(TensorFlow.multiply(TensorFlow.sqrt(TensorFlow.exp(p[2])), s), gaussian), name = "mock_galaxy")
    ms = TensorFlow.reduce_sum(TensorFlow.cast(mask, Float64), name = "number_of_voxels")
    loss = TensorFlow.identity(TensorFlow.add(TensorFlow.multiply(0.5, TensorFlow.reduce_sum(TensorFlow.square(g_ - mock_) / TensorFlow.multiply(TensorFlow.exp(p[2]), s_))), TensorFlow.multiply(0.5, TensorFlow.multiply(ms, p[2]))) - TensorFlow.exp(p[1]) - TensorFlow.exp(p[2]), name = "loss")
    adgrad = TensorFlow.gradients(loss, δ)
    wgrad = [TensorFlow.gradients(loss, p[i]) for i in range(1, length = size(p)[1])]
end

Notice here that in TensorFlow, the gradients are *super* easy to calculate since it amounts to a call to TensorFlow.gradients(a, b) which is equivalent to da/db (its actually sum(da/db) so sometimes you have to do a bit more leg work.

Now, whenever the likelihood needs to be calculated whilst running HADES the syntax is a simple as

function likelihood(state::State, ghosts::GhostPlanes, array::AbstractArray{Float64,3})
    print(LOG_INFO, "Likelihood evaluation in Julia")
    L = Float64(0.)
    for catalog=1:libLSS.get(state, "NCAT", Int64, synchronous=true)
        L += run(sess, TensorFlow.get_tensor_by_name("loss"),
                Dict(TensorFlow.get_tensor_by_name("bias")=>libLSS.get_array_1d(state, "galaxy_bias_"*repr(catalog - 1), Float64)[1],
                     TensorFlow.get_tensor_by_name("noise")=>libLSS.get_array_1d(state, "galaxy_bias_"*repr(catalog - 1), Float64)[2],
                     TensorFlow.get_tensor_by_name("density")=>array,
                     TensorFlow.get_tensor_by_name("galaxy")=>libLSS.get_array_3d(state, "galaxy_data_"*repr(catalog - 1), Float64),
                     TensorFlow.get_tensor_by_name("selection")=>libLSS.get_array_3d(state, "galaxy_sel_window_"*repr(catalog - 1), Float64),
                     TensorFlow.get_tensor_by_name("mask")=>libLSS.get_array_3d(state, "galaxy_sel_window_"*repr(catalog - 1), Float64).>0.))
    end
    print(LOG_VERBOSE, "Likelihood is " * repr(L))
    return L
end

If TensorFlow is installed to use the GPU, then this code will automatically distribute to the GPU.

Writing a new ARES core program

What is a core program ?

A core program is in charge of initializing the sampling machine, loading the data in their structures and running the main sampling loop. There are two default core programs at the moment: ARES3 (in src/ares3.cpp) and HADES3 (extra/hades/src/hades3.cpp). ARES3 implements the classical ARES sampling framework, which includes linear modeling, bias, foreground and powerspectrum sampling. HADES3 implements the non-linear density inference machine: classical HADES likelihood, BORG LPT, BORG 2LPT, BORG PM, and different variant of bias functions.

Why write a new one ?

Because you are thinking of a radically different way of presenting the data, or because your model is based on different assumptions you may have to redesign the way data are load and initialized. Also if you are thinking of a different way of sampling the different parameters (or more than usual) then you may have to implement a new bundle.

Prepare yourself

A core program is composed of different elements that can be taken from different existing parts. We can look at ares3.cpp for an example. The main part (except the splash screen) is:

#define SAMPLER_DATA_INIT "../ares_init.hpp"
#define SAMPLER_BUNDLE "../ares_bundle.hpp"
#define SAMPLER_BUNDLE_INIT "../ares_bundle_init.hpp"
#define SAMPLER_NAME "ARES3"
#define SAMPLER_MOCK_GENERATOR "../ares_mock_gen.hpp"
#include "common/sampler_base.cpp"

As you can see a number of defines are set up before including the common part, called “common/sampler_base.cpp”. These defines are doing the following:

  • SAMPLER_DATA_INIT specifies the include file that holds the definition for data initializer. This corresponds to two functions:

    • template void sampler_init_data(MPI_Communication *mpi_world, MarkovState& state, PTree& params),
      

      which is in charge of allocating the adequate arrays for storing input data into the state dictionnary. The actual names of these fields are sampler dependent. In ares and hades, they are typically called “galaxy_catalog_%d” and “galaxy_data_%d” (with %d being replaced by an integer). This function is always called even in the case the code is being resumed from a former run.

    • template void sampler_load_data(MPI_Communication *mpi_world, MarkovState& state, PTree& params, MainLoop& loop),
      

      which is in charge of loading the data into the structures. This function is only called during the first initialization of the chain.

  • SAMPLER_BUNDLE defines the sampler bundle which are going to be used. Only the structure definition of SamplerBundle should be given here.

  • SAMPLER_BUNDLE_INIT defines two functions working on initializing the bundle:

    • template void sampler_bundle_init(MPI_Communication *mpi_world, ptree& params, SamplerBundle& bundle, MainLoop& loop),
      

      which does the real detailed initialization, including the sampling loop program.

    • void sampler_setup_ic(SamplerBundle& bundle, MainLoop& loop),
      

      which allows for more details on the initial conditions to be set up.

  • SAMPLER_NAME must a be a static C string giving the name of this core program.

  • SAMPLER_MOCK_GENERATOR specifies a filename where

template void prepareMockData(PTree& ptree, MPI_Communication *comm, MarkovState& state, CosmologicalParameters& cosmo_params, SamplerBundle& bundle)

is defined. “ares_mock_gen.hpp” is a single gaussian random field generator with the selection effect applied to data.

Creating a new one

Create the skeleton

Create the sampler bundle

Initializing data structures

Filling data structures

Attach the core program to cmake

Build

Adding a new likelihood in C++

Steps to wire a C++ likelihood in hades3.

Preamble

Forward models can self register now. Unfortunately likelihood cannot. So more work is required. First one must think that there are three variants of implementing a new likelihood. One of the three options are possible, depending on the complexity and level of code reuse that is sought about (from more abstract/more code-reuse to less abstract-more flexible):

  1. rely on the generic framework (see extra/borg/libLSS/physics/likelihoods/gaussian.hpp for example)

  2. use the base class of HADES extra/hades/libLSS/samplers/hades/base_likelihood.hpp

  3. implement a full likelihood from scratch

Use generic framework

The generic framework provides more turnkey models at the price of more programming abstraction.

Warning! The following was written by Fabian. To be checked by Guilhem.

This works best by copying some existing classes using the generic framework. The generic framework separates the posterior into “bias model” and “likelihood”, which then form a “bundle”. Two basic working examples can be checked to give a better impression:

  • bias: e.g., extra/borg/libLSS/physics/bias/power_law.hpp (the Power law bias model)

  • likelihood: e.g., extra/borg/libLSS/physics/likelihoods/gaussian.hpp (the per voxel Gaussian likelihood)

Note that you do not need to recreate both likelihood and bias, if one is sufficient for your needs (e.g., you can bundle a new bias model to an existing likelihood). Of course, your classes can be defined with additional template parameters, although we shall assume there are none here.

We will now see the three steps involved in the creation and link of a generic bias model.

Writing a bias model

We will consider the noop (for no operation) bias model, which does nothing to the input density contrast to demonstrate the steps involved in the modification and development of a bias model. The full code is available in extra/borg/libLSS/physics/bias/noop.hpp. The model requires an ample use of templates. The reason is that a number of the exchanged arrays in the process have very complicated types: they are not necessarily simple boost::multi_array_ref, they can also be expressions. The advantage of using expressions is the global reduction of the number of mathematical operations if the data is masked, and the strong reduction of Input/Output memory operations, which is generally a bottleneck in modern computers. The disadvantage is that the compilation becomes longer and the compilation error may become obscure.

Here is a simplification of the NoopBias class (defined as a struct here which has a default visibility of public to all members):

struct Noop {

   static constexpr const bool NmeanIsBias = true;
   static const int numParams = 1;

   selection::SimpleAdaptor selection_adaptor;

   double nmean;

   // Default constructor
   Noop(LikelihoodInfo const& = LikelihoodInfo()) {}

   // Setup the default bias parameters
   template <typename B>
   static inline void setup_default(B &params) {}

   // Prepare the bias model for computations
   template <
       class ForwardModel, typename FinalDensityArray,
       typename BiasParameters, typename MetaSelect = NoSelector>
   inline void prepare(
       ForwardModel &fwd_model, const FinalDensityArray &final_density,
       double const _nmean, const BiasParameters &params,
       bool density_updated, MetaSelect _select = MetaSelect()) {
     nmean = params[0];
   }

   // Cleanup the bias model
   void cleanup() {}

   // This function is a relic required by the API. You can return 1 and it
   // will be fine.
   inline double get_linear_bias() const { return 1; }

   // Check whether the given array like object passes the constraints of the bias model.
   template <typename Array>
   static inline bool check_bias_constraints(Array &&a) {
     return true;
   }

   // Compute a tuple of biased densities. The computation may be lazy or not.
   template <typename FinalDensityArray>
   inline auto compute_density(const FinalDensityArray &array) {
     return std::make_tuple(b_va_fused<double>(
           [nmean](double delta) { return nmean*(1 + delta); }, array));
   }

   // Compute a tuple of adjoint gradient on the biased densities.
   template <
       typename FinalDensityArray, typename TupleGradientLikelihoodArray>
   inline auto apply_adjoint_gradient(
       const FinalDensityArray &array,
       TupleGradientLikelihoodArray grad_array) {
     return std::make_tuple(b_va_fused<double>(
         [](double g) { return g; },
         std::move(std::get<0>(grad_array))));
   }

The bias model can be decomposed in:

  1. a setup phase, with the constructor, the setup_default, get_linear_bias

  2. a sanity check phase with check_bias_constraints

  3. a pre-computation, cleanup phase with prepare and cleanup

  4. the actual computation in compute_density and apply_adjoint_gradient.

The life cycle of a computation is following roughly the above steps:

  1. construct

  2. setup

  3. prepare computation

  4. compute density

  5. (optionally) compute adjoint gradient

  6. cleanup

As you can see in the above most functions are templatized, for the reason expressed before the code. As a reminder, the name of of each template indicated after the keyword typename X indicates that we need a potentially different type, which is discovered at the use of the specific function or class.

Let us focus on compute_density:

// Compute a tuple of biased densities. The computation may be lazy or not.
template <typename FinalDensityArray>
inline auto compute_density(const FinalDensityArray &array) {
  return std::make_tuple(b_va_fused<double>(
        [nmean](double delta) { return nmean*(1 + delta); }, array));
}

Conventionally, it accepts an object which must behave, syntaxically, like an a boost::multi_array. In case a concrete, memory-backed, array is needed, one has to allocate it and copy the content of array to the newly allocated array. The member function must return a tuple (type std::tuple<T1, T2, ...>) of array-like objects. As this type is complicated, we leverage a C++14 feature which allows the compiler to decide the returned type of the function by inspecting the value provided to return. Here, this is the value returned by make_tuple, which is built out of a single “fused” array. The fused array is built out of a function that is evaluated for each element of the array provided as a second argument to b_va_fused. In practice if we call a that array, the element at i, j, k is a[i][j][k] would be strictly equal to nmean*(1+delta[i][j][k]).

Writing a likelihood model

Linking your bias/likelihood bundle to BORG

Suppose then you have mybias.hpp, mylike.hpp, which define classes MyBias, MyLikelihood. If you have encapsulated the classes in their own namespace, make sure they are visible in the bias:: namespace (in case of MyBias) and the root namespace (in case of MyLikelihood). The rationale behind that is to avoid polluting namespaces and avoid name collisions while combining different headers/C++ modules.

  1. each bias class has to declare the following two parameters in extra/borg/physics/bias/biases.cpp (which are defined in mybias.hpp; make sure to also #include "mybias.hpp"):

const int LibLSS::bias::mynamespace::MyBias::numParams;
const bool LibLSS::bias::mynamespace::EFTBias::NmeanIsBias;
  1. Then, you have to register your bundle: in extra/hades/src/hades_bundle_init.hpp, under

std::map<
        std::string,
        std::function<std::shared_ptr<VirtualGenericBundle>(
            ptree &, std::shared_ptr<GridDensityLikelihoodBase<3>> &,
            markov_ptr &, markov_ptr &, markov_ptr &,
            std::function<MarkovSampler *(int, int)> &, LikelihoodInfo &)>>
        generic_map{ // ...

add your bundle:

{"MY_BIAS_LIKE", create_generic_bundle<bias::MyBias, MyLikelihood,ptree &>}

In addition, in extra/borg/libLSS/samplers/generic/impl_gaussian.cpp, add

#include "mybias.hpp"
#include "mylike.hpp"

as well as

FORCE_INSTANCE(bias::MyBias, MyLikelihood, number_of_parameters);

where number_of_parameters stands for the number of free parameters this bundle expects (i.e. bias as well as likelihood parameters). (FS: always impl_gaussian?)

(FS: I am interpolating here…) If on the other hand you want to bundle your bias model with an existing likelihood, register it in extra/borg/src/bias_generator.cpp under LibLSS::setup_biased_density_generator; e.g. for the Gaussian likelihood:

{"GAUSSIAN_MYBIAS",
  mt(generate_biased_density<AdaptBias_Gauss<bias::MyBias>>, nullMapper)},

Todo

A global registry (like ForwardRegistry) would be needed for this mechanism as well. That would save compilation time and avoid modifying the different bundles that rely on the generic framework.

Make an automatic test case

In order to enable the gradient test for your bias/likelihood combination, add a section to extra/borg/libLSS/tests/borg_gradients.py_config:

'mybundle': {
    'includes':
    inc + [
        "libLSS/samplers/generic/generic_hmc_likelihood.hpp",
        "libLSS/physics/bias/mybias.hpp",
        # FS: not sure how generic this is
        "libLSS/physics/adapt_classic_to_gauss.hpp",
        "libLSS/physics/likelihoods/mylike.hpp"
    ],
    'likelihood':
    'LibLSS::GenericHMCLikelihood<LibLSS::bias::MyBias, LibLSS::MyLikelihood>',
    'model':
    default_model,
    'model_args': 'comm, box, 1e-5'
},

Define new configuration options

If you want to read custom fields from the ini file, you should edit extra/hades/src/likelihood_info.cpp. Also, set default values in extra/hades/libLSS/tests/generic_gradient_test.cpp; extra/hades/libLSS/tests/setup_hades_test_run.cpp.

Bonus point: map the bundle to a forward model

Since 2.1, all the bias generic models can be mapped to a standard BORGForwardModel. The advantage is that they can be recombined in different ways, and notably apply bias before applying specific transforms as redshift space distortions.

This can be done easily by adding a new line in extra/borg/libLSS/physics/forwards/adapt_generic_bias.cpp in the function bias_registrator(). Here is for example the case of the linear bias model:

ForwardRegistry::instance().registerFactory("bias::Linear", create_bias<bias::LinearBias>);

This call creates a new forward model element called bias::Linear which can be created dynamically. The bias parameters through BORGForwardModel::setModelParams with the dictionnary entry biasParameters which must point to 1d boost::multi_array of the adequate size. By default the adopted bias parameters are provided by the underlying generic bias model class through setup_default().

Of course the amount of information that can be transferred is much more limited. For example the bias model cannot at the moment produce more than one field. All the others will be ignored. To do so would mean transforming the forward model into an object with \(N\) output pins (\(N\geq 2\)).

As a final note, the forward model created that way becomes immediately available in Python through the mechanism provided by :meth:aquila_borg.forward.models.newModel. In C++ it can be accessed through the ForwardRegistry (defined in extra/hades/libLSS/physics/forwards/registry.hpp).

Use HADES base class

This framework assumes that the model is composed of a set of bias coefficients in galaxy_bias_XXX (XXX being the number) and that the likelihood only depends on the final matter state. An example of likelihoods implemented on top of it is extra/hades/libLSS/samplers/hades/hades_linear_likelihood.cpp, which is a basic Gaussian likelihood.

The mechanism of applying selection effects is to be done by the new implementation however.

With this framework one has to override a number of virtual functions. I will discuss that on the specific case of the MyNewLikelihood which will implement a very rudimentary Gaussian likelihood:

class MyNewLikelihood : public HadesBaseDensityLikelihood {
public:
    // Type alias for the supertype of this class
    typedef HadesBaseDensityLikelihood super_t;
    // Type alias for the supertype of the base class
    typedef HadesBaseDensityLikelihood::super_t grid_t;

public:
    // One has to define a constructor which takes a LikelihoodInfo.
    MyNewLikelihood(LikelihoodInfo &info);
    virtual ~MyNewLikelihood();

    // This is called to setup the default bias parameters of a galaxy catalog
    void setupDefaultParameters(MarkovState &state, int catalog) override;

    // This is called when a mock catalog is required. The function
    // receives the matter density from the forward model and the state
    // that needs to be filled with mock data.
    void
    generateMockSpecific(ArrayRef const &matter_density, MarkovState &state) override;

    // This evaluates the likelihood based solely on the matter field
    // that is provided (as well as the eventual bias parameters). One
    // cannot interrogate the forward model for more fields.
    // This function must return the logarithm of the *negative* of log l
    // likelihood
    double logLikelihoodSpecific(ArrayRef const &matter_field) override;

    // This computes the gradient of the function implemented in
    // logLikelihoodSpecific
    void gradientLikelihoodSpecific(
        ArrayRef const &matter_field, ArrayRef &gradient_matter) override;

    // This is called before having resumed or initialized the chain.
    // One should create and allocate all auxiliary fields that are
    // required to run the chain at that moment, and mark the fields
    // of interest to be stored in the mcmc_XXXX.h5 files.
    void initializeLikelihood(MarkovState &state) override;
};

The above declaration must go in a .hpp file such as my_new_likelihood.hpp, that would be customary to be placed in libLSS/samplers/fancy_likelihood. The source code itself will be placed in my_new_likelihood.cpp in the same directory.

Constructor

The first function to implement is the constructor of the class.

MyNewLikelihood::MyNewLikelihood(LikelihoodInfo &info)
    : super_t(info, 1 /* number of bias parameter */) {}

The constructor has to provide the info to the base class and indicate the number of bias parameters that will be needed.

Setup default parameter

The second function allows the developer to fill up the default values for bias parameters and other auxiliary parameters. They are auxiliary with respect to the density field inference. In the Bayesian framework, they are just regular parameters.

void MyNewLikelihood::setupDefaultParameters(MarkovState& state, int catalog) {
    // Retrieve the bias array from the state dictionnary
    // This return an "ArrayStateElement *" object
    // Note that "formatGet" applies string formatting. No need to
    // call boost::format.
    auto bias = state.formatGet<ArrayType1d>("galaxy_bias_%d", catalog);
    // This extracts the actually boost::multi_array from the state element.
    // We take a reference here.
    auto &bias_c = *bias->array;
    // Similarly, if needed, we can retrieve the nmean
    auto &nmean_c = state.formatGetScalar<double>("galaxy_nmean_%d", catalog);

    // Now we can fill up the array and value.
    bias_c[0] = 1.0;
    nmean_c = 1;
}

Note in the above that we asked for auto& reference types for bias_c and nmean_c. The auto asks the compiler to figure out the type by itself. However it will not build a reference by default. This is achieved by adding the & symbol. That way any value written into this variable will be reflected in the original container. This would not be the case without the reference. Also note that the galaxy_bias_%d is already allocated to hold the number of parameters indicated to the constructor to the base class.

Initialize the likelihood

The initialization done by the base class already takes care of allocating galaxy_bias_%d, BORG_final_density, checking on the size of galaxy_data_%d. One could then do the minimum amount of work, i.e. not even override that function or putting a single statement like this:

void MyNewLikelihood::initializeLikelihood(MarkovState &state) {
    super_t::initializeLikelihood(state);
}

If more fields are required to be saved/dumped and allocated, this would otherwise be the perfect place for it. However keep in mind that it is possible that the content of fields in MarkovState is not initialized. You may rely on the info provided to the constructor in LikelihoodInfo for such cases.

Evaluate the log likelihood

Now we arrive at the last piece. The class HadesBaseDensityLikelihood offers a great simplification compared to recoding everything including the management of the forward model for the evaluation of the log likelihood and its adjoint gradient.

Warning

The function is called logLikelihoodSpecific but it is actually the negative of the log likelihood.

\[\mathrm{logLikelihoodSpecific}(\delta_\mathrm{m}) = -\log \mathcal{L}(\delta_\mathrm{m})\]

This sign is for historical reason as the Hamiltonian Markov Chain algorithm requires the gradient of that function to proceed.

[FS: actually when using the generic framework, it seems log_probability actually returns log( P )…]

As an example we will consider here the case of the Gaussian likelihood. The noise in each voxel are all i.i.d. thus we can factorize the likelihood into smaller pieces, one for each voxel:

\[\mathcal{L}(\{N_{i,g}\}|\{\delta_{i,\text{m}}\}) = \prod \mathcal{L}(N_{i,g}|\delta_{i,\text{m}})\]

The likelihood for each voxel is:

\[\mathcal{L}(N_g|\delta_\text{m},b,\bar{N}) \propto \frac{1}{\sqrt{R\bar{N}}} \exp\left(-\frac{1}{2 R\bar{N}} \left(N_g - R \bar{N}(1+b\delta_m\right)^2 \right)\]

We will implement that computation. The first function that we will consider is the evaluation of the log likelihood itself.

double
MyNewLikelihood::logLikelihoodSpecific(ArrayRef const &delta) {
// First create a variable to accumulate the log-likelihood.
double logLikelihood = 0;
// Gather the information on the final output sizes of the gridded
// density.
// "model" is provided by the base class, which is of type
// std::shared_ptr<BORGForwardModel>, more details in the text
size_t const startN0 = model->out_mgr->startN0;
size_t const endN0 = startN0 + model->out_mgr->localN0;
size_t const N1 = model->out_mgr->N1;
size_t const N2 = model->out_mgr->N2;

// Now we may loop on all catalogs, "Ncat" is also provided
// by the base class as well as "sel_field", "nmean", "bias" and
// "data"
for (int c = 0; c < Ncat; c++) {
    // This extract the 3d selection array of the catalog "c"
    // The arrays follow the same scheme as "setupDefaultParameters"
    auto &sel_array = *(sel_field[c]);
    // Here we do not request a Read/Write access to nmean. We can copy
    // the value which is more efficient.
    double nmean_c = nmean[c];
    double bias_c = (*(bias[c]))[0];
    auto &data_c = *(data[c]);

    // Once a catalog is selected we may start doing work on voxels.
    // The openmp statement is to allow the collapse of the 3-loops
#pragma omp parallel for collapse(3) reduction(+:logLikelihood)
    for (size_t n0 = startN0; n0 < endN0; n0++) {
    for (size_t n1 = 0; n1 < N1; n1++) {
        for (size_t n2 = 0; n2 < N2; n2++) {
        // Grab the selection value in voxel n0xn1xn2
        double selection = sel_array[n0][n1][n2];

        // if the voxel is non-zero, it must be counted
        if (selection > 0) {
            double Nobs = data_c[n0][n1][n2];
            // bias the matter field
            double d_galaxy = bias_c * delta[n0][n1][n2];

            // Here is the argument of the exponential
            logLikelihood += square(selection * nmean_c * (1 + d_galaxy) - Nobs) /
                (selection * nmean_c) + log(R nmean_c);
        }
        }
    }
    }
}

return logLikelihood;
}

This completes the likelihood. As one can see there is not much going on. It is basically a sum of squared differences in a triple loop.

The adjoint gradient defined as

\[\mathrm{adjoint\_gradient}(\delta_\mathrm{m}) = -\nabla \log \mathcal{L}(\delta_\mathrm{m})\]

follows the same logic, except that instead of a scalar, the function returns a vector under the shape of a mesh. Note that ArrayRef is actually a boost::multi_array_ref with the adequate type.

void MyNewLikelihood::gradientLikelihoodSpecific(
    ArrayRef const &delta, ArrayRef &grad_array) {
// Grab the mesh description as for the likelihood
size_t const startN0 = model->out_mgr->startN0;
size_t const endN0 = startN0 + model->out_mgr->localN0;
size_t const N1 = model->out_mgr->N1;
size_t const N2 = model->out_mgr->N2;

// A shortcut to put zero in all entries of the array.
// "fwrap(array)" becomes a vectorized expression
fwrap(grad_array) = 0;

for (int c = 0; c < Ncat; c++) {
    auto &sel_array = *(sel_field[c]);
    auto &data_c = *(data[c]);
    double bias_c = (*bias[c])[0];
    double nmean_c = nmean[c];

#pragma omp parallel for collapse(3)
    for (size_t n0 = startN0; n0 < endN0; n0++) {
    for (size_t n1 = 0; n1 < N1; n1++) {
        for (size_t n2 = 0; n2 < N2; n2++) {
        double deltaElement = delta[n0][n1][n2];
        double d_galaxy = bias_c * deltaElement;
        double d_galaxy_prime = bias_c;
        double response = sel_array[n0][n1][n2];
        double Nobs = data_c[n0][n1][n2];

        // If selection/mask is zero, we can safely skip that
        // particular voxel. It will not produce any gradient value.
        if (response == 0)
            continue;

        // Otherwise, we accumulate the gradient
        grad_array[n0][n1][n2] +=
            (nmean_c * response * (1 + d_galaxy) - Nobs) * d_galaxy_prime
        }
    }
    }
}
}

Adding the code to the build infrastructure

If you are in the borg module, you must open the file named libLSS/borg.cmake. It contains the instruction to compile the borg module into libLSS. To do that it is sufficient to add the new source files to the EXTRA_LIBLSS cmake variable. As one can see from the cmake file there is a variable to indicate the directory of libLSS in borg: it is called BASE_BORG_LIBLSS. One can then add the new source file like this:

SET(EXTRA_LIBLSS ${EXTRA_LIBLSS}
    ${BASE_BORG_LIBLSS}/samplers/fancy_likelihood/my_new_likelihood.cpp
    # The rest is left out only for the purpose of this documentation
)

Then the new file will be built into libLSS.

Linking the new likelihood to hades

For this it is unfortunately necessary to hack into extra/hades/src/hades_bundle_init.hpp, which holds the initialization logic for hades3 specific set of likelihood, bias, and forward models. The relevant lines in the source code are the following ones:

if (lh_type == "LINEAR") {
    bundle.hades_bundle = std::make_unique<LinearBundle>(like_info);
    likelihood = bundle.hades_bundle->likelihood;
    }
#ifdef HADES_SUPPORT_BORG
    else if (lh_type == "BORG_POISSON") {

In the above lh_type is a std::string containing the value of the field likelihood in the ini file. Here we check whether it is "LINEAR" or "BORG_POISSON".

To add a new likelihood "NEW_LIKELIHOOD" we shall add the following lines:

if (lh_type == "LINEAR") {
    bundle.hades_bundle = std::make_unique<LinearBundle>(like_info);
    likelihood = bundle.hades_bundle->likelihood;
    }
#ifdef HADES_SUPPORT_BORG
    else if (lh_type == "NEW_LIKELIHOOD") {
    typedef HadesBundle<MyNewLikelihood> NewBundle;
    bundle.hades_bundle = std::make_unique<NewBundle>(like_info);
    likelihood = bundle.hades_bundle->likelihood;
    }
    else if (lh_type == "BORG_POISSON") {

while also adding

#include "libLSS/samplers/fancy_likelihood/my_new_likelihood.hpp"

towards the top of the file.

The above piece of code define a new bundle using the template class HadesBundle<T>. T can be any class that derives from HadesBaseDensityLikelihood. Then this bundle is constructed, providing the likelihood info object in like_info. Finally the built likelihood object is copied into likelihood for further processing by the rest of the code.

Note

If you need to query more parameters from the ini file (for example the [likelihood] section), you need to look for them using params. For example params.template get<float>("likelihood.k_max") will retrieve a float value from the field k_max in [likelihood] section. You can then store it in like_info (which is a std::map in practice)

like_info["good_k_max"] = params.template get<float>("likelihood.k_max");

In your constructor you can then retrieve the value from the new entry as:

boost::any_cast<float>(like_info["good_k_max"])

And now you are done! You can now set likelihood=NEW_LIKELIHOOD in the ini file and your new code will be used by hades.

Implement from scratch

to be written even later

Adding a new likelihood/bias combination in BORG

To be written…