Inputs

ARES_Configuration_file_v1

The configuration file for ARES uses the INI file syntax. It is separated into sections among which three are main sections.

Main sections

Section [system]

  • console_output: Holds the prefix filename for all log output files.

  • VERBOSE_LEVEL: Set the verbosity level for the console. Files get all outputs.

  • N0: Number of grid elements along the X axis.

  • N1: Same for Y axis.

  • N2: Same for Z axis.

  • L0: Comoving length of the X axis

  • L1: Same for Y axis

  • L2: Same for Z axis

  • corner0: Center of the voxel at the corner of the box in -X direction, this should be the smallest X value.

  • corner1: Same for Y

  • corner2: Same for Z

  • NUM_MODES: number of bins to represent the power spectrm

  • N_MC: Maximum number of markov chain samples to produce in a single run (Note: Used only for v1)

  • borg_supersampling: Supersampling level of the grid for intermediate calculations. The number of particles is N0*N1*N2*borg_supersampling**3

  • hades_likelihood: Likelihood to use in HADES run. Can be either one of those values:

    • BORG_POISSON: Use poisson likelihood

    • BORG_POISSON_POWER:

    • BORG_VOODOO:

    • BORG_VOODOO_MAGIC:

    • BORG_LINEAR: ARES likelihood model. Noise is Gaussian with Variance equal to \(S \bar{N}\). Use power law bias.

    • BORG_SH:

    • BORG_NB: Negative binomial. Broken power law bias.

    • Generic framework:

      • GAUSSIAN_BROKEN_POWERLAW_BIAS

      • GAUSSIAN_MO_WHITE_BIAS: Gaussian noise model, variance is fitted. Double power law bias

      • GAUSSIAN_POWERLAW_BIAS

      • GAUSSIAN_2ND_ORDER_BIAS

      • GENERIC_POISSON_BROKEN_POWERLAW_BIAS

      • GENERIC_GAUSSIAN_LINEAR_BIAS

      • GENERIC_GAUSSIAN_MANY_POWER_1^1

      • GENERIC_GAUSSIAN_MANY_POWER_1^2

      • GENERIC_GAUSSIAN_MANY_POWER_1^4

      • GENERIC_POISSON_MANY_POWER_1^1

      • GENERIC_POISSON_MANY_POWER_1^2

      • GENERIC_POISSON_MANY_POWER_1^4

  • hades_forward_model: Forward model to use

    • LPT: Lagrangian perturbation theory, ModifiedNGP/Quad final projection

    • 2LPT: Second order Lagrangian perturbation theory, ModifiedNGP/Quad final projection

    • PM: Particle mesh, ModifiedNGP/Quad final projection

    • LPT_CIC: Same as LPT, but use CIC for final projection

    • 2LPT_CIC: Same as LPT, but use CIC for final projection

    • PM_CIC: Same as LPT, but use CIC for final projection

    • HADES_LOG: Use Exponential transform (HADES model) for the forward model. Preserved mean density is enforced.

  • borg_do_rsd: Do redshift space distortion if set to “true”.

  • projection_model: Specifies which projection to use for data. No constraints are enforced on the likelihood, but of course they should be matched to the value adopted here. The value is inspected in src/common/projection.hpp. There are two available at the moment: number_ngp and luminosity_cic. The number_ngp is just Nearest-Grid-Point number counting. The luminosity_cic uses the value in Mgal to weight the object before doing CIC projection.

    • number_ngp: it just counts the number of galaxies/objects within a voxel

    • luminosity_cic: it weights galaxies by their luminosity and do a CIC projection.

  • test_mode: Runs ARES/BORG/HADES in test mode. Data is not used, mock data is generated on the fly.

  • seed_cpower: Set to true to seed the power spectrum with the correct one according to the cosmology section. Otherwise it is set to a small fraction of it.

  • hades_max_epsilon: Stepsize for the HMC. It is unitless. Good starting point is around 0.01.

  • hades_max_timesteps: Maximum number of timesteps for a single HMC sample.

  • hades_mixing: Number of samples to compute before writing to disk.

  • savePeriodicity: This reduces the number of times the restart files are dumped to the hard drives. This is useful for reducing I/Os, as restart files are heavy. You can set this to a number that is a multiple of the number of mcmc steps. For example, 20 tells ares to dump restart files every 20 mcmc steps.

  • mask_precision: Precision to which you want to compute the mask. By default it is “0.01”, which is not related to the actual precision (unfortunately not yet). It allows scaling the internal number of evaluation of the selection function. So 0.001 will call it 100 times more. The advice is not to decrease below 0.01.

  • furious_seeding: if set to true the core sampler will reseed itself from a system entropy source at each step of the MCMC. That means the MCMC becomes unpredictable and the seed number is discarded.

  • simulation: if set to true switches to N-body simulation analysis. Additional cuts are possible depending on masses, spins, etc, of halos.

Likelihoods that use the generic bias framework (currently GAUSSIAN_MO_WHITE_BIAS) supports also the following tags:

  • bias_XX_sampler_generic_blocked: if sets to true, it will not sampling the XX parameter of the bias. XX varies depending on the likelihood.

  • block_sigma8_sampler: true by default, to sample sigma8 in the initial conditions, sets this to false

Section [run]

  • NCAT: Number of catalogs. This affects the number of “catalog” sections.

  • SIMULATION: Specify if the input is from simulation. Default is false.

Section [cosmology]

  • omega_r: Radiation density

  • omega_k: Curvature

  • omega_m: Total matter density

  • omega_b: Baryonic matter density

  • omega_q: Quintescence density

  • w: Quintescence equation of state

  • wprime: Derivative of the equation of state

  • n_s: Slope of the power spectrum of scalar fluctuations

  • sigma8: Normalisation of powerspectrum at 8 Mpc/h

  • h100: Hubble constant in unit of 100 km/s/Mpc

Section [julia]

  • likelihood_path: Path to the julia file describing the likelihood (i.e. the main entry point for BORG in the likelihood)

  • likelihood_module: Name of the julia module holding the likelihood

  • bias_sampler_type: slice or hmclet, which sampling strategy to use to sample the “bias” parameters

  • ic_in_julia: true or false, whether the initial condition of the Markov Chain is set in julia

  • hmclet_diagonalMass: whether to use a diagonal or a dense mass matrix estimed on the fly

  • hmclet_burnin: number of steps allowed in “BURN IN” mode. This depends on the complexity of the likelihood. A few hundred seems reasonable.

  • hmclet_burnin_memory: size of the memory in “BURN IN” mode. Something like 50 is advocated to be sure it is fairly local but not too noisy.

  • hmclet_maxEpsilon: maximum epsilon for the HMC integrator (take order 0.01)

  • hmclet_maxNtime: maximum number of timesteps for the HMC integrator (take a few decade like 20-50)

Catalog sections

Basic fields

  • datafile: Text filename holding the data

  • maskdata: Healpix FITS file with the mask

  • radial_selection: Type of selection function, can be either “schechter”, “file” or “piecewise”.

  • refbias: true if this catalog is a reference for bias. Bias will not be sampled for it

  • bias: Default bias value, also used for mock generation

  • nmean: Initial mean galaxy density value, also used for mock generation

Halo selection

  • halo_selection: Specifying how to select the halos from the halo catalog. Can be mass, radius, spin or mixed. The mixed represents the combined cuts and can be applied by specifying, eg “halo_selection = mass radius”

  • halo_low_mass_cut: this is log10 of mass in the same unit as the masses of the input text file

  • halo_high_mass_cut: same as for halo_low_mass_cut, this is log10 of mass

  • halo_small_radius_cut

  • halo_large_radius_cut

  • halo_small_spin_cut

  • halo_high_spin_cut

Schechter selection function

  • schechter_mstar: Mstar for Schechter function

  • schechter_alpha: Power law slope of Schechter function

  • schechter_sampling_rate: How many distance points to precompute from Schechter (i.e. 1000)

  • schechter_dmax: Maximum distance to precompute Schecter selection functino

  • galaxy_bright_apparent_magnitude_cut: Apparent magnitude where data and selection must be truncated, bright end.

  • galaxy_faint_apparent_magnitude_cut: Same for faint end.

  • galaxy_bright_absolute_magnitude_cut: Absolute magnitude cut in data and selection function, bright end, useful to select different galaxy populations

  • galaxy_faint_absolute_magnitude_cut: Similar but faint end

  • zmin: Minimum redshift for galaxy sample, galaxies will be truncated

  • zmax: Maximum redshift for galaxy sample, galaxies will be truncated

‘File’ selection function

  • radial_file: Text file to load the selection from

The file has the following format. Each line starting with a ‘#’ is a comment line, and discarded. The first line is a set of three numbers: ‘rmin dr N’. Each line that follows must be a number between 0 and 1 giving the selection function at a distance r = rmin + dr * i, where ‘i’ is the line number (zero based). Finally ‘N’ is the number of points in the text file.

Two possibilities are offered for adjusting the catalog and the selection together:

  • either you chose not to do anything, and take the whole sample and provided selection. Then you need to specify:

    • file_dmin: Minimal distance for selection function and data

    • file_dmax: same but maximal distance

    • no_cut_catalog: set to false, if you do not set this you will get an error message.

  • or you want ares to preprocess the catalog and then you need:

    • zmin

    • zmax

    • galaxy_faint_apparent_magnitude_cut: Same for faint end.

    • galaxy_bright_absolute_magnitude_cut: Absolute magnitude cut in data and selection function, bright end, useful to select different galaxy populations

    • galaxy_faint_absolute_magnitude_cut: Similar but faint end

    • no_cut_catalog: (not necessary, as it defaults to true)

ARES_Configuration_file_v2

The configuration file for ARES uses the INI file syntax. It is separated into sections among which three are main sections.

Main sections

Section [system]

  • console_output: Holds the prefix filename for all log output files.

  • VERBOSE_LEVEL: Set the verbosity level for the console. Files get all outputs. Check inside libLSS/tools/log_traits.hpp for details.

    • Values:
      • VERBOSE_LEVEL=1 : up to STD level

      • VERBOSE_LEVEL=2 : INFO level

      • VERBOSE_LEVEL=3 : VERBOSE level

      • VERBOSE_LEVEL=4 : DEBUG level

  • N0: Number of grid elements along the X axis.

  • N1: Same for Y axis.

  • N2: Same for Z axis.

  • Optionally:

    • Ndata0, Ndata1, Ndata2 specifies the same thing as N0, N1, N2 but for the projection grid of the galaxy positions. This grid must be different in the case the degrader bias pass is used (see bias model section)

  • L0: Comoving length of the X axis

  • L1: Same for Y axis

  • L2: Same for Z axis

  • corner0: Center of the voxel at the corner of the box in -X direction, this should be the smallest X value.

  • corner1: Same for Y

  • corner2: Same for Z

  • NUM_MODES: number of bins to represent the power spectrm

  • projection_model: Specifies which projection to use for data. No constraints are enforced on the likelihood, but of course they should be matched to the value adopted here. The value is inspected in src/common/projection.hpp. There are two available at the moment: number_ngp and luminosity_cic. The number_ngp is just Nearest-Grid-Point number counting. The luminosity_cic uses the value in Mgal to weight the object before doing CIC projection.

    • number_ngp: it just counts the number of galaxies/objects within a voxel

    • luminosity_cic: it weights galaxies by their luminosity and do a CIC projection.

  • test_mode: Runs ARES/BORG/HADES in test mode. Data is not used, mock data is generated on the fly.

  • seed_cpower: Set to true to seed the power spectrum with the correct one according to the cosmology section. Otherwise it is set to a small fraction of it.

  • savePeriodicity: This reduces the number of times the restart files are dumped to the hard drives. This is useful for reducing I/Os, as restart files are heavy. You can set this to a number that is a multiple of the number of mcmc steps. For example, 20 tells ares to dump restart files every 20 mcmc steps.

  • mask_precision: Precision to which you want to compute the mask. By default it is “0.01”, which is not related to the actual precision (unfortunately not yet). It allows scaling the internal number of evaluation of the selection function. So 0.001 will call it 100 times more. The advice is not to decrease below 0.01.

  • furious_seeding: if set to true the core sampler will reseed itself from a system entropy source at each step of the MCMC. That means the MCMC becomes unpredictable and the seed number is discarded.

Section [block_loop]

  • hades_sampler_blocked: Prevents the density field from being sampled

Likelihoods that use the generic bias framework (currently GAUSSIAN_MO_WHITE_BIAS) supports also the following tags:

  • bias_XX_sampler_generic_blocked: if sets to true, it will not sampling the XX parameter of the bias. XX varies depending on the likelihood. ‘’’WARNING: the code has not yet been updated to look for these variables in [block_loop], they should still be located in [system] at the moment. ‘’’

    • Note: Whenever a bias model uses $b_0$ to hold the normalization, inside its header you should set/see NmeanIsBias=True. Take a look inside libLSS/physics/bias/* (for example linear.hpp).

  • sigma8_sampler_blocked: true by default, to sample sigma8 in the initial conditions, sets this to false

Section [mcmc]

  • number_to_generate: Maximum number of markov chain samples to produce in a single run

  • init_random_scaling: This is more specific to HADES. It starts the MCMC run with a random initial condition, scaled with this number (default 0.1) compared to the reference initial powerspectrum.

  • random_ic: true if ic must be reshuffled before starting the MCMC sampling, false to keep them at their value generated by the mock data generator

Section [gravity]

  • model: Forward model to use

    • LPT: Lagrangian perturbation theory, ModifiedNGP/Quad final projection

    • 2LPT: Second order Lagrangian perturbation theory, ModifiedNGP/Quad final projection

    • PM: Particle mesh, ModifiedNGP/Quad final projection

    • LPT_CIC: Same as LPT, but use CIC for final projection

    • 2LPT_CIC: Same as 2LPT, but use CIC for final projection

    • PM_CIC: Same as PM, but use CIC for final projection

    • tCOLA: Same as PM_CIC but uses a TCOLA gravity machine. To enable, specify model=PM_CIC, as above, AND set tCOLA=true.

    • HADES_LOG: Use Exponential transform (HADES model) for the forward model. Preserved mean density is enforced.

  • supersampling: Controls the number of particles (supersampling level of the particle grid with respect to the grid). The number of particles is \((N_0 \cdot N_1 \cdot N_2 \cdot \mathrm{supersampling})^3\)

  • forcesampling : This is the oversampling for computing the gravitational field (and thus the force in the PM). A current rule of thumb is to have forcesampling at least twice of supersampling, and supersampling at least two. For tCOLA, the requirements are less.

    • To be checked: Setup with forcesampling=supersampling.

  • a_initial : Scale factor value reflects the time. This parameter controls the value of the a_initial (\(a_i\)) which should be \(10^{-3} \leq a_i \leq 1.0\), with \(a_i=10^{-3}\) corresponding to the time of CMB

  • a_final : Same as a_initial parameter, but \(a_f > a_i\)

  • pm_start_z: This is relevant only for the PM forward model and represents the starting redshift for the PM simulation.

  • pm_nsteps: Relevant only for PM model, see extra/borg/libLSS/physics/forwards/borg_multi_pm.cpp. There are two scalings in the code, controlled with LOG_SCALE_STEP. If LOG_SCALE_STEP is set to False then steps are splitted linearly in \(a\). It seems the linear scaling gives better results in tests of \(P(k)\).

  • part_factor: An option relevant for MPI run. This is the overallocation of particles on each node to allow for moving them in and out of the node. It is required because the density projection needs to have only the relevant particles on the node. If one of them is outside the slab it will cause a failure.

    • Note: part_factor is indipendent of forcesampling and supersampling It will likely be larger for smaller boxes (physical length) and smaller box (in terms of mesh / grid size). The first case because particles travel larger distances w.r.t to the size of the box, and the second because there is more shot noise.

  • lightcone: See equation 2 from the SDSS3-BOSS inference paper. This option is more relevant for larger boxes.

  • do_rsd: Do redshift space distortion if set to True.

    • Note: The DM particles are shifted directly. But, this will never be the case in observations, for which it is ensemble of gas particles around a galaxy that is shifted.

Forward model elements can as well be chained and have different grid sizes. “model” can now be CHAIN, which then needs a specific list of models in “models”.

Here is an example:

[gravity]
model=CHAIN
models=PRIMORDIAL,TRANSFER_EHU,LPT_CIC
[gravity_chain_0]
a_final=0.001
[gravity_chain_1]
[gravity_chain_2]
supersampling=2
lightcone=false
do_rsd=false
a_initial=0.001
a_final=1.
part_factor=2.0
mul_out=1

Each element of the chain gets its own configuration section which is the same as previously when it was a global descriptor (see above). Note that if you use the chain mechanism, you have to be explicit on the production of initial conditions power spectrum. As you can see above, we indicate “PRIMORDIAL,TRANSFER_EHU” to start with a primordial scale-free gravitational potential, onto which we apply an Einstein-Hu transfer function to form density fluctuations, which are then passed down to LPT_CIC. Also keep in mind that the scale factors must be compatibles and no checks are run by the code at the moment. mul_out specifices how much the output grid as to be supersampled for the CIC (i.e. the CIC grid is produced at mul_out times the initial grid size).

Model ‘Primordial’

Apply a primordial scale free power spectrum on the input. The output is scaled linearly to a_final.

Model ‘Transfer’
  • CIC correction: use_invert_cic=true: Transfer function is inverse CIC smoother=0.99 (in unit of grid)

  • Sharp K filter: use_sharpk=true: Transfer function is sharp k filter k_max=0.1 (in h/Mpc)

Model ‘Softplus’

Apply a softplus transform hardness=1.0 , some parameter making the transition more or less harder

Model ‘Downgrade’

(No option)

Section [hades]

  • max_epsilon: Stepsize for the HMC. It is unitless. Good starting point is around 0.01.

  • max_timesteps: Maximum number of timesteps for a single HMC sample.

  • mixing: Number of samples to compute before writing to disk.

  • algorithm:

    • HMC: classical HMC algorithm

    • QN-HMC: Quasi-Newton HMC algorithm

    • FROZEN-PHASE: Fixed phase. They are not sampled at all but provide some pipelines to allow the other samplers to work.

  • phases: if algorithm is FROZEN-PHASE, you can specify an HDF5 filename here. This file must contain a “phase” array which is conforming to the setup of the ini.

  • noPhasesProvided: if phases is omitted, this one has to be set to true, otherwise an error is thrown.

  • phasesDataKey: this indicate which field to use in the phases HDF5 file.

  • likelihood: Likelihood to use in HADES run. Can be either one of those values:

    • LINEAR: Gaussian likelihood

    • BORG_POISSON: Use poisson likelihood

    • Generic framework:

      • GAUSSIAN_BROKEN_POWERLAW_BIAS

      • GAUSSIAN_MO_WHITE_BIAS: Gaussian noise model, variance is fitted. Double power law bias

      • GAUSSIAN_POWERLAW_BIAS: Power law bias model with a Gaussian noise model, variance is fitted.

      • GAUSSIAN_2ND_ORDER_BIAS

      • GENERIC_POISSON_BROKEN_POWERLAW_BIAS: Broken power law bias model (also called Neyrinck’s model), with Poisson noise lmodel

      • GENERIC_GAUSSIAN_LINEAR_BIAS: Linear bias model, Gaussian noise model

      • GENERIC_GAUSSIAN_MANY_POWER_1^1

      • GENERIC_GAUSSIAN_MANY_POWER_1^2

      • GENERIC_GAUSSIAN_MANY_POWER_1^4

      • GENERIC_POISSON_MANY_POWER_1^1

      • GENERIC_POISSON_MANY_POWER_1^2

      • GENERIC_POISSON_MANY_POWER_1^4

      • GENERIC_POISSON_POWERLAW_BIAS: simple power law bias model with Poisson noise model

      • GENERIC_POISSON_POWERLAW_BIAS_DEGRADE4: power law bias models preceded by a degrade pass (N -> N/4 in each direction)

      • GENERIC_POISSON_BROKEN_POWERLAW_BIAS_DEGRADE4: broken power law bias model preceded by a degrade pass (N -> N/4 in each direction)

  • scheme: SI_2A, SI_2B, SI_2C, SI_3A, SI_4B, SI_4C, SI_4D, SI_6A

Section [run]

  • NCAT: Number of catalogs. This affects the number of “catalog” sections.

    -Note: If NCAT>1 then it is supposed catalogues are independently taken (no double counting of galaxies etc.) and hence when one evaluates the log-likelihood, they are just summed together.

  • SIMULATION: Specify if the input is from simulation. Default is false.

Section [cosmology]

  • omega_r: Radiation density

  • omega_k: Curvature

  • omega_m: Total matter density

  • omega_b: Baryonic matter density

  • omega_q: Quintescence density

  • w: Quintescence equation of state

  • wprime: Derivative of the equation of state

  • n_s: Slope of the power spectrum of scalar fluctuations

  • sigma8: Normalisation of powerspectrum at 8 Mpc/h

  • h100: Hubble constant in unit of 100 km/s/Mpc

  • fnl: primordial non-Gaussianity

Section [likelihood]

  • Options related to robust likelihood. Each patch of a robust likelihood can be sliced in the redshift direction. There are two options controlling the slicing: the maximum distance “rmax” and the number of slices “slices”

    • rmax: Maximum distance accessible during the inference. In practice it is at least the farthest distance of a voxel in the box. Unit is the one of the box, most generally \(h^{-1}\) Mpc.

    • slices: Number of slices to build in the redshift direction. Each patch will have a depth ~rmax/slices.

Section [julia]

  • likelihood_path: path of the julia code

  • likelihood_module: julia module where the likelihood is implemented

  • bias_sampler_type: type of sampler for the bias parameters (hmclet, slice)

  • ic_in_julia: whether initial conditions of the MCMC are coded in julia or choose some random numbers

  • hmclet_diagonalMass: where to use a diagonal mass matrix or a full dense

  • mass_burnin: number of MCMC steps in burnin mode

  • mass_burnin_memory: number of MCMC steps to store when in burnin mode

  • hmclet_maxEpsilon: maximum epsilon for the leapfrog integrator (~0.002-0.01 depending on likelihood complexity)

  • hmclet_maxNtime: maximum number of steps for the leapfrog integrator (~50-100)

  • hmclet_massScale: amount of momentum reshuffling (0.0 = full, 1.0 = none bad for MCMC)

  • hmclet_correlationLimiter: reduce the correlations in the covariance matrix by some number. Typically the smaller the number the less reduction with \(\simeq 1\) reducing the correlation by 2.

Catalog sections

Basic fields

  • datafile: Text filename holding the data

  • maskdata: Healpix FITS file with the mask

  • radial_selection: Type of selection function, can be either “schechter”, “file” or “piecewise”.

  • refbias: true if this catalog is a reference for bias. Bias will not be sampled for it

  • bias: Default bias value, also used for mock generation

  • nmean: Initial mean galaxy density value, also used for mock generation

Halo selection

  • halo_selection: Specifying how to select the halos from the halo catalog. Can be mass, radius, spin or mixed. The mixed represents the combined cuts and can be applied by specifying, eg “halo_selection = mass radius”

  • halo_low_mass_cut: this is log10 of mass in the same unit as the masses of the input text file

  • halo_high_mass_cut: same as for halo_low_mass_cut, this is log10 of mass

  • halo_small_radius_cut

  • halo_large_radius_cut

  • halo_small_spin_cut

  • halo_high_spin_cut

Schechter selection function

  • schechter_mstar: Mstar for Schechter function

  • schechter_alpha: Power law slope of Schechter function

  • schechter_sampling_rate: How many distance points to precompute from Schechter (i.e. 1000)

  • schechter_dmax: Maximum distance to precompute Schecter selection function

  • galaxy_bright_apparent_magnitude_cut: Apparent magnitude where data and selection must be truncated, bright end.

  • galaxy_faint_apparent_magnitude_cut: Same for faint end.

  • galaxy_bright_absolute_magnitude_cut: Absolute magnitude cut in data and selection function, bright end, useful to select different galaxy populations

  • galaxy_faint_absolute_magnitude_cut: Similar but faint end

  • zmin: Minimum redshift for galaxy sample, galaxies will be truncated

  • zmax: Maximum redshift for galaxy sample, galaxies will be truncated

‘File’ selection function

  • radial_file: Text file to load the selection from

The file has the following format. Each line starting with a ‘#’ is a comment line, and discarded. The first line is a set of three numbers: ‘rmin dr N’. Each line that follows must be a number between 0 and 1 giving the selection function at a distance r = rmin + dr * i, where ‘i’ is the line number (zero based). Finally ‘N’ is the number of points in the text file.

Two possibilities are offered for adjusting the catalog and the selection together:

  • either you chose not to do anything, and take the whole sample and provided selection. Then you need to specify:

    • file_dmin: Minimal distance for selection function and data

    • file_dmax: same but maximal distance

    • no_cut_catalog: set to false, if you do not set this you will get an error message.

  • or you want ares to preprocess the catalog and then you need:

    • zmin

    • zmax

    • galaxy_faint_apparent_magnitude_cut: Same for faint end.

    • galaxy_bright_absolute_magnitude_cut: Absolute magnitude cut in data and selection function, bright end, useful to select different galaxy populations

    • galaxy_faint_absolute_magnitude_cut: Similar but faint end

    • no_cut_catalog: (not necessary, as it defaults to true)

ARES_Configuration_file_v2.1

The configuration file for ARES uses the INI file syntax. It is separated into sections among which three are main sections.

Main sections

Section [system]

  • console_output: Holds the prefix filename for all log output files.

  • VERBOSE_LEVEL: Set the verbosity level for the console. Files get all outputs.

  • N0: Number of grid elements along the X axis.

  • N1: Same for Y axis.

  • N2: Same for Z axis.

  • Optionally:

    • Ndata0, Ndata1, Ndata2 specifies the same thing as N0, N1, N2 but for the projection grid of the galaxy positions. This grid must be different in the case the degrader bias pass is used (see bias model section)

  • L0: Comoving length of the X axis

  • L1: Same for Y axis

  • L2: Same for Z axis

  • corner0: Center of the voxel at the corner of the box in -X direction, this should be the smallest X value.

  • corner1: Same for Y

  • corner2: Same for Z

  • NUM_MODES: number of bins to represent the power spectrm

  • projection_model: Specifies which projection to use for data. No constraints are enforced on the likelihood, but of course they should be matched to the value adopted here. The value is inspected in src/common/projection.hpp. There are two available at the moment: number_ngp and luminosity_cic. The number_ngp is just Nearest-Grid-Point number counting. The luminosity_cic uses the value in Mgal to weight the object before doing CIC projection.

    • number_ngp: it just counts the number of galaxies/objects within a voxel

    • luminosity_cic: it weights galaxies by their luminosity and do a CIC projection.

  • test_mode: Runs ARES/BORG/HADES in test mode. Data is not used, mock data is generated on the fly.

  • seed_cpower: Set to true to seed the power spectrum with the correct one according to the cosmology section. Otherwise it is set to a small fraction of it.

  • savePeriodicity: This reduces the number of times the restart files are dumped to the hard drives. This is useful for reducing I/Os, as restart files are heavy. You can set this to a number that is a multiple of the number of mcmc steps. For example, 20 tells ares to dump restart files every 20 mcmc steps.

  • mask_precision: Precision to which you want to compute the mask. By default it is “0.01”, which is not related to the actual precision (unfortunately not yet). It allows scaling the internal number of evaluation of the selection function. So 0.001 will call it 100 times more. The advice is not to decrease below 0.01.

  • furious_seeding: if set to true the core sampler will reseed itself from a system entropy source at each step of the MCMC. That means the MCMC becomes unpredictable and the seed number is discarded.

Section [block_loop]

  • hades_sampler_blocked: Prevents the density field from being sampled

Likelihoods that use the generic bias framework (currently GAUSSIAN_MO_WHITE_BIAS) supports also the following tags:

  • bias_XX_sampler_generic_blocked: if sets to true, it will not sampling the XX parameter of the bias. XX varies depending on the likelihood. ‘’’WARNING: the code has not yet been updated to look for these variables in [block_loop], they should still be located in [system] at the moment. ‘’’

  • sigma8_sampler_blocked: true by default, to sample sigma8 in the initial conditions, sets this to false

Section [mcmc]

  • number_to_generate: Maximum number of markov chain samples to produce in a single run

  • init_random_scaling: This is more specific to HADES. It starts the MCMC run with a random initial condition, scaled with this number (default 0.1) compared to the reference initial powerspectrum.

  • random_ic: true if ic must be reshuffled before starting the MCMC sampling, false to keep them at their value generated by the mock data generator

  • scramble_bias: true (default), reset the bias values to some other values before starting the chain, after generating the mock.

Section [gravity]

  • model: Forward model to use

    • LPT: Lagrangian perturbation theory, ModifiedNGP/Quad final projection

    • 2LPT: Second order Lagrangian perturbation theory, ModifiedNGP/Quad final projection

    • PM: Particle mesh, ModifiedNGP/Quad final projection

    • LPT_CIC: Same as LPT, but use CIC for final projection

    • 2LPT_CIC: Same as 2LPT, but use CIC for final projection

    • PM_CIC: Same as PM, but use CIC for final projection

    • tCOLA: Same as PM_CIC but uses a TCOLA gravity machine. To enable, specify model=PM_CIC, as above, AND set tCOLA=true.

    • HADES_LOG: Use Exponential transform (HADES model) for the forward model. Preserved mean density is enforced.

  • supersampling: Controls the number of particles (supersampling level of the particle grid with respect to the grid). The number of particles is (N0*N1*N2*borg_supersampling)**3

  • forcesampling

  • a_initial

  • a_final

  • pm_start_z:

  • pm_nsteps:

  • part_factor:

  • lightcone:

  • do_rsd: Do redshift space distortion if set to “true”.

Forward model elements can as well be chained and have different grid sizes. “model” can now be CHAIN, which then needs a specific list of model layers in “models”.

Here is an example:

[gravity]
model=CHAIN
models=PRIMORDIAL,TRANSFER_EHU,LPT_CIC
[gravity_chain_0]
a_final=0.001
[gravity_chain_1]
[gravity_chain_2]
supersampling=2
lightcone=false
do_rsd=false
a_initial=0.001
a_final=1.
part_factor=2.0
mul_out=1

Each element of the chain gets its own configuration section which is the same as previously when it was a global descriptor (see above). Note that it you use the chain mechanism, you have to be explicit on the production of initial conditions power spectrum. As you can see above, we indicate “PRIMORDIAL,TRANSFER_EHU” to start with a primordial scale-free gravitational potential, onto which we apply an Einstein-Hu transfer function to form density fluctuations, which are then passed down to LPT_CIC. Also keep in mind that the scale factors must be compatibles and no checks are run by the code at the moment. `mul_out` specifices how much the output grid as to be supersampled for the CIC (i.e. the CIC grid is produced at mul_out times the initial grid size).

Model ‘Primordial’

Apply a primordial scale free power spectrum on the input. The output is scaled linearly to a_final.

Model ‘Transfer’
  • CIC correction: use_invert_cic=true: Transfer function is inverse CIC smoother=0.99 (in unit of grid)

  • Sharp K filter: use_sharpk=true: Transfer function is sharp k filter k_max=0.1 (in h/Mpc)

Model ‘Softplus’

Apply a softplus transform hardness=1.0 , some parameter making the transition more or less harder

Model ‘Downgrade’

(No option)

Section [hades]
  • max_epsilon: Stepsize for the HMC. It is unitless. Good starting point is around 0.01.

  • max_timesteps: Maximum number of timesteps for a single HMC sample.

  • mixing: Number of samples to compute before writing to disk.

  • algorithm:

    • HMC: classical HMC algorithm

    • QN-HMC: Quasi-Newton HMC algorithm

    • FROZEN-PHASE: Fixed phase. They are not sampled at all but provide some pipelines to allow the other samplers to work.

  • phases: if algorithm is FROZEN-PHASE, you can specify an HDF5 filename here. This file must contain a “phase” array which is conforming to the setup of the ini.

  • noPhasesProvided: if phases is omitted, this one has to be set to true, otherwise an error is thrown.

  • phasesDataKey: this indicate which field to use in the phases HDF5 file.

  • likelihood: Likelihood to use in HADES run. Can be either one of those values:

    • LINEAR: Gaussian likelihood

    • BORG_POISSON: Use poisson likelihood

    • Generic framework:

      • GAUSSIAN_BROKEN_POWERLAW_BIAS

      • GAUSSIAN_MO_WHITE_BIAS: Gaussian noise model, variance is fitted. Double power law bias

      • GAUSSIAN_POWERLAW_BIAS: Power law bias model with a Gaussian noise model, variance is fitted.

      • GAUSSIAN_2ND_ORDER_BIAS

      • GENERIC_POISSON_BROKEN_POWERLAW_BIAS: Broken power law bias model (also called Neyrinck’s model), with Poisson noise lmodel

      • GENERIC_GAUSSIAN_LINEAR_BIAS: Linear bias model, Gaussian noise model

      • GENERIC_GAUSSIAN_MANY_POWER_1^1

      • GENERIC_GAUSSIAN_MANY_POWER_1^2

      • GENERIC_GAUSSIAN_MANY_POWER_1^4

      • GENERIC_POISSON_MANY_POWER_1^1

      • GENERIC_POISSON_MANY_POWER_1^2

      • GENERIC_POISSON_MANY_POWER_1^4

      • GENERIC_POISSON_POWERLAW_BIAS: simple power law bias model with Poisson noise model

      • GENERIC_POISSON_POWERLAW_BIAS_DEGRADE4: power law bias models preceded by a degrade pass (N -> N/4 in each direction)

      • GENERIC_POISSON_BROKEN_POWERLAW_BIAS_DEGRADE4: broken power law bias model preceded by a degrade pass (N -> N/4 in each direction)

  • scheme: SI_2A, SI_2B, SI_2C, SI_3A, SI_4B, SI_4C, SI_4D, SI_6A

Section [run]

  • NCAT: Number of catalogs. This affects the number of “catalog” sections.

  • SIMULATION: Specify if the input is from simulation. Default is false.

Section [likelihood]

  • MainPower_prior_width: Variance of the manypower parameters (except mean which is always uniform positive)

  • EFT_Lambda: Lambda truncation parameter of the EFT bias model

  • Options related to robust likelihood. Each patch of a robust likelihood can be sliced in the redshift direction. There are two options controlling the slicing: the maximum distance “rmax” and the number of slices “slices”

    • rmax: Maximum distance accessible during the inference. In practice it is at least the farthest distance of a voxel in the box. Unit is the one of the box, most generally \(h^{-1}\) Mpc.

    • slices: Number of slices to build in the redshift direction. Each patch will have a depth ~rmax/slices.

Section [cosmology]

  • omega_r: Radiation density

  • omega_k: Curvature

  • omega_m: Total matter density

  • omega_b: Baryonic matter density

  • omega_q: Quintescence density

  • w: Quintescence equation of state

  • wprime: Derivative of the equation of state

  • n_s: Slope of the power spectrum of scalar fluctuations

  • sigma8: Normalisation of powerspectrum at 8 Mpc/h

  • h100: Hubble constant in unit of 100 km/s/Mpc

  • fnl: primordial non-Gaussianity

Section [julia]

  • likelihood_path: path of the julia code

  • likelihood_module: julia module where the likelihood is implemented

  • bias_sampler_type: type of sampler for the bias parameters (hmclet, slice)

  • ic_in_julia: whether initial conditions of the MCMC are coded in julia or choose some random numbers

  • hmclet_diagonalMass: where to use a diagonal mass matrix or a full dense

  • mass_burnin: number of MCMC steps in burnin mode

  • mass_burnin_memory: number of MCMC steps to store when in burnin mode

  • hmclet_maxEpsilon: maximum epsilon for the leapfrog integrator (~0.002-0.01 depending on likelihood complexity)

  • hmclet_maxNtime: maximum number of steps for the leapfrog integrator (~50-100)

  • hmclet_massScale: amount of momentum reshuffling (0.0 = full, 1.0 = none bad for MCMC)

  • hmclet_correlationLimiter: reduce the correlations in the covariance matrix by some number. Typically the smaller the number the less reduction with \(\simeq 1\) reducing the correlation by 2.

Catalog sections

Basic fields

  • datafile: Text filename holding the data

  • maskdata: Healpix FITS file with the mask

  • radial_selection: Type of selection function, can be either “schechter”, “file” or “piecewise”.

  • refbias: true if this catalog is a reference for bias. Bias will not be sampled for it

  • bias: Default bias value, also used for mock generation

  • nmean: Initial mean galaxy density value, also used for mock generation

Halo selection

  • halo_selection: Specifying how to select the halos from the halo catalog. Can be mass, radius, spin or mixed. The mixed represents the combined cuts and can be applied by specifying, eg “halo_selection = mass radius”

  • halo_low_mass_cut: this is log10 of mass in the same unit as the masses of the input text file

  • halo_high_mass_cut: same as for halo_low_mass_cut, this is log10 of mass

  • halo_small_radius_cut

  • halo_large_radius_cut

  • halo_small_spin_cut

  • halo_high_spin_cut

Schechter selection function

  • schechter_mstar: Mstar for Schechter function

  • schechter_alpha: Power law slope of Schechter function

  • schechter_sampling_rate: How many distance points to precompute from Schechter (i.e. 1000)

  • schechter_dmax: Maximum distance to precompute Schecter selection function

  • galaxy_bright_apparent_magnitude_cut: Apparent magnitude where data and selection must be truncated, bright end.

  • galaxy_faint_apparent_magnitude_cut: Same for faint end.

  • galaxy_bright_absolute_magnitude_cut: Absolute magnitude cut in data and selection function, bright end, useful to select different galaxy populations

  • galaxy_faint_absolute_magnitude_cut: Similar but faint end

  • zmin: Minimum redshift for galaxy sample, galaxies will be truncated

  • zmax: Maximum redshift for galaxy sample, galaxies will be truncated

‘File’ selection function

  • radial_file: Text file to load the selection from

The file has the following format. Each line starting with a ‘#’ is a comment line, and discarded. The first line is a set of three numbers: ‘rmin dr N’. Each line that follows must be a number between 0 and 1 giving the selection function at a distance r = rmin + dr * i, where ‘i’ is the line number (zero based). Finally ‘N’ is the number of points in the text file.

Two possibilities are offered for adjusting the catalog and the selection together:

  • either you chose not to do anything, and take the whole sample and provided selection. Then you need to specify:

    • file_dmin: Minimal distance for selection function and data

    • file_dmax: same but maximal distance

    • no_cut_catalog: set to false, if you do not set this you will get an error message.

  • or you want ares to preprocess the catalog and then you need:

    • zmin

    • zmax

    • galaxy_faint_apparent_magnitude_cut: Same for faint end.

    • galaxy_bright_absolute_magnitude_cut: Absolute magnitude cut in data and selection function, bright end, useful to select different galaxy populations

    • galaxy_faint_absolute_magnitude_cut: Similar but faint end

    • no_cut_catalog: (not necessary, as it defaults to true)

How to create a config file from python

This page is about running the gen_subcat_conf.py script under scripts/ini_generator in ares. For an explanation of the config-file itself, see here.

Config-file for 2M++ and SDSS(MGS)

The folder containing the scripts and the ini files below is located in $SOURCE/scripts/ini_generator. Steps to generate the config-file are the following:

  • Manipulate header.ini for your needs

  • (If needed) alter template files (template_sdss_main.py, template_2mpp_main.py and template_2mpp_second.py) for the cutting and adjusting of data

  • To create ini file, run this command:

python gen_subcat_conf.py  --output NAME_OF_OUTPUT_FILE.ini --configs template_sdss_main.py:template_2mpp_main.py:template_2mpp_second.py  --header header.ini

Text catalog format

It is determined by the function loadGalaxySurveyFromText in libLSS/data/survey_load_txt.hpp (ARES git tree)

[Galaxy Survey]

For galaxy survey, the standard catalog format includes 7-8 columns. The meaning of each column, from left to right, is listed below.

  • galaxy id

  • phi: longitude, \(2\pi >= \phi >= 0\) [rad].

  • theta: latitude, \(\pi/2 >= \theta >= -\pi/2\) [rad].

  • zo: total observed redshift, to be used with photo-z.

  • m: apparent magnitude.

  • M_abs: absolute magnitude, not really used as it is derived from other quantities.

  • z: redshift, used to position the galaxies, cosmology is used to transform this to comoving distance at the moment.

  • w: weight, used as a multiplier when creating the grid of galaxy distribution.

[Dark Matter Simulation]

For Dark Matter simulation, the standard catalog format includes 10 columns. The meaning of each column, from left to right, is listed below.

  • halo id

  • halo mass: given in unit of solar mass

  • halo radius

  • halo spin

  • x, y, z: comoving coordinates

  • vz, vy, vz: velocities

HDF5 catalog format

Passing in the ini file the following option in the catalog sections:

  • dataformat=HDF5

  • datakey=KEY

one can load from an HDF5 file the needed data for a catalog. The data are taken from the entry “KEY” in the HDF5. This allows to store several catalogs at the same time in the same file.

HDF5 catalog format

The catalog must have the following columns:

  • id (unsigned long int compatible)

  • phi (longitude in radians, double compatible)

  • theta (latitude in radians, double compatible)

  • zo (observed redshift, dimensionless, double compatible)

  • m (apparent magnitude, double compatible)

  • M_abs (absolute magnitude, optional, double compatible)

  • z (redshift, optional, double compatible)

  • w (weight, double compatible, should be 1)

HDF5 halo catalog format

  • id (unsigned long int compatible)

  • Mgal (mass, double compatible)

  • radius (double compatible)

  • spin (double compatible)

  • posx (x position Mpc, double compatible)

  • posy (y position Mpc, double compatible)

  • posz (z position Mpc, double compatible)

  • vx (velocity x, km/s, double compatible)

  • vy (velocity x, km/s, double compatible)

  • vz (velocity x, km/s, double compatible)

  • w (weight, double compatible, should be 1)

An example converter can be found hereafter:

import numpy as np
import h5py as h5

# Load text data file
data0 = np.loadtxt("./halo.txt", dtype=[("id",int),("Mgal", float),("radius",float),("spin",float),("posx",float),("posy",float),("posz",float),("vx",float),("vy",float),("vz",float)])
# Build a new one with a weight column
data = np.empty(data0.size, dtype=[("id",int),("Mgal", float),("radius",float),("spin",float),("posx",float),("posy",float),("posz",float),("vx",float),("vy",float),("vz",float),("w",float)])

for n in data0.dtype.names:
  data[n] = data0[n]

# Set the weight to one
data['w'] = 1

# Write the hdf5
print("Writing catalog")
with h5.File("halo.h5", mode="w") as f:
  f['data'] = data

Radial selection format

The file format for radial selection is the following:

  • First line is : rmin dr numPoints

    • rmin is the minimal distance of the completeness (the first point in the following)

    • dr is the space between two samples

    • numPoints is the number of points

  • Comment line start with #

  • All following lines are completeness

For example, the following would create a completeness equal to one between \(100 \, \mathrm{Mpc} \, h^{-1}\) and \(4000 \, \mathrm{Mpc} \, h^{-1}\):

# some comment
100 800 5
1
1
1
1
1