Controlling the Analysis with ilamb-run
Options¶
While the basic operation of ilamb-run
is covered in previous
tutorials, there are more options which can be used to control what
happens in a run. In this tutorial we will describe these options and
motivate when they are useful.
Limiting the analysis¶
The configure file specifies the confrontations which will be performed. However, for many reasons in the development and debugging process it may be advantageous to run ILAMB on only a subset of the configure file. You can control this by specifying strings which must be in the confrontation longname. For example, consider the following arguments:
ilamb-run --config ilamb.cfg --model_root ${ILAMB_ROOT}/MODELS --confrontation CERES
This line will run only the CERES confrontations found in the
ilamb.cfg
. The same can be achieved for models with the
--models
option. In this case, you must specify the model names
which you wish to run against in a given run.
To see how this is useful, imagine you have 3 models in your
--model_root
directory, but for one you needed to replace all the
model result files. So you need to rerun ILAMB, but not on all three
models. With this option, you can rerun ILAMB on just the needed
model, saving time. If this model name is modelC
, then the option
would be:
ilamb-run --config ilamb.cfg --model_root ${ILAMB_ROOT}/MODELS --models modelC --clean
The --clean
option here tells ILAMB to rerun the analysis even if
intermediate files are present, essentially cleaning out the
modelC
ILAMB contents and recomputing them, while leaving the rest
of the models untouched.
Defining models¶
There are two ways to define models for an analysis. The first was
covered in previous tutorials. The --model_root
option is used to
specify a location whose subdirectories (not recursive) are
initialized as separate models. This is the recommended method as it
is simple and fast.
One issue that comes up is that model developers want to run ILAMB,
but during the development process as a sanity check on the model
performance. This means that model results may not be available in the
contemporary period. The --model_year
option can be used to shift
the time of all models in an analysis by a fixed number of years. So
if a model run starts in 1850, but you wish to treat these results as
if they were the year 2000 (for comparing to some dataset), then the
appropriate command is --model_year 1850 2000
.
While helplful, we also acknowledge that globally applying a shift in
time across all models might not be desirable. It is likely that a
user has cached the results from previous versions of the model which
have been spun up and run over the contemporary period. To this end we
provide a more detailed model setup option --model_setup
file.txt
. The contents of file.txt
could look something like the
following:
# Model Name, Location of Files , Shift From, Shift To
CLM40 , ABSOLUTE/PATH/TO/CLM40
CLM45 , ABSOLUTE/PATH/TO/CLM45
CLM5X , ABSOLUTE/PATH/TO/CLM5X, 1850 , 2000
CLM5Y , ABSOLUTE/PATH/TO/CLM5Y, 1850 , 2000
The text file is in simple comma delimited form with either 2 or 4
columns. Lines which begin with #
will be ignored. The first
column is the name which you wish to assign to the model and the
second is the absolute path of the results. The third and fourth
columns define the shift in years for each model. If there are only
two columns of data, we will not apply a shift.
To add some context, this option may be useful in the model development process. In our sample setup, we have two model versions CLM4 and CLM4.5 whose results are archived and will not be changing and thus do not need time shifted. We have setup two versions of CLM5, X and Y which represent perhaps different parameterization choices, shifted because we have not spun these models up. The ILAMB results should be interpretted carefully, but comparing two parameterizations in this way might provide insight into key differences.
Regions¶
The ILAMB analysis can be performed on an arbitrary number of regions which may be defined in many ways. The ILAMB package comes with a set of these regions predefined which are used in the Global Fire Emissions Database. They are:
bona, Boreal North America
tena, Temperate North America
ceam, Central America
nhsa, Northern Hemisphere South America
shsa, Southern Hemisphere South America
euro, Europe
mide, Middle East
nhaf, Northern Hemisphere Africa
shaf, Southern Hemisphere Africa
boas, Boreal Asia
ceas, Central Asia
seas, Southeast Asia
eqas, Equatorial Asia
aust, Australia
The first entry in the above list is a region label. To avoid
confusion these should not have spaces or special characters. The
second entry is the name itself which will appear in the pull down
menus on the webpage otput. To run the ilamb analysis over particular
regions, use the --regions
option and include the region labels
delimited by spaces.
As we anticipate that users will desire to define their own regions, we have provided this capability in two forms. The first is region definition by latitude and longitude bounds which can be done in the form of a text file in the following comma delimited format:
#label,name ,lat_min,lat_max,lon_min,lon_max
usa,Continental US, 24, 50, -126, -66
Additional rows in the same format may be included to define more regions in the same file. The first column is the label to be used, followed by the region name. Then the minimum and maximum bounds on the latitude and longitude are specified. Note that longitude values are expected on the [-180,180] interval. In this current iteration regions cannot be specified which span the international dateline.
The second form is by creating a mask using a netCDF4 file. We will go
into more detail about the format of the netCDF4 file for defining
masks in its own tutorial. So if the sample
text file above is called regions.txt
and we have a netCDF4 file
called amazon.nc
with a region label amazon
, then the ILAMB
analysis can be performed over additional regions by specifying:
--define_regions regions.txt amazon.nc --regions global usa amazon
In its current form, ILAMB expects that the analysis will be performed
over at least the global region. All overall scores are based on
information in that region. This is a restriction we are working to
loosen. If you need to circumvent this, you can redefine the region
labeled global
to meet your needs.
Other options¶
--filter
, Sometimes a model has output from several runs or experiments included in the same location. This is frequently indicated by some string in the filename, such asr0i0p0
. This option may be used to require that files contain a specific string to be considered in the list of variables models provide.--skip_plots
, The plotting phase of ILAMB is expensive. It takes a long time to generate all the thousands of graphics that get produced. It may be that you are running ILAMB for the summary graphic/information only. In this case you can run with this option to speed up the run.--build_dir
, The default location for generating the ILAMB output is a_build
directory placed in the directory from which you ranilamb-run
. While fine for everyday use, you may wish to control the location of this directory.--disable_logging
, ILAMB uses a MPI logger to write exceptions and progress to a log file in a thread-lock fashion. This helps tremendously when tracking down user errors. However, we have found that on some systems (e.g. geysey at NCAR) this causesilamb-run
to lock for reasons we do not yet understand. Disabling the logging seems to circumvent the issue. If you find thatilamb-run
makes no progress when running in parallel, you might try this option.--quiet
, By default, ILAMB spits out progress information to the screen. If you wish to supress this information, run with this option.