Defining Custom RegionsΒΆ
In the tutorial explaining the options of
ilamb-run
, we highlight that custom regions may be defined in two
ways. The first is region definition by latitude and longitude bounds
which can be done in the form of a text file in the following comma
delimited format:
#label,name ,lat_min,lat_max,lon_min,lon_max
usa ,Continental US, 24, 50, -126, -66
alaska,Alaska , 53, 72, -169, -129
The first column is the label to be used, followed by the region name. Then the minimum and maximum bounds on the latitude and longitude are specified. Note that longitude values are expected on the [-180,180] interval. In this current iteration regions cannot be specified which span the international dateline.
The second method is by creating a netCDF4 file which will be used internally to create a mask for each region. This we will demonstrate by encoding the above regions but in this format. First we create the spatial grid on which we will define the regions.
from netCDF4 import Dataset
import numpy as np
# Create the lat/lon dimensions
res = 0.5
latbnd = np.asarray([np.arange(- 90 , 90 ,res),
np.arange(- 90+res, 90+0.01,res)]).T
lonbnd = np.asarray([np.arange(-180 ,180 ,res),
np.arange(-180+res,180+0.01,res)]).T
lat = latbnd.mean(axis=1)
lon = lonbnd.mean(axis=1)
Next we create an array of integers which we will use to mark the regions we wish to encode. This is essentially painting by numbers. We initialize the array to a missing value which we will encode later.
# Create the number array, initialize to a missing value
miss = -999
ids = np.ones((lat.size,lon.size),dtype=int)*miss
Then we paint the regions we wish to encode using the latitude and longitude bounds which were in the sample text file above. This part will vary depending on how you wish to define regions. For example, our regions here will still appear to be defined by latitude and longitude bounds because that is how we are creating the mask. You may find other sources for your region definitions which will allow more precise representations. Note that this method of definition means that regions cannot overlap in a single file. If you need to define overlapping regions, put each region in a separate file.
# Paint the Continental US with a `0`
ids[np.where(np.outer((lat>= 24)*(lat<= 50),
(lon>=-126)*(lon<=- 66)))] = 0
# Paint Alaska with a `1`
ids[np.where(np.outer((lat>= 53)*(lat<= 72),
(lon>=-169)*(lon<=-129)))] = 1
Next we convert the numpy
integer array to a masked array where we
mask by the missing value we defined above. Then we create an array of
labels to use as indentifiers for the integer numbers we defined. A
0
in the ids
array will correspond to the USA
region and a
1
to the Alaska
region. These lower case version of these
names will be used as region labels.
# Convert the ids to a masked array
ids = np.ma.masked_values(ids,miss)
# Create the array of labels
lbl = np.asarray(["USA","Alaska"])
Finally we encode the netCDF4 dataset. There are a few important
details in this code. The first is to use the numpy
datatypes of
the arrays when creating netCDF4 variables. This is especially
important in encoding the labels
array as it will ensure the
string array is created properly. The other important detail is to
encode the labels
attribute of the I
variable. This is what
tells the ILAMB system where to find the labels for the integers
defined in the array.
# Create netCDF dimensions
dset = Dataset("regions.nc",mode="w")
dset.createDimension("lat" ,size=lat.size)
dset.createDimension("lon" ,size=lon.size)
dset.createDimension("nb" ,size=2 )
dset.createDimension("n" ,size=lbl.size)
# Create netCDF variables
X = dset.createVariable("lat" ,lat.dtype,("lat" ))
XB = dset.createVariable("lat_bounds" ,lat.dtype,("lat","nb" ))
Y = dset.createVariable("lon" ,lon.dtype,("lon" ))
YB = dset.createVariable("lon_bounds" ,lon.dtype,("lon","nb" ))
I = dset.createVariable("ids" ,ids.dtype,("lat","lon"))
L = dset.createVariable("labels" ,lbl.dtype,("n" ))
# Load data and encode attributes
X [...] = lat
X.units = "degrees_north"
XB[...] = latbnd
Y [...] = lon
Y.units = "degrees_east"
YB[...] = lonbnd
I[...] = ids
I.labels= "labels"
L[...] = lbl
dset.close()