Manual
Reading and writing
Reading and searching shapefile
Since shape files are always read instead of written in this application, only the reading function readshape
is provided.
Mikrubi.readshape
— Functionreadshape(path::AbstractString, index::Int=-1;
extset = [".shp", ".geojson", ".gpkg"]) :: AG.IFeatureLayer
Read a shape file located at path
. If path
refers to a file, the file is directly read; otherwise, if path
refers to a directory, a random shape file inside is read.
extset
describes possible extensions of shape files (see also readlayers
). By setting extset
to nothing
, the extension filtering is not processed, i.e., all files are regarded as shape files. extset
is indifferent when path
refers to a file.
The shape file should contain a dataset. When the dataset consists of multiple layers, index
indicates which data layer should be returned.
The function lookup
is useful when some attribute (e.g. name or code) of a county is known and the row number of the county in a shapefile is wanted (row number may act as identifiers in the list of occupied counties, see the syntax of fit
).
Mikrubi.lookup
— Functionlookup(shptable::AG.IFeatureLayer,
column::Union{AbstractString, Symbol}, entry)
lookup(shptable::AG.IFeatureLayer,
column::Union{AbstractString, Symbol}, entries::AbstractArray)
lookup(shptable::AG.IFeatureLayer)
Find row(s) in the shape table whose column
record(s) equal(s) to entry
or elements of entries
. When the third argument is an array, results are output as an array of the same shape by broadcasting.
Internal functions
Mikrubi.filterext
— Functionfilterext(dir::AbstractString, extset=nothing) :: Vector{String}
Find all file names in dir
with extensions in extset
. When extset
is set to nothing
(by default), all extensions are acceptable.
Mikrubi.goodcolumns
— Functiongoodcolumns(shptable::AG.IFeatureLayer) :: Dict{String, Vector}
Find all properties of features in shptable
where entries are all unique and either integers or strings (types whose isequal
is well-defined).
Reading and writing list file
List of occupied counties can be prepared explicitly in Julia as a vector or a set. Meanwhile, it is also possible to read from or write to disk such a list, especially when the list is generated outside Julia.
Mikrubi.readlist
— Functionreadlist(path::AbstractString) :: Vector
Read any list of vector from file at path
.
Mikrubi.writelist
— Functionwritelist(path::AbstractString, list::AbstractVector) :: Nothing
Write any list or vector to file at path
.
Reading and writing raster layers
Climatic factors are downloaded and stored as raster layers. Mikrubi reads such layers by readlayers
, performs principal component analysis on them and returns the results as layers also. When the output layers need to be kept for future use, they can be written to disk using writelayers
. Moreover, when the predicted distribution of species is organized in raster format, it can be saved likewise using writelayer
.
Mikrubi.readlayers
— Functionreadlayers(filenames::Vector{<:AbstractString}) :: RasterStack
readlayers(dir::AbstractString; extset=nothing) :: RasterStack
Read all raster layers from the directory dir
as a RasterStack
.
extset
describes possible extensions of raster files (e.g., Set(".tif")
, or [".tiff"]
; see also readshape
). By setting extset
to nothing
, the extension filtering is not processed, i.e., all files are regarded as raster files.
Mikrubi.writelayer
— Functionwritelayer(path::AbstractString, layer::Raster) :: Nothing
Write layer
to the disk at path
. Alias for GeoArrays.write!
.
Mikrubi.writelayers
— Functionwritelayers(paths::AbstractVector{<:AbstractString},
layers::RasterStack) :: Nothing
writelayers(pathformula::AbstractString, layers::RasterStack) :: Nothing
Write layers
to paths
respondingly, or a series of paths generated by the pathformula
where an asterisk is used for wildcard and replaced by numbers.
Internal functions
It is worth mention that when reading layers from a directory, files are sorted according to their names in a manner similar to the sorting order in Windows OS. Please pay extra attention when two parallel raster stacks are fed into makefield
.
Mikrubi.sortfilenames!
— Functionsortfilenames!(filenames::AbstractVector{<:AbstractString})
Sort filenames
in place according to the order of the distinctive parts among them. If all of the distinctive parts are decimal numerals, they are sorted as integers.
Examples
julia> sortfilenames!(["bio_9.tif", "bio_10.tif", "bio_1.tif"])
[ Info: 3 files "bio_*.tif" recognized in the directory, where * = 1, 9, 10.
3-element Array{String,1}:
"bio_1.tif"
"bio_9.tif"
"bio_10.tif"
julia> sortfilenames!(["bio_09.tif", "bio_10.tif", "bio_01.tif"])
[ Info: 3 files "bio_*.tif" recognized in the directory, where * = 01, 09, 10.
3-element Array{String,1}:
"bio_01.tif"
"bio_09.tif"
"bio_10.tif"
Mikrubi.allsame
— Functionallsame(a::AbstractVector) :: Bool
Return true
if all elements from a
are identical, or otherwise false
. An error is thrown if the vector a
is empty.
Examples
julia> allsame([1, 1, 2])
false
julia> allsame([1, 1, 1])
true
julia> allsame([1])
true
julia> allsame([])
ERROR: BoundsError: attempt to access 0-element Array{Any,1} at index [1]
Stacktrace:
[1] getindex at .\array.jl:787 [inlined]
[2] allsame(::Array{Any,1}) at .\REPL[9]:1
[3] top-level scope at REPL[20]:1
Reading and writing Mikrubi fields
MikrubiField
is a specially designed type where the environmental information of pixels and their county identifiers are nested. It may be necessary to save (by writefield
) and load (by readfield
) a Mikrubi field especially when it is used on multiple species.
Mikrubi.readfield
— Functionreadfield(path::AbstractString) :: MikrubiField
Read a Mikrubi field from file at path
.
Mikrubi.writefield
— Functionwritefield(path::AbstractString, field::MikrubiField) :: Nothing
Write a Mikrubi field to file at path
.
Reading and writing Mikrubi models
MikrubiModel
is a struct containing transformation parameters. It can be read from and written to disk using respectively readmodel
and writemodel
.
Mikrubi.readmodel
— Functionreadmodel(path::AbstractString) :: MikrubiModel
Read a Mikrubi model from file at path
.
Mikrubi.writemodel
— Functionwritemodel(path::AbstractString, model::MikrubiModel) :: Nothing
Write a Mikrubi model to file at path
.
Rasterizing a shapefile
Since v1.3.0, Mikrubi no longer provides its own rasterization routine; the implementation from Rasters is applied instead. The function rasterize
in Mikrubi integrates the rasterization of multiple geometries. The returned value is of an internal type Mikrubi.CtPixels
.
Mikrubi.rasterize
— Functionrasterize(geoms, layer::Raster) :: CtPixels
rasterize(shptable::AG.IFeatureLayer, layer::Raster) :: CtPixels
For a collection of (multi)polygons, rasterize each of them and write the results in a CtPixels.
Mikrubi.CtPixels
— TypeCtPixels
CtPixels(indices::Raster{Int})
Collector for county-specific rasterization results, whose list
contains county-pixel tuples. Can only be instantiated from an index raster (see indicate
).
Internal functions
Mikrubi.getpixels
— Functiongetpixels(ctpixels::CtPixels) :: Vector{Int}
Get pixel indices from ctpixels
.
Mikrubi.getcounties
— Functiongetcounties(ctpixels::CtPixels) :: Vector{Int}
Get county indices from ctpixels
.
Mikrubi.getpixel
— Functiongetpixel(ctpixels::CtPixels, i::Int) :: Int
Get the pixel index of the i
-th county-pixel tuple in ctpixels
.
Mikrubi.getcounty
— Functiongetcounty(ctpixels::CtPixels, i::Int) :: Int
Get the county index of the i
-th county-pixel tuple in ctpixels
.
Mikrubi.indicate
— Functionindicate(layer::Raster) :: Raster{Int}
Build an index raster indices
from layer
. The value of an array element in indices
is either (1) 0
for missing, if the corresponding element in layer
is missing; or (2) the integer index of the array element, otherwise.
Mikrubi.register!
— Functionregister!(ctpixels::CtPixels, ct::Int, pixel::Int) :: Int
Push a county-pixel tuple into ctpixels
, if pixel
is not zero. For convenience, the value of pixel
is returned.
register!(ctpixels::CtPixels, ct::Int) :: Function
Create a function that accepts a pixel
, pushes the county-pixel tuple into ctpixels
, and finally returns the value of pixel
.
Mikrubi.ispoly
— Functionispoly(geom) :: Bool
Check if geom
is a polygon or a multipolygon.
Processing the raster layers
In Mikrubi, climatic factors after being read in typically undergo some processing steps together with shapefile inside the function makefield
, which returns a Mikrubi field and a stack of extracted components in raster layers. The two outputs can be used for training and prediction.
Sometimes it is also required to apply a model to another circumstance (different time or different space), in which case another series of parallel climatic factor layers need to be processed in exactly the same way as those used to generate the Mikrubi field (so that their climatic meanings are the same). Such layers need to be put in the third place in the input argument list for makefield
, and those parallelly extracted components are returned in the third place in output as well.
Mikrubi.makefield
— Functionmakefield(layers::RasterStack, ctpixels::CtPixels;
rabsthres=0.8, nprincomp=3) :: Tuple{MikrubiField, RasterStack}
makefield(layers::RasterStack, ctpixels::CtPixels,
players::RasterStack; rabsthres=0.8, nprincomp=3)
:: Tuple{MikrubiField, RasterStack, RasterStack}
makefield(layers::RasterStack, shptable; rabsthres=0.8, nprincomp=3)
:: Tuple{MikrubiField, RasterStack}
makefield(layers::RasterStack, shptable, players::RasterStack;
rabsthres=0.8, nprincomp=3)
:: Tuple{MikrubiField, RasterStack, RasterStack}
Create a MikrubiField
as well as processed variable layers from layers
and shptable
or ctpixels
, by
- (rasterizing the
shptable
toctpixels
usingrasterize
,) - masking the
layers
withctpixels
(usingMikrubi.masklayers!
), - extracting non-missing pixels from
layers
(usingMikrubi.extractlayers
), - selecting less correlated variables (using
Mikrubi.selectvars
), and - doing the principal component analysis (using
Mikrubi.princompvars
).
Optional keyword arguments
rabsthres
: threshold of collinearity.
Absolute value of Pearson correlation efficient greater than this threshold is identified as collinearity and the two variables are thus incompatible.
nprincomp
: expected number of principal components of the variables.
Notes about players
When players
is present in the argument list, raster layers it contains experience the same process including subsetting, selecting, and taking principal components, and results are packed and returned in the third place. User must assure that players
has the same length as layers
, and their elements are corresponding in order. This would be useful when the prediction is in another geographic range or at another time.
Internal functions
Mikrubi.colmatrix
— Functioncolmatrix(vector::AbstractVector) :: AbstractMatrix
colmatrix(matrix::AbstractMatrix) :: AbstractMatrix
Return a one-column matrix if the argument is a vector, or the matrix itself if the argument is already a matrix.
Mikrubi.masklayers!
— Functionmasklayers!(layers::RasterStack, ctpixels::CtPixels) :: RasterStack
Mask the layers
in a way that only pixels present in ctpixels
are kept, while all other uncovered pixels are set to a missing value.
Mikrubi.extractlayers
— Functionextractlayers(layers::RasterStack) :: Tuple{Matrix, Vector{Int}}
Extract the non-missing pixels from layers
, and combine them into a matrix, whose rows representing pixels and columns representing variables.
extractlayers
is the inverse function of makelayers
.
Mikrubi.emptylayer!
— Functionemptylayer!(grid::Raster) :: Raster
Fill the grid
with missing values in place.
Mikrubi.emptylayer
— Functionemptylayer(grid::Raster) :: Raster
Create a new Raster
full of missing values from the shape of grid
.
Mikrubi.emptylayers
— Functionemptylayers(grid::Raster, m::Int) :: RasterStack
Create a RasterStack
with m
empty Raster
s (full of missing values) from the shape of grid
.
Mikrubi.makelayer
— Functionmakelayer(vector::AbstractVector, idx::AbstractVector, grid::Raster)
Make a Raster
from the grid
and values in vector
.
For making a RasterStack
from a matrix, see makelayers
.
Mikrubi.makelayers
— Functionmakelayers(matrix::AbstractMatrix, idx::AbstractVector, grid::Raster)
:: RasterStack
Make a RasterStack
from the grid
and values in columns of matrix
.
For making a Raster
from a column vector, see makelayer
.
makelayers
is the inverse function of extractlayers
.
Mikrubi.dftraverse!
— Functiondftraverse!(beststate, bestscore, state, score, depth, maxdepth,
incompat, scoremat) :: Nothing
Find the index combination that
- firstly containing as many indices as possible, and
- secondly with the lowest pairwise sum from submatrix of
scoremat
,
such that no indices i
and j
coexist as long as incompat[i][j] == true
.
The result is stored as the only element of beststate
, with its score decided by the two criteria above stored as the only element of bestscore
.
Example
julia> beststate = Vector(undef, 1);
julia> bestscore = [(0, 0.0)];
julia> dftraverse!(beststate, bestscore, Int[], (0, 0.0), 1, 3,
Bool[0 0 1; 0 0 0; 1 0 0],
[0.0 0.6 0.3; 0.6 0.0 0.9; 0.3 0.9 0.0]);
julia> beststate
1-element Array{Any,1}:
[1, 2]
julia> bestscore
1-element Array{Tuple{Int64,Float64},1}:
(2, -0.6)
Mikrubi.selectvars
— Functionselectvars(matrix::Matrix, rabsthres=0.8) :: Vector{Int}
Select as many variables as possible from matrix
such that no pairwise Pearson coefficient among them exceeds rabsthres
and their sum is minimal.
Example
julia> selectvars([1. 4. 7.; 2. 5. 8.; 3. 9. 27.], rabsthres=0.9)
2-element Array{Int64,1}:
1
3
Mikrubi.princompvars
— Functionprincompvars(smatrix::Matrix; nprincomp=3) :: Tuple{Vector, Matrix}
Perform principal component analysis on smatrix
whose columns represents variables, and combines the nprincomp
principal components into a matrix, and returns the result matrix as well as the affine transformation (colmean, projwstd)
, such that the result matrix == (smatrix .- colmean) * projwstd
.
Mikrubi.DimLower
— TypeDimLower
DimLower()
A container for transformation used in makefield
, working as a function. If it is new (new=true
), the parameters (colid
, colmean
, and projwstd
) are computed when it is applied on a RasterStack
.
Mikrubi.dimpoints
— Functiondimpoints(grid::Raster) :: DimPoints
Create a DimensionalData.DimPoints
from grid
after shifting all its dimension loci to Center()
. Similar to GeoArrays.coords
.
Mikrubi.centercoords
— Functioncentercoords(dp::DimPoints, ci::CartesianIndices, idx::Int)
:: Tuple{AbstractFloat, AbstractFloat}
Get center coordinates of a grid cell indexed by idx
in a Raster
.
Mikrubi.buildfield
— Functionbuildfield(ctpixels::CtPixels, idx::Vector,
projmat::Matrix, grid::Raster) :: MikrubiField
Construct a MikrubiField
from ctpixels
, idx
, projmat
, and grid
. Used in makefield
.
The Mikrubi core
Two specially designed structs are involved in the core of Mikrubi.
Mikrubi field
MikrubiField
is a struct containing mainly three types of information of pixels/points, that is, which counties they belong to (ctids
), their geographic coordinates (locs
), and their environmental coordinates (vars
), with also some derived assistant attributes, such as geographic dimensionality (usually 2
) and environmental dimensionality (for example, 3
).
MikrubiField
can be obtained in three ways: as first output argument of makefield
, read from disk, or constructed directly from the three required attributes (this may be useful for simulation analysis).
Mikrubi.MikrubiField
— TypeMikrubiField{T, U <: Real, V <: AbstractFloat}
MikrubiField(ctids, locs, vars)
Construct a Mikrubi field containing a number of pixels or points, using the arguments which should always have the same number of rows
ctids::Vector
: a vector containing the county identifierslocs::Array{<:Real}
: an array of geographic coordinatesvars::Matrix{<:AbstractFloat}
: an array of environmental coordinates
Mikrubi model
MikrubiModel
contains the environmental dimensionality and the model parameters to define a positive-definite quadratic mapping from environmental space to a real number axis.
Like MikrubiField
, MikrubiField
can be obtained in three ways: as output argument of fit
, read from disk, or constructed directly from attributes. An example of obtaining Mikrubi field and Mikrubi model from constructors are available in examples/onedimsim/sim.jl
.
Mikrubi.MikrubiModel
— TypeMikrubiModel{V <: AbstractFloat}
MikrubiModel(dvar::Int, params::Vector{<:AbstractFloat})
Construct a Mikrubi Model from a dimensionality dvar
and a parameter vector params
. The equation must hold for dvar2dparam(dvar) == length(params)
.
Mikrubi Models can be obtained from the function fit
, and can be used in the function predict
.
Fitting a Mikrubi model
When a Mikrubi field as well as occurrence data in county and/or in coordinates are ready, they can be used to train a Mikrubi model by function fit
(county data at counties
, required; coordinates at coords
, optional). Result is output as a MikrubiModel
.
Mikrubi.fit
— Functionfit(field::MikrubiField, counties, coords=zeros(0, 0);
optresult=[], iterations=3_000_000, kwargs...) :: MikrubiModel
Numerically find the Mikrubi model maximizing the likelihood that the occupied counties as well as the occupied coordinates are sampled in the given Mikrubi field. The optimization result is stored in the container optresult
for debugging.
Predicting from a Mikrubi model
A Mikrubi model can be applied by function predict
to a matrix with its columns corresponding to extracted variables, a stack of extracted layers, or a Mikrubi field.
- When input argument is a matrix, output argument is a column vector denoting the probability of presence in pixels/points related to rows in the matrix.
- When input argument is a stack of layers, output argument is a single layer denoting the probability of presence.
- When input argument is a Mikrubi field, output argument is a
Dict
which maps every county identifier to probability of presence at pixels inside the county, see alsopredictcounty
.
Mikrubi.predict
— Functionpredict(matrix::AbstractMatrix, model::MikrubiModel) :: Vector
predict(layers::RasterStack, model::MikrubiModel) :: Raster
predict(field::MikrubiField, model::MikrubiModel)
:: Dict{<:Any, <:Vector{<:Tuple{Vector{<:Real}, AbstractFloat}}}
Predict the probability of presence according to processed climatic factors (matrix
/ layers
) or on the Mikrubi field
.
When distribution probability within only one county is concerned, predictcounty
returns probability of presence at all pixels that constitute the county in descending order. Therefore, the first element represents the most likely occupied pixel of a county.
Mikrubi.predictcounty
— Functionpredictcounty(field::MikrubiField, model::MikrubiModel, county)
:: Vector{<:Tuple{Vector{<:Real}, AbstractFloat}}
Return the geographic coordinates of pixels in the county sorted by the likeliness of being occupied.
It is also possible to obtain the overall probability that every county is occupied by the function probcounties
.
Mikrubi.probcounties
— Functionprobcounties(field::MikrubiField, model::MikrubiModel)
:: Dict{<:Any, <:AbstractFloat}
probcounties(::Type{<:Logistic}, field::MikrubiField, model::MikrubiModel)
:: Dict{<:Any, <:Logistic}
Compute the probability for every county to be occupied in the field
.
Sampling counties in a Mikrubi field
For simulation analysis, sometimes it is required to sample a set of counties from a Mikrubi field and a Mikrubi model. samplecounties
does the trick.
Mikrubi.samplecounties
— Functionsamplecounties(field::MikrubiField, model::MikrubiModel) :: Vector
Sample counties according to their probability of being occupied.
Detecting overfitting
Overfitting can be detected with the Lipschitz constant, the (logarithmic) maximum gradient (in norm) of the probability of presence in environmental space.
Mikrubi.lipschitz
— Functionlipschitz(model::MikrubiModel, field::MikrubiField; wholespace=false)
:: AbstractFloat
Calculate the maximum gradient (in norm) of the probability of presence over the field
. When wholespace=false
(default), the maximum is taken among the points contained in field
; otherwise it is taken around the whole space.
Internal functions
Mikrubi.dvar2dparam
— Functiondvar2dparam(dvar::Int) :: Int
Convert dimensionality of an environmental space to the dimensionality of the induced parameter space, i.e., compute the degrees of freedom for positive-definite quadratic functions mapping a dvar
-dimensional linear space into real numbers.
Examples
julia> dvar2dparam(1)
3
julia> dvar2dparam(3)
10
Mikrubi.decomparams
— Functiondecomparams(p::AbstractVector, d::Int) :: Tuple{Matrix, Vector, Any}
decomparams(model::MikrubiModel) :: Tuple{Matrix, Vector, Any}
Return parameter decomposition At
, b
, c
, where
At
is a lower triangular matrix of size(d, d)
,b
is a column vector of sized
, andc
is a scalar.
WARNING: The vector p
must have length dvar2dparam(d)
.
Example
julia> decomparams(collect(1:10), 3)
([1 0 0; 2 3 0; 4 5 6], [7, 8, 9], 10)
Mikrubi.pabsence
— Functionpabsence(vars::AbstractMatrix, params::AbstractVector) :: Logistic
pabsence(field::MikrubiField, params::AbstractVector) :: Logistic
Compute the probability of absence at pixels given vars
/field
and params
.
Mikrubi.ppresence
— Functionppresence(vars::AbstractMatrix, params::AbstractVector) :: Logistic
ppresence(field::MikrubiField, params::AbstractVector) :: Logistic
Compute the probability of presence at pixels given vars
/field
and params
.
Mikrubi.mlogL
— FunctionmlogL(field::MikrubiField, counties, params::AbstractVector)
:: AbstractFloat
mlogL(vars::AbstractMatrix, params::AbstractVector) :: AbstractFloat
Compute the opposite log-likelihood that the occupied counties or occupied coordinates are sampled. The opposite is taken for optimization.
Mikrubi.probpixels
— Functionprobpixels(field::MikrubiField, model::MikrubiModel)
:: Vector{<:AbstractFloat}
Compute the probability for every pixel to be occupied in the field
.
Mikrubi.findnearest
— Functionfindnearest(loc::AbstractVecOrMat{<:Real}, field::MikrubiField) :: Int
Return the row index in field.locs
which is the nearest to the given coordinates.
Mikrubi.findnearests
— Functionfindnearests(loc::AbstractVector{<:AbstractVecOrMat}, field::MikrubiField)
:: Vector{Int}
findnearests(loc::AbstractMatrix{<:Real}, field::MikrubiField)
:: Vector{Int}
Return the row indices in field.locs
which are the nearest to each of the given coordinates. Duplicate results are reduced to one.
Mikrubi.loglipschitz
— Functionloglipschitz(model::MikrubiModel, field::MikrubiField; wholespace=false)
:: AbstractFloat
Calculate the (logarithmic) maximum gradient (in norm) of the probability of presence over the field
. When wholespace=false
(default), the maximum is taken among the points contained in field
; otherwise it is taken around the whole space.
Mikrubi.textwrap
— Functiontextwrap(str::AbstractString) :: String
Gobble all linefeeds ("\n
") inside str
and replaces them with spaces (""), so long strings can be wrapped to multiple lines in the codes, like the Python package "textwrap". See also
tw
.
Mikrubi.@tw_str
— Macro@tw_str :: String
Macro version of textwrap
, without interpolation and unescaping.