Manual
Reading and writing
Reading and searching shapefile
Since shape files are always read instead of written in this application, only the reading function readshape is provided.
Mikrubi.readshape — Functionreadshape(path::AbstractString, index::Int=-1;
extset = [".shp", ".geojson", ".gpkg"]) :: AG.IFeatureLayerRead a shape file located at path. If path refers to a file, the file is directly read; otherwise, if path refers to a directory, a random shape file inside is read.
extset describes possible extensions of shape files (see also readlayers). By setting extset to nothing, the extension filtering is not processed, i.e., all files are regarded as shape files. extset is indifferent when path refers to a file.
The shape file should contain a dataset. When the dataset consists of multiple layers, index indicates which data layer should be returned.
The function lookup is useful when some attribute (e.g. name or code) of a county is known and the row number of the county in a shapefile is wanted (row number may act as identifiers in the list of occupied counties, see the syntax of fit).
Mikrubi.lookup — Functionlookup(shptable::AG.IFeatureLayer,
column::Union{AbstractString, Symbol}, entry)
lookup(shptable::AG.IFeatureLayer,
column::Union{AbstractString, Symbol}, entries::AbstractArray)
lookup(shptable::AG.IFeatureLayer)Find row(s) in the shape table whose column record(s) equal(s) to entry or elements of entries. When the third argument is an array, results are output as an array of the same shape by broadcasting.
Internal functions
Mikrubi.filterext — Functionfilterext(dir::AbstractString, extset=nothing) :: Vector{String}Find all file names in dir with extensions in extset. When extset is set to nothing (by default), all extensions are acceptable.
Mikrubi.goodcolumns — Functiongoodcolumns(shptable::AG.IFeatureLayer) :: Dict{String, Vector}Find all properties of features in shptable where entries are all unique and either integers or strings (types whose isequal is well-defined).
Reading and writing list file
List of occupied counties can be prepared explicitly in Julia as a vector or a set. Meanwhile, it is also possible to read from or write to disk such a list, especially when the list is generated outside Julia.
Mikrubi.readlist — Functionreadlist(path::AbstractString) :: VectorRead any list of vector from file at path.
Mikrubi.writelist — Functionwritelist(path::AbstractString, list::AbstractVector) :: NothingWrite any list or vector to file at path.
Reading and writing raster layers
Climatic factors are downloaded and stored as raster layers. Mikrubi reads such layers by readlayers, performs principal component analysis on them and returns the results as layers also. When the output layers need to be kept for future use, they can be written to disk using writelayers. Moreover, when the predicted distribution of species is organized in raster format, it can be saved likewise using writelayer.
Mikrubi.readlayers — Functionreadlayers(filenames::Vector{<:AbstractString}) :: RasterStack
readlayers(dir::AbstractString; extset=nothing) :: RasterStackRead all raster layers from the directory dir as a RasterStack.
extset describes possible extensions of raster files (e.g., Set(".tif"), or [".tiff"]; see also readshape). By setting extset to nothing, the extension filtering is not processed, i.e., all files are regarded as raster files.
Mikrubi.writelayer — Functionwritelayer(path::AbstractString, layer::Raster) :: NothingWrite layer to the disk at path. Alias for GeoArrays.write!.
Mikrubi.writelayers — Functionwritelayers(paths::AbstractVector{<:AbstractString},
layers::RasterStack) :: Nothing
writelayers(pathformula::AbstractString, layers::RasterStack) :: NothingWrite layers to paths respondingly, or a series of paths generated by the pathformula where an asterisk is used for wildcard and replaced by numbers.
Internal functions
It is worth mention that when reading layers from a directory, files are sorted according to their names in a manner similar to the sorting order in Windows OS. Please pay extra attention when two parallel raster stacks are fed into makefield.
Mikrubi.sortfilenames! — Functionsortfilenames!(filenames::AbstractVector{<:AbstractString})Sort filenames in place according to the order of the distinctive parts among them. If all of the distinctive parts are decimal numerals, they are sorted as integers.
Examples
julia> sortfilenames!(["bio_9.tif", "bio_10.tif", "bio_1.tif"])
[ Info: 3 files "bio_*.tif" recognized in the directory, where * = 1, 9, 10.
3-element Array{String,1}:
"bio_1.tif"
"bio_9.tif"
"bio_10.tif"
julia> sortfilenames!(["bio_09.tif", "bio_10.tif", "bio_01.tif"])
[ Info: 3 files "bio_*.tif" recognized in the directory, where * = 01, 09, 10.
3-element Array{String,1}:
"bio_01.tif"
"bio_09.tif"
"bio_10.tif"Mikrubi.allsame — Functionallsame(a::AbstractVector) :: BoolReturn true if all elements from a are identical, or otherwise false. An error is thrown if the vector a is empty.
Examples
julia> allsame([1, 1, 2])
false
julia> allsame([1, 1, 1])
true
julia> allsame([1])
true
julia> allsame([])
ERROR: BoundsError: attempt to access 0-element Array{Any,1} at index [1]
Stacktrace:
[1] getindex at .\array.jl:787 [inlined]
[2] allsame(::Array{Any,1}) at .\REPL[9]:1
[3] top-level scope at REPL[20]:1Reading and writing Mikrubi fields
MikrubiField is a specially designed type where the environmental information of pixels and their county identifiers are nested. It may be necessary to save (by writefield) and load (by readfield) a Mikrubi field especially when it is used on multiple species.
Mikrubi.readfield — Functionreadfield(path::AbstractString) :: MikrubiFieldRead a Mikrubi field from file at path.
Mikrubi.writefield — Functionwritefield(path::AbstractString, field::MikrubiField) :: NothingWrite a Mikrubi field to file at path.
Reading and writing Mikrubi models
MikrubiModel is a struct containing transformation parameters. It can be read from and written to disk using respectively readmodel and writemodel.
Mikrubi.readmodel — Functionreadmodel(path::AbstractString) :: MikrubiModelRead a Mikrubi model from file at path.
Mikrubi.writemodel — Functionwritemodel(path::AbstractString, model::MikrubiModel) :: NothingWrite a Mikrubi model to file at path.
Rasterizing a shapefile
Since v1.3.0, Mikrubi no longer provides its own rasterization routine; the implementation from Rasters is applied instead. The function rasterize in Mikrubi integrates the rasterization of multiple geometries. The returned value is of an internal type Mikrubi.CtPixels.
Mikrubi.rasterize — Functionrasterize(geoms, layer::Raster) :: CtPixels
rasterize(shptable::AG.IFeatureLayer, layer::Raster) :: CtPixelsFor a collection of (multi)polygons, rasterize each of them and write the results in a CtPixels.
Mikrubi.CtPixels — TypeCtPixels
CtPixels(indices::Raster{Int})Collector for county-specific rasterization results, whose list contains county-pixel tuples. Can only be instantiated from an index raster (see indicate).
Internal functions
Mikrubi.getpixels — Functiongetpixels(ctpixels::CtPixels) :: Vector{Int}Get pixel indices from ctpixels.
Mikrubi.getcounties — Functiongetcounties(ctpixels::CtPixels) :: Vector{Int}Get county indices from ctpixels.
Mikrubi.getpixel — Functiongetpixel(ctpixels::CtPixels, i::Int) :: IntGet the pixel index of the i-th county-pixel tuple in ctpixels.
Mikrubi.getcounty — Functiongetcounty(ctpixels::CtPixels, i::Int) :: IntGet the county index of the i-th county-pixel tuple in ctpixels.
Mikrubi.indicate — Functionindicate(layer::Raster) :: Raster{Int}Build an index raster indices from layer. The value of an array element in indices is either (1) 0 for missing, if the corresponding element in layer is missing; or (2) the integer index of the array element, otherwise.
Mikrubi.register! — Functionregister!(ctpixels::CtPixels, ct::Int, pixel::Int) :: IntPush a county-pixel tuple into ctpixels, if pixel is not zero. For convenience, the value of pixel is returned.
register!(ctpixels::CtPixels, ct::Int) :: FunctionCreate a function that accepts a pixel, pushes the county-pixel tuple into ctpixels, and finally returns the value of pixel.
Mikrubi.ispoly — Functionispoly(geom) :: BoolCheck if geom is a polygon or a multipolygon.
Processing the raster layers
In Mikrubi, climatic factors after being read in typically undergo some processing steps together with shapefile inside the function makefield, which returns a Mikrubi field and a stack of extracted components in raster layers. The two outputs can be used for training and prediction.
Mikrubi.makefield — Functionmakefield(layers::RasterStack, shptable;
config=DimLowerConfig(rabsthres=0.8, nprincomp=3))
:: Tuple{MikrubiField, RasterStack}Create a MikrubiField as well as processed variable layers from layers and shptable, by
- masking the
layerswithctpixels(usingMikrubi.masklayers!), - extracting non-missing pixels from
layers(usingMikrubi.extractlayers), - selecting less correlated variables (using
Mikrubi.selectvars), and - doing the principal component analysis (using
Mikrubi.princompvars).
Configuration including rabsthres and nprincomp can be passed in as a keyword argument; see DimLowerConfig.
A less detailed version of yieldfield.
Since v1.4.0, Mikrubi supports parallel dimensionality reduction. Sometimes it is also required to apply a model to another circumstance (different time or different space), in which case another series of parallel climatic factor layers need to be processed in exactly the same way as those used to generate the Mikrubi field (so that their climatic meanings are the same). For such purpose, use the more detailed version yieldfield for the current circumstance, and pass dimlower, the third output parameter, again using yieldfield to another circumstance.
Mikrubi.yieldfield — Functionyieldfield(layers::RasterStack, ctpixels::CtPixels,
dimlowerorconfig=DimLowerConfig(rabsthres=0.8, nprincomp=3)) ::
Tuple{MikrubiField, RasterStack, DimLower}
yieldfield(layers::RasterStack, shptable,
dimlowerorconfig=DimLowerConfig(rabsthres=0.8, nprincomp=3)) ::
Tuple{MikrubiField, RasterStack, DimLower}Return a MikrubiField, processed variable layers, and the DimLower.
A more detailed version of makefield.
To adjust the manner how Mikrubi reduces the dimensionality of the variables, configuration should be provided in a DimLowerConfig instance.
Mikrubi.DimLowerConfig — TypeDimLowerConfig
DimLowerConfig(; rabsthres=0.8, nprincomp=3)A type for configuration for dimensionality reduction.
Keyword arguments
rabsthres: threshold of collinearity.
Absolute value of Pearson correlation efficient greater than this threshold is identified as collinearity and the two variables are thus incompatible.
nprincomp: expected number of principal components of the variables.
Internal functions
Mikrubi.colmatrix — Functioncolmatrix(vector::AbstractVector) :: AbstractMatrix
colmatrix(matrix::AbstractMatrix) :: AbstractMatrixReturn a one-column matrix if the argument is a vector, or the matrix itself if the argument is already a matrix.
Mikrubi.masklayers! — Functionmasklayers!(layers::RasterStack, ctpixels::CtPixels) :: RasterStackMask the layers in a way that only pixels present in ctpixels are kept, while all other uncovered pixels are set to a missing value.
Mikrubi.extractlayers — Functionextractlayers(layers::RasterStack) :: Tuple{Matrix, Vector{Int}}Extract the non-missing pixels from layers, and combine them into a matrix, whose rows representing pixels and columns representing variables.
extractlayers is the inverse function of makelayers.
Mikrubi.emptylayer! — Functionemptylayer!(grid::Raster) :: RasterFill the grid with missing values in place.
Mikrubi.emptylayer — Functionemptylayer(grid::Raster) :: RasterCreate a new Raster full of missing values from the shape of grid.
Mikrubi.emptylayers — Functionemptylayers(grid::Raster, m::Int) :: RasterStackCreate a RasterStack with m empty Rasters (full of missing values) from the shape of grid.
Mikrubi.makelayer — Functionmakelayer(vector::AbstractVector, idx::AbstractVector, grid::Raster)
:: RasterMake a Raster from the grid and values in vector.
For making a RasterStack from a matrix, see makelayers.
Mikrubi.makelayers — Functionmakelayers(matrix::AbstractMatrix, idx::AbstractVector, grid::Raster)
:: RasterStackMake a RasterStack from the grid and values in columns of matrix.
For making a Raster from a column vector, see makelayer.
makelayers is the inverse function of extractlayers.
Mikrubi.dftraverse! — Functiondftraverse!(beststate, bestscore, state, score, depth, maxdepth,
incompat, scoremat) :: NothingFind the index combination that
- firstly containing as many indices as possible, and
- secondly with the lowest pairwise sum from submatrix of
scoremat,
such that no indices i and j coexist as long as incompat[i][j] == true.
The result is stored as the only element of beststate, with its score decided by the two criteria above stored as the only element of bestscore.
Example
julia> beststate = Vector(undef, 1);
julia> bestscore = [(0, 0.0)];
julia> dftraverse!(beststate, bestscore, Int[], (0, 0.0), 1, 3,
Bool[0 0 1; 0 0 0; 1 0 0],
[0.0 0.6 0.3; 0.6 0.0 0.9; 0.3 0.9 0.0]);
julia> beststate
1-element Array{Any,1}:
[1, 2]
julia> bestscore
1-element Array{Tuple{Int64,Float64},1}:
(2, -0.6)Mikrubi.selectvars — Functionselectvars(matrix::Matrix, rabsthres=0.8) :: Vector{Int}Select as many variables as possible from matrix such that no pairwise Pearson coefficient among them exceeds rabsthres and their sum is minimal.
Example
julia> selectvars([1. 4. 7.; 2. 5. 8.; 3. 9. 27.], rabsthres=0.9)
2-element Array{Int64,1}:
1
3Mikrubi.princompvars — Functionprincompvars(smatrix::Matrix; nprincomp=3) :: Tuple{Vector, Matrix}Perform principal component analysis on smatrix whose columns represents variables, and combines the nprincomp principal components into a matrix, and returns the result matrix as well as the affine transformation (colmean, projwstd), such that the result matrix == (smatrix .- colmean) * projwstd.
Mikrubi.DimLower — TypeDimLower
DimLower{T<:AbstractFloat}(n::Int, colid::Vector{Int},
colmean::Matrix{T}, projwstd::Matrix{T})
DimLower(n::Integer, colid::AbstractVector{<:Integer},
colmean::AbstractMatrix{<:Real}, projwstd::AbstractMatrix{<:Real})A container for transformation parameters used in makefield.
Mikrubi.dimpoints — Functiondimpoints(grid::Raster) :: DimPointsCreate a DimensionalData.DimPoints from grid after shifting all its dimension loci to Center(). Similar to GeoArrays.coords.
Mikrubi.centercoords — Functioncentercoords(dp::DimPoints, ci::CartesianIndices, idx::Int)
:: Tuple{AbstractFloat, AbstractFloat}Get center coordinates of a grid cell indexed by idx in a Raster.
Mikrubi.buildfield — Functionbuildfield(ctpixels::CtPixels, idx::Vector,
projmat::Matrix, grid::Raster) :: MikrubiFieldConstruct a MikrubiField from ctpixels, idx, projmat, and grid. Used in makefield.
The Mikrubi core
Two specially designed structs are involved in the core of Mikrubi.
Mikrubi field
MikrubiField is a struct containing mainly three types of information of pixels/points, that is, which counties they belong to (ctids), their geographic coordinates (locs), and their environmental coordinates (vars), with also some derived assistant attributes, such as geographic dimensionality (usually 2) and environmental dimensionality (for example, 3).
MikrubiField can be obtained in three ways: as first output argument of makefield, read from disk, or constructed directly from the three required attributes (this may be useful for simulation analysis).
Mikrubi.MikrubiField — TypeMikrubiField{T, U <: Real, V <: AbstractFloat}
MikrubiField(ctids, locs, vars)Construct a Mikrubi field containing a number of pixels or points, using the arguments which should always have the same number of rows
ctids::Vector: a vector containing the county identifierslocs::Array{<:Real}: an array of geographic coordinatesvars::Matrix{<:AbstractFloat}: an array of environmental coordinates
Mikrubi model
MikrubiModel contains the environmental dimensionality and the model parameters to define a positive-definite quadratic mapping from environmental space to a real number axis.
Like MikrubiField, MikrubiField can be obtained in three ways: as output argument of fit, read from disk, or constructed directly from attributes. An example of obtaining Mikrubi field and Mikrubi model from constructors are available in examples/onedimsim/sim.jl.
Mikrubi.MikrubiModel — TypeMikrubiModel{V <: AbstractFloat}
MikrubiModel(dvar::Int, params::Vector{<:AbstractFloat})Construct a Mikrubi Model from a dimensionality dvar and a parameter vector params. The equation must hold for dvar2dparam(dvar) == length(params).
Mikrubi Models can be obtained from the function fit, and can be used in the function predict.
Fitting a Mikrubi model
When a Mikrubi field as well as occurrence data in county and/or in coordinates are ready, they can be used to train a Mikrubi model by function fit (county data at counties, required; coordinates at coords, optional). Result is output as a MikrubiModel.
Mikrubi.fit — Functionfit(field::MikrubiField, counties, coords=zeros(0, 0);
optresult=[], iterations=3_000_000, kwargs...) :: MikrubiModelNumerically find the Mikrubi model maximizing the likelihood that the occupied counties as well as the occupied coordinates are sampled in the given Mikrubi field. The optimization result is stored in the container optresult for debugging.
Predicting from a Mikrubi model
A Mikrubi model can be applied by function predict to a matrix with its columns corresponding to extracted variables, a stack of extracted layers, or a Mikrubi field.
- When input argument is a matrix, output argument is a column vector denoting the probability of presence in pixels/points related to rows in the matrix.
- When input argument is a stack of layers, output argument is a single layer denoting the probability of presence.
- When input argument is a Mikrubi field, output argument is a
Dictwhich maps every county identifier to probability of presence at pixels inside the county, see alsopredictcounty.
Mikrubi.predict — Functionpredict(matrix::AbstractMatrix, model::MikrubiModel) :: Vector
predict(layers::RasterStack, model::MikrubiModel) :: Raster
predict(field::MikrubiField, model::MikrubiModel)
:: Dict{<:Any, <:Vector{<:Tuple{Vector{<:Real}, AbstractFloat}}}Predict the probability of presence according to processed climatic factors (matrix / layers) or on the Mikrubi field.
When distribution probability within only one county is concerned, predictcounty returns probability of presence at all pixels that constitute the county in descending order. Therefore, the first element represents the most likely occupied pixel of a county.
Mikrubi.predictcounty — Functionpredictcounty(field::MikrubiField, model::MikrubiModel, county)
:: Vector{<:Tuple{Vector{<:Real}, AbstractFloat}}Return the geographic coordinates of pixels in the county sorted by the likeliness of being occupied.
It is also possible to obtain the overall probability that every county is occupied by the function probcounties.
Mikrubi.probcounties — Functionprobcounties(field::MikrubiField, model::MikrubiModel)
:: Dict{<:Any, <:AbstractFloat}
probcounties(::Type{<:Logistic}, field::MikrubiField, model::MikrubiModel)
:: Dict{<:Any, <:Logistic}Compute the probability for every county to be occupied in the field.
Sampling counties in a Mikrubi field
For simulation analysis, sometimes it is required to sample a set of counties from a Mikrubi field and a Mikrubi model. samplecounties does the trick.
Mikrubi.samplecounties — Functionsamplecounties(field::MikrubiField, model::MikrubiModel) :: VectorSample counties according to their probability of being occupied.
Detecting overfitting
Overfitting can be detected with the Lipschitz constant, the (logarithmic) maximum gradient (in norm) of the probability of presence in environmental space.
Mikrubi.lipschitz — Functionlipschitz(model::MikrubiModel, field::MikrubiField; wholespace=false)
:: AbstractFloatCalculate the maximum gradient (in norm) of the probability of presence over the field. When wholespace=false (default), the maximum is taken among the points contained in field; otherwise it is taken around the whole space.
Internal functions
Mikrubi.dvar2dparam — Functiondvar2dparam(dvar::Int) :: IntConvert dimensionality of an environmental space to the dimensionality of the induced parameter space, i.e., compute the degrees of freedom for positive-definite quadratic functions mapping a dvar-dimensional linear space into real numbers.
Examples
julia> dvar2dparam(1)
3
julia> dvar2dparam(3)
10Mikrubi.decomparams — Functiondecomparams(p::AbstractVector, d::Int) :: Tuple{Matrix, Vector, Any}
decomparams(model::MikrubiModel) :: Tuple{Matrix, Vector, Any}Return parameter decomposition At, b, c, where
Atis a lower triangular matrix of size(d, d),bis a column vector of sized, andcis a scalar.
WARNING: The vector p must have length dvar2dparam(d).
Example
julia> decomparams(collect(1:10), 3)
([1 0 0; 2 3 0; 4 5 6], [7, 8, 9], 10)Mikrubi.pabsence — Functionpabsence(vars::AbstractMatrix, params::AbstractVector) :: Logistic
pabsence(field::MikrubiField, params::AbstractVector) :: LogisticCompute the probability of absence at pixels given vars/field and params.
Mikrubi.ppresence — Functionppresence(vars::AbstractMatrix, params::AbstractVector) :: Logistic
ppresence(field::MikrubiField, params::AbstractVector) :: LogisticCompute the probability of presence at pixels given vars/field and params.
Mikrubi.mlogL — FunctionmlogL(field::MikrubiField, counties, params::AbstractVector)
:: AbstractFloat
mlogL(vars::AbstractMatrix, params::AbstractVector) :: AbstractFloatCompute the opposite log-likelihood that the occupied counties or occupied coordinates are sampled. The opposite is taken for optimization.
Mikrubi.probpixels — Functionprobpixels(field::MikrubiField, model::MikrubiModel)
:: Vector{<:AbstractFloat}Compute the probability for every pixel to be occupied in the field.
Mikrubi.findnearest — Functionfindnearest(loc::AbstractVecOrMat{<:Real}, field::MikrubiField) :: IntReturn the row index in field.locs which is the nearest to the given coordinates.
Mikrubi.findnearests — Functionfindnearests(loc::AbstractVector{<:AbstractVecOrMat}, field::MikrubiField)
:: Vector{Int}
findnearests(loc::AbstractMatrix{<:Real}, field::MikrubiField)
:: Vector{Int}Return the row indices in field.locs which are the nearest to each of the given coordinates. Duplicate results are reduced to one.
Mikrubi.loglipschitz — Functionloglipschitz(model::MikrubiModel, field::MikrubiField; wholespace=false)
:: AbstractFloatCalculate the (logarithmic) maximum gradient (in norm) of the probability of presence over the field. When wholespace=false (default), the maximum is taken among the points contained in field; otherwise it is taken around the whole space.
Mikrubi.textwrap — Functiontextwrap(str::AbstractString) :: StringGobble all linefeeds ("\n") inside str and replaces them with spaces (""), so long strings can be wrapped to multiple lines in the codes, like the Python package "textwrap". See also tw.
Mikrubi.@tw_str — Macro@tw_str :: StringMacro version of textwrap, without interpolation and unescaping.