| Title: | Analysis of sun- and shade flecks |
|---|---|
| Description: | Functions for the analysis of time series of natural light irradiance measured at high frequency. Denoising, and quantification of duration and amplitude of sun-, shade-, cloud- and wind flecks. |
| Authors: | Pedro J. Aphalo [aut, cre] (ORCID: <https://orcid.org/0000-0003-3385-972X>), Maxime Durand [aut] (ORCID: <https://orcid.org/0000-0002-8991-3601>) |
| Maintainer: | Pedro J. Aphalo <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 0.0.0.9000 |
| Built: | 2026-05-22 09:05:36 UTC |
| Source: | https://github.com/aphalo/photobiologyFlecks |
Functions for the analysis of time series of natural light irradiance measured at high frequency. Denoising, and quantification of duration and amplitude of sun-, shade-, cloud- and wind flecks.
Maintainer: Pedro J. Aphalo [email protected] (ORCID)
Authors:
Maxime Durand [email protected] (ORCID)
Durand M, Matule B, Burgess AJ, Robson TM. 2021. Sunfleck properties from time series of fluctuating light. Agricultural and Forest Meteorology 308-309, 108554. doi:10.1016/j.agrformet.2021.108554
Useful links:
Report bugs at https://github.com/aphalo/photobiologyFlecks/issues/
Combine time-series from higher resolution to lower resolution.
combine_integ(time, var, new.integ.ms = 20)combine_integ(time, var, new.integ.ms = 20)
time |
numeric Vector of times from the time-series (x-axis). |
var |
numeric Vector of observations from the time-series (y-axis). |
new.integ.ms |
numeric New integration time for the returned time-series. Use same units as main time-series. |
Timeseries of observations at different time steps, e.g. 10 ms and 30 ms are combined into a new data frame. Higher resolution needs to be a multiple of lower resolution.
A data frame
Replace running differences smaller than a threshold by zeros in selected columns of data frames.
denoise_chunks( data, time.name = "TIMESTAMP", qty.name = NULL, absolute.threshold = 0, relative.threshold = 0.05, range.baseline = 0, add.signs = FALSE, verbose = FALSE )denoise_chunks( data, time.name = "TIMESTAMP", qty.name = NULL, absolute.threshold = 0, relative.threshold = 0.05, range.baseline = 0, add.signs = FALSE, verbose = FALSE )
data |
data.frame or list of |
time.name |
character vector of length one Name of the variable containing time stamps for the observations. |
qty.name |
character vector Name(s) of variable(s) in |
absolute.threshold |
numeric The largest difference values to ignore, i.e., to set to zero. Expressed as a change per second. |
relative.threshold |
numeric The multiplier to apply to the spread of
|
range.baseline |
numeric An additional value included in the computation
of the range of the observations. Set |
add.signs |
logical Flag indicating if values returned by
|
verbose |
logical Report data columns found. Useful for debugging. |
When searching for changes in the sign of differences we may need to discard small values introduced by "measurement noise". These functions replace differences smaller than a threshold by zeros. This approach is an alternative to smoothing, which can be difficult to implement for irregular time series.
The argument passed to data can be either a bare data.frame
object or a list containing one or more data frames, such as that
returned by split_chunks().
The argument passed to absolute.threshold is directly expressed as the
smallest value of differences to be retained with any smaller differences
replaced by zero. In contrast, the argument passed to
relative.threshold is a multiplier applied to the spread of the
observations, where the spread is the difference between the largest and the
smallest observed value for a given variable in data plus
range.baseline. The values of the two thresholds are combined, so that
the largest of the two values is used. Setting either threshold equal to
zero, forces the other the one to be always used. The threshold used is
computed as
max(abs(diff(range(c(range.baseline, x), na.rm = TRUE))) * relative.threshold, absolute.threshold)
with differences in data smaller than the threshold, set to zero.
The intended use of absolute.threshold is to allow filtering out both
zero or dark noise and gain noise in the observed data, i.e., to be able to
apply a minimum denoising even in the complete absence of flecks, but
otherwise apply a denoising relative to the value of the largest observation
or relative to the spread of the observations.
denoise_chunks() returns a copy of data, either a
data.frame or a list. Each dataframe with each column of
differences named in qty.name, if present, replaced by the value
returned by function denoise_diffs() applied to it, and
optionally with columns added with the result of calling
sign() on the denoised differences.
split_chunks() and check_colnames().
Detects and characterizes "flecks" in a time series of irradiances.
find_flecks( time, var, zero.lim = 5e-04, minTime = 0, minAmp = 0, minPdiff = 0, asymmetry = 1/4, trimCV = 0.05, asmMethod = c("mean", "max", "rm"), bounds = c(0, 1), timeSplit = 10, shadeflecks = FALSE, time.digits = 3, var.digits = 2, verbose = TRUE )find_flecks( time, var, zero.lim = 5e-04, minTime = 0, minAmp = 0, minPdiff = 0, asymmetry = 1/4, trimCV = 0.05, asmMethod = c("mean", "max", "rm"), bounds = c(0, 1), timeSplit = 10, shadeflecks = FALSE, time.digits = 3, var.digits = 2, verbose = TRUE )
time |
numeric Vector of times from the time-series (x-axis). |
var |
numeric Vector of observations from the time-series (y-axis). |
zero.lim |
numeric Limit for multiplication, only values higher than
|
minTime |
numeric Flecks found with duration below |
minAmp |
numeric Flecks found with amplitude below |
minPdiff |
numeric Flecks found with a percent difference between peak
and baseline that is below |
asymmetry |
numeric Threshold value to qualify fleck as asymeric. Asymetry happens when the baseline have large difference in their value. |
trimCV |
Control trimming threshold. Fleck baselines are trimmed iteratively based on the coefficient of variation between two points at each baseline side. |
asmMethod |
character One of |
bounds |
numeric vector of length 2 For relative amplitude calculations, normalize between 0 and 1 by default. |
timeSplit |
integer Increase time-series data frequency by linear interpolation. A value of 10 usually guarantee accuracy. |
shadeflecks |
logical If true, run the function in shadefleck mode instead of in the default sunfleck mode, i.e., the function will find troughs instead of peaks in the data. |
time.digits, var.digits
|
integer Argument passed to parameter
|
verbose |
logical If |
A sunfleck is characterized by an increase followed by a decrease
in irradiance. As a first step zero crossings of the derivative are located
using the same code as in find_zeros(). In a second step,
flecks are searched and once found, are checked for asymmetry
between baselines. If found, tries to extend baseline a bit. If still
asymmetric, behaviour is as defined by asmMethod. Then fleck is
compared againts criteria given by minTime, minAmp and
minPdiff. If passed, the baselines are trimmed. Conditions are
checked once more, and trimming is reversed if conditions are not passed
any more. In the last step overlapping flecks are removed and the time
interval between successive flecks is computed. Finally, the fleck properties
are returned in a data frame.
Durand M, Matule B, Burgess AJ, Robson TM. 2021. Sunfleck properties from time series of fluctuating light. Agricultural and Forest Meteorology 308-309, 108554. doi:10.1016/j.agrformet.2021.108554
First function to use on discrete time-series. Returns a vector giving the position of every zero crossing within the time-series.
find_zeros(time, var, zero.lim = 5e-04, timeSplit = 10, return_n1n2 = FALSE)find_zeros(time, var, zero.lim = 5e-04, timeSplit = 10, return_n1n2 = FALSE)
time |
numeric Vector of times from the time-series (x-axis). |
var |
numeric Vector of observations from the time-series (y-axis). |
zero.lim |
numeric Limit for multiplication, only values higher than
|
timeSplit |
numeric Increase time-series frequency (value at t = 0 are copied from t = 1 to t = 9, etc.). 10 usually garantees accuracy. |
return_n1n2 |
logical If |
For every time point, the numerical derivative is calculated using
diff() and each sequential time point of the numerical
derivative is multiplied. When the result is negative, the time-series
crosses zero.
An integer vector or a data frame with two variables.
Durand M, Matule B, Burgess AJ, Robson TM. 2021. Sunfleck properties from time series of fluctuating light. Agricultural and Forest Meteorology 308-309, 108554. doi:10.1016/j.agrformet.2021.108554
Split a time series stored in a data frame at breaks (long time steps), returning a list of data frames or data chunks.
split_chunks( data, time.name = "TIMESTAMP", qty.name = NULL, time.step = NULL, chunk.min.time, chunk.min.rows = 2, add.diffs = TRUE, verbose = FALSE, na.rm = TRUE )split_chunks( data, time.name = "TIMESTAMP", qty.name = NULL, time.step = NULL, chunk.min.time, chunk.min.rows = 2, add.diffs = TRUE, verbose = FALSE, na.rm = TRUE )
data |
data.frame Containing at least one coloumn with time stamps and one column with a measured quantity. |
time.name |
character vector of length one Name of the variable containing time stamps for the observations. |
qty.name |
character vector Name(s) of variable(s) in |
time.step |
numeric The duration in seconds of one time step within a
chunk. If |
chunk.min.time |
numeric or duration Length of minimum time step length between data chunks. If numeric, expressed in seconds. |
chunk.min.rows |
integer The minimum number of rows that a chunk must have not to be discarded. |
add.diffs |
logical Flag indicating if values returned by
|
verbose |
logical Report chunk names and lengths at each iteration. Useful for debugging. |
na.rm |
logical Omit rows of |
When time series of data are acquired in bursts or chunks separated by longer time intervals it can be useful to extract the chunks into separate data frames before further analysis. This implementation does not assume the same duration for all chunks or the gaps, it searches for time intervals longer than a threshold duration and splits the data at these points. If the data contains no gaps, the whole data is returned as a single chunk.
When a minimum length for the individuals chunks is set with an argument to
chunk.min.rows, chunks with fewer rows are discarded silently,
unless verbose = TRUE.
With add.diffs = TRUE the running differences between values in the
current row and the one above are added to the returned data frames. The
value in the first row is NA for running differences, except for
the time, in which case it is the time difference to the precceeding value
in data.
Method diff() must be available for the class of the variable
named by the argument to time.name. The class of this column is in
most cases numeric, date, or time. If add.diffs = TRUE this
requirement also applies to the variable(s) named by the argument passed to
qty.name.
The number of chunks in the returned list of data frames and their lengths
are reported in a message().
A list of data frames of varying length, depending on the number of
chunks found, possibly of length zero. The members of the list are named
based on the starting time of each chunk. The variables included in the
member data frames are those named by time.name and qty.name
and optionally, their running differences.
Compute statistical summaries for each numeric variable in a time series chunk or in a list of time series chunks.
summarize_chunks( l, FUN = summarize_chunk, parameter.names = FUN(NULL, return.names = TRUE), add.times = FALSE, tz = "UTC", time.shift = 0, add.solar.times = add.times && !is.null(geocode), geocode = NULL, verbose = FALSE ) summarize_chunk(x, return.names = FALSE)summarize_chunks( l, FUN = summarize_chunk, parameter.names = FUN(NULL, return.names = TRUE), add.times = FALSE, tz = "UTC", time.shift = 0, add.solar.times = add.times && !is.null(geocode), geocode = NULL, verbose = FALSE ) summarize_chunk(x, return.names = FALSE)
l |
a named list of data frames. |
FUN |
function The function used to compute the summary of a numeric vector. |
parameter.names |
The names used to identify the members of the vector
returned by |
add.times |
logical If |
tz |
character The time zone used to decode times. |
time.shift |
numeric A time shift expressed in hours. Only needed if
original times do not match those at the time zone passed as argument
to |
add.solar.times |
logical If |
geocode |
A one row |
verbose |
logical Report chunk names while walking through |
x |
numeric vector. |
return.names |
logical Return the names of the parameters as a character vector instead of the computed numeric values. |
A fixed set of summaries is computed for each numeric variable in a chunk
and indexed by additional columns parameter and chunk.
If add.times == TRUE and the names in l are strings describing
instants in time, they are decoded using functions
anytime() and anydate() and
added as columns time and date. If the argument passed to
l was the list returned by
split_chunks() the times and
dates match those of the first time point in each chunk.
An argument passed to time.shift can be used to correct a consistent
error in times such as a badly set clock during acquisition or when data
acquisition times have been in UTC plus a constant time shift year round.
A tibble with 10 rows for each chunk in the input, with one summary
per row and one column for each numeric column in the chunks with their
original names plus columns chunk and parameter, and
if requested columns time, date and solar.time added.
A numeric vector of length ten, or a character vector of the same length.
Time series of photosynthetically active radiation (PAR) measured at 50 ms time intervals in 15 min-long bursts every 30 min.
three_chunks.tbthree_chunks.tb
A "data.frame" object with 54700 rows.
The data.frame named three_chunks.tb contains
time stamps and PAR photon irradiances (PPFD).
The variables in each member spectrum are as follows:
time
Q_PAR
head(three_chunks.tb)head(three_chunks.tb)