Title: | Origin Estimation for Propagation Processes on Complex Networks |
---|---|
Description: | Performs network-based source estimation. Different approaches are available: effective distance median, recursive backtracking, and centrality-based source estimation. Additionally, we provide public transportation network data as well as methods for data preparation, source estimation performance analysis and visualization. |
Authors: | Juliane Manitz [aut, cre], Jonas Harbering [ctb], Jun Li [ctb] |
Maintainer: | Juliane Manitz <[email protected]> |
License: | GPL-3 |
Version: | 1.1-6 |
Built: | 2025-03-19 02:45:50 UTC |
Source: | https://github.com/jmanitz/netorigin |
convert individual event information to aggregated information per network node
aggr_data(dat, from = NULL, cumsum = TRUE)
aggr_data(dat, from = NULL, cumsum = TRUE)
dat |
|
from |
character in |
cumsum |
logical indicating whether data is aggregated by cumulative sum, default is |
data.frame
of dimension (TxK)
, where T
is the number of observation times and K
the number of network nodes. Thus, each row represents a snapshot of the spreading process at a specific observation time with the event magnitude observed at the network nodes. Rownames are observation times, colnames are node names.
Other data_handling:
read_DB_data()
analyze public transportation network characteristics
analyze_ptn(g)
analyze_ptn(g)
g |
|
'data.frame':
1 obs. of 7 variables:
vcount
number of nodes,
ecount
number of edges,
density
network graph density,
av_deg
average degree,
av_cent
average unit betweenness,
diam
diameter, and
trans
transitivity.
Details to the computation and interpretation can be found in:
Kolaczyk, E. D. (2009). Statistical analysis of network data: methods and models. Springer series in statistics. Springer. <DOI: 10.1007/978-0-387-88146-1>
Manitz, J. (2014): Statistical Inference for Propagation Processes on Complex Networks. Ph.D. thesis, Georg-August-University Goettingen. Verlag Dr.~Hut, ISBN 978-3-8439-1668-4. Available online: https://ediss.uni-goettingen.de/handle/11858/00-1735-0000-0022-5F38-B.
Other network helper:
plot_ptn()
data(ptnAth) analyze_ptn(ptnAth) data(ptnGoe) analyze_ptn(ptnGoe)
data(ptnAth) analyze_ptn(ptnAth) data(ptnGoe) analyze_ptn(ptnGoe)
compute_mu_lambda
computes 'mu' and 'lambda' from training data and
selected observers, for Gaussian source estimation with prior information
compute_mu_lambda(train.data, obs.vec, candidate.thres)
compute_mu_lambda(train.data, obs.vec, candidate.thres)
train.data |
training data for 'mu' and 'lambda' list computation format-list, length-number of cities/nodes format of train.data[[i]]- number of simulated scenarios x number of cities/nodes, each entry is minimum arrival time |
obs.vec |
list of cities ids used as observers |
candidate.thres |
threshold to determine if a node/city could be a candidate for source e.g. if we set this number to be 0.2, if in [x] simulated scenarios, there are only 10 percent scenarios a node [a] is infected, we do not think [a] is a potential source |
a list, consisting of 3 variables: mu.mat, lambda.list, poss.candidate.vec mu.mat: matrix- number of cities/nodes x number of observers, each row represents- if this node is the source, the mean of arrival time vector; lambda.list: a length-number of cities/nodes list, each element is a number of observers x number of observers matrix- if a node is the source, the covariance matrix for arrival time vector; poss.candidate.vec: a boolean vector indicating if a node has the potential to be the source
Jun Li
Li, J., Manitz, J., Bertuzzo, E. and Kolaczyk, E.D. (2020). Sensor-based localization of epidemic sources on human mobility networks. arXiv preprint Available online: https://arxiv.org/abs/2011.00138.
# fake training data, indicating format nnodes <- 851 max.day <- 1312 nsimu <- 20 train.data.fake <- list() for (j in 1:nnodes) { train.data.fake[[j]] <- matrix(sample.int(max.day, size = nsimu*nnodes, replace = TRUE), nrow = nsimu, ncol = nnodes) } obs.vec <- (1:9) candidate.thres <- 0.3 mu.lambda.list <- compute_mu_lambda(train.data.fake, obs.vec, candidate.thres)
# fake training data, indicating format nnodes <- 851 max.day <- 1312 nsimu <- 20 train.data.fake <- list() for (j in 1:nnodes) { train.data.fake[[j]] <- matrix(sample.int(max.day, size = nsimu*nnodes, replace = TRUE), nrow = nsimu, ncol = nnodes) } obs.vec <- (1:9) candidate.thres <- 0.3 mu.lambda.list <- compute_mu_lambda(train.data.fake, obs.vec, candidate.thres)
Delay propagation data examples simulated by LinTim software
delayAth
Delay propagation data generated on the Athens metro network by LinTim software
delayGoe
Delay propagation data generated on the Goettingen bus system by LinTim software
delayAth
Delay data on the Athens metro network. Propagation simulation under consideration of secruity distances and fixed-waiting time delay management. 'data.frame' with 510 observations (10 sequential time pictures for delay spreading pattern from 51 stations) of 53 variables (k0
true source, time
, delays at 51 stations).
delayGoe
Delay data on the directed Goettingen bus system. Progation simulation under consideration of secruity distances and fixed-waiting time delay management. 'data.frame' with 2570 observations (10 sequential time pictures for delay spreading pattern from 257 stations) of 259 variables (k0
true source, time
, delays at 257 stations).
Jonas Harbering
Public transportation network datasets are generated by LinTim software (Integrated Optimization in Public Transportation; https://lintim.net/).
Manitz, J., J. Harbering, M. Schmidt, T. Kneib, and A. Schoebel (2017): Source Estimation for Propagation Processes on Complex Networks with an Application to Delays in Public Transportation Systems. Journal of Royal Statistical Society C (Applied Statistics), 66: 521-536.
## Not run: # compute effective distance data(ptnAth) athnet <- igraph::as_adjacency_matrix(ptnAth, sparse=FALSE) p <- athnet/rowSums(athnet) eff <- eff_dist(p) # apply source estimation data(delayAth) res <- plyr::alply(.data=delayAth[,-c(1:2)], .margins=1, .fun=origin_edm, distance=eff, silent=TRUE, .progress='text') perfAth <- plyr::ldply(Map(performance, x = res, start = as.list(delayAth$k0), list(graph = ptnAth))) ## End(Not run) ## Not run: # compute effective distance data(ptnGoe) goenet <- igraph::as_adjacency_matrix(ptnGoe, sparse=FALSE) p <- goenet/rowSums(goenet) eff <- eff_dist(p) # apply source estimation data(delayGoe) res <- plyr::alply(.data=delayGoe[,-c(1:2)], .margins=1, .fun=origin_edm, distance=eff, silent=TRUE, .progress='text') perfGoe <- plyr::ldply(Map(performance, x = res, start = as.list(delayGoe$k0), list(graph = ptnGoe))) ## End(Not run)
## Not run: # compute effective distance data(ptnAth) athnet <- igraph::as_adjacency_matrix(ptnAth, sparse=FALSE) p <- athnet/rowSums(athnet) eff <- eff_dist(p) # apply source estimation data(delayAth) res <- plyr::alply(.data=delayAth[,-c(1:2)], .margins=1, .fun=origin_edm, distance=eff, silent=TRUE, .progress='text') perfAth <- plyr::ldply(Map(performance, x = res, start = as.list(delayAth$k0), list(graph = ptnAth))) ## End(Not run) ## Not run: # compute effective distance data(ptnGoe) goenet <- igraph::as_adjacency_matrix(ptnGoe, sparse=FALSE) p <- goenet/rowSums(goenet) eff <- eff_dist(p) # apply source estimation data(delayGoe) res <- plyr::alply(.data=delayGoe[,-c(1:2)], .margins=1, .fun=origin_edm, distance=eff, silent=TRUE, .progress='text') perfGoe <- plyr::ldply(Map(performance, x = res, start = as.list(delayGoe$k0), list(graph = ptnGoe))) ## End(Not run)
eff_dist
computes the effective distance between all nodes in the network
eff_dist(p) eff_dijkstra(p, start) spd_dijkstra(p, start)
eff_dist(p) eff_dijkstra(p, start) spd_dijkstra(p, start)
p |
numeric matrix, representing the transition probability matrix for the network graph |
start |
start of path |
A numeric matrix, representing the effective distance between all nodes in the network graph.
Dijkstra, E. W. (1959): A note on two problems in connexion with graphs. Numerische Mathematik, 1, 269-271. <DOI: 10.1007/BF01386390>
Brockmann, D. and Helbing, D. (2013): The hidden geometry of complex, network-driven contagion phenomena. Science, 342, 1337-1342. <DOI: 10.1126/science.1245200>
Manitz, J. (2014): Statistical Inference for Propagation Processes on Complex Networks. Ph.D. thesis, Georg-August-University Goettingen. Verlag Dr. Hut, ISBN 978-3-8439-1668-4. Available online: https://ediss.uni-goettingen.de/handle/11858/00-1735-0000-0022-5F38-B.
# compute effective shortest path distance data(ptnAth) require(igraph) net <- igraph::as_adjacency_matrix(ptnAth, sparse=FALSE) p <- net/rowSums(net) eff <- eff_dist(p) # compute shortest path distance data(ptnAth) athnet <- as_adj(ptnAth, sparse=FALSE) spd <- spd_dijkstra(athnet, start=1) # compare calculations with the one from igraph spd_igraph <- igraph::distances(ptnAth, v=1, algorithm='dijkstra') all(spd[[1]] == spd_igraph)
# compute effective shortest path distance data(ptnAth) require(igraph) net <- igraph::as_adjacency_matrix(ptnAth, sparse=FALSE) p <- net/rowSums(net) eff <- eff_dist(p) # compute shortest path distance data(ptnAth) athnet <- as_adj(ptnAth, sparse=FALSE) spd <- spd_dijkstra(athnet, start=1) # compare calculations with the one from igraph spd_igraph <- igraph::distances(ptnAth, v=1, algorithm='dijkstra') all(spd[[1]] == spd_igraph)
initial_condition_sib_model
Compute Initial Condition for Function SIB_SS
initial_condition_sib_model( POP_node, sigma, mu_B, theta, node_in, in_prevalence = 0.001 )
initial_condition_sib_model( POP_node, sigma, mu_B, theta, node_in, in_prevalence = 0.001 )
POP_node |
vector, length represents number of cities/nodes; vector represents population at each node |
sigma |
symptomatic ratio, i.e., fraction of infected people that develop symptoms and are infective. (The remaining fraction enters directly the recovered compartment.) |
mu_B |
death rate of V.cholerae in the aquatic environment (day^-1) |
theta |
contamination rate |
node_in |
index/indices for initial infected node(s) |
in_prevalence |
initial prevalence of symptomatic infected in a node, default is 0.1% |
a 5 x number of nodes matrix, each row represents the following for all the nodes: Row 1: number of suspectible people, i.e., population excpect infected and recovered for each node; Row 2: number of infected people; Row 3: number of recovered people; Row 4: bacteria concentration in equilibrium with infected individuals; Row 2: number of infected people, but representing cumulative cases
Jun Li
set.seed(2020) popu <- rep(20000, 10) sigma <- 0.05 mu_B <- 0.2 theta_max <- 16 theta <- runif(10, 0.1, 0.9) * theta_max y0 <- initial_condition_sib_model(popu, sigma, mu_B, theta, c(3))
set.seed(2020) popu <- rep(20000, 10) sigma <- 0.05 mu_B <- 0.2 theta_max <- 16 theta <- runif(10, 0.1, 0.9) * theta_max y0 <- initial_condition_sib_model(popu, sigma, mu_B, theta, c(3))
Performs different approaches for network-based source estimation: effective distance median, recursive backtracking, and centrality-based source estimation. Additionally, we provide public transportation network data as well as methods for data preparation, source estimation performance analysis and visualization.
The main function for origin estimation of propagation processes on complex network is origin
. Different methods are available: effective distance median ('edm'
), recursive backtracking ('backtracking'
), and centrality-based source estimation ('centrality'
).
For more details on the methodological background, we refer to the corresponding publications.
Juliane Manitz with contributions by Jonas Harbering
Manitz, J., J. Harbering, M. Schmidt, T. Kneib, and A. Schoebel (2017): Source Estimation for Propagation Processes on Complex Networks with an Application to Delays in Public Transportation Systems. Journal of Royal Statistical Society C (Applied Statistics), 66: 521-536.
Manitz, J., Kneib, T., Schlather, M., Helbing, D. and Brockmann, D. (2014) Origin detection during food-borne disease outbreaks - a case study of the 2011 EHEC/HUS outbreak in Germany. PLoS Currents Outbreaks, 1. <DOI: 10.1371/currents.outbreaks.f3fdeb08c5b9de7c09ed9cbcef5f01f2>
Comin, C. H. and da Fontoura Costa, L. (2011) Identifying the starting point of a spreading process in complex networks. Physical Review E, 84. <DOI: 10.1103/PhysRevE.84.056105>
This is the main function for origin estimation for propagation processes on complex networks. Different methods are available: effective distance median ('edm'
), recursive backtracking ('backtracking'
), and centrality-based source estimation ('centrality'
).
For details on the methodological background, we refer to the corresponding publications.
origin_edm
for effective distance-median origin estimation (Manitz et al., 2016)
origin(events, type = c("edm", "backtracking", "centrality", "bayesian"), ...) origin_edm(events, distance, silent = TRUE) origin_backtracking(events, graph, start_with_event_node = TRUE, silent = TRUE) origin_centrality(events, graph, silent = TRUE) origin_bayesian( events, thres.vec, obs.vec, mu.mat, lambda.list, poss.candidate.vec, prior, use.prior = TRUE )
origin(events, type = c("edm", "backtracking", "centrality", "bayesian"), ...) origin_edm(events, distance, silent = TRUE) origin_backtracking(events, graph, start_with_event_node = TRUE, silent = TRUE) origin_centrality(events, graph, silent = TRUE) origin_bayesian( events, thres.vec, obs.vec, mu.mat, lambda.list, poss.candidate.vec, prior, use.prior = TRUE )
events |
numeric vector of event counts at a specific time point; if type is 'bayesian', 'events' is a matrix, number of nodes x time points; entries represent number of cases |
type |
character specifying the method, |
... |
parameters to be passed to origin methods |
distance |
numeric matrix specifying the distance matrix (for |
silent |
locigal, should the messages be suppressed? |
graph |
igraph object specifying the underlying network graph (for |
start_with_event_node |
logical specifying whether backtracking only starts from nodes that experienced events (for |
thres.vec |
vector, length represents number of cities/nodes, representing thresholds for cities/nodes that they are infected |
obs.vec |
list of cities ids used as observers |
mu.mat |
matrix- number of cities/nodes x number of observers, each row represents - if this node is the source, the mean of arrival time vector |
lambda.list |
a length-number of cities/nodes list, each element is a number of observers x number of observers matrix - if a node is the source, the covariance matrix for arrival time vector |
poss.candidate.vec |
a boolean vector indicating if a node has the potential to be the source |
prior |
vector, length - number of cities/nodes, prior for cities |
use.prior |
boolean, TRUE or FALSE, if use prior, default TRUE |
origin_edm
returns an object of class origin
, list with
est
origin estimate
aux
data.frame
with auxiliary variables
id
as node identifier,
events
for event magnitude,
wmean
for weighted mean,
wvar
for weighted variance, and
mdist
mean distance from a node to all other nodes.
type = 'edm'
effective distance median origin estimation
origin_backtracking
returns an object of class origin
, list with
est
origin estimate
aux
data.frame
with auxiliary variables
id
as node identifier,
events
for event magnitude, and
bcount
for backtracking counts, how often backtracking identifies this source node.
type = 'backtracking'
backtracking origin estimation
origin_centrality
returns an object of class origin
, list with
est
origin estimate
aux
data.frame
with auxiliary variables
id
as node identifier,
events
for event magnitude, and
cent
for node centrality (betweenness divided degree).
type = 'centrality'
centrality-based origin estimation
a dataframe with columns 'nodes' and 'probab', indicating nodes indices and their posteriors
Juliane Manitz with contributions by Jonas Harbering
Jun Li
Comin, C. H. and da Fontoura Costa, L. (2011). Identifying the starting point of a spreading process in complex networks. Physical Review E, 84. <doi: 10.1103/PhysRevE.84.056105>
Manitz, J., J. Harbering, M. Schmidt, T. Kneib, and A. Schoebel (2017): Source Estimation for Propagation Processes on Complex Networks with an Application to Delays in Public Transportation Systems. Journal of Royal Statistical Society C (Applied Statistics), 66: 521-536. <doi: 10.1111/rssc.12176>
Manitz, J. (2014). Statistical Inference for Propagation Processes on Complex Networks. Ph.D. thesis, Georg-August-University Goettingen. Verlag Dr.~Hut, ISBN 978-3-8439-1668-4. Available online: https://ediss.uni-goettingen.de/handle/11858/00-1735-0000-0022-5F38-B.
Manitz, J., Kneib, T., Schlather, M., Helbing, D. and Brockmann, D. (2014). Origin detection during food-borne disease outbreaks - a case study of the 2011 EHEC/HUS outbreak in Germany. PLoS Currents Outbreaks, 1. <doi: 10.1371/currents.outbreaks.f3fdeb08c5b9de7c09ed9cbcef5f01f2>
Li, J., Manitz, J., Bertuzzo, E. and Kolaczyk, E.D. (2020). Sensor-based localization of epidemic sources on human mobility networks. arXiv preprint Available online: https://arxiv.org/abs/2011.00138.
Other origin-est:
origin_multiple()
data(delayGoe) # compute effective distance data(ptnGoe) goenet <- igraph::as_adjacency_matrix(ptnGoe, sparse=FALSE) p <- goenet/rowSums(goenet) eff <- eff_dist(p) # apply effective distance median source estimation om <- origin(events=delayGoe[10,-c(1:2)], type='edm', distance=eff) summary(om) plot(om, 'mdist',start=1) plot(om, 'wvar',start=1) performance(om, start=1, graph=ptnGoe) # backtracking origin estimation (Manitz et al., 2016) ob <- origin(events=delayGoe[10,-c(1:2)], type='backtracking', graph=ptnGoe) summary(ob) plot(ob, start=1) performance(ob, start=1, graph=ptnGoe) # centrality-based origin estimation (Comin et al., 2011) oc <- origin(events=delayGoe[10,-c(1:2)], type='centrality', graph=ptnGoe) summary(oc) plot(oc, start=1) performance(oc, start=1, graph=ptnGoe) # fake training data, indicating format nnodes <- 851 max.day <- 1312 nsimu <- 20 max.case.per.day <- 10 train.data.fake <- list() for (j in 1:nnodes) { train.data.fake[[j]] <- matrix(sample.int(max.day, size = nsimu*nnodes, replace = TRUE), nrow = nsimu, ncol = nnodes) } obs.vec <- (1:9) candidate.thres <- 0.3 mu.lambda.list <- compute_mu_lambda(train.data.fake, obs.vec, candidate.thres) # matrix representing number of cases per node per day cases.node.day <- matrix(sample.int(max.case.per.day, size = nnodes*max.day, replace = TRUE), nrow = nnodes, ncol = max.day) nnodes <- dim(cases.node.day)[1] # number of nodes # fixed threshold for all nodes - 10 infected people thres.vec <- rep(10, nnodes) # flat/non-informative prior prior <- rep(1, nnodes) result2.df <- origin(events = cases.node.day, type = "bayesian", thres.vec = thres.vec, obs.vec = obs.vec, mu.mat=mu.lambda.list$mu.mat, lambda.list = mu.lambda.list$lambda.list, poss.candidate.vec=mu.lambda.list$poss.candidate.vec, prior=prior, use.prior=TRUE)
data(delayGoe) # compute effective distance data(ptnGoe) goenet <- igraph::as_adjacency_matrix(ptnGoe, sparse=FALSE) p <- goenet/rowSums(goenet) eff <- eff_dist(p) # apply effective distance median source estimation om <- origin(events=delayGoe[10,-c(1:2)], type='edm', distance=eff) summary(om) plot(om, 'mdist',start=1) plot(om, 'wvar',start=1) performance(om, start=1, graph=ptnGoe) # backtracking origin estimation (Manitz et al., 2016) ob <- origin(events=delayGoe[10,-c(1:2)], type='backtracking', graph=ptnGoe) summary(ob) plot(ob, start=1) performance(ob, start=1, graph=ptnGoe) # centrality-based origin estimation (Comin et al., 2011) oc <- origin(events=delayGoe[10,-c(1:2)], type='centrality', graph=ptnGoe) summary(oc) plot(oc, start=1) performance(oc, start=1, graph=ptnGoe) # fake training data, indicating format nnodes <- 851 max.day <- 1312 nsimu <- 20 max.case.per.day <- 10 train.data.fake <- list() for (j in 1:nnodes) { train.data.fake[[j]] <- matrix(sample.int(max.day, size = nsimu*nnodes, replace = TRUE), nrow = nsimu, ncol = nnodes) } obs.vec <- (1:9) candidate.thres <- 0.3 mu.lambda.list <- compute_mu_lambda(train.data.fake, obs.vec, candidate.thres) # matrix representing number of cases per node per day cases.node.day <- matrix(sample.int(max.case.per.day, size = nnodes*max.day, replace = TRUE), nrow = nnodes, ncol = max.day) nnodes <- dim(cases.node.day)[1] # number of nodes # fixed threshold for all nodes - 10 infected people thres.vec <- rep(10, nnodes) # flat/non-informative prior prior <- rep(1, nnodes) result2.df <- origin(events = cases.node.day, type = "bayesian", thres.vec = thres.vec, obs.vec = obs.vec, mu.mat=mu.lambda.list$mu.mat, lambda.list = mu.lambda.list$lambda.list, poss.candidate.vec=mu.lambda.list$poss.candidate.vec, prior=prior, use.prior=TRUE)
Multiple origin estimation using community partitioning
origin_multiple( events, type = c("edm", "backtracking", "centrality"), graph, no = 2, distance, fast = TRUE, ... )
origin_multiple( events, type = c("edm", "backtracking", "centrality"), graph, no = 2, distance, fast = TRUE, ... )
events |
numeric vector of event counts at specific time point |
type |
character specifying the method, |
graph |
igraph object specifying the underlying network graph |
no |
numeric specifying the number of supposed origins |
distance |
numeric matrix specifying the distance matrix |
fast |
logical specifying community partitioning algorithm, default is |
... |
parameters to be passed to origin methods |
origin_multiple
returns an list object with objects of class origin
of length no
Zang, W., Zhang, P., Zhou, C. and Guo, L. (2014) Discovering Multiple Diffusion Source Nodes in Social Networks. Procedia Computer Science, 29, 443-452. <DOI: 10.1016/j.procs.2014.05.040>
Other origin-est:
origin()
origin
print
produces an output for objects of class origin
.
## S3 method for class 'origin' print(x, ...) ## S3 method for class 'origin' summary(object, x = object, ...) ## S3 method for class 'origin' plot(x, y = "id", start, ...) ## S3 method for class 'origin' performance(x, start, graph = NULL, ...)
## S3 method for class 'origin' print(x, ...) ## S3 method for class 'origin' summary(object, x = object, ...) ## S3 method for class 'origin' plot(x, y = "id", start, ...) ## S3 method for class 'origin' performance(x, start, graph = NULL, ...)
x |
object of class |
... |
further arguments to be passed to default |
object |
object of class |
y |
character specifying the variable being plotted at the y-axis; options are |
start |
numeric, giving the node of the true origin |
graph |
|
performance.origin
returns a data.frame
with variables
origin = start
representing the true origin,
est
the estimated node of origin,
hitt
logical indicating whether origin estimation is correct or not,
rank
rank of correct detection,
spj
number of segments from estimated origin to true origin (requires an igraph
object),
dist
distance along the shortest path from estimated origin to true origin (igraph
edge attribute length
)
data(ptnGoe) data(delayGoe) res <- origin(events=delayGoe[10,-c(1:2)], type='centrality', graph=ptnGoe) res summary(res) plot(res, start=1) performance(res, start=1, graph=ptnGoe)
data(ptnGoe) data(delayGoe) res <- origin(events=delayGoe[10,-c(1:2)], type='centrality', graph=ptnGoe) res summary(res) plot(res, start=1) performance(res, start=1, graph=ptnGoe)
generic method for performance evaluation
performance(x, ...)
performance(x, ...)
x |
object |
... |
further arguments |
origin-methods
plot_performance
A plot method combining a time series of performance results.
plot_performance( x, var = "rank", add = FALSE, offset = NULL, log = FALSE, col = 1, ylim = NULL, text.padding = 0.9, ... )
plot_performance( x, var = "rank", add = FALSE, offset = NULL, log = FALSE, col = 1, ylim = NULL, text.padding = 0.9, ... )
x |
|
var |
character, variable to be plotted, |
add |
logical, should be added to another performance plot |
offset |
|
log |
logical, should y-axis be logarithmized? |
col |
numeric or character, color of lines |
ylim |
numeric vector, range of y axis |
text.padding |
a numeric value specifying the factor for the text position relative to the y values |
... |
further graphical parameters passed to default |
## Not run: ### delays on Goettingen bus network # compute effective distance data(ptnGoe) goenet <- igraph::as_adjacency_matrix(ptnGoe, sparse=FALSE) p <- goenet/rowSums(goenet) eff <- eff_dist(p) # apply source estimation data(delayGoe) if (requireNamespace("aplyr", quietly = TRUE)) { res <- alply(.data=delayGoe[11:20,-c(1:2)], .margins=1, .fun=origin_edm, distance=eff, silent=TRUE, .progress='text') perfGoe <- ldply(Map(performance, x = res, start = 2, list(graph = ptnGoe))) # performance plots plot_performance(perfGoe, var='rank', ylab='rank of correct detection', text.padding=0.5) plot_performance(perfGoe, var='dist', ylab='distance to correct detection') } ### delays on Athens metro network # compute effective distance data(ptnAth) athnet <- igraph::as_adjacency_matrix(ptnAth, sparse=FALSE) p <- athnet/rowSums(athnet) eff <- eff_dist(p) # apply source estimation data(delayAth) if (requireNamespace("aplyr", quietly = TRUE)) { res <- alply(.data=delayAth[11:20,-c(1:2)], .margins=1, .fun=origin_edm, distance=eff, silent=TRUE, .progress='text') perfAth <- ldply(Map(performance, x = res, start = as.list(delayAth$k0), list(graph = ptnAth))) # performance plots plot_performance(perfAth, var='rank', ylab='rank of correct detection',text.padding=0.5) plot_performance(perfAth, var='dist', ylab='distance to correct detection') } ## End(Not run)
## Not run: ### delays on Goettingen bus network # compute effective distance data(ptnGoe) goenet <- igraph::as_adjacency_matrix(ptnGoe, sparse=FALSE) p <- goenet/rowSums(goenet) eff <- eff_dist(p) # apply source estimation data(delayGoe) if (requireNamespace("aplyr", quietly = TRUE)) { res <- alply(.data=delayGoe[11:20,-c(1:2)], .margins=1, .fun=origin_edm, distance=eff, silent=TRUE, .progress='text') perfGoe <- ldply(Map(performance, x = res, start = 2, list(graph = ptnGoe))) # performance plots plot_performance(perfGoe, var='rank', ylab='rank of correct detection', text.padding=0.5) plot_performance(perfGoe, var='dist', ylab='distance to correct detection') } ### delays on Athens metro network # compute effective distance data(ptnAth) athnet <- igraph::as_adjacency_matrix(ptnAth, sparse=FALSE) p <- athnet/rowSums(athnet) eff <- eff_dist(p) # apply source estimation data(delayAth) if (requireNamespace("aplyr", quietly = TRUE)) { res <- alply(.data=delayAth[11:20,-c(1:2)], .margins=1, .fun=origin_edm, distance=eff, silent=TRUE, .progress='text') perfAth <- ldply(Map(performance, x = res, start = as.list(delayAth$k0), list(graph = ptnAth))) # performance plots plot_performance(perfAth, var='rank', ylab='rank of correct detection',text.padding=0.5) plot_performance(perfAth, var='dist', ylab='distance to correct detection') } ## End(Not run)
A plot method for public transportation networks (PTNs).
plot_ptn( g, color.coding = NULL, color.scheme = rev(sequential_hcl(5)), legend = FALSE, ... )
plot_ptn( g, color.coding = NULL, color.scheme = rev(sequential_hcl(5)), legend = FALSE, ... )
g |
|
color.coding |
numeric vector with length equal to the number of network nodes |
color.scheme |
character vector of length 5 indicating the |
legend |
logical indicating whether legend for color-coding should be added or not. |
... |
further arguments to be passed to |
No return value
Other network helper:
analyze_ptn()
data(ptnAth) plot_ptn(ptnAth) data(ptnGoe) plot_ptn(ptnGoe)
data(ptnAth) plot_ptn(ptnAth) data(ptnGoe) plot_ptn(ptnGoe)
Public transportation network datasets from LinTim software (Integrated Optimization in Public Transportation)
ptnAth
The data of the Athens Metro, consisting of 51 nodes and 52 edges.
Vertex attributes: station name, additonal station info.
Edge attributes: track length (in meter), minimal and maximal time required to pass the track (in minutes).
ptnGoe
The data of the Goettingen bus network, consisting of 257 nodes and 548 edges.
Vertex attributes: station name.
Edge attributes: track length (in meter), minimal and maximal time required to pass the track (in minutes).
Juliane Manitz and Jonas Harbering
Public transportation network datasets are extracted from LinTim software (Integrated Optimization in Public Transportation; https://lintim.net/). Special thanks to Anita Schoebel for making the data available.
The Athens Metro data was collected by Konstantinos Gkoumas.
The Goettingen bus network data was collected by Barbara Michalski.
# Athens metro system data(ptnAth) plot_ptn(ptnAth) # Goettingen bus system data(ptnGoe) plot_ptn(ptnGoe)
# Athens metro system data(ptnAth) plot_ptn(ptnAth) # Goettingen bus system data(ptnGoe) plot_ptn(ptnGoe)
Reads a data file as provided by 'Deutsche Bahn' (for internal use).
read_DB_data(file)
read_DB_data(file)
file |
character with path and file name containing the variables for 'stationID', 'date', 'hour', 'minutes', and 'delay' |
data.frame
with variables 'node'
, 'time'
, 'delay'
Other data_handling:
aggr_data()
run robustness analysis for a source estimate by subsampling individual events.
robustness( x, type = c("edm", "backtracking", "centrality"), prop, n = 100, ... )
robustness( x, type = c("edm", "backtracking", "centrality"), prop, n = 100, ... )
x |
|
type |
character, specifying the method, |
prop |
numeric, value between zero and one, proportion of events to be sampled |
n |
numeric, number of resamplings |
... |
parameters to be passed to origin methods |
We create subsamples of individual events and their magnitude using a sampling proportion p in [0, 1]. After aggregating the data, we apply the source estimation approach. Using this result, we deduce the relative frequency of how often the source estimate obtained with the complete data set can be recovered by source estimation based on the subsample. Thus, the estimate robustness is assessed by the proportion of estimate recovery.
data.frame
with columns
est
origin estimated when all data is evaluated
rob
estimate uncertainty, computed as the proportion of resamplings when origin estimate was recovered
# generate random delay data data(ptnAth) require(igraph) dat <- data.frame(node = sample(size = 500, make.names(V(ptnAth)$name), replace = TRUE), time = sample(size = 500, 1:10, replace = TRUE), delay = rexp(500, rate=10)) # compute effective distance net <- igraph::as_adjacency_matrix(ptnAth, sparse=FALSE) p <- net/rowSums(net) eff <- eff_dist(p) colnames(eff) <- paste('x.',colnames(eff),sep='') # run robustness analysis r5 <- robustness(x=dat, type='edm', prop=0.5, n=10, distance=eff) summary(r5) plot(r5) # compare results r9 <- robustness(x=dat, type='edm', prop=0.9, n=10, distance=eff) plot(r9, add=TRUE, col='gray')
# generate random delay data data(ptnAth) require(igraph) dat <- data.frame(node = sample(size = 500, make.names(V(ptnAth)$name), replace = TRUE), time = sample(size = 500, 1:10, replace = TRUE), delay = rexp(500, rate=10)) # compute effective distance net <- igraph::as_adjacency_matrix(ptnAth, sparse=FALSE) p <- net/rowSums(net) eff <- eff_dist(p) colnames(eff) <- paste('x.',colnames(eff),sep='') # run robustness analysis r5 <- robustness(x=dat, type='edm', prop=0.5, n=10, distance=eff) summary(r5) plot(r5) # compare results r9 <- robustness(x=dat, type='edm', prop=0.9, n=10, distance=eff) plot(r9, add=TRUE, col='gray')
robustness
print
produces an output for objects of class robustness
## S3 method for class 'robustness' print(x, ...) ## S3 method for class 'robustness' summary(object, x = object, ...) ## S3 method for class 'robustness' plot(x, y = NULL, add = FALSE, ...)
## S3 method for class 'robustness' print(x, ...) ## S3 method for class 'robustness' summary(object, x = object, ...) ## S3 method for class 'robustness' plot(x, y = NULL, add = FALSE, ...)
x |
|
... |
further arguments passed to the default |
object |
object of class |
y |
not used; default |
add |
logical specifying whether this should be added to another robustness plot |
stochastic_sib_model
Stochastic SIB model for infected cases simulation
stochastic_sib_model( mu, beta, rho, sigma, gamma, alpha, mu_B, m = 0.3, theta, nnodes, POP_node, fluxes, time_sim, y0 )
stochastic_sib_model( mu, beta, rho, sigma, gamma, alpha, mu_B, m = 0.3, theta, nnodes, POP_node, fluxes, time_sim, y0 )
mu |
population natality and mortality rate (day^-1) |
beta |
contact rate |
rho |
immunity loss rate (day^-1) |
sigma |
symptomatic ratio, i.e., fraction of infected people that develop symptoms and are infective. (The remaining fraction enters directly the recovered compartment.) |
gamma |
rate at which people recover from cholera (day^-1) |
alpha |
cholera induced mortality rate (day^-1) |
mu_B |
death rate of V.cholerae in the aquatic environment (day^-1) |
m |
parameter for infection force, default value is 0.3 |
theta |
contamination rate |
nnodes |
number of nodes/cities |
POP_node |
vector, length represents number of cities/nodes; vector represents population at each node |
fluxes |
matrix, number of nodes x number of nodes where each row contains the probabilities a person travels from the given city (by Row Index) to another city (by Column Index). |
time_sim |
time steps for simulation, e.g., seq(0, 100, 0.1) |
y0 |
initial condition for stochastic_sib_model, output of 'initial_condition_sib_model' |
a matrix, nnodes x number of time steps, representing number of new cases at each node, each time step
Jun Li
Li, J., Manitz, J., Bertuzzo, E. and Kolaczyk, E.D. (2020). Sensor-based localization of epidemic sources on human mobility networks. arXiv preprint Available online: https://arxiv.org/abs/2011.00138.
set.seed(2020) popu <- rep(20000, 10) sigma <- 0.05 mu_B <- 0.2 theta_max <- 16 theta <- runif(10, 0.1, 0.9) * theta_max y0 <- initial_condition_sib_model(popu, sigma, mu_B, theta, c(3)) time_sim <- seq(0, 1, by=0.1) mu <- 4e-05 beta_max <- 1 rho <- 0 beta <- runif(10, 0.1, 0.9) * beta_max gamma <- 0.2 alpha <- 0 humanmob.mass <- matrix(runif(100, 0.1, 0.9), 10, 10) diag(humanmob.mass) <- 0 for (j in 1:10) { humanmob.mass[j, ] <- humanmob.mass[j, ]/sum(humanmob.mass[j, ]) } simu.list = stochastic_sib_model(mu = mu, beta = beta, rho = rho, sigma = sigma, gamma = gamma, alpha = alpha, mu_B = mu_B, theta = theta, nnodes = 10, POP_node = popu, fluxes = humanmob.mass, time_sim = time_sim, y0 = y0)
set.seed(2020) popu <- rep(20000, 10) sigma <- 0.05 mu_B <- 0.2 theta_max <- 16 theta <- runif(10, 0.1, 0.9) * theta_max y0 <- initial_condition_sib_model(popu, sigma, mu_B, theta, c(3)) time_sim <- seq(0, 1, by=0.1) mu <- 4e-05 beta_max <- 1 rho <- 0 beta <- runif(10, 0.1, 0.9) * beta_max gamma <- 0.2 alpha <- 0 humanmob.mass <- matrix(runif(100, 0.1, 0.9), 10, 10) diag(humanmob.mass) <- 0 for (j in 1:10) { humanmob.mass[j, ] <- humanmob.mass[j, ]/sum(humanmob.mass[j, ]) } simu.list = stochastic_sib_model(mu = mu, beta = beta, rho = rho, sigma = sigma, gamma = gamma, alpha = alpha, mu_B = mu_B, theta = theta, nnodes = 10, POP_node = popu, fluxes = humanmob.mass, time_sim = time_sim, y0 = y0)
This is a helper method for weighted variance computation in origin_edm
, which is the closest to the bootstrap.
var_wtd_mean_cochran(x, w)
var_wtd_mean_cochran(x, w)
x |
numeric vector of values |
w |
numeric vector of weights |
numeric value of weighted variance
Gatz, D. F., and Smith, L. (1995). The standard error of a weighted mean concentration-I. Bootstrapping vs other methods. Atmospheric Environment, 29(11), 1185-1193. <DOI: 10.1016/1352-2310(94)00210-C>
Gatz, D. F., and Smith, L. (1995). The standard error of a weighted mean concentration-II. Estimating confidence intervals. Atmospheric Environment, 29(11), 1195-1200. <DOI: 10.1016/1352-2310(94)00209-4>