"Reusing Search Data in Ranking and Selection: What Could Possibly Go Wrong" by Eckman and Henderson rigorously defines the statistical guarantees for ranking-and-selection (R&S) procedures after random search, and points out that the simulation replications collected in the search phase are conditionally dependent given the sequence of returned systems. Therefore, reusing the search data for R&S may affect the statistical guarantees. The authors further design random search algorithms to demonstrate that the correct selection guarantees of some ranking-and-selection procedures will be compromised if reusing the simulation replications taken during the search. This replicated computation report focuses on the reproducibility of the experiment results in the aforementioned article.
In simulations, probabilistic algorithms and statistical tests, we often generate random integers in an interval (e.g., [0,s)). For example, random integers in an interval are essential to the Fisher-Yates random shuffle. Consequently, popular languages like Java, Python, C++, Swift and Go include ranged random integer generation functions as part of their runtime libraries. Pseudo-random values are usually generated in words of a fixed number of bits (e.g., 32~bits, 64~bits) using algorithms such as a linear congruential generator. We need functions to convert such random words to random integers in an interval ($[0,s)$) without introducing statistical biases. The standard functions in programming languages such as Java or Go involve integer divisions. Unfortunately, division instructions are relatively expensive. We review an unbiased function to generate ranged integers from a source of random words that avoids integer divisions with high probability. To establish the practical usefulness of the approach, we show that this algorithm can multiply the speed of unbiased random shuffling on x64 processors. Our proposed approach has been adopted by the Go language for its implementation of the shuffle function.
Virtual performance is a class of time-dependent performance measures conditional on a particular event occurring at time $\tau_0$ for a (possibly) nonstationary stochastic process; virtual waiting time of a customer arriving to a queue at time $\tau_0$ is one example. Virtual statistics are estimators of the virtual performance. In this paper, we go beyond the mean to propose estimators for the variance, and for the derivative of the mean with respect to time, of virtual performance, examining both their small-sample and asymptotic properties. We also provide a modified $K$-fold cross validation method for tuning the parameter $k$ for the difference-based variance estimator, and evaluate the performance of both variance and derivative estimators via controlled studies and a realistic illustration. The variance and derivative provide useful information that is not apparent in the mean of virtual performance.
Software-defined networking (SDN) enables efficient network management. As the technology matures, utilities are looking to integrate those benefits to their operations technology (OT) networks. To help the community to better understand and evaluate the effects of such integration, we develop DSSnet, a testing platform that combines a power distribution system simulator and an SDN-based network emulator for smart grid planning and evaluation. DSSnet relies on a container-based virtual time system to achieve efficient synchronization between the simulation and emulation systems. To enhance the system scalability and usability, we extend DSSnet to support the distributed controller environment. To enhance system fidelity, we extend the virtual time system to support kernel-based switches. We also evaluate the system performance of DSSnet and demonstrate the usability of DSSnet with a resilient demand response application case study.
This paper addresses the statistical output analysis of transient simulations in the parallel computing environment with fixed computing time. Using parallel computing, most unbiased estimators commonly used based on the output sequence compromise. To rectify this issue, this paper proposes an estimation procedure in the Bayesian framework. The proposed procedure is particularly useful when the computing time depends on the output value in each simulation replication. The effectiveness of our method is demonstrated through studies on queuing simulation and control chart simulation.
In this paper, we propose a role-dependent (RD) data-driven modeling approach to simulate pedestrians' motion in high density scenes. It is commonly observed that pedestrians behave quite differently when walking in dense crowd. Some people explore routes towards their destinations. Meanwhile, some people deliberately follow others, leading to lane formation. Based on these observations, two roles are included in the proposed model: leader and follower. The motion behaviors of leader and follower are modeled separately. Leaders' behaviors are learned from real crowd motion data using state-action pairs while followers' behaviors are calculated based on specific targets that are obtained dynamically during the simulation. The proposed RD model is trained and applied to different real world datasets to evaluate its generality and effectiveness. The simulation results demonstrate that the RD model is capable of simulating crowd behaviors in crowded scenes realistically and reproducing collective crowd behaviors such as lane formation.
We consider the bias arising from time discretization when estimating the threshold crossing probability w(b) := P(supt(0,1] Bt > b), with (Bt)t(0,1] a standard Brownian Motion. We prove that if the discretization is equidistant, then to reach a given target value of the bias, the number of grid points has to grow quadratically in b, as b grows. When considering non-equidistant discretizations (i.e., with b-dependent grid-points), we can substantially improve in this: we show that for such adaptive grids the required number of grid points is independent of b, and in addition we point out how they can be used to construct a strongly efficient algorithm for the estimation of w(b).
As supercomputers approach exascale performance, the increased number of processors translates to an increased demand on the underlying network interconnect. The slim fly network topology, a new low-diameter, low-latency, and low-cost interconnection network is gaining interest as one possible solution for next-generation supercomputing interconnect systems. In this paper, we present a high-fidelity slim fly flit-level model leveraging the Rensselaer Optimistic Simulation System (ROSS) and Co-Design of Exascale Storage (CODES) frameworks. We validate our slim fly model with published work. slim fly model results provided at moderately sized network scales. We further scale the model size up to an unprecedented 1 million compute nodes; and through visualization of network simulation metrics such as link bandwidth, packet latency, and port occupancy, we provide insight into the network behavior. In addition to synthetic workloads, we evaluate the large-scale slim fly models with real communication workloads from applications in the Design Forward program. We also show linear strong scaling of the slim fly model on an Intel cluster achieving a peak event rate of 36 million events per second using 128 MPI tasks to process 7 billion events.
Processes in computer simulations tend to be highly repetitive. In particular, parameter studies further exasperate the situation as the same model is repeatedly executed with only partially varying parameters. Consequently, computer simulations perform identical computations, with identical code, identical input, and hence identical output. These redundant computations waste significant time and energy resources. Memoization, dating back to 1968, allows the caching of these intermediate identical results, thereby significantly speeding up those computations. However, up to now, automated approaches were limited to pure functions. At ACM SIGSIM-PADS 2016, we published, to the best of our knowledge, the first practical approach for automated memoization for impure code. In this work, we extend this approach and evaluate the performance characteristics of a number of extensions: (1) To reduce the memory footprint, we inves- tigate the impact of several cache eviction strategies. (2) We allow the original and the memoized code to coexist via a runtime-switch and analyze the crossover point. (3) By allowing the Memoization Cache to be persisted to disk, we expand the scope to exploratory parameter studies where cached results can now be reused across multiple simulation runs. Altogether, automated memoization for impure code is a valuable technique, the versatility of which we explore further in this article. Additional internal improvements allow us to speed up a case study of an OFDM network simulation by a factor of 80 (compared to 75 in our previous work) with an only marginal increase of memory consumption.
Neuromorphic computing is a broad category of non-von Neumann architectures that mimic biological nervous systems using hardware. Current research shows that this class of computing can execute data classification algorithms using only a tiny fraction of the power conventional CPUs require. This raises the larger research question: how might neuromorphic computing be used to improve application performance, power consumption, and overall system reliability of future supercomputers? To address this question, an open-source neuromorphic processor architecture simulator called NeMo is being developed. This effort will enable the design space exploration of potential heterogeneous compute systems that combine traditional CPUs, GPUs, and neuromorphic hardware. This paper examines the design, implementation, and performance of NeMo. Demonstration of NeMos efficient execution using 2048 nodes of an IBM Blue Gene/Q system, modeling 8,388,608 neuromorphic processing cores is reported. The peak performance of NeMo is just over ten billion events-per-second when operating at this scale.
We introduce Path-ZVA: an efficient simulation technique for estimating the probability of reaching a rare goal state before a renewal state in a Markov chain. Standard (Monte Carlo) simulation techniques do not work well for rare events, so we use importance sampling; i.e., we change the probability measure governing the Markov chain such that transitions `towards' the goal state become more likely. To do this we need an idea of distance to the goal state, so some level of knowledge of the Markov chain is required. In this paper, we use graph analysis to obtain this knowledge. In particular, we focus on knowledge of the shortest paths (in terms of `rare' transitions) to the goal state. We show that only a subset of the (possibly huge) state space of the Markov chain needs to be considered. We compare our results to well-known importance sampling methods from the literature and demonstrate the large potential gains of our method.
In this paper, we develop a new scheme of exact simulation for a class of tempered stable (TS) and other related distributions with similar Laplace transforms. We discover some interesting integral representations for the underlying density functions that imply a unique simulation framework based on a backward recursive procedure. Therefore, the foundation of this simulation design is very different from existing schemes in the literature. It works pretty efficiently for some subclasses of TS distributions, where even the conventional acceptance-rejection mechanism can be avoided. It can also generate some other distributions beyond the TS family. For applications, this scheme could be easily adopted to generate a variety of TS-constructed random variables and TS-driven stochastic processes for modelling observational series in practice. Numerical experiments and tests are performed to demonstrate the accuracy and effectiveness of our scheme.