All News Posts
-
Release 8.1.0
Our new release is packed with new features and bug fixes!! :D
-
GVT Hook - Running custom code at GVT computation time
Sometimes, we want to switch our model as it is running, or we want to pause the simulation and check what it is doing, or we want to checkpoint the whole thing. Well, that is finally possible with the GVT hook.
-
Checking Reverse Handler - The debugging tool for PDES you didn't know you needed
We have implemented a new synchronizaton algorithm: sequential rollback check (
--synch=6
). -
Args file: --args-file=options.txt
ROSS new –args-file argument allows us to pass some arguments in a file.
-
Release 8.0.0
This version includes all of the 2020-2022 work on Tiebreaking and various updates to develop since the last version.
-
Usage FAQ
A quick list of Answered Questions for new people (and not so much) to ROSS.
-
LP and PE Mappings
Once a model is developed for ROSS, the performance of simulations becomes hugely important. One factor that can play a role is how the LPs are mapped to the physical hardware underneath. LPs that communicate frequently may benefit from being placed within the same MPI process, while LPs that communicate infrequently may be placed on different MPI processes.
-
Release 7.2.0
This release has some new features:
-
Versioning and Releases
In addition to the git branching workflow development model, the ROSS project uses the Semantic Versioning model for version numbers. This helps with reproducibility of experiments detailed in publications and helps users ensure that they are using the correct version of ROSS, especially when using other software that depends on ROSS.
-
Default Clock: Get-Time-of-Day
With PR #170 there is a new default clock based on the system
gettimeofday
function. This means that ROSS can function on any architecture, even if a processor-specific system clock is not implemented. More details on thegettimeofday
function can be found on the linux man page. -
Delta Encoding
Delta encoding provides a solution for models that contain events which are not well suited for using reverse computation or consumes significant amounts of memory (making copy-state approaches infeasible). Delta encoding solves this issue by only computing state change deltas once an event has completed execution. These deltas are then compressed for reduced storage overheads. In addition, delta encoding is done on a per-event basis allowing both reverse computation and delta encoding to be mixed within a single model. Overall, the delta encoding approach provides the benefits of incremental state-saving but without requiring the specific identification of which state elements change. This feature is further described in LaPre et al., 2015.
-
LP Printing with tw_output
LP-level printing is available from the forward event handler. Models simply need to attach an output buffer to the event being processed, using the
tw_output
function. If this event is eventually committed, all attached output will be sent to standard-out. -
Website Refresh
The ROSS-org website has just been refreshed. We hope to make it clear that this website is the starting point for ROSS documentation and a central place for any announcements relevant to the ROSS community. We welcome any contributions, announcements or documentation posts, from the community (check out our contributing guide for details).
-
TW_STIME API: User Defined Simulation Time
To allow for user-defined simulation time, a new TW_STIME API has been put in place. This API is implemented through various
#define
macros, which should allow users flexibility while maintaining the performance of ROSS when not using this feature. Any models which have already been developed using the originaldouble
type fortw_stime
are unchanged. -
Git Branching Workflow
We’re now following a new Git development model for ROSS, based on this post by Vincent Driessen. For anyone using ROSS and not doing development work, nothing changes. Stick to the master branch, which will at the most recent ROSS release. For ROSS devs, continue reading.
-
Building a Model with ROSS
The first step for any new model is to get some bare-bones code building. Unfortunately, building a model on top of ROSS is somewhat complicated. This post will take you through several methods to getting your code running, each of which are demonstrated with the template model.
-
ROSS-org on GitHub
ROSS now has its own GitHub organization, ROSS-org!
-
ROSS Installation with Spack
ROSS can now be installed using the Spack package manager.
-
Optimistic Parameters
There are several different parameters that are available for tuning optimistic parallel mode in ROSS. Optimistic mode can be performed in two ways, the difference between the two ways is how the frequency of GVT computation is determined. The traditional way,
--sync=3
, uses the number of events to determine how frequently to perform GVT. This way means GVT will be performed after everybatch
xGVT-interval
number of events (these parameters are explained in more detail below). The other method is similar, but GVT is performed after some amount of real time has passed (see Optimistic Realtime Scheduler for more details). -
Instrumentation Overview
There are several different modes of instrumentation have been added to ROSS that can be used to collect data on the simulation engine and/or the model being simulated:
-
Real Time Sampling
This collects data at real time intervals specified by the user. The runtime option
--rt-interval=n
sets the sampling interval, where n is the number of milliseconds between sampling points. The default is set to 1000 ms. -
GVT-based Sampling
This collects data immediately after each GVT. By default, the data is collected on a PE basis, but some metrics can be changed to tracking on a KP or LP basis (depending on the metric).
-
Virtual Time Sampling
In order to support virtual time sampling in ROSS, specialized LPs called Analysis LPs were added to the ROSS core. These LPs are hidden from the model, so they will not affect any LP to PE/KP mappings.
-
Event Tracing
For event tracing, ROSS can directly access the source and destination LP IDs for each event, as well as the sent and received virtual timestamps. It will also record the real time that the event is computed at.
-
Data Sample Format
Since the instrumentation data is output in binary, this page describes the format of the data. You can also see the ROSS Binary Reader repo for examples of reading the data. Each sample collected (regardless of which instrumentation mode is used) is broken into two parts, the metadata and the sample data.
-
In Situ Analysis and Visualization with Damaris
Damaris is an I/O and data management software. Support for Damaris is currently being added to ROSS to enable in situ data analysis and visualization. The current focus is to use it with the various instrumentation modes to do performance analysis on simulations to better understand performance bottlenecks. Eventually it will support visualizing model data as well.
-
Streaming Data with Damaris
Note: This is still under heavy development.
-
Simulation Engine Metrics Descriptions
This page provides explanations on each of the simulation engine metrics collected in the ROSS instrumentation layer.
-
MPI Communicators in ROSS
By default, ROSS will use MPI_COMM_WORLD for all its communications.
tw_comm_set
allows the user to change the communicator used by ROSS, for instance to run ROSS on a subset of the ranks of a larger MPI application.tw_comm_set
should be called before callingtw_init
. Iftw_comm_set
is used, then the user is responsible for callingMPI_Finalize
after callingtw_end
. -
Manual ROSS Installation
ROSS can now be installed using the Spack package manager. To see those instructions, please see Installing ROSS with Spack.
-
Building and Running ROSS on CCI Blue Gene/Q
These instructions are specifically for users of RPI’s AMOS System (the IBM Blue Gene/Q at the CCI). They could be followed for any system behind a firewall, where direct access to GitHub is not allowed.
-
Running the Simulator
Quick Help
-
RIO Version 2 Release
The latest version of the RIO API for ROSS checkpoints has just been released! The RIO project can be found here.
-
Lamport Clocks
Lamport clocks are a simple technique used for determining the order of events in a distributed system. First proposed by Leslie Lamport in a paper available here, a Lamport clock maintains order of operations by incrementing a counter contained in the events. By simply adding a counter value to events as they are received and incrementing this value based on the last seen value, Lamport clocks provide a simple way to determine order of events. Lamport clocks provide a partial ordering of events – specifically “happened-before” ordering.
-
Step 2a: Modeling Airplane Functionality
Now that the Airplane and Airport LPs have been defined, their functionality can be implemented. During the model implementation process, there are several non-trivial design decisions that must made. This post attempts to document thought process and order of steps that takes place during the implementation of the airplane functionality.
-
Debugging Tips & Tricks
Debugging a PDES model can be a very tricky venture. Fortunately, there seem to be a small subset of common bugs that are found in many different models. This post discusses the basics of debugging a parallel program and some of the bugs to watch out for. Learn from mistakes of the past!
-
LP Type Callbacks
The functionality of an LP is realized through a number of callback functions and LP type can have its own set of callbacks. As the model developer, you simply need to program a single LP’s task for each function. The ROSS simulation engine takes care of the rest.
-
Style Guide
ROSS doesn’t officially enforce any particular style guide. Over the years, the different contributers have each had their own conventions. This post will document the conventions currently in place and outline a style guide for future contributions.
-
Versioning and Releases
Is the history of your repository safe? If there are continuous integration tests, does that mean every merged commit is safe? Turns out, the way in which branches were merged into master made some false assumption, which, in turn, hurt the end users. This post lays out the old and the new, documenting the way in which ROSS will be versioned.
-
Optimistic Realtime Scheduler
The realtime scheduler is a new optimistic scheduling option for ROSS, invoked through
--synch=5
. During this scheduler, GVT intervals are trigged based on the amount of time elapsed (rather than the number of event which have been processed). For models that see minimal speedup during regular optimistic simulation, or models where there is a known load imbalance, the optimistic realtime scheduler may lead to increased performance. -
Schedulers
Schedulers are at the core of any parallel discrete-event simulation (PDES) system. They are responsible for the performance of the simulation and, in some cases, can indirectly influence correctness as well. Schedulers execute events and they do so in timestamp, or virtual time order. As each event is executed, time jumps to the timestamp of that event. In this blog post we will discuss all three major scheduler categories: sequential, conservative, and optimistic.
-
Step 1: Designing Objects
The very first thing any ROSS user should do is develop their own model. This is the best way to gain experience with ROSS API and the best way to expose oneself to the principles of discrete-event simulations. Here we start with a definition of discrete-event simulation and begin to create a basic model for airports.
-
Random Numbers
ROSS’s reversible random number generator is based on L’Ecuyer’s Combined Linear Congruential Generator (see the implementation paper or the wikipedia article). On top of this implementation, ROSS adds the ability to “rewind” the RNG, a functionality needed for reverse computation.
-
Development Setup
ROSS is an actively developing framework. As such, it is important to setup your development environment to allow for updating of the ROSS core library. The goal of this article is to outline the best-practices for setting up a development environment.
-
Overview
E. Gonsiorowski. “Enabling Extreme-Scale Circuit Modeling Using Massively Parallel Discrete-Event Simulation,” Ph.D. dissertation, CS, RPI, Troy, NY, 2016.
-
API Description
The RIO API is designed to be familiar to anyone who is using ROSS. The API includes:
-
Checkpoint Description
A RIO checkpoint contains LP and event data (along with metadata) captured at the end of a ROSS simulation. These checkpoints can be used to restart and continue the simulation from where it left off.
-
Adding RIO to a Model
Distribution of RIO is included with ROSS as a git-submodule.1 The RIO package itself can be turned on or off through CMake. Thus, there should be no performance impact on those users who do not use RIO.