1. Introduction

The following diagram illustrates three high-level computing stages commonly used in HEP to obtain physics results from detector signals.

digraph {
  rankdir=LR
  fontsize=10
  margin="1,0.2"

  // Set rank separation to zero so that the begin and end labels are close to the block point nodes.
  ranksep=0

  // Nodes for processing stages
  node [shape="box",
        style="filled,rounded",
        fontname="Arial",
        fontcolor="white",
        fillcolor="gray40",
        height=0.8,
        margin="0.3,0.15"]
  signals [shape="point", width=0.1, fillcolor="black"]
  results [shape="point", width=0.1, fillcolor="black"]

  DAQ [label=<<b>DAQ</b>>]
  framework [label=<<b>Reconstruction and<br/>simulation framework</b>>, fillcolor="royalblue3"]
  Analysis [label=<<b>Analysis</b>>]

  // Artificial nodes for creating labels
  node [penwidth="0", style="nofill", fontcolor="black", margin="0,0"];
  startlabel [label="Detector\nsignals"]
  endlabel [label="Physics\nresults"]

  startlabel -> signals [style=invis]
  signals -> DAQ [arrowhead=none, minlen=3]
  DAQ -> framework -> Analysis [minlen=4]
  Analysis -> results [arrowhead=none, minlen=3]
  results -> endlabel [style=invis]
}

Wikipedia decently defines a software framework as [Wiki-framework]:

an abstraction in which software, providing generic functionality, can be selectively changed by additional user-written code, thus providing application-specific software.

The framework orchestrates data flow, resource management, and parallel execution. It enables a scientific collaboration to write standardized workflows where physicists can insert their own algorithms. In a HEP context, this insertion often occurs by the framework dynamically loading libraries called plugins. Although not required, a framework often provides a program’s main(…) function, which (directly or indirectly) invokes user code within the plugins as configured at appropriate points in the program’s execution.

Frameworks are typically used in a high-level trigger environment, for reconstructing physics objects from detector signals, or for simulating physics processes. Many analysis needs can also be met by a data-processing framework. However, the HEP community tends to perform final-stage analysis using standalone applications. Phlex, therefore, aims to satisfy the data-processing needs of only physics reconstruction and simulation.

1.1. Requirements Process

Phlex provides facilities and behaviors intended to support the physics goals of its stakeholders, notably the DUNE experiment [1]. DUNE has established a set of high-level requirements or stakeholder requirements, which constrain the design of the framework in support of DUNE’s needs. A dedicated tool [Jama-connect] is used to manage such stakeholder requirements, tracking them in a version-controlled manner, and creating logical dependencies among them. As the design matures, system requirements are then created to guide implementation in support of the stakeholder requirements.

1.1.1. Requirements Ownership

Each Phlex stakeholder owns its stakeholder requirements, which support the high-level needs of the experiment. System requirements, which are subservient to stakeholder requirements, are owned by the Phlex developers, who are free to adjust the implementation to satisfy all stakeholder requirements.

1.1.2. Requirements in This Document

The stakeholder requirements are listed in Appendix B for convenience. To more easily connect the design to the requirements, any design aspect influenced by specific requirements contains bracketed references to those requirements (e.g. [DUNE 1]).

Where possible, we limit references to stakeholder requirements to the conceptual design in Chapter 3. Some stakeholder requirements are referenced in the subsystem design (under preparation) if those requirements do not affect the conceptual framework model. No system requirements are currently referenced in this document.

1.2. Framework Philosophy

A framework is a tool that aids the scientific process of inferring accurate physics results from observed data. Maintaining data integrity is therefore paramount, as is retaining an accounting of how physics results were obtained from that data. The Phlex design therefore:

  • treats all data presented to (or created by) Phlex as immutable for the remainder of a Phlex program’s execution,

  • requires recording the provenance of every created data product [DUNE 121], and

  • enables, and—to the extent possible—ensures the reproducible creation of data products.

1.2.1. Flexibility

Physics results in HEP are obtained by processing sequences of data and making statistical statements from them. Each element of a sequence generally contains the data corresponding to one readout of the detector. The sequence elements are often termed “events”, which are treated as statistically independent observations of physics processes. It is common for experiments to define larger aggregations of data by grouping events into subruns (or, for LHC experiments, luminosity blocks), and by further grouping subruns into runs. These larger aggregations are typically defined according to when certain detector calibrations or accelerator beam parameters were applied.

Although frameworks supporting the run-subrun-event (RSE) hierarchy have proved effective and flexible enough for collider-based experiments, the RSE hierarchy is not always appropriate:

  • simulated data often do not need to be processed with an RSE hierarchy; a flat hierarchy (e.g. only the “event”) is usually sufficient,

  • framework interface is often explicitly couched in RSE terminology, making it difficult to apply to non-collider contexts, where a different data-grouping may be more appropriate (e.g. time slices for extended readout windows, each of which correspond to one “event”),

  • calibration data is often described independently from an RSE hierarchy, requiring other means of accounting for systematic corrections that must be applied to the data.

Phlex does not prescribe an RSE hierarchy—it only requires that the hierarchy be representable as a directed acyclic graph (DAG) at run-time, with each grouping of data represented as a node in the graph, and the relationships between data-groupings represented as edges. This expression of the hierarchy greatly relaxes the constraints placed on experiments while still supporting the collider-based RSE hierarchy (see Section 3.2.2).

The hierarchy graph and its nodes (i.e. the data-groupings) are definable at run-time, thus allowing the specification of data organizations that are appropriate for the workflow [DUNE 22].

The flexibility in defining data-groupings and how they relate to each other necessitates further flexibility:

  1. user-defined algorithms are not bound to statically-typed classes representing data-groupings—e.g. there is no direct dependency on a C++ “event” class, and

  2. a framework program must be “driven” by a user-provided entity that expresses the hierarchy graph desired by the user, not a hierarchy that is prescribed by the framework.

These concepts are discussed more fully in Chapter 3.

1.2.2. Portability

Phlex is intended to be used on a variety of computing systems to take advantage of the disparate computing resources available to each stakeholder [DUNE 8]. This means the framework:

  • must support data-processing by algorithms that execute on GPUs [DUNE 11], in addition to those that execute on CPUs,

  • may not generally rely on hardware characteristics unique to a particular platform [DUNE 63],

  • must favor standardized programming-language features.

1.2.3. Usability

Although usability is not a formal stakeholder requirement, physicists expect various behaviors and features that ease one’s interaction with a data-processing framework. Phlex strives to meet this expectation in various ways:

minimizing boilerplate code

Some data-processing frameworks in HEP adopt an object-oriented design, where stateful framework-dependent objects are required to register inherently framework-agnostic algorithms with a framework program. Phlex does not generally require physics algorithms to depend on any framework libraries [DUNE 43]. This design, therefore, substantially reduces the amount of code required for the interface between physics algorithms and the framework itself (see Section 1.4).

failing early

To avoid needless computation, Phlex will fail as early as possible in the presence of an error. This means that, for C++ usage, compile-time failures will be favored over run-time exceptions.

meaningful error messages

When failures within the scope of the framework occur [2], the reported error messages will be as descriptive as possible. Messages will typically include diagnostic information about the data being processed when the error occurred as well as the algorithms that were executed on that data.

graceful shutdown

For run-time errors, the default behavior of Phlex is to end the framework program gracefully [DUNE 134]. A graceful shutdown refers to a framework program that completes the processing of all in-flight data, safely closes all open input and output files, cleans up connections to external entities (such as databases), etc. before the program ends. This ensures that no resources are left in ill-defined states and that all output files are readable and valid.

1.3. Programming Languages

The framework will support user algorithms written in multiple programming languages [DUNE 14]. Specifically, an algorithm may be written in either C++ [3] [DUNE 81] or Python [4] [DUNE 82]. If there is a need to support user algorithms written in another programming language, a corresponding stakeholder requirement should be created.

Note that the language is left unspecified for the implementation of the framework itself.

1.4. Framework Independence

We define an algorithm as framework-independent if it contains no explicit dependencies on framework libraries—i.e. it is possible to build and execute the algorithm independent of a framework context. For framework-independent C++ algorithms, this means there are no direct or transitive framework libraries that are either included as headers in the algorithm code or linked as run-time libraries. Similarly, framework-independent Python algorithms import no direct or transitive framework packages.

Phlex is required to support the registration of user-defined, framework-independent algorithms [DUNE 43]. This does not mean that all framework-independent algorithms are suitable for registration, nor does it mean that all algorithms registered with the framework must be framework-independent. In fact, depending on what the algorithm is doing, some algorithms might require explicit framework dependencies.

Footnotes

References