3. Conceptual Design

Purpose

The conceptual design is not a reference manual; it is a high-level description of how the framework aims to satisfy the stakeholder requirements (see Appendix B). The audience for the conceptual design is the physicist, algorithm author, or framework program runner. More detailed design aspects in support of the conceptual model are given in the subsystem design (under preparation).

Phlex adopts the data-flow approach discussed in Section 2.6.1. Instead of expressing scientific workflows as monolithic functions to be executed, workflows are factorized into composable algorithms that operate on data products passed among them [DUNE 1], [DUNE 111], [DUNE 20]. These algorithms then serve as operators to higher-order functions that operate on data-product sequences.

To guide the discussion of Phlex’s conceptual model, we refer to the graph in Fig. 3.1, which illustrates various framework aspects:

  • the data-flow graph itself (see Section 3.1)

  • data products and data-product sets as passed along graph edges (see Section 3.2)

  • user-provided algorithms such as transforms, folds, etc. (see Section 3.3 and Section 3.5)

  • the framework driver (see Section 3.6)

  • data-product providers (see Section 3.7), which are plugins that provide data products from external entities to downstream user algorithms (e.g. input from ROOT files)

  • data-product writers (see Section 3.8), which are plugins that may write data products to an external entity (e.g. output files)

  • resources (see Section 3.9)

  • program configuration (see Section 3.10)

digraph {
  node [shape="box", style="rounded"]
  edge [fontcolor="red"];

  start [shape="point", width=0.1]
  unfold [label=<unfold(<font color="blue">into_apas</font>)>]
  transform [label=<transform(<font color="blue">clamp</font>)>]
  fold [label=<fold(<font color="blue">sum_waveforms</font>)>]
  filter [label=<filter(<font color="blue">high_energy</font>)>];
  observer [label=<observe(<font color="blue">histogram_waveforms</font>)>];
  out [label="ROOT output file(s)", shape="cylinder", style="filled", fillcolor="lightgray"]

  {
    rank=same;
    resource [label=<Histogram<br/> resource>,
              shape=hexagon,
              style=filled,
              fillcolor=thistle,
              margin=0];
    root [label="ROOT analysis file", style=filled, shape=cylinder];
  }

  start -> driver [label=" Configuration", fontcolor="forestgreen"];

  {
    rank=same;
    gdml [label="GDML file", shape="cylinder", style="filled", fillcolor="lightgray"]
    driver [label="driver(Spill)", style="rounded,filled",fillcolor="palegreen1"];
    input [label="ROOT input files(s)", shape="cylinder", style="filled", fillcolor="lightgray"];
  }

  driver -> input [style="dotted", arrowhead=none];

  // Providers
  {
    rank=same;
    geometry_provider [label="provide(Geometry)", style="filled,rounded", fillcolor="lightblue"];
    sim_hits_provider [label="provide(SimHits)", style="filled,rounded" fillcolor="lightblue"];
  }

  driver -> geometry_provider [label=" [J]", fontcolor="darkorange"];
  driver -> sim_hits_provider [label=< [Spill<sub><i>j</i></sub>]>, fontcolor="darkorange"];

  geometry_provider -> gdml [style="dotted", arrowhead=none];
  resource -> root [style="dotted", arrowhead=none];

  sim_hits_provider -> input [style="dotted", arrowhead=none];
  sim_hits_provider -> unfold [label=< [SimHits<sub><i>j</i></sub>]>];
  geometry_provider -> unfold [label=< [Geometry]>];

  unfold:s -> transform [xlabel=< [Waveforms<sub><i>j k</i></sub>]>];
  transform:s -> fold [taillabel=<[ClampedWaveforms<sub><i>j k</i></sub>] >,
                       labelangle=-80,
                       labeldistance=7
                      ];

  // Writers
  {
    rank=same;
    waveforms_writer [label="write(Waveforms)", style="filled,rounded", fillcolor="lightblue"];
    summed_waveforms_writer [label="write(SummedWaveforms)", style="filled,rounded", fillcolor="lightblue"];
    clamped_waveforms_writer [label="write(ClampedWaveforms)", style="filled,rounded" fillcolor="lightblue"];
  }

  unfold:s -> waveforms_writer [label=<[Waveforms<sub><i>j k</i></sub>]>];
  transform:s -> clamped_waveforms_writer;
  fold:s -> summed_waveforms_writer [label=< [SummedWaveforms<sub><i>j</i></sub>]>];

  {waveforms_writer, clamped_waveforms_writer, summed_waveforms_writer} -> out [style="dotted", arrowhead=none]

  unfold:s -> filter [label=< [Waveforms<sub><i>j k</i></sub>]>];
  filter:s -> observer [label=< [Waveforms<sub><i>j k</i> '</sub>]>];
  resource -> observer [style="dashed"];

}

Fig. 3.1 A sample workflow showing the different types of algorithm supported by Phlex (see Section 2.5 for a list of the supported algorithms). Solid arrows show the flow of data through the graph. Dotted lines indicate communication of data through the IO system. The driver algorithm (see section Section 3.6) is configured to process all spills in the specified ROOT input files. One provide algorithm is configured to read SimHits associated with spills from the ROOT input files and the other reads a single Geometry object from the GDML file. For each spill, an unfold algorithm is configured to create a sequence of Waveforms objects, creating one Waveforms object in each APA. A transform algorithm is run on each of the Waveforms objects to create a ClampedWaveforms object. A fold algorithm is run on each of the ClampedWaveforms objects in a spill to create a SummedWaveforms object for the spill. The write algorithms are configured to write the Waveforms, ClampedWaveforms, and SummedWaveforms objects to one or more ROOT output files. Each Waveforms, ClampedWaveforms, and SummedWaveforms object is associated with the appropriate spill or APA. This workflow also shows a filter algorithm selecting only “high energy” Waveforms, and an observe algorithm creating a histogram from them, which is written to a ROOT analysis file. Note that in this workflow the names spill and APA are not special to the Phlex framework; they are names (hypothetically) chosen by the experiment.