B. Framework requirements¶
Requirements norm: Baseline 1 (created March 03, 2025)
B.1. Conceptual requirements¶
The framework shall allow the execution of multiple algorithms. |
See Section 3.1
The framework shall mediate communication between algorithms via data products. |
The framework shall separate the persistent representation of data products from their in-memory representations as seen by algorithms. |
See Section 3.2.1.2
The framework shall run on widely-used scientific computing systems in order to fully utilize DUNE computing resources. |
See Section 1.2.2
The framework shall provide an API that allows users to express hardware requirements of the algorithms. |
See Section 3.4
The framework shall support running algorithms that require a GPU. |
See Section 1.2.2
The framework shall support the invocation of algorithms written in multiple programming languages. |
See Section 1.3
The framework shall support the invocation of algorithms written in C++. |
See Section 1.3
The framework shall support the invocation of algorithms written in Python. |
See Section 1.3
The framework shall provide user-accessible persistence of user-defined metadata. |
The framework shall provide the ability to read a framework-produced file as input to a subsequent framework job so that the physics data are equivalent to the physics data obtained from a single execution of the combined job. |
The framework shall present data produced by an already executed algorithm to each subsequent, requesting algorithm. |
The framework shall support the creation of data sets composed of data products derived from data originating from disparate input sources. |
See Section 3.6.
The framework shall support flexibly defined, context-aware processing units to address the varying granularity necessary for processing different kinds of data. |
See Section 1.2.1
The framework shall provide the ability for user-level code to define data products. |
The framework shall provide the ability for user-level code to create new data sets. |
See Section 3.2, Section 3.2.2
The framework shall provide the ability for user-level code to define data families. |
See Section 3.2, Section 3.2.2
The framework shall provide the ability for user-level code to define hierarchies of data families. |
See Section 3.2, Section 3.2.2
The framework shall support processing of collections that are too large to fit into memory at one time. |
See Section 3.2.2
The framework shall allow the unfolding of data products into a sequence of finer-grained data products. |
See Section 3.5.5
The framework shall support access to external data sources. |
The framework shall support algorithms that provide data from calibration databases. |
See Section 3.2, Section 3.6.
The framework shall support the registration of algorithms that are independent of framework interface. |
The framework shall safely execute user algorithms declared to be non-thread-safe along with those declared to be thread-safe. |
The framework shall enable the specification of resources required by the program. |
See the subsystem design (under preparation)
The framework shall enable the specification of user-defined resources required by the program. |
The framework shall enable the specification of resources required by each algorithm. |
The framework shall permit algorithm authors to specify that the algorithm requires serial access to a thread-unsafe resource. |
The framework shall enable the specification of user-defined resources required by the algorithm. |
The framework shall dynamically schedule algorithms to execute efficiently according to the availability of each algorithm’s required resources. |
The framework shall optimize the memory management of data products. |
See Section 3.2.3
The framework shall support composable workflows that use GPU algorithms along with CPU algorithms. |
The framework shall support the specification of data products required as input by an algorithm. |
See Section 3.4
The framework shall support the specification of data products created as output by an algorithm. |
See Section 3.4
The framework shall accept exactly one configuration per program execution. |
See Section 3.10.1
The framework shall provide the ability to configure the execution of a framework program at runtime using a human-readable language. |
See Section 3.10.1
The framework shall provide a public API that enables the implementation of a concrete IO backend for a specific persistent storage format. |
The framework IO subsystem shall support backward compatibility across versions, subject to policy decisions on deprecation provided by DUNE. |
The framework shall allow a single invocation of an algorithm with data products from multiple data sets. |
The framework shall support the invocation of an algorithm with data products belonging to adjacent data sets. |
The framework shall support user code that defines adjacency of data sets within a data family. |
The framework shall allow a single invocation of an algorithm with data products from multiple data families. |
The framework shall support the user specification of which data family to place the data products created by an algorithm. |
The data objects exchanged among algorithms shall be separable from those algorithms. |
See Section 3.2.1.1
The framework shall enable users to discover the provenance of data products. |
The framework shall record metadata to output enabling the reproduction of the processing steps used to produce the data recorded in that output. |
The framework shall support the reproduction of data products from the provenance stored in the output. |
See Section 3.2.4.
The framework shall provide a facility to produce random numbers enabling algorithms to create reproducible data in concurrent contexts. |
The framework shall facilitate the development of thread-safe algorithms. |
See Section 2.4, Section 3.2.3
The framework shall support executing programs configured by composing configurations of separate components. |
See Section 3.10.1
The framework shall attempt a graceful shutdown by default. |
See Section 1.2.3
B.2. Supporting requirements¶
The framework shall shut down if the platform fails to meet each specified hardware requirement. |
The framework shall emit a diagnostic message for each hardware requirement the platform fails to meet. |
The framework documentation shall provide instructions for writing framework-executable algorithms in supported languages. |
The framework shall support reading from disk only the data products required by a given algorithm. |
The framework shall support the reading of collections too large to hold in memory. |
The framework shall support the writing of collections too large to hold in memory. |
The framework shall provide the ability to compare two configurations. |
See Section 3.10.1
The framework shall record the job’s execution environment. |
The framework shall provide the list of recordable components of the execution environment. |
The framework shall save each execution-environment description selected by the user from the framework-provided-list. |
The framework shall record user-selected items from the shell environment. |
The framework shall record labelled execution environment information provided by the user. |
The framework shall gracefully shut down if the program attempts to exceed a configured memory limit. |
The framework shall emit a diagnostic message if the program attempts to exceed the configured maximum memory. |
The framework shall have an option to record build information, including the source code version, associated with each algorithm. |
The framework shall allow algorithms to use the same parallelism mechanisms the framework uses to schedule the execution of algorithms. |
See the subsystem design (under preparation)
The framework shall enable the specification of the maximum number of CPU threads permitted by the program. |
The framework shall enable the specification of the maximum CPU memory allowed by the program. |
The framework shall enable the specification of GPU resources required by the program. |
The framework shall enable the specification of the maximum number of CPU threads permitted by the algorithm. |
See Section 3.4
The framework shall enable the specification of an algorithm’s expected CPU memory usage. |
See Section 3.2.1.2
The framework shall enable the specification of GPU resources required by the algorithm. |
The framework shall support algorithms that perform calculations using a remote GPU. |
The framework shall support algorithms that perform calculations using a local GPU. |
The framework shall support logging the usage of a specified resource for each algorithm using the resource. |
The framework shall have an option to provide elapsed time information for each algorithm executed in a framework program. |
See the subsystem design (under preparation)
The framework shall efficiently execute a graph of algorithms where at least one algorithm requires access to a network resource. |
The framework shall optimize the availability of external resources. |
The framework shall efficiently execute a graph of algorithms where at least one algorithm specifies a required amount of CPU memory. |
The framework shall efficiently execute a graph of algorithms where at least one algorithm specifies a required amount of GPU memory. |
The framework shall have an option to emit a description of the data flow of a configured program without executing the workflow. |
The framework shall have an option to emit a message stating the resources required by each algorithm of a configured program without executing the workflow. |
The framework shall be able to report the global memory use of the framework program at user-specified points in time. |
See the subsystem design (under preparation)
The framework shall support a logging solution that is usable in an algorithm without that algorithm explicitly relying on the framework. |
See the subsystem design (under preparation)
The framework shall validate an algorithm’s configuration against specifications provided at registration time. |
See Section 3.10.1
The framework shall have an option to emit an algorithm’s configuration schema in human-readable form. |
See Section 3.10.3
The framework shall validate the configuration of each algorithm before that algorithm processes data. |
See Section 3.10.1, Section 3.10.3.
The framework ecosystem shall support a ROOT IO backend. |
See Section 3.2.1.1.
The framework ecosystem shall support an HDF5 IO backend. |
See Section 3.2.1.1.
The framework’s IO subsystem shall support backward compatibility of data products. |
The framework’s IO subsystem shall support backward compatibility of metadata. |
The framework IO subsystem shall allow user-configuration of compression settings for each concrete IO implementation. |
The framework shall support user-configurable rollover of output files. |
The framework shall have an option to rollover output files according to a configurable limit on the number of data sets in a user-specified data family. |
The framework shall have an option to rollover output files according to a configurable limit on output-file size. |
The framework shall have an option to rollover output files according to a configurable limit on the aggregated value of a user-derived quantity. |
The framework shall have an option to rollover output files according to a configurable limit on the time the file has been open. |
The framework ecosystem shall support processing ProtoDUNE single-phase raw data. |
The framework ecosystem shall support processing ProtoDUNE dual-phase raw data. |
The framework ecosystem shall support processing ProtoDUNE II horizontal-drift raw data. |
The framework ecosystem shall support processing ProtoDUNE II vertical-drift raw data. |
The framework shall provide an option to persist the configuration of each framework execution to the output of that execution. |
See Section 3.10.1
The framework shall operate independently of unique characteristics of existing hardware. |
The framework shall provide a command-line interface that allows the setting of configuration parameters. |
See Section 3.10.1
The framework shall support the use of local configuration changes with respect to a separate complete configuration to modify the execution of a program. |
See Section 3.10.1
The framework configuration system shall have an option to provide diagnostic information for an evaluated configuration, including origins of final parameter values. |
See Section 3.10.1
The language used for configuring a framework program shall include features for maintaining hierarchical configurations from a single point of maintenance. |
See Section 3.10.1
The framework shall record metadata identifying data sets where the framework took special measures to process data collections of unconstrained size. |
The framework build system shall support options that enable debugging executed code. |
The framework shall allow the per-execution setting of the float-point environment to control the handling of IEEE floating-point exceptions. |
The framework shall by default attempt a graceful shutdown upon receiving an uncaught exception from user algorithms. |
The framework shall by default attempt a graceful shutdown when receiving a signal. |