the base class for all drivers running on batch systems
More...
#include <BatchDriver.h>
Inherits EL::Driver.
Inherited by EL::CondorDriver, EL::ExecDriver, EL::GEDriver, EL::KubernetesDriver, EL::LLDriver, EL::LocalDriver, EL::LSFDriver, EL::SlurmDriver, EL::SoGEDriver, and EL::TorqueDriver.
|
static void | resubmit (const std::string &location, const std::string &option) |
| resubmit all failed sub-jobs for the job in the given location More...
|
|
static bool | retrieve (const std::string &location) |
| retrieve all the output for the job in the given location More...
|
|
static bool | wait (const std::string &location, unsigned time=60) |
| retrieve all the output for the job in the given location and wait until it is finished completely. More...
|
|
static void | updateLocation (const std::string &location) |
| update the internal location of files, after moving the submission directory More...
|
|
static void | mergedOutputSave (Detail::ManagerData &data) |
| create and save a sample handler assuming we created all the merged files at the requested locations More...
|
|
static void | diskOutputSave (Detail::ManagerData &data) |
| make the output sample handler for the given job or stream from the information stored in the histogram files. More...
|
|
|
std::string | shellInit |
| description: these shell commands are run verbatim on each worker node before execution More...
|
|
|
static bool | abortRetrieve |
| this flag is set to true when the wait() function is running and a SIGINT is caught, meaning that control should be returned to the user as soon as possible. More...
|
|
|
static bool | mergeHists (Detail::ManagerData &data) |
| effects: merge the fetched histograms returns: wether all histograms have been fetched guarantee: strong failures: out of memory II failures: i/o errors More...
|
|
the base class for all drivers running on batch systems
This class implements extra functionality for packaging EventLoop jobs into sub-jobs and then retrieving the results.
Definition at line 23 of file BatchDriver.h.
◆ BatchDriver()
EL::BatchDriver::BatchDriver |
( |
| ) |
|
effects: standard default constructor guarantee: strong failures: low level errors I
◆ ClassDef()
◆ defaultReleaseSetup()
the code for setting up the release
- Guarantee
- strong
- Failures
- out of memory II
failed to read environment variables
◆ diskOutputSave()
make the output sample handler for the given job or stream from the information stored in the histogram files.
This is optional, but it is convenient for drivers that use (conventional) writers
- Guarantee
- basic
- Failures
- out of memory II
i/o errors
◆ doManagerStep()
◆ makeScript()
effects: create the run script to be used guarantee: basic, may create a partial script failures: out of memory II failures: i/o errors
◆ mergedOutputSave()
create and save a sample handler assuming we created all the merged files at the requested locations
This is optional, but it is convenient for drivers that want to keep their outputs locally.
- Guarantee
- basic
- Failures
- out of memory II
i/o errors
◆ mergeHists()
effects: merge the fetched histograms returns: wether all histograms have been fetched guarantee: strong failures: out of memory II failures: i/o errors
◆ options() [1/2]
the list of options to jobs with this driver
- Guarantee
- no-fail
- Postcondition
- result != 0
◆ options() [2/2]
◆ resubmit()
static void EL::Driver::resubmit |
( |
const std::string & |
location, |
|
|
const std::string & |
option |
|
) |
| |
|
staticinherited |
resubmit all failed sub-jobs for the job in the given location
\parm option driver-specific option string selecting which jobs to resubmit (and how)
- Guarantee
- basic, may partially resubmit
- Failures
- out of memory III
job resubmission errors
job can't be read
job was made with different driver
◆ retrieve()
static bool EL::Driver::retrieve |
( |
const std::string & |
location | ) |
|
|
staticinherited |
retrieve all the output for the job in the given location
While job failures will cause this method to fail you can typically retry it multiple times if you can use partial results.
- Returns
- whether the job completed successfully
- Guarantee
- basic, may partially retrieve
- Failures
- out of memory III
job failures
job can't be read
job was made with different driver
◆ submit()
std::string EL::Driver::submit |
( |
const Job & |
job, |
|
|
const std::string & |
location |
|
) |
| const |
|
inherited |
submit the given job with the given output location and wait for it to finish
This is mostly for small jobs and backward compatibility. For longer jobs use submitOnly instead.
- Returns
- The actual location of the submit directory, if the job was configured to generate a unique directory.
- Guarantee
- basic, may partially submit
- Failures
- out of memory II
- Failures
- can't create directory at location
submission errors
◆ submitOnly()
std::string EL::Driver::submitOnly |
( |
const Job & |
job, |
|
|
const std::string & |
location |
|
) |
| const |
|
inherited |
submit the given job with the given output location and return immediately
This method allows you to submit jobs to your local batch system, log out and at a later point log back in again.
- Returns
- The actual location of the submit directory, if the job was configured to generate a unique directory.
- Guarantee
- basic, may partially submit
- Failures
- out of memory II
can't create directory at location
submission errors \warn not all drivers support this. some will do all their work in the submit function. \warn you normally need to call wait() or retrieve() before you can use the output.
◆ testInvariant()
void EL::BatchDriver::testInvariant |
( |
| ) |
const |
effects: test the invariant of this object guarantee: no-fail
◆ updateLocation()
static void EL::Driver::updateLocation |
( |
const std::string & |
location | ) |
|
|
staticinherited |
update the internal location of files, after moving the submission directory
- Guarantee
- basic, may update partially
- Failures
- out of memory II \warn only move the submission directory after all your jobs are finished, or the results will be unpredictable
◆ wait()
static bool EL::Driver::wait |
( |
const std::string & |
location, |
|
|
unsigned |
time = 60 |
|
) |
| |
|
staticinherited |
retrieve all the output for the job in the given location and wait until it is finished completely.
poll the output every time seconds.
While job failures will cause this method to fail you can typically retry it multiple times if you can use partial results.
Typically sleeping for 60 seconds is an appropriate interval, but if it doesn't work for you, you can change it here.
- Guarantee
- basic, may partially retrieve
- Failures
- out of memory III
job failures
job can't be read
job was made with different driver
◆ abortRetrieve
bool EL::Driver::abortRetrieve |
|
staticprotectedinherited |
this flag is set to true when the wait() function is running and a SIGINT is caught, meaning that control should be returned to the user as soon as possible.
drivers can use it to abort long running operations in doRetrieve before completion
Definition at line 212 of file Driver.h.
◆ m_options
members directly corresponding to accessors
Definition at line 233 of file Driver.h.
◆ shellInit
std::string EL::BatchDriver::shellInit |
description: these shell commands are run verbatim on each worker node before execution
Definition at line 45 of file BatchDriver.h.
The documentation for this class was generated from the following file: