|
ATLAS Offline Software
|
a Driver to submit jobs via prun
More...
#include <PrunDriver.h>
|
static void | status (const std::string &location) |
|
static void | setState (const std::string &location, const std::string &task, const std::string &state) |
|
static void | resubmit (const std::string &location, const std::string &option) |
| resubmit all failed sub-jobs for the job in the given location More...
|
|
static bool | retrieve (const std::string &location) |
| retrieve all the output for the job in the given location More...
|
|
static bool | wait (const std::string &location, unsigned time=60) |
| retrieve all the output for the job in the given location and wait until it is finished completely. More...
|
|
static void | updateLocation (const std::string &location) |
| update the internal location of files, after moving the submission directory More...
|
|
static void | mergedOutputSave (Detail::ManagerData &data) |
| create and save a sample handler assuming we created all the merged files at the requested locations More...
|
|
static void | diskOutputSave (Detail::ManagerData &data) |
| make the output sample handler for the given job or stream from the information stored in the histogram files. More...
|
|
|
static bool | abortRetrieve |
| this flag is set to true when the wait() function is running and a SIGINT is caught, meaning that control should be returned to the user as soon as possible. More...
|
|
a Driver to submit jobs via prun
Definition at line 23 of file PrunDriver.h.
◆ PrunDriver()
EL::PrunDriver::PrunDriver |
( |
| ) |
|
◆ ClassDef()
◆ diskOutputSave()
make the output sample handler for the given job or stream from the information stored in the histogram files.
This is optional, but it is convenient for drivers that use (conventional) writers
- Guarantee
- basic
- Failures
- out of memory II
i/o errors
◆ doManagerStep()
Definition at line 496 of file PrunDriver.cxx.
499 using namespace msgEventLoop;
505 const std::string jobELGDir =
data.submitDir +
"/elg";
506 const std::string runShFile = jobELGDir +
"/runjob.sh";
508 const std::string mergeShFile = jobELGDir +
"/elg_merge";
514 const std::string jobDefFile = jobELGDir +
"/jobdef.root";
515 gSystem->Exec(Form(
"mkdir -p %s", jobELGDir.c_str()));
516 gSystem->Exec(Form(
"cp %s %s", runShOrig.c_str(), runShFile.c_str()));
517 gSystem->Exec(Form(
"chmod +x %s", runShFile.c_str()));
518 gSystem->Exec(Form(
"cp %s %s", mergeShOrig.c_str(), mergeShFile.c_str()));
519 gSystem->Exec(Form(
"chmod +x %s", mergeShFile.c_str()));
524 if (listToShipToGrid.size()){
526 "Creating symbolic links for additional files or directories to be sent to grid.\n"
527 "For root or heavy files you should also add their name (not the full path) to EL::Job::optUserFiles.\n"
528 "Otherwise prun ignores those files."
531 std::vector<std::string> vect_filesOrDirToShip;
533 boost::split(vect_filesOrDirToShip,listToShipToGrid,boost::is_any_of(
","));
536 for (
const std::string & fileOrDirToShip: vect_filesOrDirToShip){
537 ANA_MSG_INFO ((
"Creating symbolic link for: " +fileOrDirToShip).c_str());
550 std::string outputSampleName = meta.
castString(
"nc_outputSampleName");
551 if (outputSampleName.empty()) {
552 outputSampleName =
"user.%nickname%.%in:name%";
554 meta.
setString(
"nc_outDS", formatOutputName(meta, outputSampleName));
556 meta.
setString(
"nc_writeInputToTxt",
"IN:input.txt");
558 const std::string execstr =
"runjob.sh " + (*s)->name();
560 meta.
setString(
"nc_framework",
"EventLoopGrid");
563 saveJobDef(jobDefFile, *
data.job,
sh);
568 shOut.
save(
data.submitDir +
"/output-" +
out->label());
571 shHist.
save(
data.submitDir +
"/output-hist");
577 sh.save(
data.submitDir +
"/input");
578 data.submitted =
true;
591 return ::StatusCode::SUCCESS;
◆ doRetrieve()
Definition at line 594 of file PrunDriver.cxx.
599 TmpCd tmpDir(
data.submitDir);
609 processAllInState(
sh, JobState::DOWNLOAD, nDlThreads);
614 std::cout << std::endl;
618 JobState::Enum state = sampleState(*
s);
622 std::cout << (*s)->name() <<
"\t";
626 case JobState::DOWNLOAD:
631 std::cout <<
"\033[1;32m" <<
JobState::name[state] <<
"\033[0m\t";
633 case JobState::FAILED:
634 std::cout <<
"\033[1;31m" <<
JobState::name[state] <<
"\033[0m\t";
637 std::cout <<
details << std::endl;
642 std::cout << std::endl;
644 data.retrieved =
true;
645 data.completed = allDone;
646 return ::StatusCode::SUCCESS;
◆ mergedOutputSave()
create and save a sample handler assuming we created all the merged files at the requested locations
This is optional, but it is convenient for drivers that want to keep their outputs locally.
- Guarantee
- basic
- Failures
- out of memory II
i/o errors
◆ options() [1/2]
the list of options to jobs with this driver
- Guarantee
- no-fail
- Postcondition
- result != 0
◆ options() [2/2]
◆ resubmit()
static void EL::Driver::resubmit |
( |
const std::string & |
location, |
|
|
const std::string & |
option |
|
) |
| |
|
staticinherited |
resubmit all failed sub-jobs for the job in the given location
\parm option driver-specific option string selecting which jobs to resubmit (and how)
- Guarantee
- basic, may partially resubmit
- Failures
- out of memory III
job resubmission errors
job can't be read
job was made with different driver
◆ retrieve()
static bool EL::Driver::retrieve |
( |
const std::string & |
location | ) |
|
|
staticinherited |
retrieve all the output for the job in the given location
While job failures will cause this method to fail you can typically retry it multiple times if you can use partial results.
- Returns
- whether the job completed successfully
- Guarantee
- basic, may partially retrieve
- Failures
- out of memory III
job failures
job can't be read
job was made with different driver
◆ setState()
void EL::PrunDriver::setState |
( |
const std::string & |
location, |
|
|
const std::string & |
task, |
|
|
const std::string & |
state |
|
) |
| |
|
static |
Definition at line 667 of file PrunDriver.cxx.
674 TmpCd tmpDir(location);
678 if (not
sh.get(task)) {
679 std::cout <<
"Unknown task: " << task << std::endl;
680 std::cout <<
"Choose one of: " << std::endl;
685 sh.get(task)->meta()->setString(
"nc_ELG_state", state);
◆ status()
void EL::PrunDriver::status |
( |
const std::string & |
location | ) |
|
|
static |
Definition at line 649 of file PrunDriver.cxx.
652 TmpCd tmpDir(location);
659 JobState::Enum state = sampleState(*
s);
663 <<
"\t" <<
details << std::endl;
◆ submit()
std::string EL::Driver::submit |
( |
const Job & |
job, |
|
|
const std::string & |
location |
|
) |
| const |
|
inherited |
submit the given job with the given output location and wait for it to finish
This is mostly for small jobs and backward compatibility. For longer jobs use submitOnly instead.
- Returns
- The actual location of the submit directory, if the job was configured to generate a unique directory.
- Guarantee
- basic, may partially submit
- Failures
- out of memory II
- Failures
- can't create directory at location
submission errors
◆ submitOnly()
std::string EL::Driver::submitOnly |
( |
const Job & |
job, |
|
|
const std::string & |
location |
|
) |
| const |
|
inherited |
submit the given job with the given output location and return immediately
This method allows you to submit jobs to your local batch system, log out and at a later point log back in again.
- Returns
- The actual location of the submit directory, if the job was configured to generate a unique directory.
- Guarantee
- basic, may partially submit
- Failures
- out of memory II
can't create directory at location
submission errors \warn not all drivers support this. some will do all their work in the submit function. \warn you normally need to call wait() or retrieve() before you can use the output.
◆ testInvariant()
void EL::PrunDriver::testInvariant |
( |
| ) |
const |
◆ updateLocation()
static void EL::Driver::updateLocation |
( |
const std::string & |
location | ) |
|
|
staticinherited |
update the internal location of files, after moving the submission directory
- Guarantee
- basic, may update partially
- Failures
- out of memory II \warn only move the submission directory after all your jobs are finished, or the results will be unpredictable
◆ wait()
static bool EL::Driver::wait |
( |
const std::string & |
location, |
|
|
unsigned |
time = 60 |
|
) |
| |
|
staticinherited |
retrieve all the output for the job in the given location and wait until it is finished completely.
poll the output every time seconds.
While job failures will cause this method to fail you can typically retry it multiple times if you can use partial results.
Typically sleeping for 60 seconds is an appropriate interval, but if it doesn't work for you, you can change it here.
- Guarantee
- basic, may partially retrieve
- Failures
- out of memory III
job failures
job can't be read
job was made with different driver
◆ abortRetrieve
bool EL::Driver::abortRetrieve |
|
staticprotectedinherited |
this flag is set to true when the wait() function is running and a SIGINT is caught, meaning that control should be returned to the user as soon as possible.
drivers can use it to abort long running operations in doRetrieve before completion
Definition at line 212 of file Driver.h.
◆ m_options
members directly corresponding to accessors
Definition at line 233 of file Driver.h.
The documentation for this class was generated from the following files:
SH::MetaObject * options()
the list of options to jobs with this driver
char data[hepevt_bytes_allocation_ATLAS]
virtual ::StatusCode doManagerStep(Detail::ManagerData &data) const
std::vector< Sample * >::const_iterator iterator
the iterator to use
void save(const std::string &directory) const
save the list of samples to the given directory
static const std::string optGridPrunShipAdditionalFilesOrDirs
Enables to ship additional files to the tarbal sent to the grid Should be a list of comma separated p...
@ doRetrieve
call the actual doRetrieve method
std::string PathResolverFindCalibFile(const std::string &logical_file_name)
void exec(const std::string &cmd)
effects: execute the given command guarantee: strong failures: out of memory II failures: system fail...
A class that manages a list of Sample objects.
::StatusCode doRetrieve(Detail::ManagerData &data) const
std::string outputFileNames(const EL::Job &job)
#define RCU_READ_INVARIANT(x)
@ submitJob
do the actual job submission
#define RCU_NEW_INVARIANT(x)