Running Athena Full Chain in Release 14
TABLE OF CONTENTS
This tutorial is an extension of the Regular Computing Tutorial held at CERN:
https://twiki.cern.ch/twiki/bin/view/Atlas/RegularComputingTutorial
Its purpose is mainly creating scripts to run Athena Full Chain on LXBATCH and the GRID. If you have some experience with writing scripts for the
Athena, you can safely ignore this TWiki and download the files directly, I tried to make them well commented and understandable.
Scripts have been sucessfuly tested on the current 14.2.10 version.
About Notation
Notation Throughout the TWiki
Symbol |
Meaning |
> something |
type something into the shell |
script.sh |
font used for file names and code |
Terminus Technicus |
any technical term |
< SOMETHING > |
substitute for your instance of something |
IMPORTANT |
anything important within the text |
Idea Structure
ASSUMPTIONS |
everything what needs to be done before going to the procedures |
PROCEDURES |
what to do to obtain the result |
NOTES |
what to know to avoid being killed by Athena |
File Name Structure
In order to preserve some order in the produced files, all scripts here are written in a way to produce unique filenames dependent on the run parameters. Generally the filenames look as follows:
STEP |
FILE NAME |
FILE TYPE |
Generation |
<JOB OPTION>.<EVENTS>.<ID>.pool.root |
generated pool |
Simulation |
<JOB OPTION>.<SKIP>-<SKIP + EVENTS>of<GENERATION TOTAL>.<ID>.sim.pool.root |
simulated pool (sim) |
Digitization |
<JOB OPTION>.<SKIP>-<SKIP + EVENTS>of<GENERATION TOTAL>.<ID>.rdo.pool.root |
Raw Data Object (RDO) |
Reconstruction |
<JOB OPTION>.<SKIP>-<SKIP + EVENTS>of<DIGITIZATION TOTAL>.<ID>.esd.pool.root |
Event Summary Data (ESD) |
Reconstruction |
<JOB OPTION>.<SKIP>-<SKIP + EVENTS>of<DIGITIZATION TOTAL>.<ID>.aod.pool.root |
Analysis Object Data (AOD) |
Reconstruction |
<JOB OPTION>.<SKIP>-<SKIP + EVENTS>of<DIGITIZATION TOTAL>.<ID>.tag.pool.root |
Tagged Data (TAG) |
Reconstruction |
<JOB OPTION>.<SKIP>-<SKIP + EVENTS>of<DIGITIZATION TOTAL>.<ID>.ntuple.root |
Combined ntuple (NTUPLE) |
Reconstruction |
<JOB OPTION>.<SKIP>-<SKIP + EVENTS>of<DIGITIZATION TOTAL>.<ID>.JiveXML.tar |
JiveXML (XML) |
1 Configuration
Following is a procedure how to set up your
lxplus
account without having to read much. Official guide on how to setup your environment can be found here:
https://twiki.cern.ch/twiki/bin/view/Atlas/WorkBookSetAccount.
1.1 Setting up the CMT Environment
1. Login to
lxplus.cern.ch
using your credentials. (
http://www-hep2.fzu.cz/twiki/bin/view/ATLAS/AthenaRelated#LXplus_Login)
2. Prepare the necessary directories. Create your
$CMT_HOME
directory for your
configuration management (see
http://www.cmtsite.org):
> cd $HOME
> mkdir cmt-fullchain
Create your
$TestArea
, where your packages will be installed:
> mkdir testarea
> mkdir testarea/FullChain
3. Create the
requirements environment for the CMT. You can do it in the console like this:
> cd cmt-fullchain
> touch requirements
> mcedit requirements
or use
pico
or whichever editor you prefer (you can also use any of your preferred editors using sFTP, see:
http://www-hep2.fzu.cz/twiki/bin/view/ATLAS/WindowsRelated) and copy this into the
requirements
file:
#---- CMT HOME REQUIREMENTS FILE ---------------------------------
set CMTSITE CERN
set SITEROOT /afs/cern.ch
macro ATLAS_DIST_AREA ${SITEROOT}/atlas/software/dist
macro ATLAS_TEST_AREA /afs/cern.ch/user/m/mzeman/testarea/FullChain
apply_tag oneTest # use ATLAS working directory
apply_tag setup # use working directory
apply_tag 32 # use 32-bit
use AtlasLogin AtlasLogin-* $(ATLAS_DIST_AREA)
#----------------------------------------------------------------
4. Download the
CMT enviroment to your
$CMT_HOME
. The
v1r20p20070208
(an iteration for the v1r20 release) is a tested well-working environment.
> source /afs/cern.ch/sw/contrib/CMT/v1r20p20070208/mgr/setup.sh
5. Configure the CMT to your AFS account (you do this only once per each new
CMT_HOME
you are using):
> cmt config
6. Logout.
7. Login. Enter your
CMT_HOME
directory and source the Athena setup with release specifications (14.2.10).
> cd cmt-fullchain
> source setup.sh -tag=14.2.10
Congratulations, your
lxplus
environment is now ready
1.2 What to do every time you log on?
Go to your
CMT_HOME
directory:
> cd cmt-fullchain
Load Athena environment:
> source setup.sh -tag=14.2.10
1.2.1 Employing Startup Script
You can simplify this process by employing startup script with functions and environment variables, see
http://www-hep2.fzu.cz/twiki/bin/view/ATLAS/AthenaRelated#LXplus_Login. Here is an example of a startup script with functions employed for the Full Chain:
#---- LXPLUS STARTUP SCRIPT ---------------------------------
# Load Athena Function
function Load/Athena {
source ${CMT_HOME}/setup.sh -tag=$*
echo "Athena" $* "Loaded"
shift
}
# Full Chain Setup
function FullChain {
echo "Loading Full Chain Environment"
# Specify CMT home directory
export CMT_HOME=${HOME}/cmt-fullchain
echo "Your CMT home directory:" $CMT_HOME
# Use function Load/Athena
Load/Athena 14.2.10
# Load Envrioment Variables
# LOCAL
export FULL_CHAIN=${HOME}/testarea/FullChain/
export SCRATCH=${HOME}/scratch0/
# CASTOR
export CASTOR_GENERATION=${CASTOR_HOME}/fullchain/generation/
export CASTOR_SIMULATION=${CASTOR_HOME}/fullchain/simulation/
export CASTOR_DIGITIZATION=${CASTOR_HOME}/fullchain/digitization/
export CASTOR_RECONSTRUCTION=${CASTOR_HOME}/fullchain/reconstruction/
export CASTOR_TEMP=${CASTOR_HOME}/fullchain/temp/
export CASTOR_LOG=${CASTOR_HOME}/fullchain/log/
echo "Environment Ready"
}
#----------------------------------------------------------------
!!! Please note that
$SCRATCH
and all environment variables beginning with
$CASTOR_
are necessary for the script to run correctly.
1.3 Preparing CASTOR
In case you have not already done it, do create the directories on Castor (*C*ERN *A*dvanced *STOR*age manager for large ammounts of data:
http://castor.web.cern.ch/castor/). It is necessary for handling large files, since your AFS space quota is tight and it is being used it in the scripts here:
>
rfmkdir ${CASTOR_HOME}/fullchain
rfmkdir ${CASTOR_HOME}/fullchain/generation
rfmkdir ${CASTOR_HOME}/fullchain/simulation
rfmkdir ${CASTOR_HOME}/fullchain/digitization/
rfmkdir ${CASTOR_HOME}/fullchain/reconstruction/
rfmkdir ${CASTOR_HOME}/fullchain/temp/
rfmkdir ${CASTOR_HOME}/fullchain/log/
>
1.4 Running Full Chain on LXPLUS
Follow the tutorial here:
https://twiki.cern.ch/twiki/bin/view/Atlas/RegularComputingTutorial#Setting_up_job_transformation
Disadvantages:
- LX plus will kill all your runs after 40 minutes, therefore you can simulate and reconstruct only a few events.
- Your session gets busy, therefore you loose one terminal
- Logging of
lxplus
, loosing connection or turning off your computer causes the run to stop
In conclusion, using
lxplus
on larger scale is impossible. The solution is to use LXBATCH or GRID.
1.5 Using LXBATCH
LX batch submitting works in a way that the job you want to run is processed on some other computer than the one you are just logged in. There are some things to know about LXBATCH:
- You are starting up clean, so there is
- NO startup script (no functions and environment variables of your own),
- NO CMT configured,
- NO Athena version is loaded
- Your home directory
/afs/cern.ch/user//
is accessible as if you were logged in (not just the public folder)
- All other ATLAS repositories are visible normally (
/afs/cern.ch/atlas/software/...
and other users)
- CASTOR is visible normally
Submitting jobs to LX batch is done by the following command
> bsub -q <QUEUE>[ -R "type==<OS>&&swp><# megabytes>&&pool><# megabytes>" ] <SCRIPT NAME> [ parameters ]
You can check out which queues are available using:
> bqueues
Listing the jobs currently running:
> bjobs [ -l ] [ -w ] [ jobID ]
Killing a job assuming you know its ID (use
bjobs
to find out)
> bkill <jobID>
1.6 Using GANGA
first run
ganga -g
Configuration file
.gangarc
gangadir = /afs/cern.ch/user/LETTER/NAME/PLACE/gangadir
2 Generation
More information can be found here:
https://twiki.cern.ch/twiki/bin/view/Atlas/WorkBookGeneration
2.1 Running Generation
Generation is run just as any job in Athena using the
athena.py
script on a Job Options file. It is also good to print the output on the screen and store it into a file using
| tee
:
> athena.py <JOB OPTIONS> | tee Generation.Output.txt
You will need to obtain these two files:
PDGTABLE.MeV
jobOptions.pythia.py
2.1.1 How to get Particle Data Group Table
PDG Table is a database of particles, their energies, charges and code-names being used throughout the Athena. You can get it like this:
> get_files -jo PDGTABLE.MeV
2.1.2 How to get Job Option files:
You can choose from a variety of files available on
http://reserve02.usatlas.bnl.gov/lxr/source/atlas/Generators/EvgenJobOptions/share/:
-
MC8.105145.PythiaZmumu.py
for Z->mu,mu decay,
-
MC8.105144.PythiaZee.py
for Z->e,e decay,
- and many others
Use the following command to get Job Options you want, we are going to use the Z->e+,e- decay:
> get_files -jo MC8.105144.PythiaZee.py
2.1.3 How to change minimum number of events:
The default value of the
MC8.105144.PythiaZee.py
is 5000 events, therefore if you choose your
> MAX EVENTS $lt
below 5000, you will get into problems and
generation will crash. What you need to do is edit the
JobOptions? .py file (e.g.
MC8.105144.PythiaZee.py
) and add this line to the end:
evgenConfig.minevents = 100 # default is 5000
On LXBATCH we can of course leave the default 5000, however since we use
get_files -jo
on LXBATCH to obtain the
JobOptions? file, it uses the unmodified version from the central repositories and your modified version from your AFS account.
2.1.4 Running Generation using Job Transformation
You can run
generation using
Pythia Job Transformation by issuing:
> csc_evgen08_trf.py <RUN no.> <FIRST EVENT> <MAX EVENTS> <RANDOM SEED> ./<JOB OPTIONS .py> <OUTPUT run.pool.root>
Why would we want to do this? Because the Job Transformation does not require any CMT setup and
requirements
files. You will only need to call these global repository CMTs and then you can run the Job:
>
. /afs/cern.ch/atlas/software/releases/14.2.10/cmtsite/setup.sh -tag=14.2.10
export CMTPATH=/afs/cern.ch/atlas/software/releases/14.2.10/AtlasProduction/14.2.10
. /afs/cern.ch/atlas/software/releases/14.2.10/AtlasProduction/14.2.10/AtlasProductionRunTime/cmt/setup.sh
>
2.2 Generation on LXPLUS
Now in our case, we obtained
MC8.105144.PythiaZee.py
and
modified file and since we are running locally we want to generate only about ~110 events of Z->e+,e- decay. Make sure you have changed
evgenConfig.minevents
to
100
, otherwise the
generation will crash on
Too Few Events Requested
.
> csc_evgen08_trf.py 105144 1 110 1324354657 ./MC8.105144 .PythiaZee.py MC8.PythiaZee.110.Local.pool.root
2.3 Generation on LXBATCH
If we want to generate more events, for instance the default minimum 5000 events, we need to run
Generation on LXBATCH. To do this you need to have the following scripts:
If you did everything just as it is in this tutorial (including all directory and file names), you can run it without modifying anything.
What you need to do
ONLY ONCE is to make the scripts executable by:
> chmod +x Generation.JobTransformation.sh
> chmod +x Batch.Generation.sh
Now all you need to do is run the
Batch.Generation.sh
script, which submits your job to the LXBATCH. The script has three parameters you
HAVE TO specify. To run the job, issue the following from the directory where you put
BOTH of the scripts:
> ./Batch.Generation.sh <JOBOPTIONS> <EVENTS> <ID>
A few notes:
- You can run the submitter
Batch.Generation.sh
from any folder (public, private - it does not matter)
- Make sure both scripts are executable before panicking.
- Make sure all directories and files specified in the environment variables exist !!! (if you followed this tutorial EXACTLY, everything should be working)
- The script creates your environment on LXBATCH machine and runs the generation setup of your choosing
2.3.1 Making generation script for LXBATCH step by step:
Following is the description how the code of the
Generation.JobTransformation.sh
script was written.
1. First we want to specify the environment variables, so the script works generally everywhere and if something needs to be changed, it needs to be done only on one place which can be easily found at the beginning of the file:
### ENVIRONMENT SETUP
## LOCAL (your AFS environment)
# export HOME=/afs/cern.ch/user/m/mzeman # uncomment and change if missing
# export CASTOR_HOME=/castor/cern.ch/user/m/mzeman # uncomment and change if missing
export CMT_HOME=${HOME}/cmt-fullchain
export FULL_CHAIN=${HOME}/testarea/FullChain/
# CASTOR (your CASTOR environment)
export CASTOR_GENERATION=${CASTOR_HOME}/fullchain/generation
export CASTOR_SIMULATION=${CASTOR_HOME}/fullchain/simulation
export CASTOR_DIGITIZATION=${CASTOR_HOME}/fullchain/digitization
export CASTOR_RECONSTRUCTION=${CASTOR_HOME}/fullchain/reconstruction
export CASTOR_TEMP=${CASTOR_HOME}/fullchain/temp
export CASTOR_LOG=${CASTOR_HOME}/fullchain/log
Make sure
ALL these paths are in accord with your actual directories. If that is not the case, you will undoubtedly
FAIL.
2. Secondly, we need to process the input parameters coming from the
Batch.Generation.sh
script:
### INPUT PARAMETERS
export JOBOPTIONS=$1 # which file to run
export EVENTS=$2 # number of events to process (int)
export ID=$3 # unique run identificator of your choice (string)
# Parse environment variables amongst points
PARSE=(`echo ${JOBOPTIONS} | tr '.' ' '`)
OUTPUT=${PARSE[0]}.${PARSE[2]} # name of the CASTOR output for easy orientation
RUN=${PARSE[1]} # generation Job Transformation requires RUN number parsed from the JobOptions filename
## Remove all the parameters from $1, $2 and $3, otherwise "source setup.sh ..." would pick them up and probably fail
while [ $# -gt 0 ] ; do shift ; done
3. Now we need to setup the workspace and CMT completely anew, since we are running on a remote machine:
### ISSUE CODE
## CREATE TEMPORARY WORKSPACE
echo "###################################################################################################"
echo "CREATING WORKSPACE"
mkdir Generation.${OUTPUT}.${EVENTS}.${ID}
cd Generation.${OUTPUT}.${EVENTS}.${ID}
# Setup Athena environment
# Have some experience with newer Athena releases,
# that first setup.sh have to be done in its directory
echo "###################################################################################################"
echo "SETTING UP CMT HOME"
export CURRENTDIR=`pwd` # remember the current directory
cd ${CMT_HOME}
echo "Your CMT home directory:" ${CMT_HOME}
source ${CMT_HOME}/setup.sh -tag=14.2.10,32
. /afs/cern.ch/atlas/software/releases/14.2.10/cmtsite/setup.sh -tag=14.2.10
export CMTPATH=/afs/cern.ch/atlas/software/releases/14.2.10/AtlasProduction/14.2.10
. /afs/cern.ch/atlas/software/releases/14.2.10/AtlasProduction/14.2.10/AtlasProductionRunTime/cmt/setup.sh
4. Now we can finally leave the CMT be and go back to our working directory, where we can
RUN THE JOB:
# Go back to working directory
echo ${CURRENTDIR}
cd ${CURRENTDIR}
# Download the Job Options for the current run:
get_files -jo ${JOBOPTIONS}
# Run the Job
echo "###################################################################################################"
echo "RUNNING GENERATION"
csc_evgen08_trf.py ${RUN} 1 ${EVENTS} 1324354657 ./${JOBOPTIONS} ${OUTPUT}.${EVENTS}.${ID}.pool.root
5. Finally we need to copy our generation results from the LXBATCH, the most convenient way is to put it to castor using
rfcp
command.
# Copy out the results if they exist
echo "###################################################################################################"
echo "COPYING GENERATION OUTPUT"
if [ -e ${OUTPUT}.${EVENTS}.${ID}.pool.root ] ; then
rfcp ${OUTPUT}.${EVENTS}.${ID}.pool.root ${CASTOR_GENERATION}/${OUTPUT}.${EVENTS}.${ID}.pool.root
fi
# List content of the working directory for debugging purposes
ls -lRt
# Clean workspace before exit
cd ..
rm -fR Generation.${OUTPUT}.${EVENTS}.${ID}
2.3.2 LXBATCH Generation Submitter
The script we have just made needs to be run on the LXBATCH machine, which is executed by the
bsub
command. For this reason, we create a
batch submitter:
1. First get the parameters into variables:
export JOBOPTIONS=$1 # which file to run (string)
export EVENTS=$2 # number of events to process (int)
export ID=$3 # unique run identificator of your choice (string)
2. We want to parse some of the variables, so that we can adhere to the file name notation (symbol
.
is being used as a separator):
PARSE=(`echo ${JOBOPTIONS} | tr '.' ' '`)
OUTPUT=${PARSE[0]}.${PARSE[2]} # name of the CASTOR output for easy orientation
3. Ensure that the parameters will not get picked up twice, we need to magically remove them:
while [ $# -gt 0 ] ; do shift ; done
4. And finally submit:
bsub -R "type==SLC4&&swp>4000&&pool>2000" -q 8nh -o ${SCRATCH}/Generation.${OUTPUT}.${EVENTS}.${ID}.Screen.txt Generation.JobTransformation.sh ${JOBOPTIONS} ${EVENTS} ${ID}
2.4 Generation on the GRID
3 Simulation
3.1 Running Simulation
Simulation is run just as any job in Athena using the athena.py script on a Job Options file:
> athena.py <JOB OPTIONS> | tee Simulation.Output.txt
Screen output is saved into the
Simulation.Output.txt
file. Make sure you have enough disk space.
3.1.1 Running Simulation JobTransformation?
You can run
simulation together with
digitization using
Geant4 by running
csc_simul_trf.py
script (accessible after sourcing Athena). Type the help command
> csc_simul_trf.py -h
to get information about script parameters. The most important are: the
generation input file, type of
geometry and
SIM and RDO output names.
3.2 Simulation on LXPLUS
Assumptions: Your generation output file (if you followed the tutorial, its name should be
MC8.PythiaZee.110.Local.pool.root
) is in the same directory you are trying to run the Job Transformation.
Procedures: Running simulation on LXPLUS with the
JobTransformation? is very easy, however it takes
incredibly long:
> csc_simul_trf.py MC8.PythiaZee.110.Local.pool.root MC8.PythiaZee.0-1of110.Local.hits.pool.root MC8.PythiaZee.0-1of110.Local.rdo.pool.root 1 0 1324354656 ATLAS-CSC-02-01-00 100 1000
Notes:
Since simulating more than 10 events on LXPLUS is problematic, we need to use LXBATCH.
3.3 Simulation on LXBATCH
To run
simulation on LXBATCH, you need to have the following scripts. If you did everything just as it is in this tutorial (including all directory and file names), you can run it without modifying anything.
What you need to do
ONLY ONCE is to make the scripts executable by:
> chmod +x Simulation.JobTransformation.sh
> chmod +x Batch.Simulation.sh
Now all you need to do is run the
Batch.Simulation.sh
script, which submits your job to the LXBATCH. The script has four parameters you
HAVE TO specify. To run the job, issue the following from the directory where you put
BOTH of the scripts:
> ./Batch.Simulation.sh <GENERATION POOL ROOT> <EVENTS> <SKIP> <ID>
-
GENERATION POOL ROOT
is a file we obtained from generation. It should be in your $CASTOR_HOME/fullchain/generation
folder. All you need to do is to COPY/PASTE its name, the scripts downloads it and accesses it automatically (string).
-
EVENTS
is a number of events you want to simulate (int).
-
SKIP
is the number of events you want to skip (for example you want to simulate between 2000 and 2200 events of your generation file)
-
ID
is an identifier of your choosing (string).
A few notes:
- You can run the submitter
Batch.Simulation.sh
from any folder (public, private - it does not matter)
- Make sure both scripts are executable before panicking.
- Make sure all directories and files specified in the environment variables exist !!! (if you followed this tutorial EXACTLY, everything should be working)
- The script creates your environment on LXBATCH machine and runs the simulation setup of your choosing
- Simulation JobTransformation? runs together with digitization, therefore it takes very long. Make sure you try to simulate 50 events at most, if you want to simulate more, you need to modify the
Batch.Simulation.sh
script to run in longer queues (for instance change the 8 hour queue 8nh
to 2 day queue 2nd
)
3.4 Making simulation script for LXBATCH step by step:
Following is the description how the code of the
Simulation.JobTransformation.sh
script was written. The whole code is very similar to
Generation.JobTransformation.sh
, it only works with more parameters and more outputs and it takes much longer to execute.
1. First we want to specify the environment variables, so the script works generally everywhere and if something needs to be changed, it needs to be done only on one place which can be easily found at the beginning of the file. It is essentially the same as in the
Generation.JobTransformation.sh
script.:
<copy all from Generation.JobTransformation.sh>
Make sure
ALL these paths are in accord with your actual directories. If that is not the case, you will undoubtedly
FAIL.
2. Secondly, we need to process the input parameters coming from the
Batch.Simulation.sh
script:
### INPUT PARAMETERS
export INPUT=$1 # input POOL ROOT file
export EVENTS=$2 # number of events to process
export SKIP=$3 # number of generated events to skip
export ID=$4 # unique run identificator of your choice
# Parse environment variables amongst points
PARSE=(`echo ${INPUT} | tr '.' ' '`)
OUTPUT=${PARSE[0]}.${PARSE[1]} # name of the CASTOR output for easy orientation
TOTAL=${PARSE[2]} # total number of events generated in the input file
LAST=$[${EVENTS}+${SKIP}] # arithmetic evaluation
## Remove all the parameters from $1, $2, $3 and $4, otherwise "source setup.sh ..." would pick them up and probably fail
while [ $# -gt 0 ] ; do shift ; done
3. Now we need to setup the workspace:
### ISSUE CODE
## CREATE TEMPORARY WORKSPACE
echo "###################################################################################################"
echo "CREATING WORKSPACE"
mkdir Simulation.${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}
cd Simulation.${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}
# Copy entire run directory in my working place
rfcp ${CASTOR_GENERATION}/${INPUT} .
# Setup Athena environment
# Have some experience with newer Athena releases,
# that first setup.sh have to be done in its directory
echo "###################################################################################################"
echo "SETTING UP CMT HOME"
export CURRENTDIR=`pwd` # remember the current directory
cd ${CMT_HOME}
echo "Your CMT home directory:" ${CMT_HOME}
source ${CMT_HOME}/setup.sh -tag=14.2.10,32
. /afs/cern.ch/atlas/software/releases/14.2.10/cmtsite/setup.sh -tag=14.2.10
export CMTPATH=/afs/cern.ch/atlas/software/releases/14.2.10/AtlasProduction/14.2.10
. /afs/cern.ch/atlas/software/releases/14.2.10/AtlasProduction/14.2.10/AtlasProductionRunTime/cmt/setup.sh
4. Now we run the job:
# Go back to working directory and run the job
echo ${CURRENTDIR}
cd ${CURRENTDIR}
# Run the Job
echo "###################################################################################################"
echo "RUNNING SIMULATION"
csc_simul_trf.py ${INPUT} hits.pool.root rdo.pool.root ${EVENTS} ${SKIP} 1324354656 ATLAS-CSC-02-01-00 100 1000
5. Finally we need to copy our generation results from the LXBATCH, the most convenient way is to put it to castor using
rfcp
command. Again, this is almost the same as in the
Generation.JobTransformation.sh
with the following exception:
.
.
.
if [ -e hits.pool.root ] ; then
rfcp hits.pool.root ${CASTOR_SIMULATION}/${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}.sim.pool.root
rfcp rdo.pool.root ${CASTOR_SIMULATION}/${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}.rdo.pool.root
fi
.
.
.
3.5 LXBATCH Simulation Submitter
3.6 LXBATCH Simulation Submitter
Again, the script we have just made needs to be run on the LXBATCH machine, which is executed by the
bsub
command. This script is very similar to the
Batch.Generation.sh
script, it just works with more parameters.
1. First get the parameters into variables:
export INPUT=$1 # input Generation POOL ROOT
export EVENTS=$2 # number of events to process
export SKIP=$3 # number of events to skip
export ID=$4 # unique run dentifier of your choice
2. Now we have to do some more parsing, just as in the script before:
PARSE=(`echo ${INPUT} | tr '.' ' '`)
OUTPUT=${PARSE[0]}.${PARSE[1]} # name of the CASTOR output for easy orientation
TOTAL=${PARSE[2]} # total number of events generated in the input file
LAST=$[${EVENTS}+${SKIP}] # arithmetic evaluation
3. Again, here comes the magical line:
while [ $# -gt 0 ] ; do shift ; done
4. And submit:
bsub -R "type==SLC4&&swp>4000&&pool>2000" -q 8nh -o ${SCRATCH}/${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}.Screen.txt Simulation.sh ${INPUT} ${EVENTS} ${SKIP} ${ID}
4 Digitization
Digitization is run together with simulation using
JobTransformation? . That is why it takes so long.
Simulation produces the
hits.pool.root
and
digitization produces
rdo.pool.root
file. If for some reason you need to run digitization separately, use the following Job Transformation:
> csc_digi_trf.py <INPUT hits.pool.root> <OUTPUT rdo.pool.root> <MAX EVENTS> <SKIP EVENTS> <GEOMETRY VERSION> <SEEDS> ...
You can just simply change the
csc_simul_trf.py
command in the simulation script to use the digitization.
5 Reconstruction
Reconstruction is the last step before you can view and analyse your data. Generally, it runs on the
Reconstruction/RecExample/RecExCommon
package. More information about how it works and ho write your
JobOptions? can be found here:
https://twiki.cern.ch/twiki/bin/view/Atlas/RunningReconstruction
Documentation: https://twiki.cern.ch/twiki/bin/view/Atlas/ReconstructionDocumentation
5.1 Running Reconstruction
Again, reconstruction is run just as any job:
> athena.py jobOptions.py | tee Generation.Output.txt
However this requires that you specify your input file. You can do so in your
myTopOptions.py
. Note that it has to be
RDO file, digitization output. You also have to insert this file into the
PoolFileCatalog
.
Notes:
- Make sure you have over 50 MB free space if you are running on LXPLUS.
- By default, the jobOptions.py script is used, which is linked to the
myTopOptions.py
, which you can modify to suit your needs.
- Be careful about the geometry and trigger flags in your
myTopOptoins.py
. Their order matters.
5.1.1 How to insert file to Pool File Catalog
The classical F.A.Q.; you need to do just this:
> pool_insertFileToCatalog <PATH>/<FILENAME>
> FCregisterLFN -p <PATH>/<FILENAME> -l <USERNAME>.<FILENAME>
If you are using a file from CASTOR, do not forget to add the
rfio
protocol like this:
rfio:/<PATH>/<FILENAME>
5.1.2 Customizing Job Options
5.1.3 Running Reconstruction using Job Transformation
You can run
reconstruction using Job Transformation as follows:
> csc_reco_trf.py <INPUT RDO.pool.root> esd.pool.root aod.pool.root ntuple.root <MAX EVENTS> <SKIP> <GEOMETRY> DEFAULT
5.2 Reconstruction on LXPLUS
You can use Job Transformation to run
reconstruction on the
RDO file to obtain
ESD,
AOD and/or other outputs:
> csc_reco_trf.py MC8.PythiaZee.0-1of110.Local.rdo.pool.root esd.pool.root aod.pool.root ntuple.root 1 0 ATLAS-CSC-02-01-00 DEFAULT
5.3 Reconstruction on LXBATCH
To run
simulation on LXBATCH, you need to have the following scripts. If you did everything just as it is in this tutorial (including all directory and file names), you can run it without modifying anything.
What you need to do
ONLY ONCE is to make the scripts executable by:
> chmod +x Reconstruction.JobTransformation.sh
> chmod +x Batch.Reconstruction.sh
Now all you need to do is run the
Batch.Reconstruction.sh
script, which submits your job to the LXBATCH. The script has four parameters you
HAVE TO specify. To run the job, issue the following from the directory where you put
BOTH of the scripts:
> ./Batch.Reconstruction.sh <DIGITIZATION POOL ROOT> <EVENTS> <SKIP> <ID>
-
DIGITIZATION POOL ROOT
is a file we obtained from digitization (in our case this step is identical to simulation step). It should be in your $CASTOR_HOME/fullchain/digitization
folder. All you need to do is to COPY/PASTE its name, the scripts downloads it and accesses it automatically (string).
-
EVENTS
is a number of events you want to reconstruct (int).
-
SKIP
is the number of events you want to skip (int)
-
ID
is an identifier of your choosing (string).
NOTES: You can run the submitter
Batch.Reconstruction.sh
from any folder (public, private - it does not matter). Again double-check that:
- Both scripts are executable.
- All directories and files specified in the environment variables exist.
5.3.1 Making customizable LXBATCH reconstruction script
Now at this point there is little sense in repeating how to write a Job Transformation LXBATCH script, because all you need to do is to change a few lines in the code. The script has been provided as a attachement
Reconstruction.JobTransformation.sh
. What we should do now is to write a more customizable
reconstruction script that allows us to play with the
Job Options and Athena packages (
Reconstruction.Custom.sh
):
1. First things first, double-check the following environment variables suit your setup as in all previous scripts:
### ENVIRONMENT SETUP
## LOCAL (your AFS environment)
# export HOME=/afs/cern.ch/user/m/mzeman # uncomment and change if missing
# export CASTOR_HOME=/castor/cern.ch/user/m/mzeman # uncomment and change if missing
export CMT_HOME=${HOME}/cmt-fullchain
export FULL_CHAIN=${HOME}/testarea/FullChain/
# CASTOR (your CASTOR environment)
export CASTOR_GENERATION=${CASTOR_HOME}/fullchain/generation
export CASTOR_SIMULATION=${CASTOR_HOME}/fullchain/simulation
export CASTOR_DIGITIZATION=${CASTOR_HOME}/fullchain/digitization
export CASTOR_RECONSTRUCTION=${CASTOR_HOME}/fullchain/reconstruction
export CASTOR_TEMP=${CASTOR_HOME}/fullchain/temp
export CASTOR_LOG=${CASTOR_HOME}/fullchain/log
2. Now let us again go through the input paramters. Apologies for the overly complicated parsing, that is to extract the number of
digitised events in the RDO input file.
### INPUT PARAMETERS
export INPUT=$1 # input RDO file
export EVENTS=$2 # number of events to process
export SKIP=$3 # number of generated events to skip
export ID=$4 # unique run identificator of your choice
# Parse environment variables amongst points
PARSE=(`echo ${INPUT} | tr '.' ' '`)
OUTPUT=${PARSE[0]}.${PARSE[1]} # name of the CASTOR output for easy orientation
PARSE=(`echo ${PARSE[2]} | tr '-' ' '`) # another parsing to obtain the total number of simulated events from the filename
PARSE=(`echo ${PARSE[1]} | tr 'of' ' '`)
TOTAL=${PARSE[0]} # total number of events generated in the input file
LAST=$[${EVENTS}+${SKIP}] # arithmetic evaluation
## Remove all the parameters from $1, $2, $3 and $4, otherwise "source setup.sh ..." would pick them up and probably fail
while [ $# -gt 0 ] ; do shift ; done
3. As for the code itself, it is quite different from the one used for Job Transformations, however the workspace is the same:
# Delete directory if exists
if [ -d Reconstruction.${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID} ] ; then
rm -fR Reconstruction.${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}
fi
# Create new directory
mkdir Reconstruction.${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}
cd Reconstruction.${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}
# Show the power of the processor
grep MHz /var/log/dmesg
4. Since normal Athena
jobs require a CMT setup, we have to create a
${CMT_HOME}
directory:
export CURRENTDIR=`pwd` # remember the current directory
# Create CMT directory
mkdir cmt
cd cmt
5. Create the
requirements
file:
touch requirements
cat <<EOF >|requirements
#---- CMT HOME REQUIREMENTS FILE ---------------------------------
set CMTSITE CERN
set SITEROOT /afs/cern.ch
macro ATLAS_DIST_AREA \${SITEROOT}/atlas/software/dist
macro ATLAS_TEST_AREA /afs/cern.ch/user/m/mzeman/testarea/FullChain
apply_tag oneTest # use ATLAS working directory
apply_tag setup # use working directory
apply_tag 32 # use 32-bit
apply_tag ${RELEASE}
use AtlasLogin AtlasLogin-* \$(ATLAS_DIST_AREA)
#----------------------------------------------------------------
EOF
echo "YOUR REQUIREMENTS FILE:"
cat requirements
We use
cat
to make sure the contents of the
requirements
appears in the Screen Output of
bsub
.
6. Source
CMT!
export CMT_ROOT=/afs/cern.ch/sw/contrib/CMT/${CMT_VERSION}
source ${CMT_ROOT}/mgr/setup.sh
which cmt
cmt config
source setup.sh -tag=${RELEASE},32
export CMTPATH=/afs/cern.ch/atlas/software/releases/$RELEASE/AtlasProduction/$PCACHE
source /afs/cern.ch/atlas/software/releases/${RELEASE}/AtlasOffline/${PCACHE}/AtlasOfflineRunTime/cmt/setup.sh
7. Insert the
RDO input file into the
Pool File Catalog:
# Go back to working directory
echo ${CURRENTDIR}
cd ${CURRENTDIR}
if [ -d PoolFileCatalog.xml ] ; then
rm -f PoolFileCatalog.xml
fi
pool_insertFileToCatalog rfio:${CASTOR_DIGITIZATION}/${INPUT} # create PoolFileCatalog.XML
FCregisterLFN -p rfio:${CASTOR_DIGITIZATION}/${INPUT} -l ${CASTOR_DIGITIZATION}/`whoami`.${INPUT}
8. Now we need to specify our own
Job Options file and save it as
myTopOptions.py
. The maximum number of events
EvtMax
is written first according to the number of events specified while running the script. Please note that the order of the
flags and
includes in the code
DOES MATTER:
touch myTopOptions.py
# FLAGS NEED TO COME FIRST
if [ ${EVENTS} -ne 0 ] ; then
echo "Setting EvtMax to $EVENTS"
echo "### NUMBER OF EVENTS" >> myTopOptions.py
echo "EvtMax=${EVENTS}" >> myTopOptions.py
fi
cat >> myTopOptions.py << EOF
### INPUT FILE (POOL FILE CATALOG needs to be defined)
PoolRDOInput = ["rfio:${CASTOR_DIGITIZATION}/${INPUT}"]
### GEOMETRY SELECTION
DetDescrVersion="ATLAS-CSC-02-01-00" # new geometry for Job Transformations
# DetDescrVersion="ATLAS-CSC-01-02-00" # default geometry
### GENERAL FLAGS
# doTrigger = False # for example do not run trigger simulation
# doTruth=False
### INCLUDE YOUR OWN ALGORITHMS(s)
# UserAlgs=[ "MyPackage/MyAlgorithm_jobOptions.py" ]
### ESD output CONFIGURATION
# doESD=False
# doWriteESD=False
### TRIGGER CNFIGURATION
#(see https://twiki.cern.ch/twiki/bin/view/Atlas/TriggerFlags)
include ("TriggerRelease/TriggerFlags.py") # Trigger Flags
TriggerFlags.doLVL1=True
TriggerFlags.doLVL2=True
TriggerFlags.doEF=True
# ANALYSIS OBJECT DATA output CONFIGURATION
# (see https://twiki.cern.ch/twiki/bin/view/Atlas/UserAnalysisTest#The_AOD_Production_Flags)
# doAOD=False
# doWriteAOD=False
# doWriteTAG=False
# from ParticleBuilderOptions.AODFlags import AODFlags
### DETECTOR FLAGS
# switch off Inner Detector, Calorimeters, or Muon Chambers
#include ("RecExCommon/RecExCommon_flags.py")
#DetFlags.Muon_setOff()
#DetFlags.ID_setOff()
#DetFlags.Calo_setOff()
### MAIN JOB OPTIONS
include ("RecExCommon/RecExCommon_topOptions.py")
### USER MODIFIER
## ATLANTIS
# if needed to create JiveXML for Atlantis
include("JiveXML/JiveXML_jobOptionBase.py")
include("JiveXML/DataTypes_All.py")
EOF
echo "YOUR JOB OPTIONS:"
cat myTopOptions.py
echo ""
Again we use
cat
to show the
myTopOptions.py
contents in the Screen Output.
9. Et voilá! Run the job on the
myTopOptions.py
:
echo "RUNNING: athena.py myTopOptions.py"
athena.py myTopOptions.py
10. Copy out the results, if they exist (
ESD,
AOD,
TAG,
JiveXML put together in one
tar
and
NTUPLE
):
if [ -e ESD.pool.root ] ; then
echo "ESD file found, copying ..."
rfcp ESD.pool.root ${CASTOR_RECONSTRUCTION}/${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}.esd.pool.root
else
echo "No ESD file found."
fi
if [ -e AOD.pool.root ] ; then
echo "AOD file found, copying ..."
rfcp AOD.pool.root ${CASTOR_RECONSTRUCTION}/${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}.aod.pool.root
else
echo "No AOD file found."
fi
if [ -e TAG.pool.root ] ; then
echo "TAG files found, copying ..."
rfcp TAG.pool.root ${CASTOR_RECONSTRUCTION}/${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}.tag.pool.root
# ZIP ALL JiveXML outputs to one file
for FILE in `ls Jive*`
do
tar cf JiveXML.tar Jive*
done
rfcp JiveXML.tar ${CASTOR_RECONSTRUCTION}/${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}.JiveXML.tar
else
echo "No TAG files found."
fi
if [ -e ntuple.root ] ; then
echo "NTUPLE file found, copying ..."
rfcp ntuple.root ${CASTOR_RECONSTRUCTION}/${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}.ntuple.root
else
echo "No NTUPLE file found."
fi
11. In the end, we just list all files and directories in the workspace for debugging purposes:
ls -lRt
12. Finally, clean workspace and exit:
cd ..
rm -fR Reconstruction.${OUTPUT}.${SKIP}-${LAST}of${TOTAL}.${ID}
5.3.2 LXBATCH reconstruction submitter
This file is essentially the same as LXBATCH
generation submitter and
simulation submitter. It has been attached. Now let us go through writting a script that will enable us to submit more jobs in parallel and choose whether we want them customized or not.
.
.
.
still at work
.
.
.
5.4 Reconstruction on the GRID
6 Analysis
6.1 Analysis Packages
cmt co -r AnalysisExamples-00-20-14 PhysicsAnalysis/AnalysisCommon/AnalysisExamples
6.2 ROOT
Simple. Run your X server and type:
> root <MY RECONSTRUCTED FILE>
6.3 Atlantis
Is a Java-based event display tool that can be run on any computer and is not therefore dependent on Athena. Download here:
Documentation and download: http://cern.ch/atlantis
Online Atlantis (requires Java):
http://www.hep.ucl.ac.uk/atlas/atlantis/webstart/atlantis.jnlp
6.3.1 How to create JiveXML?
Assumptions: your reconstruction package is installed and working
http://www.hep.ucl.ac.uk/atlas/atlantis/?q=jivexml
Solutions: a) Either insert the following includes into your
myTopOptions.py
:
include("JiveXML/JiveXML_jobOptionBase.py")
include("JiveXML/DataTypes_All.py")
OR b) change the flags directly in your package (e.g.: =RecExCommon_flags.py) as follows:
# --- write Atlantis xml file
JiveXML = True
OnlineJiveXML = False
# --- write Atlantis geometry xml file (JiveXML needs to be set True)
AtlantisGeometry = True
Notes:
- Each XML output file size can range from a few KB to a few MB, be carefull about your quota
- Many of the JiveXML? outputs can be empty (no visible event reconstructed)
- (Of course) you need to run the reconstruction again after making these changes.
6.4 Virtual Point 1
Is a 3D ATLAS visualization tool within Athena. In order to run the program, you need to use
SSH connection with enabled X11 forwarding and have your
X server running. Windows users look here:
http://www-hep2.fzu.cz/twiki/bin/view/ATLAS/WindowsRelated
You need to run the Virtual Point 1 on a
ESD file like this:
> vp1 <YOUR ESD.pool.root FILE>
Documentation here:
http://cern.ch/atlas-vp1
--
MartinZeman - 28 Jul 2008
--
MichalMarcisovsky - 28 Jul 2008