Operating an HTCondor-CE¶
To verify that you have a working installation of HTCondor-CE, ensure that all the relevant services are started and enabled then perform the validation steps below.
Managing HTCondor-CE services¶
In addition to the HTCondor-CE job gateway service itself, there are a number of supporting services in your installation. The specific services are:
|Your batch system||
|(Optional) APEL uploader||
Start and enable the services in the order listed and stop them in reverse order.
As a reminder, here are common service commands (all run as
|To...||On EL7, run the command...|
|Start a service||
|Stop a service||
|Enable a service to start on boot||
|Disable a service from starting on boot||
To validate an HTCondor-CE, perform the following steps:
Verify that local job submissions complete successfully from the CE host. For example, if you have a Slurm cluster, run
sbatchfrom the CE and verify that it runs and completes with
Verify that all the necessary daemons are running with condor_ce_status -any.
Verify the CE's network configuration using condor_ce_host_network_check.
Verify that jobs can complete successfully using condor_ce_trace.
Draining an HTCondor-CE¶
To drain an HTCondor-CE of jobs, perform the following steps:
CONDORCE_MAX_JOBS = 0in
condor_ce_reconfigto apply the configuration change
condor_ce_rmas needed to stop and remove any jobs that should stop running
Once draining is completed, don't forget to restore the value of
CONDORCE_MAX_JOBS to its previous value
before trying to operate the HTCondor-CE again.
Checking User Authentication¶
There are two primary authentication methods for submitting jobs to
an HTCondor-CE: GSI (currently being phased out) and SciTokens.
To see which authentication method and identity were used to submit
a particular job (or modify existing jobs), you can look in
If GSI authentication was used, you'll see a set of lines like this:
10/15/21 17:52:32 (cid:14) (D_AUDIT) Command=QMGMT_WRITE_CMD, peer=<172.17.0.2:41045> 10/15/21 17:52:32 (cid:14) (D_AUDIT) AuthMethod=GSI, AuthId=/DC=org/DC=opensciencegrid/C=US/O=OSG Software/OU=People/CN=testuser, CondorIdfirstname.lastname@example.org 10/15/21 17:52:32 (cid:14) (D_AUDIT) Submitting new job 1.0
If SciTokens authentication was used, you'll see a set of lines like this:
10/15/21 17:54:08 (cid:130) (D_AUDIT) Command=QMGMT_WRITE_CMD, peer=<172.17.0.2:37869> 10/15/21 17:54:08 (cid:130) (D_AUDIT) AuthMethod=SCITOKENS, AuthId=https://demo.scitokens.org,htcondor-ce-dev, CondorIdemail@example.com 10/15/21 17:54:08 (cid:130) (D_AUDIT) Submitting new job 2.0
Lines pertaining to the same client request will have the same
Lines from different client requests may be interleaved.