6. Advanced topics

This section covers some more advanced, but very important, concepts and topics within Condor.

Knowledge of these things can be very advantageous for getting the most out of the systems and troubleshooting / debugging your jobs.

6.1. Condor class ads

HTCondor stores a list of information about each job, machine and almost every entity in the pool. This information list is called a ClassAd. HTCondor juggles jobs and resources by matching one to the other based on the information contained in their ClassAds.

ClassAds are basically a list of attribute = expression entries.

ClassAds provide lots of useful information about jobs and machines. Useful in many occasions for debugging or properly taking advantage of the system.

More information at ClassAds can be found here: https://htcondor.readthedocs.io/en/latest/users-manual/matchmaking-with-classads.html

6.1.1. Job ClassAds

Submit file + HTCondor configuration = Job ClassAd

Job ClassAds are viewable with condor_q using the long option:

Example output from the condor_q -l command
abc123@condor:~ $ condor_q 269 -l
BufferBlockSize = 32768
BufferSize = 524288
BytesRecvd = 0.0
BytesSent = 0.0
ClusterId = 269
Cmd = "/usr/sbin/sshd"
CommittedSlotTime = 0
DockerImage = "kopolyzo/condor"
EnteredCurrentStatus = 1525683772
Environment = "publishports=2000 mount=/vol/vssp/reframe,/vol/feps/cookbookversions,/stornext"
Err = "269.errorlog.0"
ExecutableSize = 1000
ExecutableSize_RAW = 773
ExitBySignal = false
ExitStatus = 0
...
  • Useful Job attributes:

    UserLog:

    Location of the job log

    RemoteHost:

    where the job is running

    AssignedGPUs:

    GPUs assigned to the job

    CPUs:

    Number of CPUs available in the slot

6.1.2. Machine ClassAds

Machine (discovered attributes about machine) + HTCondor configuration = Machine ClassAd

What condor_q is for jobs, condor_status is for machines.

By default it will give us an overview of the pool machines and their status. Similarly to condor_q batch summary, condor_status can be used with the compact option for a summary.

Example output from the condor_status -compact command
abc123@condor:~ $ condor_status -compact
Machine                          Platform   Slots Cpus Gpus  TotalGb FreCpu  FreeGb  CpuLoad ST Jobs/Min MaxSlotGb

bilbo.eps.surrey.ac.uk           x64/LINUX0     3   16        125.92     10   113.92    0.00 **     0.00      4.00
gnasher1.eps.surrey.ac.uk        x64/LINUX0     0   16        141.66     16   141.66    0.00 Ui     0.00 *
gnasher2.eps.surrey.ac.uk        x64/LINUX0     0   16        141.66     16   141.66    0.01 Ui     0.00 *
gnasher3.eps.surrey.ac.uk        x64/LINUX0     0   16        141.66     16   141.66    0.00 Ui     0.00 *
gnasher4.eps.surrey.ac.uk        x64/LINUX0     0   16    0   141.66     16   141.66    0.00 Ui     0.00 *
willow.eps.surrey.ac.uk          x64/LINUX0     0   24    3    31.38     19    25.10    0.00 Ui     0.00 *
xenial-lab-test.eps.surrey.ac.uk x64/LINUX0     1    8    1     7.75      4     2.20    0.00 **     0.00      4.00

                    Total Owner Claimed Unclaimed Matched Preempting Backfill  Drain

        x64/LINUX0    11     0       4         7       0          0        0      0

             Total    11     0       4         7       0          0        0      0

Machine ClassAds are viewable with condor_status -l:

Example output from the condor_status -l command
abc123@condor:~ $ condor_status -l willow
Activity = "Idle"
AddressV1 = "{[ p=\"primary\"; a=\"131.227.85.20\"; port=3050; n=\"Internet\"; ], [ p=\"IPv4\"; a=\"131.227.85.20\"; port=3050; n=\"Internet\"; ], [ p=\"IPv6\"; a=\"::1\"; port=3050; n=\"Internet\"; ]}"
Arch = "X86_64"
AssignedGPUs = "CUDA0,CUDA1,CUDA2"
AuthenticatedIdentity = "condor_pool@surrey.ac.uk"
AuthenticationMethod = "PASSWORD"
CanHibernate = false
CheckpointPlatform = "LINUX X86_64 4.4.0-112-generic normal N/A avx ssse3 sse4_1 sse4_2"
CUDACapability = 6.1
CUDAClockMhz = 1582.0
CUDAComputeUnits = 30
CUDACoresPerCU = 192
CUDADeviceName = "TITAN Xp COLLECTORS EDITION"
CUDAGlobalMemoryMb = 12190
...
  • Useful Machine attributes:

    CUDA*:

    Related to NVidia GPUs and their capabilities

    Has*:

    A variety of capabilities

for example:

  • HasWeka execute node is connected to the WEKA Storage (High performance scratch storage).

  • HasStornext execute node is connect to the Quantum Stornext storage system (Project Spaces).

6.2. Condor priorities & matchmaking

6.2.1. User priorities

(Effective) User Priority = Real User Priority * Priority Factor

Real Priority is based on actual usage and approaches the actual number of resources used over time. There is a decay factor in this number’s calculation so that the system forgets past usage. By default, it will count usage over past one day.

Priority factor is administrator assigned, and by default is the same for all users.

condor_userprio shows the current user priorities.

using condor_userprio to check user priority
abc123@condor:~ $ condor_userprio  -allusers
Last Priority Update:  5/7  22:53
                    Effective   Priority   Res   Total Usage  Time Since
User Name              Priority    Factor   In Use (wghted-hrs) Last Usage
-------------------- ------------ --------- ------ ------------ ----------
user1@surrey.ac.uk       500.00   1000.00      0        74.00   11+13:39
someguy@surrey.ac.uk       500.00   1000.00      0       348.64   34+14:31
abc123@surrey.ac.uk       6687.15   1000.00      4      1832.44      <now>
-------------------- ------------ --------- ------ ------------ ----------
Number of users: 3                               4      2255.08

Note

With users priorities, the smaller the number, the higher the priority.

More information is available at: https://htcondor.readthedocs.io/en/latest/users-manual/priorities-and-preemption.html

6.2.2. Job ranking

Users can steer jobs towards certain machines by ranking them.

For example: to prefer willow over all other nodes use Rank = ( machine == "willow.eps.surrey.ac.uk" ) in your submit file. Or to prefer machines with higher available memory use Rank = memory.

More information is available at: https://htcondor.readthedocs.io/en/latest/users-manual/submitting-a-job.html#about-requirements-and-rank

6.2.3. Machine ranking

Machines can also rank the jobs they will choose to execute. This is controlled by administrators and/or machine owners.

Although users have no control over this it is useful to know that machines which rank their owner’s jobs higher than any other will in fact evict any other running job in their favor.

6.3. PScratch

Note

pscratch (persistent-scratch) is a locally developed condor plugin. This is specific to using condor here at Surrey and is currently exclusive to the AISURREY condor pool

The overhead of the time required for transferring files to local scratch at the start of each new job lead to us investigating the idea of a more persistent-scratch space,(Faster network storage solutions coming to surrey in the future should reduce our need for this).

Each condor execute machine has a limited amount of local disk. PScratch capability is enabled on machines with enough disk to support it, such as AI @ Surrey machines which have about 3TB of local disk. On machines with PScratch capability, the scratch space is shared between:

  • Job scratch area, which is allocatable through the scheduler and is attached to the lifespan of a job.

  • PScratch (Persistent Scratch) area, which is allocatable on a per user/area basis, through the scheduler, but is not attached to the lifespan of a job, and as such persists to be utilised by subsequent jobs.

Data transferred in to the pscratch area will persist over multiple Condor jobs run on that machine. This allows use of local, fast disk on the compute node without the overhead of data transfer prior to each job run.

PScratch space is good for moderately sized, read-only data that is invariant and unchanged between multiple runs of the same or similar job.

6.3.1. How pscratch works

With PScratch, Condor copies your selected data across to the compute node ahead of the first job running on the node. The local copy of your data will be made available under /scratch/pscratch/<username>/<dir-name> where <username> is your username and <dir-name> is the name of the directory where you assembled your collection of data. The data is only copied once per machine; a later, second job will not re-transfer the data (so saving the usual overhead associated with the normal transfer_input_files functionality of Condor).

Warning

As subsequent jobs do not re-transfer or re-sync the data, PScratch is only for data that is unmodified by the processing jobs being executed.

Once copied onto the compute node, your data will remain there as long as you are actively using the area. When a PScratch area is no longer being requested by jobs it is automatically cleaned up after 3 days of inactivity. To inform Condor that your jobs are using the PScratch area, continue to specify the pscratch entry (see below) in your submit file (even when the data has already been synced across - Condor will know not to re-transfer it).

Warning

This is still scratch space. Outside of your original copy of the data, no backup of the data held in the PScratch area exists. The copy in the PScratch area will ultimately be deleted.

6.3.2. How to use pscratch

To use this facility, first prepare a directory containing all the data you would like to host on a compute node. Make a note of the disk size consumed by this area. Modify your submit file to inform Condor that you would like to store this amount of data in the PScratch area by adding a +WantPscratch=<size in KB> setting.

Modify your submit file to inform Condor where to take the data copy from, add a transfer_input_files statement to the submit file with the full path to your collected data directory prefixed by pscratch:// (the path has a leading double slash - pscratch://vol/research/myproject/mypscratchmaster…).

For example:

  1. Set up collection of data to be transferred, if such an area doesn’t exist already. Note the size

    $ mkdir /vol/research/my_project_area/pscratch_data
    $ cp -a /vol/research/my_project_area/data_gp1 /vol/research/my_project_area/pscratch_data
    $ cp -a /vol/research/my_project_area/data_gp4 /vol/research/my_project_area/pscratch_data
    $ ln /vol/research/my_project_area/big_file /vol/research/my_project_area/pscratch_data
    $ du –sh /vol/research/my_project_area/pscratch_data
    19M /vol/research/my_project_area/pscratch_data
    
  2. Modify submit file as necessary

    ...
    
    # Request 19MB of PScratch space
    +WantPscratch = 19000
    
    # Sync in data on first job to /scratch/pscratch/<user>/pscratch_data
    transfer_input_files = pscratch://vol/research/my_project_area/pscratch_data
    
    ...
    
  3. Make any necessary changes to your application/config files to pick up the data from the PScratch area rather than from your usual locations. For this example the PScratch area will be /scratch/pscratch/<user>/pscratch_data

  4. Launch your job in the usual way:

    $ condor_submit submit-file
    

6.3.3. pscratch manual cleanup

Although, Condor will automatically delete your PScratch area when you cease to use it, owing to limited space availability on the compute nodes, we encourage people to delete any PScratch areas they no longer require.

You can quickly query the pscratch usage across the pool in general or with a specific user in focus with the following:

$ condor_status –compact –af:h machine pscratch_users | grep <your-username>

We provide a script that can delete any PScratch area that is no longer required. This script, pscratch-delete, is available on the Condor submit nodes. Run this script specifying the name of the PScratch area you wish to delete and the names of the machines where you wish to remove it from.

For example, to clean down a PScratch area created for the user un9999, the user would:

$ condor_status –compact –af:h machine pscratch_users | grep un9999
aisurrey01.surrey.ac.uk un9999/pscratch_area
aisurrey09.surrey.ac.uk an01234/test un9999/pscratch_area

$ pscratch-delete pscratch_area aisurrey01 aisurrey09
Queuing Condor jobs to remove the pscratch_data area from the selected machine(s)
  Submitting job(s).
  1 job(s) submitted to cluster 1234.
  Submitted deletion job to aisurrey01.surrey.ac.uk
  Please check the log files:
            ./pscratch-delete-pscratch_data-aisurrey01.surrey.ac.uk.<job-number>.log
            ./pscratch-delete-pscratch_data-aisurrey01.surrey.ac.uk.<job-number>.out
            ./pscratch-delete-pscratch_data-aisurrey01.surrey.ac.uk.<job-number>.err
  after the Condor job has finished

Submitting job(s).
  1 job(s) submitted to cluster 1235.
  Submitted deletion job to aisurrey09.surrey.ac.uk
  Please check the log files:
            ./pscratch-delete-pscratch_data-aisurrey09.surrey.ac.uk.<job-number>.log
            ./pscratch-delete-pscratch_data-aisurrey09.surrey.ac.uk.<job-number>.out
            ./pscratch-delete-pscratch_data-aisurrey09.surrey.ac.uk.<job-number>.err
   after the Condor job has finished
$

The condor-delete script submits a condor job on each machine to remove the unwanted PScratch area. Please check for successful completion of these jobs as you would with any other regular job.

Note

  • Copying of the data is performed on the compute node. Compute nodes are connected with fast connections to the storage. Therefore, the copying of data to the local pscratch is fast. Faster than using the default condor transfer mechanism which copies data from the submit node to the worker node via a regular connection.

  • Condor will attempt to direct a job that requires an area on pscratch to machines that already have the PScratch area installed on them.

  • Condor will automatically make the pscratch area available to your job, if you use the pscratch:// transfer plugin in your transfer_input_files.

6.4. Docker and Condor

It is possible to run a Condor job inside of a containerized environment such as a docker container.

If you are familiar with Docker you will already understand the benefits this will offer. If you aren’t familiar with Docker or containerization it would benefit you to do some background reading on the subject: with Docker or containerization I would recommend some background reading first:

https://www.docker.com/resources/what-container

To summarise, running a job inside a container will allow you to better manage the environment your processes are running in and ensure consistency and easy replication of this environment.

Execute nodes with the ability to run jobs inside containers will advertise their ability to do so by including "HasDocker = True" in their ClassAd. See Condor class ads for more info.

Docker jobs still run under your user name and its associated permissions.

6.4.1. Running a docker job

To run a docker job include the following in your job submit file:

universe = docker:

Tells HTCondor this is a docker job

docker_image = nvidia/cuda:

The name of the docker image to use. The image must be available in an accessible repository, such as DockerHub.

should_transfer_files = YES:

Make sure you transfer needed files. Shared filesystems are not automatically available inside the container.

executable = /path/to/executable -options:

What to execute inside the container. Can be omitted if the image specifies a command to execute on launch.

environment = “VAR=VAL”:

Add variables to the environment. Control ports and storage mounting.

Docker jobs are launched through a wrapper that adds a few things to each container:

  • User Home directory

    • Available at the same location, automatically.

  • Network ports via environment = "publishports = 5000,6006" (comma separated)

    • If the process running inside the container exposes certain features through network ports, you can request for those ports to be exposed.

  • Storage Locations via environment = "mount = /vol/research/XYZ,/vol/research/ABC" (comma separated)

    • You can request storage locations available on the host to be mounted inside the container. Subject to access privileges.

6.4.2. Using custom Docker images

For information on how to create, publish and use Custom Docker images please see Creating and publishing custom container images

6.4.2.1. using your custom image in your Condor Job

  • to pull the ‘latest’ version of your custom image from container-registry.surrey.ac.uk once published, edit your HTCondor Submit File like the example below:

    # Say you use Docker.
    universe = docker
    # Specify the location of the Docker Image.
    docker_image = container-registry.surrey.ac.uk/shared-containers/<container_name>:latest
    # Set environment variables, here, mount a repository, and enable OpenBLAS to use the number of CPU you required.
    environment = "mount=/some/directory/ OMP_NUM_THREADS=$(request_CPUs)"
    

Your condor job should now run inside a container running your custom docker image.