HTCondor

Important

HTCONDOR is still in active use at Surrey but it is being phased out in favour of adopting the SLURM job scheduler across all our HPC Clusters. This consolidation onto a single job scheduling technology is essential to creating a consistent and quality user experience going forward.

AISURREY is currently in the process of being migrated to the SLURM software stack.

What is HTCondor?

HTCondor is a software that creates a High-Throughput Computing environment. Such an environment can deliver a high capacity of computing power over a larger period of time. HTCondor can manage a dedicated cluster of computers, but its true power comes from the ability to effectively harness non-dedicated, pre-existing resources under distributed ownership.

Like other batch systems, HTCondor provides a job queuing mechanism, scheduling policy, priority scheme, resource monitoring, and resource management. Users submit their jobs into a HTCondor queue. HTCondor then chooses when and where to run the jobs, and monitors the job through to its completion

More information available at:

Why do we use HTCondor?

HTCondor can help us solve or improve the following issues.

  • Fair sharing of the resources

    • We want everyone to have fair and easy access to the resources available. HTCondor takes various factors into account to calculate a priority for each user.

  • Less idle resources

    • We want our resources to be utilized to their full extent, if possible. HTCondor can help us know of idle resources, and keep a steady flow of jobs for them to run.

  • Flexibility in the running environment

    • We want to customize the running environment of our jobs to our specifications. Faster than IT would respond. HTCondor can run containerized jobs, with Docker.

  • Abstraction layer

    • Allows you to be less concerned with the underlying hardware/tech. Just give me the resources!!