We are pleased to announce a major expansion of Juno, UTD’s flagship HPC cluster, that brings significant benefits to our user community.
Increased Resource Limits for Your Jobs
Thanks to this expansion, we are raising the current limit from 4 nodes/job to 8 nodes/job for applications that can efficiently scale their performance. This means you can run larger, more demanding computations and potentially reduce your time-to-solution.
The higher resource limits are available by request. To take advantage of this increased capacity, please use the dev partition to benchmark your code and demonstrate efficient scaling between serial (single core) and 1, 2, 4, 6 and 8-node parallel execution. The dev partition now allows all users to utilize 8 nodes/job for testing. Once we review your scaling results, we will enable your account to submit jobs using 8 nodes/job in the normal queue for production runs.
Expanded System Capabilities
These upgrades represent a 25% expansion of Juno’s computing capacity. The cluster now features 104 nodes, 6,600 cores, 41TB of memory, and 15 GPUs. The 25 new nodes include additional compute, login, and VDI resources, plus 3 additional H100 GPUs for machine learning and general computing workloads. The detailed Juno configuration is available on our website (link goes here).
More Scratch Space for Jobs
The Scratch space on Juno has also been upgraded, expanding its capacity from 440TB to 680TB. We have removed Scratch space quotas to allow you to run larger jobs. Please use Scratch as temporary space for your batch job files. Remember that Scratch space is never backed up, and we routinely purge files older than 45 days in accordance with our Scratch space management policy.
The new SLURM configuration is shown below.
| Partition name | Time limit | Nodes | Max nodes/job | Cores/ node | Memory/ node | GPUs/ node | VRAM/ GPU | Best used for |
| Dev | 2 hours | 8* | 8 | 64 | 384 GB | – | Code development, short jobs, benchmarking | |
| normal | 2 days | 90* | 4 (default) or 8# (by request high performance codes) | 64 | 384 GB | – | Long jobs, big jobs, production (main) compute workloads | |
| h100 | 2 days | 1 | 1 | 64 | 512 GB | 4 H100 physical | 80 GB | Large, long jobs requiring high GPU resources |
| 1 | 1 | 64 | 512 GB | 1 H100 physical | 94 GB | |||
| h100-2.47gb | 2 days | 1 | 1 | 64 | 512 GB | 4 half-H100 virtual | 47 GB | |
| a30 | 2 days | 2 | 2 | 128 | 1,024 GB | 2 A30 physical | 24 GB | Large, long jobs requiring medium GPU resources |
| a30-2.12gb | 2 days | 1 | 1 | 128 | 1,024 GB | 4 half-A30 virtual | 12 GB | Large, long jobs requiring small GPU resources |
| a30-4.6gb | 2 days | 1 | 1 | 128 | 1,024 GB | 8 quarter-A30 virtual | 6 GB | |
| vdi | 8 hours | 2 | 1 | 64 | 384 | – | Used by Open OnDemand workload | |
| * dev partition shares nodes with the normal partition. # 8 nodes/job are permitted to users whose code demonstrates efficient parallel performance scaling. | ||||||||