Juno Capacity Expands by 25% – More Resources Available to Users

We are pleased to announce a major expansion of Juno, UTD’s flagship HPC cluster, that brings significant benefits to our user community.

Increased Resource Limits for Your Jobs

Thanks to this expansion, we are raising the current limit from 4 nodes/job to 8 nodes/job for applications that can efficiently scale their performance. This means you can run larger, more demanding computations and potentially reduce your time-to-solution.

The higher resource limits are available by request. To take advantage of this increased capacity, please use the dev partition to benchmark your code and demonstrate efficient scaling between serial (single core) and 1, 2, 4, 6 and 8-node parallel execution. The dev partition now allows all users to utilize 8 nodes/job for testing. Once we review your scaling results, we will enable your account to submit jobs using 8 nodes/job in the normal queue for production runs.

Expanded System Capabilities

These upgrades represent a 25% expansion of Juno’s computing capacity. The cluster now features 104 nodes, 6,600 cores, 41TB of memory, and 15 GPUs. The 25 new nodes include additional compute, login, and VDI resources, plus 3 additional H100 GPUs for machine learning and general computing workloads. The detailed Juno configuration is available on our website (link goes here).

More Scratch Space for Jobs

The Scratch space on Juno has also been upgraded, expanding its capacity from 440TB to 680TB. We have removed Scratch space quotas to allow you to run larger jobs. Please use Scratch as temporary space for your batch job files. Remember that Scratch space is never backed up, and we routinely purge files older than 45 days in accordance with our Scratch space management policy.

The new SLURM configuration is shown below.

Partition nameTime limitNodesMax nodes/jobCores/
node
Memory/
node
GPUs/
node
VRAM/
GPU
Best used for
Dev2 hours8*864384 GB Code development, short jobs, benchmarking
normal2 days90*4 (default) or 8# (by request high performance codes)64384 GB Long jobs, big jobs, production (main) compute workloads
h1002 days1164512 GB4 H100 physical80 GBLarge, long jobs requiring high GPU resources
1164512 GB1 H100 physical94 GB
h100-2.47gb2 days1164512 GB4 half-H100
virtual
47 GB
a302 days221281,024 GB2 A30 physical24 GBLarge, long jobs requiring medium GPU resources
a30-2.12gb2 days111281,024 GB4 half-A30 virtual12 GBLarge, long jobs requiring small GPU resources
a30-4.6gb2 days111281,024 GB8 quarter-A30 virtual6 GB
vdi8 hours2164384 Used by Open OnDemand workload
* dev partition shares nodes with the normal partition. # 8 nodes/job are permitted to users whose code demonstrates efficient parallel performance scaling.