Ganymede2

This is a computation condo HPC system, supplying a variety of hardware, bespoke to our researchers’ needs. Condos are built and offered to PIs based on their needs of budget, hardware, and compute use cases. Ganymede2 has advanced queueing support as well as support for both GPU and CPU-only nodes.

Compute System Specs

Because of the diversity of condos in Ganymede2, there isn’t a unform definition of compute. Moreover, this is a growing cluster and as and when a buy-in happens, we recommend the latest-at-that-point hardware to be installed. With this progression, we are now at 90 CPU Nodes, 22 GPU Nodes with 106 GPUs.

Even though Ganymede2 assets are primarily owned by private researchers, the system has what are called “preempt” queues, which allow job submission from all Ganymede2 users. These preempt jobs (named cpu-preempt and gpu-preeempt) are heavily de-prioritized to the actual queue owner, so any workloads submitted to these queues should be seen as volatile and should heavily utilize checkpointing. When a preempt job is preempted, that job is killed immediately and forcefully. If data isn’t being constantly saved to an output file, data loss should be expected.

Ganymede2, unlike its predecessor, allows multiple jobs per node. Nodes can be “mixed”state, which indicates that node is currently processing multiple jobs at once. GPU nodes with multiple GPUs can have individual GPUs queued up to different jobs, or in some cases each GPU can run multiple jobs at the same time.

The following partitions are available to all users:

Queue NameNumber of nodesCores/Threads (CPU Architecture)MemoryTime Limit ([d-]hh:mm:ss)GPUs?Use Case
dev264/128 (Ice Lake)256GB2:00:00NoCode debugging, job submission testing
normal464/128 (Ice Lake)256GB2-00:00:00NoNormal code runs, CPU only
cpu-preempt8VARIOUSVARIOUS7-00:00:00NoVolatile CPU job submission
gpu-preempt6VARIOUSVARIOUS7-00:00:00Yes, VARIOUS typesVolatile GPU job submission