High Performance Cluster Checklist
Building a cluster, especially for high performance, requires several elements to work in tandem in order to get the performance in unison.
There are several elements that are necessary for this.
Compute – 1U Supermicro Ultra
For the compute element of the GPU cloud, we recommend using SuperMicros Ultra SuperServers. We have extensive experience building this out as the compute node. It has 32 DIMM Slots, which can fit up to 8TB DRAM. For the CPU element, it supports Dual Socket P+ (LGA-4189) 3rd Gen Intel® Xeon® Scalable Processors.
Hyper-Converged Infrastructure – 2U 4 Node Supermicro Twin ProÂ
For the Hyper-Converged Infrastructure, this particular server supports: Dual Socket P+ (LGA-4189) 3rd Generation Intel® Xeon® Scalable Processors, 16 DIMMs which can have up to 4TB RAM. With Intel® X710 Dual port 10GBase-T LAN onboard, the networking element is well equipped to handle many heavy workloads.Â
Storage – 4U 90 Bay Storage Server
For storage, there is typically a lot of deployment scenarios, this server supports up to 90 bays with the following disk configuration: 90 3.5″/2.5″ Hot-swap SAS3/SATA3 drives, 2x Fixed slim SATA SSD, 2x NVMe M.2 (form factor: 2280 and 22110).
GPU – 4U 8 x H100 GPU ServerÂ
This GPU system has NVIDIA® NVLinkâ„¢ with NVSwitchâ„¢ for GPU-to-GPU interconnect. This server has 8-GPU compatibility to handle all the GPU workloads throughout the cloud.Â
NVMe – 2U 24 x NVMe Server
For the NVMe element, this server model supports 24x 2.5″ hot-swap NVMe/SATA/SAS drive bays (22x 2.5″ NVMe hybrid) and we’ve found it to be optimally a strong server for these types of configurations.
You must then look at the software. Our LMX Cloud software is a comprehensive Cloud HPC cluster management stack that supports a broad range of workloads and software environments, enabling organizations with an agile and scalable IT infrastructure.
There are several elements that are necessary for this.
- Compute
- Hyper-Converged Infrastructure
- Storage (Non-NVMe)
- GPU
- NVMe Storage
- Software
Compute – 1U Supermicro Ultra
For the compute element of the GPU cloud, we recommend using SuperMicros Ultra SuperServers. We have extensive experience building this out as the compute node. It has 32 DIMM Slots, which can fit up to 8TB DRAM. For the CPU element, it supports Dual Socket P+ (LGA-4189) 3rd Gen Intel® Xeon® Scalable Processors.
Hyper-Converged Infrastructure – 2U 4 Node Supermicro Twin ProÂ
For the Hyper-Converged Infrastructure, this particular server supports: Dual Socket P+ (LGA-4189) 3rd Generation Intel® Xeon® Scalable Processors, 16 DIMMs which can have up to 4TB RAM. With Intel® X710 Dual port 10GBase-T LAN onboard, the networking element is well equipped to handle many heavy workloads.Â
Storage – 4U 90 Bay Storage Server
For storage, there is typically a lot of deployment scenarios, this server supports up to 90 bays with the following disk configuration: 90 3.5″/2.5″ Hot-swap SAS3/SATA3 drives, 2x Fixed slim SATA SSD, 2x NVMe M.2 (form factor: 2280 and 22110).
GPU – 4U 8 x H100 GPU ServerÂ
This GPU system has NVIDIA® NVLinkâ„¢ with NVSwitchâ„¢ for GPU-to-GPU interconnect. This server has 8-GPU compatibility to handle all the GPU workloads throughout the cloud.Â
NVMe – 2U 24 x NVMe Server
For the NVMe element, this server model supports 24x 2.5″ hot-swap NVMe/SATA/SAS drive bays (22x 2.5″ NVMe hybrid) and we’ve found it to be optimally a strong server for these types of configurations.
You must then look at the software. Our LMX Cloud software is a comprehensive Cloud HPC cluster management stack that supports a broad range of workloads and software environments, enabling organizations with an agile and scalable IT infrastructure.
- Complete HPC user environment
- Control infrastructure via cloud APIs
- Comprehensive monitoring and alerting
- OpenLDAP authentication
- Support for virtual machines and bare metal
- Containerized application stack support via Singularity
- Web UI Portal with support for file transfers, workload management, and VNC, RStudio, and Jupyter support
- On-demand Kubernetes provisioning and scaling
General Enquiry
High Performance Cluster Checklist
Building a cluster, especially for high performance, requires several elements to work in tandem in order to get the performance in unison.
There are several elements that are necessary for this.
Compute – 1U Supermicro Ultra
For the compute element of the GPU cloud, we recommend using SuperMicros Ultra SuperServers. We have extensive experience building this out as the compute node. It has 32 DIMM Slots, which can fit up to 8TB DRAM. For the CPU element, it supports Dual Socket P+ (LGA-4189) 3rd Gen Intel® Xeon® Scalable Processors.
Hyper-Converged Infrastructure – 2U 4 Node Supermicro Twin ProÂ
For the Hyper-Converged Infrastructure, this particular server supports: Dual Socket P+ (LGA-4189) 3rd Generation Intel® Xeon® Scalable Processors, 16 DIMMs which can have up to 4TB RAM. With Intel® X710 Dual port 10GBase-T LAN onboard, the networking element is well equipped to handle many heavy workloads.Â
Storage – 4U 90 Bay Storage Server
For storage, there is typically a lot of deployment scenarios, this server supports up to 90 bays with the following disk configuration: 90 3.5″/2.5″ Hot-swap SAS3/SATA3 drives, 2x Fixed slim SATA SSD, 2x NVMe M.2 (form factor: 2280 and 22110).
GPU – 4U 8 x H100 GPU ServerÂ
This GPU system has NVIDIA® NVLinkâ„¢ with NVSwitchâ„¢ for GPU-to-GPU interconnect. This server has 8-GPU compatibility to handle all the GPU workloads throughout the cloud.Â
NVMe – 2U 24 x NVMe Server
For the NVMe element, this server model supports 24x 2.5″ hot-swap NVMe/SATA/SAS drive bays (22x 2.5″ NVMe hybrid) and we’ve found it to be optimally a strong server for these types of configurations.
You must then look at the software. Our LMX Cloud software is a comprehensive Cloud HPC cluster management stack that supports a broad range of workloads and software environments, enabling organizations with an agile and scalable IT infrastructure.
There are several elements that are necessary for this.
- Compute
- Hyper-Converged Infrastructure
- Storage (Non-NVMe)
- GPU
- NVMe Storage
- Software
Compute – 1U Supermicro Ultra
For the compute element of the GPU cloud, we recommend using SuperMicros Ultra SuperServers. We have extensive experience building this out as the compute node. It has 32 DIMM Slots, which can fit up to 8TB DRAM. For the CPU element, it supports Dual Socket P+ (LGA-4189) 3rd Gen Intel® Xeon® Scalable Processors.
Hyper-Converged Infrastructure – 2U 4 Node Supermicro Twin ProÂ
For the Hyper-Converged Infrastructure, this particular server supports: Dual Socket P+ (LGA-4189) 3rd Generation Intel® Xeon® Scalable Processors, 16 DIMMs which can have up to 4TB RAM. With Intel® X710 Dual port 10GBase-T LAN onboard, the networking element is well equipped to handle many heavy workloads.Â
Storage – 4U 90 Bay Storage Server
For storage, there is typically a lot of deployment scenarios, this server supports up to 90 bays with the following disk configuration: 90 3.5″/2.5″ Hot-swap SAS3/SATA3 drives, 2x Fixed slim SATA SSD, 2x NVMe M.2 (form factor: 2280 and 22110).
GPU – 4U 8 x H100 GPU ServerÂ
This GPU system has NVIDIA® NVLinkâ„¢ with NVSwitchâ„¢ for GPU-to-GPU interconnect. This server has 8-GPU compatibility to handle all the GPU workloads throughout the cloud.Â
NVMe – 2U 24 x NVMe Server
For the NVMe element, this server model supports 24x 2.5″ hot-swap NVMe/SATA/SAS drive bays (22x 2.5″ NVMe hybrid) and we’ve found it to be optimally a strong server for these types of configurations.
You must then look at the software. Our LMX Cloud software is a comprehensive Cloud HPC cluster management stack that supports a broad range of workloads and software environments, enabling organizations with an agile and scalable IT infrastructure.
- Complete HPC user environment
- Control infrastructure via cloud APIs
- Comprehensive monitoring and alerting
- OpenLDAP authentication
- Support for virtual machines and bare metal
- Containerized application stack support via Singularity
- Web UI Portal with support for file transfers, workload management, and VNC, RStudio, and Jupyter support
- On-demand Kubernetes provisioning and scaling