How are supercomputer clusters networked together?
Posted: Fri Jun 06, 2014 11:43 pm
This is just a thought experiment.
Say if i wanted to build a GPGPU cluster, basically a sub-$10,000 GPU based supercomputer distributed across 2-5 PCs, each PC having 4x SLI-linked video-cards chosen solely for their cuda-cores/$ price point.
(I want as many GPU cores as possible, coz it'll theoretical be used for faster-than-real-time processing of 3D pointclouds from RGBD cameras. My back of the envelope calculations suggest i could probably get 24,000 cores* for $6000.)
OK, the question is: What would be the preferred way to connect the PCs together?
I'd initially think some ordinary networking thing like 10-gigabit ethernet, but i don't really need that sort of robustness do i? The PCs are literally side by side and communicate only with eachother, i could probably reduce overheads by using something much more simple than TCPIP.
And why communicate serially? The parallel lines of the PCIE bus are right there staring me in the face, is there an existing and (maybe common/cheap) networking tech that can more directly make use of those lovely lovely parallel lines?
Or maybe you could use the SATA bus? That seems fast.
Hmm, or even better: couldn't you just get the cards to output their data over the video cables themselves? Aren't video cables basically the highest bandwidth cables around? Oh, but i probably won't have an easy way of RECEIVING that data from a video cable, scratch that idea (right?).
So yeah, networking super-computer clusters. How's it done?
*Yes, this video was indeed the initial seed for this post.
Say if i wanted to build a GPGPU cluster, basically a sub-$10,000 GPU based supercomputer distributed across 2-5 PCs, each PC having 4x SLI-linked video-cards chosen solely for their cuda-cores/$ price point.
(I want as many GPU cores as possible, coz it'll theoretical be used for faster-than-real-time processing of 3D pointclouds from RGBD cameras. My back of the envelope calculations suggest i could probably get 24,000 cores* for $6000.)
OK, the question is: What would be the preferred way to connect the PCs together?
I'd initially think some ordinary networking thing like 10-gigabit ethernet, but i don't really need that sort of robustness do i? The PCs are literally side by side and communicate only with eachother, i could probably reduce overheads by using something much more simple than TCPIP.
And why communicate serially? The parallel lines of the PCIE bus are right there staring me in the face, is there an existing and (maybe common/cheap) networking tech that can more directly make use of those lovely lovely parallel lines?
Or maybe you could use the SATA bus? That seems fast.
Hmm, or even better: couldn't you just get the cards to output their data over the video cables themselves? Aren't video cables basically the highest bandwidth cables around? Oh, but i probably won't have an easy way of RECEIVING that data from a video cable, scratch that idea (right?).
So yeah, networking super-computer clusters. How's it done?
*Yes, this video was indeed the initial seed for this post.