Nvidia Tesla GPUs K40/K80/M40/P40/P100/V100 at home/desktop hacking, cooling, powering, cable solutions Tutorial AIO Solutions

Do you have access to some old Tesla GPUs and want to try them at home in your Desktop or old Server?  Some people have wanted to try these units for gaming but keep in mind they have no video out port, they were only meant for AI applications such as Deep Learning.

The easiest way by far is to choose an AI service that has everything ready to go, perhaps with a bunch of Docker or Kubernetes containers.  This can be done with Cloud services like Google, Amazon and many others, but the costs can be extremely high as in some cases, several times higher than using your own on-premise hardware, renting your own servers, or colocating your own servers.

In terms of learning or simply testing or trying to build your own development environment/test neural network setup this may be the way to go if you are experienced in building computers/modding/working on servers.

Why use GPU instead of CPU?

In general because even high-end CPUs cannot deliver the same performance for the dollar.

Check this comparison benchmark Ryzen CPU vs Nvidia GPU.

 

Issues with Tesla GPUs outside of their native habitat

Motherboard

Many older consumer grade motherboards, workstations and even servers may not work as you need PCI > 4GB 64-bit BAR support.

Cooling

"Native Habitat" generally means they were sold in custom built servers from vendors like HP, Dell, Supermicro where the actively cooling in the chassis would flow past the heatsinks in these models.  In plain English, these GPUs don't have their own cooling fans.

Some people have started selling custom fan adapters that will take something like 40MM or 80MM fans that are held in place at the back of the card that blow into the heatsink (this basically replicates what a server would be doing).

A ghetto way some have tried is to rip off the cover of the GPU, exposing the heatsink and then placing 2 fans directly alongside which has also been proven to work.

Power Supply

You need a strong enough power supply as most of these cards require 250W capability.  Some of the Tesla cards used custom EPS12V to power these.

If you are trying to use these cards in another server, you'll need to make sure that you have a riser or power output that is powerful enough.  You'll also have to be very careful not to short out the card or server, as many of the riser's have non-standard power and many of the adapter cables you may could be incorrect.

If in doubt it may be easier to get a model that uses standard PCI-e power connections.

 

List of GPUs by their power connector type:

Click on model name link for full Nvidia official spec PDF.

Nvidia Tesla GPU Name Power Connector Memory (GB) Base Clock Memory Clock Notes Benchmark
K20 PCI-E 8PIN + 6PIN  5GB        
K20X PCI-E 8PIN + 6PIN  6GB        
K40 PCI-E 8PIN + 6PIN 12GB        
K80 EPS-12V 8PIN 24GB     This is really 2 x K40 so you really have 2 x 12GB cards and not a single 24GB to use.  
M40 EPS-12V 8PIN 12GB/24GB       10194
P40 EPS-12V 8PIN 24GB       16864
             
             
             

Which Systems/Motherboards/Servers are Supported?

Remember again it's not only a power and physical space issue but your motherboard must support > 4GB BAR which MANY do not.

Nvidia list: https://www.nvidia.com/en-us/geforce/news/geforce-rtx-30-series-resizable-bar-support/

Desktop CPU and Chipset Support

 
AMD Chipsets
AMD 400 Series (on motherboards with AMD Zen 3 Ryzen 5xxx CPU support)
AMD 500 Series
AMD CPUs
AMD Zen 3 CPUs Ryzen 3 5xxx Ryzen 5 5xxx Ryzen 7 5xxx Ryzen 9 5xxx
Intel Chipsets
Intel 10th Gen Z490 H470 B460 H410
Intel 11th Gen S All 11th Gen chipsets available as of March 30th, 2021
Intel CPUs
Intel 10th Gen Intel 11th Gen S-Series
i9-10xxx CPUs i9-11xxx CPUs
i7-10xxx CPUs i7-11xxx CPUs
i5-10xxx CPUs i5-11xxx CPUs
i3-10xxx CPUs  

Motherboard Support

NVIDIA is working with motherboard manufacturers around the world to bring Resizable BAR support to compatible products. As of March 30th, 2021, the following manufacturers are offering SBIOS updates for select motherboards to enable Resizable BAR with GeForce RTX 30 Series desktop graphics cards:

 
Motherboard Manufacturers Supporting Resizable BAR
ASUS ASRock COLORFUL EVGA GIGABYTE MSI

 

This is a compilation of comments taken from the internet, I have not personally tested all of these combinations so use at your own risk.  Normally when they say supported, it means you'll still have to handle the cooling on your own.  Also keep in mind that many different workstations and servers may have power supply options.  One person may have said ABC System works but had the upgraded higher wattage power supply over standard so always double check these items.

Remember to also make sure you have the appropriate power connections/adapter cables.

According to some threads/forums:

HP Z620, Z440, Z640, Z820, Z840

Dell R720 Server

Supermicro CSE-118 /2027GR-TRFT/ 1027GR-TSF  Chassis with a motherboard like:

Supermicro  X10SRG-F

Supermicro X9DRG

Some say the X99 chipset often works for the Tesla's including some MSI boards.

Another way is if you can find something like "Above 4G Decoding" as a BIOS option you should be OK.

https://www.supermicro.com/en/support/resources/gpu

Dell R720 Caveats:

1.) The riser power 8-pin port appears to be EPS-12V but it is NOT.  It is keyed like EPS-12V but the pinout is more like a PCI-E 8 pin.  Others have fried their hardware by not understanding this.

When finding cables be careful, there are cables that plugin to the riser and give you PCI-E 8-PIN out but this will fry you, as remember most of the Tesla cards use EPS-12V.  You would only need that sort of cable if you were plugging in a more normal GPU which did use 8-PIN PCI power.

  • Make sure that whatever cable you have has the yellow cables on top to the side that connects to the Tesla GPU
  • Make sure that the cable connecting to the riser has the 3 yellow cables on the bottom.

 Confirm if Dell PowerEdge R720 Power port mixes pin layout ...

2.) Dell R720/730 Cable Recommendations:

For the Tesla's with EPS-12V, perhaps the safest method is to combine the Dell Riser to GPU cable and then get the Tesla 2xPCI-E to EPS-12V adapter (not 1 or the other but both or you will fry your system).  This would not require any hack job wiring but does require two separate cables.

Option 1. Buy these 2 cables


This Dell part number 09H6FV/N08NH is ONLY good for normal PCI-E based GPUs (eg. RTX/GTX) amazon affil link

The above part connects to the riser in the server and gives you 2 x PCI-E power adapters.

Unless you combine it with the EPS-12V to 2xPCI-8 adapter.

The above part then connects to the EPS-12V on the Tesla card and then mates to the 2 PCI-E power connections from the riser cable.

Do not just buy the single Dell 09H6F without the adapter as you will fry your server!

Be aware again that anything that says Dell Riser to GPU cable is usually going to give you PCI-E which is NOT what you want for the Tesla's and will likely fry something!

Option 2. Possibly dangerous Hack Job using Corsair Type 4 CPU cable

Be careful with this one especially that you have the correct orientation and that you snip the correct wire.  Long-term results are unknown/is it safe that the 1 sense wire was snipped off?

 This is the part you want for the Tesla GPUs that use EPS-12V for the Dell 720/730 Riser Series.

If you want a hack job, some Youtube comments claim you can take a Corsair PSU Type 4 CPU cable but chop off the bottom right positive wire that connects to the riser side.  This is because as you can see with the Dell 720 Riser (the bottom right side pin is ground).  If you don't get this correct then you will short out the motherboard by sending 12V to a ground.

This does seem to be all backed up by the following diagram of the Corsair Type 4 CPU cable.

PSU Pinout Voltage - Corsair Type 4


 

Which GPU should you choose based on performance?

Factors for how you choose will depend on your use case, workload (eg. how much VRAM do you require for the models you are running?), whether it is testing, budget, scale and the cost and availability of power at your datacenter/business center.

Tesla GPU Benchmark Comparison for Deep Learning

 

https://www.microway.com/hpc-tech-tips/deep-learning-benchmarks-nvidia-tesla-p100-16gb-pcie-tesla-k80-tesla-m40-gpus/

 

 

Ready Made Solutions

These are ready-built enterprise servers that accommodate either the PCI-e versions or SXM format Nvidia Tesla boards.

Servers support normally 4 PCI-e cards OR support 8 SXM boards.

  • Gigabyte G190-G30 Server
  • Dell PowerEdge C4140
  • Dell PowerEdge C4130
  • Nvidia DGX
  • Supermicro SYS-4028GR-TVRT
  • Supermicro SYS-1029GQ-TVRT
  • GIGABYTE G481S80
  • IBM S822LC 8335

 

References

https://forums.bit-tech.net/index.php?threads/mobos-that-work-with-tesla-k40m.368723/

https://cloud.google.com/compute/gpus-pricing

https://h30434.www3.hp.com/t5/Business-PCs-Workstations-and-Point-of-Sale-Systems/Nvidia-Tesla-k40-pci-slot-which-one/td-p/7796334

https://h30434.www3.hp.com/t5/Business-PCs-Workstations-and-Point-of-Sale-Systems/nvidia-TESLA-K40-not-working-in-Z820/td-p/7749456

https://h30434.www3.hp.com/t5/Desktop-Hardware-and-Upgrade-Questions/Z820-PSU-alert-with-NVidia-Tesla-K80/td-p/8501937

https://blog.thomasjungblut.com/random/running-tesla-k80/

https://www.reddit.com/r/homelab/comments/kn07w8/tesla_k80_in_dell_r730_which_power_cable/

https://kenmoini.com/post/2021/03/fun-with-servers-and-gpus/

https://www.reddit.com/r/homelab/comments/tpymyf/help_with_installing_m40_in_r720/

https://www.reddit.com/r/homelab/comments/z4pwza/tesla_k80_in_an_hp_z620_question_about_card/

https://www.reddit.com/r/pcmasterrace/comments/m6evvp/gaming_on_a_tesla_m40_gtx_titan_x_performance_for/

https://www.reddit.com/r/homelab/comments/pl2pga/tesla_m40_on_poweredge_r720/

https://www.dell.com/community/PowerEdge-Hardware-General/Dell-R720-6-pin-pcie-power/td-p/4218851

https://www.youtube.com/watch?v=MFCQOMCHOzM (discussion about K80 in Dell R720)

https://www.youtube.com/watch?v=qC7UdfQPMVI (discussion about M40 in Dell R720)

https://support.hpe.com/hpesc/public/docDisplay?docId=a00114890en_us&docLocale=en_US&page=NVIDIA_Tesla_K40_and_K80_GPUs.html

https://h30434.www3.hp.com/t5/Business-PCs-Workstations-and-Point-of-Sale-Systems/HP-Z620-dual-Xeon-Install-new-Graphic-card-Nvidia-power/td-p/6021214

https://h30434.www3.hp.com/t5/Desktops-Archive-Read-Only/HP-Z620-gt-2-questions-regarding-6pin-gt-8pin/td-p/5727922

https://electronics.stackexchange.com/questions/590781/confirm-if-dell-poweredge-r720-power-port-mixes-pin-layout-wiring-of-pcie-and-ke

https://www.reddit.com/r/homelab/comments/fbwi2r/pro_tip_adding_a_gpu_to_dell_poweredge_servers/ (wrong info the Dell 720 Riser is not EPS-12V, it has the connector but the keying is like an 8-PIN PCI-e).

https://electronics.stackexchange.com/questions/590781/confirm-if-dell-poweredge-r720-power-port-mixes-pin-layout-wiring-of-pcie-and-ke

https://www.reddit.com/r/homelab/comments/bn3ube/power_cable_for_gpu_in_r720xd/

https://www.reddit.com/r/homelab/comments/w0kbo3/r720xd_with_tesla_m40_what_power_cable/

https://www.reddit.com/r/homelab/comments/zumlxz/nvidia_k80_in_dell_r720/


Tags:

nvidia, tesla, gpus, desktop, hacking, cooling, powering, solutions, tutorial, aio, solutionsdo, server, gaming, ai, applications, caveats, ll, connector, gpu, gb,

Latest Articles

  • How To Add Windows 7 8 10 11 to GRUB Boot List Dual Booting
  • How to configure OpenDKIM on Linux with Postfix and setup bind zonefile
  • Debian Ubuntu 10/11/12 Linux how to get tftpd-hpa server setup tutorial
  • efibootmgr: option requires an argument -- 'd' efibootmgr version 15 grub-install.real: error: efibootmgr failed to register the boot entry: Operation not permitted.
  • Apache Error Won't start SSL Cert Issue Solution Unable to configure verify locations for client authentication SSL Library Error: 151441510 error:0906D066:PEM routines:PEM_read_bio:bad end line SSL Library Error: 185090057 error:0B084009:x509 certif
  • Linux Debian Mint Ubuntu Bridge br0 gets random IP
  • redis requirements
  • How to kill a docker swarm
  • docker swarm silly issues
  • isc-dhcp-server dhcpd how to get longer lease
  • nvidia cannot resume from sleep Comm: nvidia-sleep.sh Tainted: Linux Ubuntu Mint Debian
  • zfs and LUKS how to recover in Linux
  • [error] (28)No space left on device: Cannot create SSLMutex Apache Solution Linux CentOS Ubuntu Debian Mint
  • Save money on bandwidth by disabling reflective rpc queries in Linux CentOS RHEL Ubuntu Debian
  • How to access a disk with bad superblock Linux Ubuntu Debian Redhat CentOS ext3 ext4
  • ImageMagick error convert solution - convert-im6.q16: cache resources exhausted
  • PTY allocation request failed on channel 0 solution
  • docker error not supported as upperdir failed to start daemon: error initializing graphdriver: driver not supported
  • Migrated Linux Ubuntu Mint not starting services due to broken /var/run and dbus - Failed to connect to bus: No such file or directory solution
  • qemu-system-x86_64: Initialization of device ide-hd failed: Failed to get