GPU accelerated VDI Cubblies

* There is a three-hour idle timer, which means that after the remote session has been disconnected, the timer starts and if no reconnection happens within next three hours, then the user is forcibly logged off and the computer is freed to the pool of available machines.

14

* There is also a ten-hour total session timer, which means that the remote session is automatically disconnected after ten hours has past from the last connection. Let us know if these limits are problematic to you.

15

* TSclient link on the desktop will show your local files that are shared to the virtual desktop. You may select the folders to be shared from the Horizon Client settings.

16

\\

17

18

Check the specs with nvidia-smi. VMwareBlastServer is the remote connection agent. The screen output is encoded and send as H.264 stream to client endpoint.

19

20

21

psmaatta@vdi-cubbli-g02:~$ nvidia-smi

22

Sun Jun 10 10:35:36 2018

23

+-----------------------------------------------------------------------------+

24

| NVIDIA-SMI 390.57 Driver Version: 390.57 |

25

|-------------------------------+----------------------+----------------------+

26

| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |

27

| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |

28

|===============================+======================+======================|

29

| 0 GRID P100-4Q On | 00000000:02:01.0 On | N/A |

30

| N/A N/A P0 N/A / N/A | 780MiB / 4096MiB | 0% Default |

31

+-------------------------------+----------------------+----------------------+

32

33

+-----------------------------------------------------------------------------+

34

| Processes: GPU Memory |

35

| GPU PID Type Process name Usage |

36

|=============================================================================|

37

| 0 1163 G /usr/lib/xorg/Xorg 136MiB |

38

| 0 11339 C+G ...ent/VMwareBlastServer/VMwareBlastServer 345MiB |

39

+-----------------------------------------------------------------------------+

\\

These should run Tensorflow on GPU without (any known) issues. Here's a sample. CUDA 9.1 is currently supported and CUDA 9.2 is not yet.

48

49

50

Python 2.7.12 (default, Dec 4 2017, 14:50:18)

51

[GCC 5.4.0 20160609] on linux2

52

Type "help", "copyright", "credits" or "license" for more information.

53

>>> import tensorflow as tf

54

>>> x1 = tf.constant([1,2,3,4])

55

>>> x2 = tf.constant([5,6,7,8])

56

>>> result = tf.multiply(x1, x2)

57

>>> sess = tf.Session()

58

2018-06-09 23:54:57.726930: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA

59

2018-06-09 23:54:57.854369: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

60

2018-06-09 23:54:57.855128: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:

61

name: GRID P100-4Q major: 6 minor: 0 memoryClockRate(GHz): 1.3285

62

pciBusID: 0000:02:01.0

63

totalMemory: 4.00GiB freeMemory: 2.71GiB

64

2018-06-09 23:54:57.855153: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0

65

2018-06-09 23:55:00.512760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:

66

2018-06-09 23:55:00.512820: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0

67

2018-06-09 23:55:00.512832: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N

68

2018-06-09 23:55:00.513004: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2422 MB memory) -> physical GPU (device: 0, name: GRID P100-4Q, pci bus id: 0000:02:01.0, compute capability: 6.0)

69

>>> print(sess.run(result))

[ 5 12 21 32]

>>> quit()

Playing with CUDA samples.

75

76

77

luser@vdi-cubbli-g01:~$ /usr/local/cuda-9.1/bin/cuda-install-samples-9.1.sh ~/CUDA-samples

78

luser@vdi-cubbli-g01:~$ cd CUDA-samples/NVIDIA_CUDA-9.1_Samples/2_Graphics/Mandelbrot/

79

luser@vdi-cubbli-g01:~/CUDA-samples/NVIDIA_CUDA-9.1_Samples/2_Graphics/Mandelbrot$ make

80

/usr/local/cuda-9.1/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o Mandelbrot.o -c Mandelbrot.cpp

81

..clipclip..

82

luser@vdi-cubbli-g01:~/CUDA-samples/NVIDIA_CUDA-9.1_Samples/2_Graphics/Mandelbrot$ ../../bin/x86_64/linux/release/Mandelbrot

[[image:attach:Mandelbrot Set.png]]

\\

\\

\\

Wiki source code of GPU accelerated VDI Cubblies

Navigation

author	version	line-number	content
		1	(% class="auto-cursor-target" %)
		2	\\
		3
		4	{{info title="Contact info"}}
		5	For any inquiries regarding VDI Cubblies please contact helpdesk@helsinki.fi.
		6	{{/info}}
		7
		8	----
		9
		10	General remarks:
		11
		12	* Users' home directories are mounted NFS home directories. The quota for NFS homedir is 20 GB.
		13	* There is a three-hour idle timer, which means that after the remote session has been disconnected, the timer starts and if no reconnection happens within next three hours, then the user is forcibly logged off and the computer is freed to the pool of available machines.
		14	* There is also a ten-hour total session timer, which means that the remote session is automatically disconnected after ten hours has past from the last connection. Let us know if these limits are problematic to you.
		15	* TSclient link on the desktop will show your local files that are shared to the virtual desktop. You may select the folders to be shared from the Horizon Client settings.
		16	\\
		17
		18	Check the specs with nvidia-smi. VMwareBlastServer is the remote connection agent. The screen output is encoded and send as H.264 stream to client endpoint.
		19
		20	{{code language="bash" title="nvidia-smi"}}
		21	psmaatta@vdi-cubbli-g02:~$ nvidia-smi
		22	Sun Jun 10 10:35:36 2018
		23	+-----------------------------------------------------------------------------+
		24	\| NVIDIA-SMI 390.57 Driver Version: 390.57 \|
		25	\|-------------------------------+----------------------+----------------------+
		26	\| GPU Name Persistence-M\| Bus-Id Disp.A \| Volatile Uncorr. ECC \|
		27	\| Fan Temp Perf Pwr:Usage/Cap\| Memory-Usage \| GPU-Util Compute M. \|
		28	\|===============================+======================+======================\|
		29	\| 0 GRID P100-4Q On \| 00000000:02:01.0 On \| N/A \|
		30	\| N/A N/A P0 N/A / N/A \| 780MiB / 4096MiB \| 0% Default \|
		31	+-------------------------------+----------------------+----------------------+
		32
		33	+-----------------------------------------------------------------------------+
		34	\| Processes: GPU Memory \|
		35	\| GPU PID Type Process name Usage \|
		36	\|=============================================================================\|
		37	\| 0 1163 G /usr/lib/xorg/Xorg 136MiB \|
		38	\| 0 11339 C+G ...ent/VMwareBlastServer/VMwareBlastServer 345MiB \|
		39	+-----------------------------------------------------------------------------+
		40
		41
		42
		43	{{/code}}
		44
		45	\\
		46
		47	These should run Tensorflow on GPU without (any known) issues. Here's a sample. CUDA 9.1 is currently supported and CUDA 9.2 is not yet.
		48
		49	{{code language="py" title="Test run"}}
		50	Python 2.7.12 (default, Dec 4 2017, 14:50:18)
		51	[GCC 5.4.0 20160609] on linux2
		52	Type "help", "copyright", "credits" or "license" for more information.
		53	>>> import tensorflow as tf
		54	>>> x1 = tf.constant([1,2,3,4])
		55	>>> x2 = tf.constant([5,6,7,8])
		56	>>> result = tf.multiply(x1, x2)
		57	>>> sess = tf.Session()
		58	2018-06-09 23:54:57.726930: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
		59	2018-06-09 23:54:57.854369: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
		60	2018-06-09 23:54:57.855128: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
		61	name: GRID P100-4Q major: 6 minor: 0 memoryClockRate(GHz): 1.3285
		62	pciBusID: 0000:02:01.0
		63	totalMemory: 4.00GiB freeMemory: 2.71GiB
		64	2018-06-09 23:54:57.855153: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
		65	2018-06-09 23:55:00.512760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
		66	2018-06-09 23:55:00.512820: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
		67	2018-06-09 23:55:00.512832: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
		68	2018-06-09 23:55:00.513004: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2422 MB memory) -> physical GPU (device: 0, name: GRID P100-4Q, pci bus id: 0000:02:01.0, compute capability: 6.0)
		69	>>> print(sess.run(result))
		70	[ 5 12 21 32]
		71	>>> quit()
		72	{{/code}}
		73
		74	Playing with CUDA samples.
		75
		76	{{code title="Play with CUDA samples"}}
		77	luser@vdi-cubbli-g01:~$ /usr/local/cuda-9.1/bin/cuda-install-samples-9.1.sh ~/CUDA-samples
		78	luser@vdi-cubbli-g01:~$ cd CUDA-samples/NVIDIA_CUDA-9.1_Samples/2_Graphics/Mandelbrot/
		79	luser@vdi-cubbli-g01:~/CUDA-samples/NVIDIA_CUDA-9.1_Samples/2_Graphics/Mandelbrot$ make
		80	/usr/local/cuda-9.1/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o Mandelbrot.o -c Mandelbrot.cpp
		81	..clipclip..
		82	luser@vdi-cubbli-g01:~/CUDA-samples/NVIDIA_CUDA-9.1_Samples/2_Graphics/Mandelbrot$ ../../bin/x86_64/linux/release/Mandelbrot
		83
		84
		85
		86	{{/code}}
		87
		88	[[image:attach:Mandelbrot Set.png]]
		89
		90	\\
		91
		92	\\
		93
		94	\\