NVidia Triton on Photon OS

Back to discussions

Expand all | Collapse all

NVidia Triton on Photon OS

Jump to Best Answer

1. NVidia Triton on Photon OS

Recommend
Daniel Casota
Posted Aug 27, 2022 08:20 PM
| view attached

Reply Reply Privately
Hi everyone,

How it started
I'm trying to get some traction on AI inference use cases on Photon OS.
An idea was to get baken a super easy .ova that gives the cabability to stream e.g. from a file or to livestream an iPhone camera source and to see the visualized inference videostream on a web url. This could include vehicle and people detection and tracking.

How it goes
The deepstream tool from the inference server, an open-source-components-assembled NVidia product called Triton, does not start successfully on Photon OS and I'm not convinced that it is Photon OS issue.
Clarification from the community is the best.

Here the issue description

Starting deepstream-app -c deepstream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt ends with
**** INFO: <bus_callback:194>: Pipeline ready**
**Error String : Feature not supported on this GPUError Code : 801**
**ERROR from nvv4l2decoder0: Failed to process frame.**
Debug info: gstv4l2videodec.c(1747): gst_v4l2_video_dec_handle_frame (): /GstPipeline:pipeline/GstBin:multi_src_bin/GstBin:src_sub_bin0/GstURIDecodeBin:src_elem/GstDecodeBin:decodebin0/nvv4l2decoder:nvv4l2decoder0:
**Maybe be due to not enough memory or failing driver**
**ERROR from qtdemux0: Internal data stream error.**
Debug info: qtdemux.c(6605): gst_qtdemux_loop (): /GstPipeline:pipeline/GstBin:multi_src_bin/GstBin:src_sub_bin0/GstURIDecodeBin:src_elem/GstDecodeBin:decodebin0/GstQTDemux:qtdemux0:
**streaming stopped, reason error (-5)**
Quitting
[NvMultiObjectTracker] De-initialized
App run failed

Are the errors "Error String : Feature not supported on this GPUError Code : 801" and "ERROR from nvv4l2decoder0: Failed to process frame" Photon OS related (hardware + headless) ?

Here's the setup description.

Step 1: Hardware / Software
No blame me, it is an Azure Standard_NV6 virtual machine (6vCPUs, 56GB RAM, with 1x NVidia Tesla M60 GPU) because I cannot afford yet a new onpremise vSphere 8 homelab. Photon OS runs best on vSphere, no doubt.
64 GB disk space. The docker container is consuming a lot of disk space.
VMware Photon OS 3 rev 2 as operating system. Photon OS runs with no X window (headless), system name is NVidia01.
installed NVidia-Container-Toolkit and pulled Deepstream:6.1-Triton docker container (see Step3++)

Step 2: Base provisioning (one reboot)
# repo url
if [ `cat /etc/yum.repos.d/photon.repo | grep -o "packages.vmware.com/photon" | wc -l` -eq 0 ]; then
cd /etc/yum.repos.d/
sudo sed -i 's/dl.bintray.com\/vmware/packages.vmware.com\/photon\/$releasever/g' photon.repo photon-updates.repo photon-extras.repo photon-debuginfo.repo
fi
# update components with impact to nvidia components
tdnf update -y docker
# install kernel api headers and devel
tdnf install -y build-essential wget tar
# On vSphere comment the following line
tdnf install -y linux-devel
# On vSphere uncomment the following line
# tdnf install -y linux-esx-devel

reboot
wget https://us.download.nvidia.com/tesla/470.141.03/NVIDIA-Linux-x86_64-470.141.03.run
chmod a+x ./NVIDIA-Linux-x86_64-470.141.03.run
./NVIDIA-Linux-x86_64-470.141.03.run --kernel-source-path=/usr/lib/modules/`uname -r`/build --ui=none --no-questions --accept-license
# Check GPU driver
root@NVidia01 [ / ]# nvidia-smi
Wed Aug 24 07:02:55 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.141.03 Driver Version: 470.141.03 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla M60 Off | 00003130:00:00.0 Off | Off |
| N/A 34C P0 36W / 150W | 0MiB / 8129MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

Step 3: install NVidia Container toolkit to run NVidia docker container on Photon OS
tdnf install -y gpg
cd /etc/pki/rpm-gpg/
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | gpg --dearmor -o /etc/pki/rpm-gpg/nvidia-container-toolkit-keyring.gpg
cat << EOF >>/etc/yum.repos.d/nvidia-container-toolkit.repo
[libnvidia-container]
name=libnvidia-container
baseurl=https://nvidia.github.io/libnvidia-container/centos7/x86_64
gpgcheck=0
enabled=1
EOF
tdnf makecache
tdnf install nvidia-container-toolkit
systemctl restart docker
rm /etc/yum.repos.d/nvidia-container-toolkit.repo

Step 4: Open some ports
iptables -A INPUT -i eth0 -p udp --dport 5400 -j ACCEPT
iptables -A OUTPUT -p udp --dport 5400 -j ACCEPT
iptables -A INPUT -i eth0 -p tcp --dport 8000 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 8000 -j ACCEPT
iptables -A INPUT -i eth0 -p tcp --dport 8001 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 8001 -j ACCEPT
iptables -A INPUT -i eth0 -p tcp --dport 8002 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 8002 -j ACCEPT
iptables-save >/etc/systemd/scripts/ip4save

Step 5: Pull docker container and start it
docker run --gpus all -it --rm --net=host nvcr.io/nvidia/deepstream:6.1-triton -p 8000:8000/tcp -p 8001:8001/tcp -p 8002:8002/tcp -p 5400:5400/udp

Step 6: Inside docker container: Download configuration files and models
git clone https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps.git
cd /opt/nvidia/deepstream/deepstream-6.1/deepstream_reference_apps/deepstream_app_tao_configs
cp -a * /opt/nvidia/deepstream/deepstream-6.1/samples/configs/tao_pretrained_models/
apt-get install -y wget zip
cd /opt/nvidia/deepstream/deepstream-6.1/samples/configs/tao_pretrained_models/
./download_models.sh

Step 7: Configure file deep-stream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt
In deep-stream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt the group [sink0] has been deactivated (enable=0), and [sink2] has been activated (enable=2). Additionally in [sink2], codec=2 #2=h265 has been set because of [source0] uri=file://../../streams/sample_1080p_h265.mp4.

Step 8: Start the deepstream app
deepstream-app -c deepstream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt

Attachment(s)

Protocol.txt 65 KB 1 version
2. RE: NVidia Triton on Photon OS
Best Answer

Recommend
Daniel Casota
Posted Aug 29, 2022 08:46 AM

Reply Reply Privately
With help from the NVidia user forum it works now on Photon OS.
https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new lists each GPU with its capabilities. In order to use the hardware decoding of the M60 GPU, a stream or a file in h264 format must be available. h265 is not supported.

Hence, in Step7 the sample file deep-stream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt needs modifications.

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
num-sources=1
uri=file://../../streams/sample_1080p_h264.mp4
gpu-id=0
[sink2] # renamed from sink0
enable=0
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=1
source-id=0
gpu-id=0
[sink0] # renamed from sink2
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=4
#1=h264 2=h265
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=0
bitrate=4000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0
# set below properties in case of RTSPStreaming
rtsp-port=8554
udp-port=5400
[tests]
file-loop=1

As soon as objects or people are detected, they receive a tracking ID. This makes the deepstream app. In addition, one would like to further process such events and, for example, achieve a connection to a monitoring system. For this, deepstream supports several IoT functions. The so-called Kafka protocol adapter allows bidirectional messaging, and recording based on an anomaly is supported as file section entry well.

Happy inferencing and tanzu'ifying.
3. RE: NVidia Triton on Photon OS

Recommend
Daniel Casota
Posted Sep 03, 2022 07:54 PM

Reply Reply Privately
Beside Triton, the NVidia tao toolkit might be interesting as well. Here some provisioning info.

Step 1: packages installation
Execute the following commands.
tdnf install -y wget unzip python3-pip
pip3 install virtualenvwrapper
export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
source /usr/bin/virtualenvwrapper.sh
mkvirtualenv -p /usr/bin/python3 launcher
pip3 install jupyterlab
pip3 install nvidia-tao

Install the NVidia GPU Cloud cli as well.
# NGC installation with md5 check output
wget --content-disposition https://ngc.nvidia.com/downloads/ngccli_linux.zip && unzip ngccli_linux.zip && chmod u+x ngc-cli/ngc
find ngc-cli/ -type f -exec md5sum {} + | LC_ALL=C sort | md5sum -c ngc-cli.md5
echo "export PATH=\"\$PATH:$(pwd)/ngc-cli\"" >> ~/.bash_profile && source ~/.bash_profile

Step 2: download the tao samples
Now we download the tao samples with pretrained ML models.
wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/tao/cv_samples/versions/v1.2.0/zip -O cv_samples_v1.2.0.zip
unzip -u cv_samples_v1.2.0.zip -d ./cv_samples_v1.2.0
cd ./cv_samples_v1.2.0
mkdir ./cv_samples_v1.2.0/detectnet_v2/data

For later use, we create two directories as well.
mkdir -p /workspace/tao-experiments/detectnet_v2
mkdir -p /workspace/tao-experiments/data/training

The tao samples contain jupyter notebook files which need data object images and label data. There are two files which can be obtained from
http://www.cvlibs.net/download.php?file=data_object_image_2.zip
The zip file size is 11.7GB.
http://www.cvlibs.net/download.php?file=data_object_label_2.zip
The zip file size is 5.5MB.
Simply copy the zip files e.g. per winscp to the photon os vm into the directory ./cv_samples_v1.2.0/detectnet_v2/data .

Step 3: docker login
docker login nvcr.io
If the docker daemon is not started, execute systemctl start docker.
The docker hub login is used in a jupyter notebook.
Step 4: yupiter notebook configuration
The jupyter notebook will be published on port 8888. Hence we have to open that port.
iptables -A INPUT -i eth0 -p tcp --dport 8888 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 8888 -j ACCEPT
iptables-save >/etc/systemd/scripts/ip4save

Start the jupyter notebook .
cd ./cv_samples_v1.2.0/
/root/.virtualenvs/launcher/bin/jupyter notebook --allow-root --ip 0.0.0.0 --no-browser
On the screen you will get a similar output as below.
[W 2022-08-16 14:57:33.899 LabApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-08-16 14:57:33.906 LabApp] 'allow_root' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-08-16 14:57:33.911 LabApp] 'allow_root' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[I 2022-08-16 14:57:33.925 LabApp] JupyterLab extension loaded from /root/.virtualenvs/launcher/lib/python3.10/site-packages/jupyterlab
[I 2022-08-16 14:57:33.929 LabApp] JupyterLab application directory is /root/.virtualenvs/launcher/share/jupyter/lab
[I 14:57:33.937 NotebookApp] Serving notebooks from local directory: /
[I 14:57:33.939 NotebookApp] Jupyter Notebook 6.4.12 is running at:
[I 14:57:33.942 NotebookApp] http://ph01:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b
[I 14:57:33.945 NotebookApp] or http://127.0.0.1:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b
[I 14:57:33.948 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

    To access the notebook, open this file in a browser:
        file:///root/.local/share/jupyter/runtime/nbserver-873-open.html
    Or copy and paste one of these URLs:
        http://ph01:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b
     or http://127.0.0.1:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b

Open a web browser and insert ip and tokenID, example:
http://20.208.40.128:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b

Browse to the directory CV_Samples_v1.2.0 > Detectnet_v2 .
Open the jupyter notebook file detectnet_v2.ipynb.
Edit the section with the environment variables for the LOCAL_PROJECT_DIR path:
Because the download url for the images.zip and labels.zip is missing, the jupyter notebook shows an error. However, the .zip files were already copied beforehand, so the error can be ignored. Unpacking the zip files takes a while.
You can step through the section now. Adopt the jupyter notebook for own purposes. Using the pretrained data, here the assignments and number of existing images.
b'car': 4129
b'dontcare': 1574
b'truck': 145
b'cyclist': 226
b'misc': 118
b'pedestrian': 638
b'tram': 67
b'van': 377
b'person_sitting': 23

The dev setup with root privileges is not intended to be used outside a lab environment. So far, all the Nvidia triton and tao material works flawlessly on Photon OS.

Photon OS

NVidia Triton on Photon OS

Daniel CasotaAug 27, 2022 08:20 PM

Daniel CasotaAug 29, 2022 08:46 AMBest Answer

Daniel CasotaSep 03, 2022 07:54 PM

1. NVidia Triton on Photon OS

2. RE: NVidia Triton on Photon OS Best Answer

3. RE: NVidia Triton on Photon OS

2. RE: NVidia Triton on Photon OS
Best Answer