image05

Art 3D/VR challenge – week 2 – looking for performance

Previous week weeks 3, 4, 5 until 9

Introduction

This article is continuation of Art 3D/VR challenge.

Please note, I don’t describe all possible technologies, knowledge, science, models and approaches you can find on the search engine.


FOLLOWING INFORMATION IS MY PERSONAL EXPERIENCE I HAVE GAINED OVER LAST 10+ YEARS OF MY CAREER. BY APPLYING PRESENTED MODELS AND STACK I CAN GUARANTEE TO ACHIEVE FAIR ENOUGH RESULTS.

Performance utilisation

As we are going to process a lot of data; we have to make sure about high availability, reliability and scalability on each level of hardware and software stack.

There are the most important characteristics we should keep in mind when selecting the right platform:

Basically, each characteristic affects hardware and software.

REQUIREMENTS

In previous blogpost “Abstract overview” I have listed and compared some of hardware and software technologies, techniques and standards we could consider in this challenge. In current blogpost I will short the list with the most important hardware and software components we should to include in the VR device.

Hardware

The device has to be very efficient bridge, converter and gateway for media. To achieve this goal we have to make sure the interfaces provide highest possible throughput. There is the list of basic interfaces we should consider:

Interfaces Priority Description
CSI-2 HIGH min. 2 for v1
USB 3.0 HIGH min. 2 for v1
HDMI output HIGH min. 1 for v1
HDMI input LOW is optional in v1
AUDIO output LOW is optional in v1
AUDIO input HIGH min. 1 for v1
Ethernet HIGH min. 1 for v1
PCIe MEDIUM is optional in v1; min. 1 for v2
GPIO HIGH min. 8 pins for v1

After collecting RAW data on the memory we have to make sure to compute and push it out with highest possible rate. As we gonna collect multiple FullHD RAW streams, process, encode and push it out; we require extremely high BUS/RAM throughput and hardware acceleration on each level of media processing.

Computing Priority Description
CPU multicore HIGH min. 4 cores for v1
GPU HIGH high efficient architecture is required
Dedicated Video Chip HIGH can be integrated in CPU or GPU chip for v1
DDR4-RAM MEDIUM faster is better
eMMC HIGH high IO embedded storage

Basically the perfect hardware setup should comes with full software stack. There are many producers like Intel, Nvidia, AMD which provides such solutions.

Software

Software stack is the most important part. We should choose the platform which comes with the most advanced, most up to date science as a software.

Frameworks Priority Description
Graphics HIGH OpenGL preferred
Vision HIGH VisionWorks and OpenCV preferred
Parallel computing HIGH CUDA and OpenCL preferred
Multimedia HIGH GStreamer, OpenMAX preferred
Deep learning LOW optional for v1, preferred cuDNN
Standards Priority Description
VAAPI HIGH Ubuntu 14.04 or higher
VDPAU HIGH Ubuntu 14.04 or higher
GLX HIGH Ubuntu 14.04 or higher
TCP/UDP HIGH Ubuntu 14.04 or higher

As the operating system I would recommend Ubuntu in the latest version (15.04/Vivid). It comes with the up to date libraries for all required technologies by this challenge.

Physical

Smaller is better! The challenge is about bringing portable, energy efficient, plug&play device. We should at least focus and the size, noise and power consumption.

Component Priority Description
Small size motherboard HIGH best is ITX or smaller
FAN and radiator HIGH best without FAN, small radiator

Support

Ideally if the supplier of hardware provides world wide full support for all segments.

Manufacture

Ideally if supplier of components provides long term products cycle.

Budget

We should focus on low budget solutions with aim on customer market.

 

MARKET OVERVIEW

Please have a look on very short list of available mini computers. They differ by size, power consumption, number of cores, GPU model, hardware multimedia support, performance of interfaces, storage and price!

I will give a short comment for pros & cons of each one. Please follow by URL to gain more knowledge.

96Boards HiKey Board (LeMaker)

See more: http://www.96boards.org/products/ce/hikey/start/

Pros: Octa-Core 64bit CPU, 2GB DDR3, Bluetooth

Cons: missing GPU, only USB 2.0, no CSI-2, no Ethernet

Verdict: NO, missing graphic acceleration and efficient camera ports

 

SNAPDRAGON 805 DEVELOPMENT KIT

See more: http://shop.intrinsyc.com/collections/product-development-kits

Pros:  Adreno™ 420 GPU, Hexagon™ DSP, Krait® 450 CPU quad-core 2.5GHz, 2x MIPI-DSI 4-lane, MIPI-CSI 4-lane

Cons: no VP8 hardware encoding, single USB 3.0, high price

Verdict: YES, this platform provides 90% of features we need for 3D/VR device

 

Nvidia Development Kit – Jetson TK1

See more: http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html

Pros: NVIDIA 4-Plus-1™ Quad-Core ARM® Cortex™-A15 CPU, NVIDIA Kepler GPU with 192 CUDA Cores, hardware VP8 encoding, low price

Cons: single 2 lane CSI-2, single USB 3.0

Verdict: YES, this platform provides 75% of features we need for 3D/VR device

 

Nvidia Development Kit – Jetson TX1

See more: http://www.nvidia.com/object/jetson-tx1-module.html

Pros: 64-bit ARM® A57 CPUs, 1 TFLOP/s 256-core with NVIDIA Maxwell™ Architecture, 4 GB LPDDR4 | 25.6 GB/s, Up to 6 cameras | 1400 Mpix/s, VP8 hardware encoding

Cons: high price

Verdict: YES, this platform provides more than 100% of features we need for 3D/VR device

 

Gigabyte GeForce Mini-ITX with ITX platform

See more: http://www.gigabyte.com/products/product-page.aspx?pid=5252

This is very flexible and configurable ITX platform. We can install many types of CPUs, RAM and GPU chipsets.

Pros: configurable, supports all features

Cons: high price, quite big

Verdict: YES, this platform provides more than 100% of features we need for 3D/VR

 

Mali-T604 Low-cost Development Board

See more: http://malideveloper.arm.com/news/mali-t604-low-cost-development-board/

Pros: Quad-core Mali-T604 GPU

Cons: Dual-core Cortex-A15 CPU, only OpenGL ES 2.0, no USB 3.0, only CSI v1, no VP8 hardware encoder

Verdict: NO, missing OpenGL 3+ and efficient camera ports

 

QUICK BENCHMARK

My research lead me to give a try with Nvidia Development Kit – Jetson TK1. Comparing most of development kits available on the market (right now) the Jetson TK1 seems to be a good match for initial benchmarking. It is built with mobile performance in mind, supported by huge community and under continues development of Nvidia researchers.

Jetson TK1 is built on top of Tegra K1 chipset based on “NVIDIA Kepler GPU with 192 CUDA Cores”. Supported by “NVIDIA 4-Plus-1™ Quad-Core ARM® Cortex™-A15 CPU” gives a lot of flexibility and performance in media processing for decoding and encoding up to 2160p.

Nvidia’s Tegra K1 (codenamed “Logan”) features ARM Cortex-A15 cores in a 4+1 configuration similar to Tegra 4, or Nvidia’s 64-bit Project Denver dual-core processor as well as a Kepler graphics processing unit with support for Direct3D 12, OpenGL ES 3.1, CUDA 6.5 and OpenGL 4.4/OpenGL 4.5.[59] Nvidia claims that it outperforms both the Xbox 360 and the PS3, whilst consuming significantly less power

What does it mean for us?

In short the goal is to capture 2 RAW video streams of 1080p, stitch them, process by video filters, add OpenGL features and output to media format as VP8/Opus over network and digital media over HDMI.

Development Kit comes with pre-installed Ubuntu for Tegra R21.4. It has pre-installed core of GStreamer multimedia framework in version 1.2.4. Nvidia delivers also gst-omx plugins which allows to use all features of Tegra encoding/decoding and also to easily operate on CSI-2/USB (cameras) and HDMI interfaces end points.

In my opinion Nvidia solutions is exactly what is needed for portable 3D/VR device. I decided to take the next step and decided on my…

MY FINAL CHOICE

I have made very intensive testing and benchmarking of Jetson TK1. It looks very promising on every level. Hardware is efficient enough to handle RAW media via interfaces and efficient enough to process, encode and stream video/audio in realtime up to 30fps.

Taking all aspects and facts into account I am going to bet on the next generation of Jetson called TX1.

NVIDIA Jetson TX1 with GPU-accelerated parallel processing is the world’s leading embedded visual computing platform. It features high-performance, low-energy computing for deep learning and computer vision making the Jetson platform ideal for compute-intensive embedded projects like drones, autonomous robotic systems, Advanced Driver Assistance Systems (ADAS), mobile medical imaging, and Intelligent Video Analytics (IVA). OEMs, independent developers, makers and hobbyists can use the NVIDIA Jetson TX1 to explore the future of embedded computing.

Have a brief look on the facts about Jetson TX1 platform which fits all requirements of VR device challenge:

SUPER COMPUTING PLATFORM…

AND VERY TINY

AND WITH INCREDIBLE SOFTWARE STACK

Jetson TX1 feels to be perfect match for stereoscopic device. It will be released to the market in middle of March 2016. It brings a higher efficiency, next generation of ARM CPU, next generation of Nvidia GPU, next generation of DDR RAM, next generation of CSI, next generation of USB, next generation of Ubuntu for Tegra, next generation of software stack and many many other improvements!

 

SUMMARY

I am very happy with current state of project. Taking into account all of my research I am pretty sure to deliver 3D/VR advanced device in next 12-16 weeks.

I am about to order Jetson TX1 which will be released in Europe in March 2016. Hopefully it can be delivered to London (where I relocate in 3 days) on time!

For now, I will focus on more detailed benchmarking of Jetson TK1.

Next step

During the next week I will focus on looking into detailed benchmarking and performance characteristics of Jetson platforms…

Topic: Art 3D/VR challenge – week 3 – FullHD processing on portable GPU vs CPU

Contribution

Feel free to contact me if you are interested in meeting the team and contribution to this project in any programming language (go, php, ruby, js, node.js, objective-c, java…). This project is parked on Github.

See my contact page if required.

Resources

Multiple parts of my blogpost have its source in remote articles, blogposts and wiki for which I have no rights. I am not able to link all external sources to my blogpost. I would like to say thank you to everyone who shares the knowledge publicly. If you think I have illegally used any of your thoughts, products, patents please let me know and I will fix the issue asap.

© COPYRIGHT KRZYSZTOF STASIAK 2016. ALL RIGHTS RESERVED