smvplayerhmd

 

Research  

 

SMV Player for HMDs

Description


This site presents a prototype of a HMD-based visualization system for SMV content. The system simulates the visualization of an autostereoscopic display, in a 3D scene in which the user is immersed. In this site you can find:

- a detailed description of the implemented software

- a complete user manual and demonstration of the system usage

- a link for downloading the software and its source code.

In the field of visualization of 3D content, several technological options coexist, with different degrees of commercial success, without a clear consensus on which of those technologies is the most appropriate: stereoscopic displays with active/passive glasses, auto-stereoscopic displays, holographic displays and Head-Mounted Devices (HMDs). Currently, HMDs are experiencing a considerable advance, due to a reduction of costs that enable their entrance in the consumer market.

HMDs provide a complete sense of immersion for a single user, and a free-view navigation functionality, enabled by the tracking of the user’s head position and orientation. So far, HMDs have been mainly used for panoramic video visualization (360o) and immersive gaming. Nevertheless, MPEG has already started the exploration of the application of HMDs to Super MultiView (SMV), Free Navigation (FN) and 6 degree-of-freedom 6DoF video.

Compared to other types of visualization systems, the advantages of HMDs for the display of SMV video include the absence of crosstalk between left and right views, or other views of the set, the possibility of displaying high resolution views and a better feel of immersion. On the contrary, the disadvantages compared to SMV/lightfield displays include the single-user limitation and the need of wearing a headset.

This contribution presents a prototype of a HMD-based visualization system for SMV content. The system simulates the visualization of an autostereoscopic display, in a 3D scene in which the user is immersed. It has been developed using the Unity 3D framework and using Oculus Rift DK2. However, the system could be easily adapted to other HMDs. It has been built taking advantage of the new VR features that present some web browsers provide, such as the Mozilla WebVR Plus plugin for Firefox Nightly, that allows easy integration with VR content and a HMD. The developed system has, therefore, the advantage of being multi-platform.

The system is highly configurable, allowing different viewing configurations (e.g. stereo baseline, view density, distance to the screen...). It has been tested with MPEG SMV sequences to find the most comfortable parametrization for each of them. This first parametrization will need to be corroborated with more extensive QoE subjective tests. 

Citation


J. Cubelos, P. Carballeira, J. Gutiérrez, N. García, “QoE analysis of Dense Multiview Video with Head-Mounted Devices”, IEEE Trans. Multimedia, vol. 22, no. 1, pp. 69-81, Jan. 2020.  

System Functionality


The implemented system simulates the visualization of an autostereoscopic-like display in a 3D scene, in which the user is immersed. For the simulation of the autostereoscopic display, a 3D scene is built in Unity 3D with two co-located screens (in different layers) that are textured with the left- and right-eye views corresponding to the user’s head position. The head positioning is provided by the HMD tracking system. Figure 1 shows the 3D scene with both screens (Left_eye and Right_eye) that correspond to each of the user’s eyes (CameraL and CameraR).

 Captura de pantalla 2016 11 30 a las 12.14.54

Figure 1 - 3D scene of the system

The system takes a set of multiview sequences as input and, depending of the user’s head position, it will vary the user’s viewpoint adapting it to this position. Figure 2 depicts a scheme of how the system reacts to the user movement, showing the central stereo pair for the central user position and the left-most stereo pair for the left-most user position (40 cm head shift in the example).

 Captura de pantalla 2016 11 30 a las 12.16.51

Figure 2 - (a) Central stereo pair displayed for a user located at a central position.
(b) Left-most stereo pair displayed for a user located at 40 cm to the left of the central position.

System Functionality


 

Captura de pantalla 2016 11 30 a las 12.22.32
 
Figure 3 - Basics of the SMV player for HMD
 
 
 Captura de pantalla 2016 11 30 a las 12.22.40
Figure 4 - Display division for the Oculus Rift DK2
 
Figure 3 shows a general scheme of the system. In short, user’s head horizontal position is obtained from the Oculus Rift positional tracking and, based on it, the pair of views corresponding to that horizontal head position is displayed in the HMD. Each of the two halves of the horizontal resolution of the HMD display is used to display each view of the stereo pair (cf. Figure 4). 
 
 
 Captura de pantalla 2016 11 30 a las 12.24.48
 
(interocular distance = 3 camera gaps, number of views played in background = 40)
Figure 5 - Example of views played in the background for a central user position

 A key requirement for the system is the fluency, i.e. minimizing the delay between the user’s movement and the correspondent view switching. In addition, it’s essential, for a comfortable 3D experience, that both views of the stereo pair are synchronized. To fulfill these requirements, the system is constantly playing in the background (not textured on the screens), and synchronously with the global time frame, a subset of views around to the user’s current position (cf. Figure 5). Thus, if the user moves to a view within this subset, the new stereo pair is ready to be displayed on the screen without noticeable delay. This subset of views is updated whenever a new head position is detected.

The policy for the definition of this subset of views is adaptive to the current user position: the same number of views are set in each direction when the user is located around the central position, and this ratio varies when the user is close to one of the limits, playing more views in the direction that is opposite to this limit. 

Captura de pantalla 2016 11 30 a las 12.26.25

Figure 6 - Blocks diagram of the system's main loop

Figure 6 depicts a block diagram of the system’s main loop; in each frame period, the user’s head horizontal position is obtained, the set of background views is updated accordingly, and the foreground view pair is textured in the screens. 

System Parametrization


This section describes the parameters that are provided to configure the system and comfortable ranges of values for a set of typical MPEG SMV sequences. The configurable parameters of the system are the following:

- Interocular distance (stereo baseline measured in number of cameras): distance between left and right views (measured in number of camera gaps, cf. Figure 5).

- Distance to screen: measured relatively to screen height (vertical resolution of the sequence). This parameter is adjustable during execution time.

- Rotation (Screen simulation mode): the user has the possibility of turning off the rotation tracking of the Oculus Rift in order to make the screens “follow” the head orientation, or keep it on fixing its position in the 3D scene (screen simulation mode).

- Number of background views: number of views that are played simultaneously in the background, and prepared to be displayed on the screen. The value of this parameter does not include the number of views covered by the interocular distance that will be also played in the background (cf. Figure 5).

- View switching distance: horizontal distance between two consecutives views, i.e. distance that the head needs to move to switch to the contiguous view.

The number of background views and view switching distance are related, so it’s important to find a compromise between both of them that does not affect the system’s performance. The maximum head-movement distance between consecutive frames (to present a synchronized stereo pair) is given by:

Max. distanceto the left/right = (n_views_background x view_switching_distance) / 2

Thus, number of background views and view switching distance need to be chosen in a way that allows a comfortable head movement speed. If these parameters are not correctly set and the user moves too fast, it might move to a position that does not have the corresponding stereo pair prepared to be textured, introducing a delay in the view switching and affecting the user experience.

- Different prediction modes for the “background” views (Initial play mode):

- Manual modes:

- Normal mode: the same number of views set in both directions (taking the user current position as center).

- On my way / On the other way modes: decide the quantity of views to prepare depending of the user’s movement direction.

- Adaptive Mode: as described in 'System Implementation' section, it sets the same number of views in both directions (like the normal mode) but adapts this ratio at the limits of the camera rig setting more views in the opposite direction. The adaptive mode is most useful in almost all the situations.

An example of the user interface is shown in the Figure 7.

Captura de pantalla 2016 11 30 a las 12.36.58

Figure 7 - User interface for parametrization of the system.

Table 1 shows adequate value ranges for the later parameters and MPEG SMV sequences. Tests have been performed using compressed sequences in SD resolutions (specific resolutions can be found in Table 1). Future work will also include tests for HD resolutions. The system has been tested in a general purpose PC with the following characteristics

- Intel® CoreTM i7-4790 CPU @ 3.60 GHz
- RAM 16GB
- NVIDIA GeForce GTX 970 (4095 MB)

Captura de pantalla 2016 11 30 a las 12.39.11

Table 1 - Adequate parameter ranges for the MPEG SMV sequences

Conclusions and Future Work


This contribution presents a prototype system for the display of multiview video content in HMDs such as Oculus Rift, simulating the behavior of an autostereoscopic display. A virtual screen in the 3D scene displays a stereo pair from a viewpoint that varies depending on the user’s head position.

The system can be easily distributed as it is based on a web application, and it could be adapted in order to be hosted in a server that allows remote use. It provides the advantage of being multi- platform and it could be easily adapted to other HMDs.

Its configurable interface makes it suitable to be used for subjective tests of quality of experience. It has been tested with MPEG SMV sequences at SD resolutions and comfortable value ranges for the system parameters are provided for each of the sequences (cf. Table 1). This system could be also used in a near future to analyze the users’ typical movements while they’re using this kind of HMD. 

User Manual

1. REQUIREMENTS

    a) Computer

This system has been tested with a computer with the following characteristics:

- Intel® Core™ i7-4790 CPU @ 3.60 GHz 

- RAM 16GB 

- 64-bit OS 

- NVIDIA GeForce GTX 970 (4095 MB) updated to 361.91

The use of a computer with worst or different characteristics can be translated into a performance deterioration. It has been proved that the system doesn’t work properly for older versions  than the 358.70 one of the NVIDIA driver.

    b) Oculus

The computer used for executing the system must have installed the Oculus runtime. The system has been developed using the 0.8 version and it has been proven that it doesn’t work properly for older versions. The runtime must be initialized before executing the system.   

    c) Browser

For the correct execution of the system, it’s important to use a browser that supports WebGL and integration with HMDs. We recommend the use of Firefox Nighly in its 45.0.a1 version, and install the “Mozilla WebVR Enabler” addon (we recommend the use of the 0.5.0 version). The system should work for newer versions.

2. VIDEO FILE CONVERSION

The input videos of the multiview sequence have to be in the OGV (Ogg Theora Video) format. A way to convert them from a common format as AVI is using the ffmpeg2theora converter, for example if the input files are in AVI format:

for %%A IN (*.avi) DO ffmpeg2theora.exe -v 10 "%%A"

If the videos are in YUV instead of AVI, we recommend you to use the ffmpeg converter before doing the previous step, for example:

for %%A IN (*.yuv) DO ffmpeg.exe -s 1280x768 -r 25 -i "%%A" -pix_fmt yuv420p -vcodec rawvideo -r 25 -s 1280x768 "%%ñA.avi"

 

(Check the details of these converters in their websites)

3. VIDEO FILE INSERTION

Before executing the program, is important to put all the videos in the folder ‘StreamingAssets’. All the video files must follow the same name format and be numbered at the end of the name with 4 digit, starting from 0000.

 

 

 

 

4. SYSTEM CONFIGURATION

After including the videos in the ‘StreamingAssets’ folder and opening the ‘index.html’ file with the correct browser (see section 1.browser), it’s time to configure the system:

Video name format: Prefix of the video files names common in all the files. For example, for the sequence of 91 videos with videos from “BBB_Butterfly_cam0000” to “BBB_Butterfly_cam0090”, the video name format will be “BBB_Butterfly_cam”.

Number of cameras/videos: The total number of videos in the sequence
 
Interocular distance (stereo baseline): distance between left and right views (measured in number of camera gaps).
 
Number of background views:number of views that are played simultaneously in the background, and prepared to be displayed on the screen. The value of this parameter does not include the number of views covered by the interocular distance that will be also played in the background

Different prediction modes for the “background” views (Initial play mode):The system has different ways of preparing the videos next to the user’s position for improving his experience. Although we recommend the use of the ‘Adaptive Mode’, it can be sometimes useful to use one of the other modes:
    - Manual modes:
- Normal mode: the same number of views set in both directions (taking the user current position as center).
On my way / On the other way modes: decide the quantity of views to prepare depending of the user’s movement direction.
    - Adaptive Mode: it sets the same number of views in both directions (like the normal mode) but adapts this ratio at the limits of the camera rig setting more views in the opposite direction. The adaptive mode is most useful in almost all the situations.
 
Video resolution: Resolution of the input videos. We recommend the use of SD resolutions.

View switching distance: horizontal distance between two consecutives views, i.e. distance that the head needs to move to switch to the contiguous view.

Rotation (Screen simulation mode): the user has the possibility of turning off the rotation tracking of the Oculus Rift in order to make the screens “follow” the head orientation, or keep it on fixing its position in the 3D scene (screen simulation mode).

- Distance to screen:measured relatively to screen height (vertical resolution of the sequence). This parameter is only adjustable during execution time.
 
We recommend to try a few times changing the values of the ‘number of played cameras’ and the ‘switching distance between cameras’ because the optimal values of these parameters can change a lot depending of the computer and the sequence of videos used.

5. SYSTEM EXECUTION

Once the system has been configured and after clicking the ‘continue’ button, the system will be executed. Now it’s time to switch on the Oculus Rift:

- Click on the ‘Oculus’ logo in order to switch to Oculus Mode. In this mode the oculus tracker is activated and the result can be observed in the screen of the computer but not in the Oculus Rift

- Click on the ‘Fullscreen’ logo in order to activate the Oculus Display. Now it’s time to wear the Oculus.

Sometimes it can be usefull to reset the Oculus sensor after these steps clicking the ‘Reset Sensor’ button.

Now the system should be completely operable. We recommend to move the head slowly (to the left and right) to feel the best experience possible. If you see desynchronization between both views maybe you should change the parameters (restarting the system) or the play mode (automatic mode recommended).

During the execution you can change some settings using the keyboard:

        - Change the ‘play mode’:

o   F1’ key: switch to ‘Adaptive mode’

o   F2’ key: switch to ‘Normal mode’

o   F3’ key: switch to ‘On My Way mode’

o   F4’ key: switch to ‘On The Other Way mode’

        - Move the screens:

o   Up’ key: zoom out

o   Down’ key: zoom in

        - Modify the user position with the keyboard (if the Oculus doesn’t work correctly):

o   A’ key: move the user to the left

o   D’ key: move the user to the right

        - S’ key: force synchronization-        

        - TAB’ key: switch on/off the console

Downloads


1. User Manual in PDF format.

2. SMV Player, ready to go.

3. Source code (GitHub)

4. Try it online(only working on Firefox Nightly's browser. For better performance, download the 'SMV Player' package and run it locally)