Virtual Reality Technology
A: Two weeks from now. Doesn't matter when now is, it's still two weeks.
- taken from comp.sys.ibm.pc.hardware.misc -
- What is Virtual Reality?
- Interaction with the virtual world
- Stereo Viewing Devices
- Acoustic VR tools
- Touch & Force Feedback mechanisms
- Programming in VR
- VRML - Virtual Reality for everyone
- Human factors in VR
The idea of VR dates back to the 60s. But the first extensive researchs were made in the 70s by the american militaries, who hoped to replace their analog flight simulators by digital ones. So a lot of research went into flight helmets with displays. The NASA was another institution, which needed simulation for training purposes. In 1981, they invented a Head-mounted display based on small LC displays. This principle is still used today, although most simulations nowadays use large volume visualization.
Virtual Reality is a powerful human-computer interface. A simulation with computer graphics is used to create a synthetic world, which appears more or less realistic. Haptic feedback, 3D sound and - most important - real-time interactivity improve the impression of immersion. The virtual world is modified instantaneously after the users input, which results in the need of processing power and high-end graphic cards.
Today, the market for VR is driven primarily by entertainment applications, e.g. VR based video games with force feedback devices, 3D sound and other elements. But there is also a lot of research for scientific purposes going on, like physical based modeling, haptic rendering or the visualization of complex physical processes, e.g fluid dynamics in a motor.
There are a lot of devices available for the user to allow input and provide feedback from the computer. A moving object in space has three translations and three rotations, which can be measured by 3D position sensors. One technique is using magnetic sensors, which employ alternating low-frequency fields. That field is sampled by a receiver, who is attached to the users head. The position/orientation data is transmitted to the host computer, which can calculate a new viewing direction of the virtual scene. This sensor returns absolute position data with respect to the system of coordinates which is fixed by the field transmitter array.
Another way to control object's position is using relative sensors like trackballs or 3D probes. A probe consists of a small sensorized mechanical arm, which has six degrees of freedom. Virtual objects can be selected or released with a switch. As for all open kinematic hardware, measurement errors accumulate from the base towards the tip.
In order to model the interaction more intuitive, sensing gloves can be used to maintain the hand freedom of motion. Sensors measure the joint angle of the fingers. There are a different sorts of sensors. One of the best are fiber optic sensors (Dataglove), because they are compact and light so that the user is feeling comfortable wearing the glove. The Dextrous Hand Master (DHM) is a metallic skeleton structure worn on the back of the user's hand. Each angle is measured by Hall-effect sensors. But the weight of over 300g tires the user. It also does not measure the wrists position and 3D magnetic trackers can not be used for this purpose because of the large amount of metal in the master.
Human vision is the most powerful sensorial channel and has an extremely large processing bandwidth, which is equivalent to 10^6 bits per second while tactile sense is only equivalent to 100 bits per second. Therefore, a lot of visual feedback tools are available for VR, most common being the head-mounted displays.
Depth perception can occur with stereopsis in which both eyes register an image and the brain uses the horizontal shift in image position to measure depth. This parallax needs to be replicated by the VR viewing hardware in order to help the brain interpret depth in the simulated world. Beside of that, linear perspective, shadows, object detail and distant objects being blocked from view by closer ones round out the impression of volume.
The Cyberscope is a device, which is attached mechanically to the computer screen. It is totally passive, the optics inside the Cyberscope rotate two images and display them to the user. The resolution and brightness are limited only by those of the monitor. But with his head kept on the scope, the user is unable to move it as with more advanced HMDs. These use screens that are placed very close to the eye. Special optics need to be used to in order allow the eyes focusing without tiring. The various HMD models on the market differ in resolution, field of view, weight, comfort and cost.
When a team of people needs to see the same stereo image, stereo 'active' glasses and special monitors provide viewing stereo scenes. But as images will be displayed the same regardless of the viewer's position, it is better to add head trackers to the system. However, in a multi-user environment only one person has control over the perspective. The monitor has to be capable of refreshing the screen at double of the normal rate (about 140 images per second). The glasses close and open vision to the eyes alternatively. The brain registers a quick sequence of right and left-eye images and fuses the two by stereopsis. This image is not tiring the user for long simulation times but the images will appear less luminous than on a normal screen.
There is another possibility for projecting stereo images. Stereo projection screens display both images for left and right eye at the same time, so a normal refresh rate is enough for that type of screen. Neighbouring pixels contain alternating image information for both eyes. They also have alternating polarization (90° rotation). The user wears a pair of passive polarized glasses, so the lens block all pixels with the rotated polarization. These glasses are a lot more cheaper than active glasses. The same technique is used by printed media containing 3D images.
Acoustic is less relevant than visual sense, but the immersion is increased with the adding of sound effects. But when mono-sound creating objects move out of view, the user cannot tell where they went. Therefore, sound has to be created in 3D versus the user.
Virtual sound should not be confused with stereo sound.
"Virtual Audio: A recorded sound experience that contains significant psychoacoustic information to alter human perception into believing that the recorded sound experience is actually occurring in reality."
[Currell, 1992]
Spatial feeling is produced when sound will arrive at different times at the ears and when the intensity of the sensed sound differs between the ears. Sounds in a real room bounce off the walls, adding to the direct sound from the source. Several systems were invented to realize those effects (Convolvotron, Beachtron, Acoustetron).
Touch and force feedback are important sensorial channels. The hand has the highest density of touch recopters in the body. Touch perception occurs when a hand pushes lightly against an object. If the hand pushes harder, then the muscles in hand and forearm start to contract, the motion is stopped (assuming large feedback forces).
Virtual touch and force feedback need to be replicated in real time. Second, the feedback forces have to be strong enough to stop the hand motion but must not be too large to harm the user by accident, e.g. computer failure. And at last, it has to be considered, that the bandwidth, over which the fingers cannot discriminate two force input signals is 320 Hz, so you have to refresh the force output of your device at least at that minimum. The bandwidth, over which the human finger needs to sense vibration during skillful manipulative tasks is between 5 and 10 KHz.
Touch feedback can be approached with electric pulses to the skin or neuro-muscular stimulation provided directly to the user's primary cortex. Both techniques can be considered risky. Pneumatic touch feedback is an approach, that uses micro air pockets placed in a glove. Air pressure is obtained with a compressor placed in the control interface. Another possibility is using micro-pin arrays which provide tactile feedback and are also capable of generating the feeling of fingers moving over rough surfaces. These effects can be enhanced by adding temperature and thermal conductivity feedback.
Early research on force feedback used large mechanical arms, which embedded position sensors and electrical feedback actuators. The control sampling time is about one millisecond, while the graphics refresh is only a fraction of that rate. The arm is gravity and inertia compensated so that no forces are felt at the handle as long as there is no interaction with an virtual object. These arms are still used today but because of their complexity and high cost other approaches were made. Force feedback joysticks or gloves with a pneumatic structure in the palm are alternatives. But the combination of tactile and force feedback has still a long way to go.
Developing VR applications requires knowledge of real-time systems, OOP, networking, multi-tasking and other subjects. To master all these skills, industry produced several VR toolkits. Well known is the 'World Toolkit' from Sense 8, but there are a lot of others like 'VC Toolkit', 'Virtual Reality Toolkit' or 'Mercury'. These kits are extendable libraries of object-orientated functions, which were designed especially for VR programs. They also support different platforms by having high-level functions, which do not know the specific hardware they are running on. Low-level translators identify the specific I/O tools at runtime, such as 3D-mouse, trackballs, gloves, etc.
Creating virtual scenes requires placing elements like viewpoints, lights and geometries at particular locations. In WTK, these elements are called nodes and they are part of a hierarchial structure known as scene graph. This is an upside-down tree, where the root is on top. WTK traverses the entire scene graph once per frame. The root node is the entry point where WTK starts to draw the scene. Each node is visited in a top to bottom and left to right order. Evaluation and processing depends on the type of the current node. When WTK encounters a geometry node, it draws it at the current position and orientation with respect to the current lighting. The scene graph structures can be built dynamically at the beginning of the program but they can also be read from files like VRML (see below).
During a simulation, user input represents a series of events for the toolkit program. Scheduling all these events in real time at each simulation cycle is an important task that is being handled by the toolkit functions. At the beginning of each loop, sensors are read. Each graphical object can perform tasks once per frame. The VR scene has also user-specified action-functions which are invoked before rendering is done for the current frame.
Virtual Reality Modeling Language is the use of text-based commands to create rich 3D objects. It is not like Java or C because it does not need to be compiled. The code is interpreted directly by your internet browser or, more precise, by the Cosmoplayer plugin, which is present in all new Netscape versions. If you do not have one, you should look at the Cosmo Software site.
VRML files contain Nodes (nuggets of scene information), Fields (node attributes you can change), attribute values, node names and more that describe a 3D world. With only a few lines, you can describe primitive shapes like boxes, cones, cylinders and spheres. The scene graph has a structure which is simular to the WTK tree described above.
You can see a simple world with some basic geometries here which I have programmed together with a friend for a laboratory. The second WRL-file contains a simulation of a star system with a sun and four planets, each with one moon. To be displayed correctly, both require a WRL plugin for your browser. For more information on VRML you can look in the VRML news group 'comp.lang.vrml' . There are also vendor product-specific news groups.
Thought processes of the human brain still hold a lot of secrets which have to be discovered. It is therefore difficult to analyze human-machine interaction (depending on how much human factors are involved). Determining the performance of a VR simulation is somewhat subjective. The natural excitation of each sensorial channel needs to be analyzed individually. But beside the technological need to evaluate the quality of an application, there is another need. The 'transport' of a user to a virtual world can be considered as dangerous as it goes beyond certain limits. Long simulations may result in user isolation from normal daily life. It is therefore necessary to evaluate VR in terms of its psychological and sociological impact.
Aspects of visual feedback are very important. The adaptability of the display used to see affects the users willingness to spend long times immersed in the virtual world. For example, HMDs need not to be too heavy, too large or too tightly fit. Manufactors are careful with these aspects but the final validation is that of the user. This category is called General Ergonomic Factors. The second one deals with Physiological Factors influencing vision. Several factors as refresh rate, depth perception and lighting level influence the performance of VR simulation systems. Above 25 images per second, the eye is totally fooled into perceiving a continuous motion. But the graphics refresh rates depends on the scene complexity expressed in number of polygons and shaded modality. The third group of factors deals with Psychological Factors such as scene realism, scene errors (scale errors, translation errors, etc.) and the integration of feedback and command, which means the modification of the scene as a function of task-specific information. Markers can be added to the virtual world, which should aid the user in performing several tasks. An example is the 'intelligent agent' who serves to orient brokers in a 3D field of virtual stocks.
Sound feedback has a dual role, first as the medium for transmitting information and second as a way to localize the source of the information. Ergonomic factors refer to the design of the hardware and its ease of use by humans.
Physiological conditions refer to the sound frequency range which has to be within the range of audible sound (20 to 20000 Hz) and sound intensity. If the intensity is too great, it can produce discomfort and even pain (above 120 dB). Another factor is the sound/noise ratio. But most complex are the psychological factors. Sound understanding allows the reconstruction of a world that is volumetric and whose parts have specific informational components. A piano, for example, should not generate drum sound. Another example is a complex control panel, which concludes a large amount of visual feedback. An audio alarm can raise the users attention to error conditions. But sound can nowadays also be used as a command tool for the user.
Physical contact with the environment is the last important feedback. Some virtual tasks can only be performed accurate by adding tactile feedback to the environment. But often it is difficult to find the optimum way to use haptic feedback in simulations. The aim for the future is to provide touch and force feedback to the whole body. Today, feedback stimulation is limited mainly to one hand. Fortunately, many real tasks can be done with one hand, therefore the use of only one hand does not degrade the simulation dramatically.
Prolonged immersion in VR environments may cause several effects. Simulation sickness like dizziness is caused by a sensorial conflict between visual feedback indicating motion and the inner ear indicating stillness. Conversely, if the user moves his head and the displayed image does not respond in real time, the sickness is again present. The phenomenon is aggravated by poor image resolution.
VR applications may have potential consequences over public, private and professional life. In almost any field the major activities of a profession consist in learning, design, analysis, realization and communication. VR may benefit all these activities and increase overall individual productivity. The networking abilities of VR will allow more people to perform more complex work while remaining at home or at least at one place in a building. VR can allow people to experience art or travel the world without leaving their homes. Entertainment needs are already an important factor. Video games and television can be enhanced by the new technology. But VR can also create a whole new class of depersonalized addicts to this new 'drug'. It is thus a question of dosage. The way to control its use is mainly left to the individual or the family.