BubbleBoy

1. Human Interaction with Natural Environment

People do sense from 5 different organs. Those can be identified as human eye, ears, nose, mouth and body. These organs can sense pictures, sounds, smells, taste and feel respectively. Our main target is to capture the interaction of three main organs. Those are eyes, ears and body of the users. The exact point which is responsible for the interaction between an individual and the environment can be identified as the interface [9, 13]. In this project we can recognize the eye, ear and the body of the user comes as the main interfaces. The communication takes place through these interfaces.
Human eye has the ability of identifying three wavelength sections as short, medium and long. And it got a view in a conical manner. That is because of the structure of the human eye. Mainly color sensitive cones are situated in high density on the center of the retina. And the human eye concentrates mainly on the middle of the conical view [4]. And it gets continuous images stream over the time until person opens it and brain do processing. Since human has two identical eyes, he is able to see the distance to an object. And this condition makes human capable of recognizing three dimensions (3D). So he is able to clarify the height, length and the depth of an object he sees.
Human body also acts as an interface with the natural environment. Mainly it gets inputs such as touching, heat from the environment and interacts according to them. Physical movements of a person mainly occur as a result of a previous input on the person. The human body reacts according to the vision he gets or sounds he hears. Since the body has majority of the volume it involves with several actions of human.
As we identified with this project, the next main interface for human and environment interaction is based on sounds. Human ear can detect sounds in range between 20-20,000 Hz. And human ear takes sounds as an input and process them inside the brain. Although human can hear them he can only recognize few of them and others will filter as noise. Sounds can be identified as a fact which makes a huge impact on the human behaviors and reactions.
Human brain is capable of recognize any item which he saw early and that is named as experience. So the brain has gathered the information of these items, sounds, tastes, smell and feelings. Old experience makes easy to recognize things and new items keep the brain in a conflict for a while.
After the input passes through these interfaces it directly accesses the human brain. Then brain decides how to act according to that input. According to the output signal generated from the brain the organs will react and the interaction takes place with that particular objects, sense or feeling.

2. How to Build the Virtual Environment?

In this project we need to capture the concentration of the user towards our graphical virtual environment. When satisfying these needs we need to look at the facts described in section 1. Project consists of a sound environment which gives the experience of the real environment. This project will voyage the user to believe virtual realistic and demonstrate high interactive level with virtual world. One of the main influences given by the natural environment is talking back to the person.
To make user feel immersive, environment around user need to talk back to him as real environment do. As well as environment should consist of several continuous and distinct activities. Consider a user is walking through a jungle and stops where a leaf of a banana tree ahead. Now his vision is limited by that leaf. Now user is trying to lift the leaf and look forward. Then with a particular he must able to move it up ward and have the view ahead precisely. After he releases the leaf, it should come opposite direction and have some king of fluctuation until it become stable. That’s the level of talk back is expected from the environment.
Human eye gets a constant flow of video stream. But we are providing 60Hz output from the monitor. This will not be recognized by the human eye and it considers the output as a continuous flow. But natural view of the video stream is highly depends on the quality of the stream. If the video stream consists of the dull colors and faded images, then person will be able to detect that this is a fake vision. The resolution of the graphical model should be higher in this case. It would facilitate the stream in detailing.
This needs high graphic processing unit (GPU) power and CPU power. We need to refresh the graphical frame per every 1/30 seconds. And simultaneously it is required to draw the new frame. If it has high resolution frames then it is required to consume much processing power. But this constrain can be solved by using sound GPU in the PC.

Human eye has a conical view and mainly concentrate on the middle of that conical view. And human eye is the main interface of person when interacting with graphical interface which we have provided. If the distance to the graphical interface is high, user will feel that this is an artificially generated view on the monitor. It would be great if it is very much closer to the eye. But because of some constrains, such as setting cameras, we has to put the monitor at a medium length [fig. 1]. And this monitor should place on the middle of the conical view since human eye mainly focus on middle of the conical view.
Since the human eye is capable of identifying 3D objects we have to design the virtual environment in a 3D manner. Otherwise brain will figure it out as a 2D generated environment. So it is essential to draw 3D graphical models in the virtual environment and provide essential data about the depth.
We are going to embed suitable sounds for objects. Otherwise user won`t get the experience through environment. Then it is required to give essential sounds which are related to particular object. The collaboration of sounds and visuals make the user much interactive with the virtual environment.
This virtual environment is embedded with another essential property of the natural environment. That is the reaction time of the natural environment due to persons` action. Most of the time, it would be instance. So the graphical environment should implement with that efficiency.
The next main consideration should be paid on the continuity of the frame generation. When user gives inputs about gestures, the AI will identify them and send data via 2D object array [section 3]. According to this input, it is required to draw new frames. And this process should ensure the continuity of scene. Otherwise it would be out of this world and user will confuse and disturb.
Energy emitting or reflecting from the object allow a person to see that object [1]. In the virtual environment we need to decide the level of the energy emission from the object in order to give real world feeling through the virtual environment. The main effect related to this requirement is the brightness levels of the objects. This brightness level will help to increase the natural component of the virtual objects.
When considering activities carrying out by the user, we need to consider the speed of happening action [5]. User should able to get the control of the system according to users` Speed, Angle, Rotation and the plane of the hand movement. For example, consider the incident where the user is sitting alone by river side. Then environment should change but area remains the same. Now user picks a small piece of stone and throws it to the river. If user moves his hand in a vertical manner then it should sink at once. But if user moves hand in a horizontal plane with speed up manner then that piece of stone should jump about two to four times and sink.

We normally see video games with graphical interface which separate user n game. Our target is to give user real world experience. In video games which are on the commercial market, we can see embedded user to the game as in fig. 2.b. This scenario is ok with such games since user does use the keyboard or the mouse of that particular PC. But in the real world, when user does actions, he is able to see body parts such as hands and legs.
In our project user do act on the real world. So embedded user scenario may not work with this. So we have to move to the independent user which is demonstrated in the fig 2.a. This will make sure to the user that he is in the real work and join with the virtual world with real, natural manner.

3. Input Array

The artificial neural network will send an array of data to the virtual environment. AI will predict the actions from the users` gestures of the person who act in front of the camera. This array will contain several different information about the users` actions and this will help the virtual environment to change as users` actions [1].
This array will be a 2D array which will contains data objects about the user movements. This will contain data such as actions and coordinates. Actions will contain data about the user movements of chest and the head. This data will be provided the information about the direction he watches or direction he moves. Then it contains the data about the coordinates of the users` hands and legs. This can be used to determine object movement in the environment by facilitating interaction with items on that coordinates.

4. Body Model

In this project we are going to design a 3D model which is similar to a human pose. But this seems contradictory with facts provided previously. That is the argument about the independent user which is in fig 2.a. So we cannot display the human model in the virtual environment. And we keep it as invisible to the user but use it for our purpose [3, 10].
After that we use it to interact with objects in the give coordinates. We do operate the invisible user model by using the data in the input array. So it is trivial to get data from the input array and assign them in to the user model. Then we can get details through model to identify which objects going to interact with user. And that makes the interaction with the virtual environment more accurate.

5. Languages we are going to use.

Designing a Graphical Environment was done related to the project, while going through the research. This projects` main target is to interact with the movement of the user in real time. As the project requirement, this interaction should reflect via a high graphical interface [7]. So it was tested at several choices and did some testing to ensure the ability of available resources [14, 15]. We consider following choices in developing the user interactive interface.
• XNA
• Crytec
• JMonkey

XNA

XNA[16] is a framework developed based on native implementation of .NET Framework 2.0 on Windows. The framework runs on a version of the Common Language Runtime that is optimized for gaming to provide a managed execution environment. The runtime is available for Windows XP, Windows Vista, Windows 7, and Xbox 360. Since XNA games are written for the runtime, they can run on any platform that supports the XNA Framework with minimal or no modification. Games that run on the framework can technically be written in any .NET-compliant language, but only C# in XNA Game Studio Express IDE and all versions of Visual Studio 2008 are officially supported.

JMonkey

JMonkey Engine (jME)[17] is a high-performance 3D game framework, written entirely in Java. OpenGL is supported via LWJGL, with JOGL support in development. For sound, OpenAL is supported. Input via the keyboard, mouse, and other controllers is also supported. The most important thing in this is, it run on Windows, Mac OS, and Linux; true cross-platform thanks to the JavaVM.

Crytect

CryEngine[18] is a high performance 3D game engine developed by crytec and they recently release CryEngine 3. The new engine is being developed for use on Microsoft Windows, PlayStation 3 and Xbox 360. As for the PC platform, the engine is said to support development in DirectX 9, 10, and 11. It supports and contains rich graphical usages such as water reflection, Flow graph, Real time soft particle system & integrated FX editor, Road & river tools, Vehicle creator, Real time dynamic global illumination, Deferred lighting, Natural lighting & dynamic soft shadows Above technologies were compared with each other and came up with JME which is platform independent. When consider XNA it supports only to windows and Xbox. And when comparing the maturity levels of the XNA and JME, JME shows much efficient and effective performance. Cry engine is one of greatest 3D game engine in terms of detailing. But it needs certain level of hardware requirement and it is in the commercial market. JME stood alone in both cases.

Bibliography

[1] D Thalmann, "Using Virtual Reality Techniques in the Animation Process," in Virtual Reality Systems.: Academic Press, 1993, pp. 143-159.
[2] E Shenchang, "QuickTime VR – An Image-Based Approach to Virtual Environment Navigation," in Commum. ACM, 1995.
[3] T Rodden, J Pycock, C Greenhalgh, S Benford, "Collaborative Virtual Environments," Communications of the ACM, vol. 44, no. 7, pp. 79-85, 2001.
[4] A Roorda and D Williams, "The arrangement of the three cone classes in the living human eye," Nature, vol. 397, pp. 520-522, 1999.
[5] M Fraser, C Heath, S Benford, C Greenhalgh J Hindmarsh, "Establishing mutual orientation in virtual environments," In Proceedings of CSCW’96, Boston, pp. 67-76, 1996.
[6] R Waters, D Anderson J Barrus, "Supporting large multiuser virtual environments," IEEE Comput. Graph. App, pp. 50-57, 1997.
[7] Brutzman D, "The Virtual Reality Modeling Language and Java," Communications of the ACM, vol. 41, no. 6, pp. 57-64, 1998.
[8] C Carolina, J Sandin, and A DeFanti, "Surround-Screen Projection-Based Virtual Reality: The Design and Implementation of the CAVE," in SIGGRAPH '93 Proceedings, 1993, pp. 132-142.
[9] K Karen, K Ashman C Zastrow, Understanding Human Behavior and the Social Environment.: Cengage Learning, 2009.
[10] S Benford, J Bowers, E Fahlén, C Greenhalgh, and D Snowdon, "User Embodiment in Collaborative Virtual Environments," in ACM Conf. Human Factors in Computing Systems, 1995, pp. 242-249.
[11] A Shields, F Tavera, L Elford, H Scullin, A Reed, "Virtual Reality and Parallel System Performance Analysis," vol. 28, no. 11, pp. 57-67, 1995.
[12] A Wingrave A Bowman, "Design and Evaluation of Menu Systems for Immersive Virtual environments," in Proceedings of the Virtual Reality 2001 Conference, 2001, pp. 149 - 156.
[13] J Weigert A, Self, interaction, and natural environment:refocusing our eyesight.: SUNY Press, 1997.
[14] “Riemer's 2D & 3D XNA Tutorials” [Online]. Available: http://www.riemers.net/eng/Tutorials/XNA

[15] “Setting Up NetBeans IDE for jME 2.0.1” [Online]. Available: http://jmonkeyengine.org/wiki

[16] “Microsoft XNA” [Online]. Available: http://en.wikipedia.org/wiki/ Microsoft _XNA
[17] “JMonkey Engine” [Online]. Available: http://en.wikipedia.org/wiki/JMonkey_Engine

[18] “Welcome to our new CryENGINE® 2 website” [Online]. Available: http://www.cryengine2.com