VR Experiment Development: 360 Video versus CGI versus Mixed Media - Comparison & Cost Analysis

April 10, 2024

What are the pros and cons of VR experiments and experiences made from 360 video versus full CGI or a mix of both? WorldViz VR experts develop the right mix for you, depending on your needs.

In this post we discuss the different "levels" or "paradigms" of VR experience development and their associated cost ranges. Before starting, this explanatory video is worth a watch as it illustrates and explains the difference between “360 Video” and “Mixed Video and 3D Environment (Rotoscoped Avatars)” really well.

360 Video

This is the most "basic" VR experience and also the most popular for training as it has some distinct advantages but also some drawbacks.

This paradigm utilizes 360 degree video, typically captured using a 360 degree camera which a user can then view in a VR headset and look around and experience the filmed scenario in an embodied and immersive way.

Currently we recommend this cost effective camera as one of the best options: https://www.insta360.com/product/insta360-oners/1inch-360

Think of the end result of this being a lot like Google Street View where you are able to look in all directions but you can't actually step into the image so it's like wearing a panoramic fishbowl on your head.

Interactivity options are somewhat limited but you can still have multiple choice questions appear as a GUI menu from which a user can select different answers using their controller (a screenshot of a typical example of this is shown below). 

A user in a VR headset will watch a filmed scene play out and then select the best answer from the multiple choice menu - a correct answer will trigger one video to play giving affirmation and positive feedback while a wrong answer will trigger another video to play with a wrong answer response / negative consequences. In this way, branching "choose your own adventure" style experiences can be built that are quite detailed.


  • Highly realistic environment as it is a real-world capture - you can get very detailed scenes that would otherwise be time consuming / expensive for an artist to recreate such as a detailed store environment, an outdoor scene, etc.
  • Production costs can be relatively low compared to 3D modeling as you really just need a 360 camera, a tripod, and a basic understanding of best practices for effective 360 filming in order to produce effective results.
  • Human actors look like actual people (because they are) and you completely avoid the "uncanny valley" effect which can arise from Computer Generated (CG) human characters.


  • You can't fully reconfigure the results -  once you have filmed a scene, what you have is what you have to work with, whereas with CG environments you can move objects around, change avatar characteristics and dialogue etc.
  • Good results take competent know-how to achieve - this is along the lines of producing a good student film, i.e. you need good actors, script writers, a director etc. to achieve good results. Lighting also poses a challenge - for well lit interior scenes you either need to film everything twice with lights on one side and then the other and then combine the two 180 degree shots together via manual stitching or you simply have lights visible in the 360 video behind the user which your participants can see if they look behind them in the scene.
  • Your options for movement and interactivity are pretty limited in 360 video - you can't walk through the scene or pick objects up like you can in fully CG environments. This restricted movement can sometimes be uncomfortable for participants and you want to ensure the camera is placed at human eye level and is completely stationary to minimize discomfort.
  • There are also more limited data collection options with 360 video for researchers - since the video is seen by the computer program as one single object you can't parse objects out of the video without manually tagging them, and eliminating options for 3D measurements such as measuring interpersonal distance between a user and an avatar. You can take eye tracking data on a scene but have to manually add regions of interest if you want to track when a person is looking at a specific object (i.e. part of the video) whereas with CG, all 3D objects have inherent qualities that allow you to tag them and measure different kinds of data around them (i.e. when a user approaches an object or character or when they interact with it).


This is oftentimes the lowest cost option as you really can get away with a barebones budget, especially if you can leverage internal resources at a university like film production students.

If you want to have a branching experience where participants experience "right" and "wrong" answers, some degree of programming will be required and you will have to carefully script everything out.

If you get a basic system and training / consulting from WorldViz VR, the cost for this route is around $20K ($12.5K for hardware + plus 40 hours of consulting and training). You would also want to add an additional $1500 separately for a 360 camera and tripod if your film department doesn't have one you can use. You can bump up the consulting and training hours if you want more hands-on guidance. Keep in mind at this price range, the production will still need to be done on your end, but WorldViz can help guide you through the process.

Mixed Video and 3D Environment (Rotoscoped Avatars)

This option takes some of the benefits of a full CG environment and some of the benefits of 360 video to give you a more flexible experience but one with its own unique benefits and drawbacks.

In short, you still film actors but instead of filming them "in situ" you film them against a green screen and then use film editing software to create what are akin to cardboard cut out recordings of the individual actors and then place them in a 3D scene. We call these "rotoscoped avatars" and you can see in the screenshot below an example of one that is seen slightly from the side to highlight how they are in fact a flat object in a 3D scene.

These rotoscoped avatars give you some more flexibility as you can still have a filmed actor (thus avoiding the expensive process of creating a full CG character) and you can experiment with placing them in different areas within a 3D scene as unlike with video they are treated as their own 3D object which can be moved anywhere in the VR environment.

A user will also by default feel more embodied in a fully 3D CG scene as they have the ability to move around the environment with 6 degrees of freedom - i.e. they can actually take a step and move around in the 3D world.

To create the 3D world you can either model it from scratch using a program like Blender, find a "close enough" environment from a content repository like SketchFab or TurboSquid, or do what we have done in the screenshot example below which is use photogrammetry to fully capture a real world environment through scanning and photography and then feed it into a specialized program to make a 3D photo realistic environment.


  • This technique is likely still more cost effective than a full CG project with human avatars.
  • You can reshoot different actors and thus experiment with different scenarios more easily - this still involves reshooting but instead of filming a whole scene on location you just shoot the actor in front of a green screen.
  • Same advantages as 360 in regards to human characters - i.e. you avoid the “uncanny valley”.
  • Better data collection and interactivity options than 360 as you can take data on individual avatars and objects in the scene and move around naturally.


  • If you are not careful, the 2D character can look a little odd in a 3D scene - we typically have the character at a further distance or standing behind an object like a table to give the user a better sense of depth perception and downplay the "flat 2D-ness".
  • You still need film production and editing knowhow to capture the green screen actors and then translate them into rotoscoped avatars.


The cost will typically be more than the budget of a 360 video but less than a full CG production.

Much of the project budget will go to the production of your 3D environment, whether that is paying artists to build it from scratch or specialists to scan and capture a specific setting for a photogrammetry environment. 

The entry level cost is around $30K for this path with WorldViz VR acting as a consultation and training service. Our experts will be giving advice, training and documentation for how to implement these techniques for your study. Or, if you wanted to have WorldViz VR produce the complete experience as a "soup-to-nuts" service, the cost would likely be in the $50K+ range, depending on length and complexity.

Full CGI / All 3D Models (including human avatar characters)

This final option is building what most people would think of as a video game experience - all elements are CGI, including fully CG human avatars. In many ways this is the gold standard and we have produced successful human interaction studies for Stanford, NIH, and University of California with this approach. 

Developing realistic human avatars is a time and labor intensive process - fortunately new AI tools are emerging that will simplify this process. 

Avatars give you the most flexibility as you can change their appearance, demeanor and dialogue all with programming and artwork and they also give you the best options for data collection as you can measure minute details about how a participant interacts with them.

A potential drawback of avatar based experiences is the previously alluded to "uncanny valley" effect where some participants become distracted by the unreal aspects of a nearly human figure.

Below is a screenshot from Susan Persky's NIH research which we helped develop from an art and programming perspective. It gives a good representation of typical VR research project avatar interaction experiences.


  • Highly flexible environment that can adapt over time.
  • Users are able to freely move and interact within the virtual scene.
  • Best data collection options in terms of tracking objects and participants interactions.


  • High production costs.
  • Users can still be off put by CG avatars.


For the entry level, estimate $30K for hardware, training and consulting, if you have internal resources available capable of both 3D graphics and programming to build an effective experience, which could take 6 - 12 months depending on how involved the project is.

For WorldViz VR to build a soup-to-nuts CG based trainer with human avatars, the minimum is around $75K, much of the cost going to human avatar animation. Utilizing AI and other techniques such as motion capture can mitigate these costs.

Are you interested in a personal consultation with a WorldViz VR expert? Please contact us at sales@worldviz.com

Stay Updated
Subscribe to our monthly Newsletter
Phone +1 (888) 841-3416
Fax +1 (866) 226-7529
813 Reddick St
Santa Barbara, CA 93103