Skip to main content
SearchLoginLogin or Signup

VR for Group Forming Broadcasting

Published onDec 19, 2017
VR for Group Forming Broadcasting


The group we belong to has a distorting effect on our perception of reality. This was the conclusion of a seminal case study into confirmation bias in 1954 called “They Saw a Game”. The case study documents the reaction of Princeton and Dartmouth students to a rather dirty football game where both quarterbacks were removed from the game due to heavy injuries. The study showed that bias towards who started the rough-play was colored by which school the study participants went to. This bias permeated into details such as the number of rule infractions committed by the teams.

So how can two people see the same game but take away completely different realities? If we could make two groups of people watch the same game and agree on shared realities, we can find a way to depolarize populations.

Fast forward to today, where sporting and political events are broadcast for two audiences, home and away (or conservative and liberal). There are usually at least two broadcast trucks present at live events, each with its own production staff, engineers, and distribution capacity. Both trucks get access to the same camera feeds, although each has a wildly different audience it’s feeding to. Thus stories, shots, and commentary are colored by the interests of each audience. This channelization polarizes an audience into two groups, home and away. This traditional broadcasting architecture lends itself to picking a side. This style of broadcasting can be viewed below.

However, audiences are not easily categorized into two categories. There are those that have a fantasy-game following, wherein they receive points based on the performance of individual players that are spread out throughout the league. There are those who are interested in the fashion and celebrity appearances of the games. There are even those who are not interested in sports at all, but follow the live event for the commercials. What if, from a single event, it were possible to produce content that satisfies the complex interests and features of audiences? What if there were a different production truck for each of these interests? This is shown in the diagram below.

By enabling more groups to form, we can take an audience that is normally polarized along one axis, and allow it to form more complex and cross-cutting groups. By showing an event in its full complexity, or at least giving the audience the option to see it as such, one can begin to see a shared reality and reduce the tribalism associated with traditional broadcasting.

In order to enable a multitude of broadcasts from the same live event, one traditionally require many production trucks, each with its own staff. We observe, however, that much of the raw footage, camera-feeds, and clips can be shared by these multiple trucks. We aim to ease the friction of producing a broadcast by creating a production truck which is easy to set up, customize, and share-content with. We also observe that advances in VR enable us to make experiences that are larger than the screen of an app or a website. We propose and developed a VR broadcasting truck which enables fast and inexpensive pop-up broadcasts for events. By reducing the cost of producing a broadcast, we hope to enable group-forming broadcast networks.


Whatsapp has shown that group-forming networks can be more engaging and expansive than phone or facebook networks. Traditional broadcasting networks grow linearly with the number of participants tuning in. Phone networks grow by the square of the number of participants in the network. The number of groups, however, grows factorially with the number of members in the network. This is illustrated below.

Previous work in this space by Dan Sawada enabled users to produce their own news-cast. In this inteface, users could produce their own daily-show program where they incorporated clips from the news-cycle. In a big way, youtube and meme-ification on whatsapp have demonstrated that this idea can work well in viralizing and redistributing content.

We were also inspired by emerging accounts in social media such as House of Highlights on instagram. The House of Highlights is a popular instagram account which produces highlight clips of NBA games. The account doesn’t focus on any individual team. This is a succesful example of amateur broadcasting.

Viralcasting Studio:

By making broadcasting easy and accessible, we can enable more broadcasts out of the same events, leading to more stories than the two polarizing ones we see. We developed a VR broadcasting system which puts the broadcast truck on your head and makes it easier for anyone to produce a broadcast. We provided tools by which a producer can draw upon images, graphics, data, and live cameras to create a video stream equivalent to a broadcast. This permits several people to cooperate and is designed to work for organized sports as well as a pop-up studio for breaking news or live events. We build the space in virtual reality and operate it through a VR/AR head-mounted display. Viralcasting explores using mixed reality as a tool for live, collaborative content creation.

The main advantages of the Viralcasting studio over a broadcast truck are:

  • customizability - multiple screens that you can position anywhere

  • directly manipulate graphics on screen -

  • collaborative - clips can be shared

  • social media integration

The viralcasting studio is implemented as an application on the Vive VR headset. This headset has two controllers and a headset that are tracked in 3D with sub-mm accuracy. The necessary equipment for the system are a VR headset and a PC. The system is compatible with saved videos, and will soon be extended to live-video stream inputs.

This is demonstrated in this high level diagram.

This culminates in the VR studio visible below:

A video presentation of this is here:


The VR system has been tested to up to 8 different video inputs. Each input can be placed on as many moveable screens as the user wants. The user can position these screens anywhere in the space in 3D. This enables the user to customize their space for efficiency or to utilize their spatial memory to locate a specific clip.

In the studio, the user can manipulate the output of regular cameras, or utilize new wide-field cameras which capture the whole field at once. In this mode, the user can identify the window within this field that is of interest. This window responds to both scale and position through a two handed pinch/zoom.

Video Inputs:

When the user steps into the viralcasting studio, they have a number of small screens in front of them and one large main screen. The smaller screens represent the video input feeds. By clicking on any of the smaller screens with the thumb-pad, they will activate these feeds and they will be sent to the main broadcast screen. This is the “program” monitor, the feed that gets transmitted.

On-Screen Graphics:

On the mainscreen, the user can directly manipulate the graphics that will be transmitted. By clicking and grabbing the graphic, the graphic will move along a grid on the screen. The graphic is tracked to the grid to make it easier for the user to place the graphic in a suitable position. By clicking on the right or left thumb-pad, the user can change the size of the graphic element. The user can input any .JPG or .PNG for the graphic elements.


Viralcasting attempts to close the loop with the viewer of the braodcast by directly incorporating social media. In the studio, there is a twitter widget which enables the broadcaster to scroll through the latest tweets. When the user finds a tweet they would like to put on the main screen, the user can select the tweet in the same way that they would select the input feeds. This tweet will then be pushed to the main-screen.


Underneath every video input is a button with three bars. When the user clicks this button, a menu pops up with a set of thumbnails. Each thumbnail represents a clip that the user can put on the main screen. The user can scroll through the clips with the thumb-pad to select the appropriate one. When the clip is selected, it pops into a mini-screen to the right of the menu which enables the user to view the clip in gif form. In addition to being able to display this mini-clip on the main screen, the user is able click on the share button which will create a copy of the clip that the user can pick up and move around. This enables a division of tasks, wherein one user can perfrom the tasks of selecting clips and sharing them with other users, while some users focus only on switching between live-feeds.

The decision to a live feed into a set of thumbnails was made in order to optimize the VR experience. While the 3D-tracked controllers in VR are great for spatially placing and moving objects, they are not adept for settings the start and stop-times on a video timeline. On-surface interaction using a mouse is faster and more accurate for setting clip start and end-times. Despite the superiority of a mouse for this task, the VR style task of searching through a video by looking at visual thumbnails through time is an engaging way of going about the task.g

Andrew Lippman:

The equation is correct as far as it goes, but you needto solve it at least to show whether it is proportional to x**n or otherwise.

Andrew Lippman:

Mitigating effect. distortion is too string a word.

Likewise, rather dirty might be put as hotly contested.

Audiences are not easily categorized: yes, doing so is a rather blunt instrument, but you are losing the point here. You want people to share the same reality even when they have diverse interests. The paragraph as it is written says the opposite: we should see many games. What you are getting at is clearer in the next paragraph.

When we each get a segregated view, with little overlap in the representation or accompanying opinion, we strengthen the segregation; when we overlap them, we emphasize the shared experience while still allowing a diversity of perspectives.

There are inherent groups around any event, just as you say. Each has its own interest in watching the event. The key challenge, therefore is to discover whether such groups foster unbreakable bubbles or whether the overlaps can unify our understanding of the facts. That is a question, not an answer, and your work is an attempt to clarify the discussion of it.

There is evidence that this can work. The conventional wisdom until recently was that the Internet, by allowing so many bubbles was isolating the groups that quite naturally formed. Recent studies (cite Boxell) have shown that the reverse is closer to the truth: where there is no easy way to access diverse views, opinions tend to represent the dominant one provided by a dominant source, I.e., broadcast news and talk radio. Whereas the Internet-rich, younger communities my have seemingly hardened viewpoints, they are at least more exposed to differing presentations of daily events.

A physical analogy with respect to sports might be the difference between watching a game in a sports bar in each of the home towns, versus watching it in a neutral place, versus watching it in the home town of the other side. Mere exposure to this diversity of opinion might present a less parochial view. You could look to a correlation between those who travel and their tolerance of other ethnicities as an example. I am sure you could find references to some analogy like this.

Note that the above opens the door for you to bind the talk radio work with the sports broadcasting work. Part of your contribution is the framework to explore the underlying situation as well as suggest options to improve it.

When you reference Sawada, you also need to add the kid who Turner adopted who actually did a real version of it, using contributions that were sent to him. Sawada attacked the realtime aspect — the goal of his work was to create an alternative view that was minimally delayed from the real event yet could draw upon local resources and network feeds to present the perspective of the re-broadcaster.

Not sure, but i think the other kid tried to broaden and make more universal the whole event, post hoc.

Now you return to GFB. The idea, attributable to David Reed, is well presented mathematically, but you need to connect this with the goal. Facebook is the textbook example of group-forming. Predecessors include chat rooms, which were immensely popular (proportionately) when the Internet was smaller (look at Howard Rheingold’s earlier work). While Howard’s examples are groups, and they make the Internet different and more vaulable than a telephone network, the switching cost was still high. This proved that the concept of GFNs but stopped short of considering the positive and negative secondary social effects. Groups were better than not, but they had not scaled to the point of dividing society until Facebook caused us to examine that question.

I am not sure how to factor in the influence of niche broadcasting as Fox was when it started. I am sure there are others who address it well. You might point to that and leave it to the reader to explore more.

Finally, return to the technology: it reduces the switching costs and makes a multiplicity of telecasts possible.