Mixed Reality Privacy Threats and Open Research Directions
Note: This is an evolving document and subject to change. Comments are welcome either in the document or emailed to dhosfelt@mozilla.com


Mixed reality technologies are quickly evolving from research to mainstream, opening up questions of consumer privacy and security, as well as ethics.  In addition to collecting personal data such as location, search queries, and verbal communication, these technologies also can collect large amounts of nonverbal behavior. While researchers are governed by institutional review boards (IRBs) and ethics guidelines, commercial manufacturers have no such limits, raising the question of how can researchers help influence more ethical and privacy-preserving technologies.

How can researchers help influence more ethical and privacy-preserving technologies?
The solution I propose is bipartite:
  1. Educate consumers and policymakers on risks and make recommendations
  1. Investigate technical mitigations for privacy threats

Both facets rely on first understanding the privacy threat landscape, which is the primary purpose of this document. First, we must understand and collate the current knowledge (+Open MR security questions: Background ) , and then identify the next avenues of research (+Open MR security questions: What-are-some-research-questio ).

Roesner et al propose three aspects for protection in AR: input, data access, and output. Most of the concerns listed here fit in the category of data access, as MR applications require access to a variety of sensor data, which pose privacy threats to their user. However, it’s important to note that MR applications will also face similar input validation and sanitization challenges as conventional applications. Additionally, users will need to place extreme amounts of trust in applications that have the capabilities to obscure or overlay their real-world senses, especially in the face of such attacks as the human joystick attack, wherein a participant can be manipulated to move to a predefined physical location without the user’s knowledge, and overlay attack, wherein unwanted content is overlaid on a participant’s MR view—this content could be a fake road sign or a banner ad.

The purpose of this document is to provide an overview of the data access threats and prompt research questions for the privacy and MR research communities. In order to recommend technical and regulatory mitigations, we need to have a full grasp of the range of the privacy problems. There’s a wide range of research left to do to inform mitigations and recommendations.

I’ve left this document as an editable space to expand our knowledge base and list of open questions that can reasonably be worked on at the moment. I’m happy to include additional background research that people recommend, but this isn’t intended to be a full survey paper, merely a description of what we know is possible.

Eye Tracking
Eye tracking is a particularly powerful nonverbal communication technique, allowing complex and invasive inferences into individual’s psychology and identity. As eye trackers are now being fitted into mass market devices, it’s critical that we understand their potential privacy threats and propose mitigations. Generally, eye tracking measures the following metrics:
  • fixation: the moments that the eyes stop scanning the scene, holding the central foveal vision in place so the visual system can take in detailed information about what is being looked at
  • saccade: quick eye movements occurring between fixations
  • scanpath: path of eye movement during eye tracking
  • blink rate: blink rate can be used as an index of cognitive workload, but is also determed by other factors such as ambient light levels
  • pupil size: like blink rate, this is determined by ambient light levels, but also provides insight into the level of arousal caused by a scene

Gaze tracking is a particular focus of this document as it is the least studied in the privacy literature and offers the most privacy threat vectors, ranging from data inference threats to shaping future behaviors and identification threats.

Gait tracking
Gait is “an idiosyncratic feature of a person that is determined by, among other things, an individual’s weight, limb length, footwear, and posture combined with characteristic motion. Hence, gait can be used as a biometric measure to recognize known persons and classify unknown subjects.”

The gait cycle is the period from an initial contact of one foot to the following initial contact on the same foot, which is divided into three main tasks and eight phases
Identifying users via gait in literature is mostly divided into three techniques, floor-based and observation-based, and sensor-based. Newer sensor-based techniques are able to use the accelerometer in users phones to identify them via their gait.

Hand tracking and gesture recognition
Gesture recognition refers to tracking human gestures, matching them to a representation and converting them to semantically meaningful commands. For MR applications, the focus is on vision-based methods.

Hand tracking is also of particular interest in MR, especially given that the Oculus Quest has now released hand tracking capabilities.

Head pose estimation
Head pose is estimated via three (pitch, roll, yaw) degrees of freedom. Much of the literature is vision-based, but in MR, we’re generally more concerned with sensor-based approaches (since we’re concerned with what we can tell about a user wearing an HMD—bystander privacy will remain an open, general problem).

Head pose estimation can also be used as a proxy for gaze estimation.


Data inferences

Gaze allows inferences into the following (non-exhaustive) medical and psychological conditions: