Mixed Reality Privacy Threats and Open Research Directions
Note: This is an evolving document and subject to change. Comments are welcome either in the document or emailed to dhosfelt@mozilla.com
Introduction
Mixed reality technologies are quickly evolving from research to mainstream, opening up questions of consumer privacy and security, as well as ethics. In addition to collecting personal data such as location, search queries, and verbal communication, these technologies also can collect large amounts of nonverbal behavior. While researchers are governed by institutional review boards(IRBs) and ethics guidelines, commercial manufacturers have no such limits, raising the question of how can researchers help influence more ethical and privacy-preserving technologies.
How can researchers help influence more ethical and privacy-preserving technologies?
The solution I propose is bipartite:
Educate consumers and policymakers on risks and make recommendations
Investigate technical mitigations for privacy threats
Roesner et al propose three aspects for protection in AR: input, data access, and output. Most of the concerns listed here fit in the category of data access, as MR applications require access to a variety of sensor data, which pose privacy threats to their user. However, it’s important to note that MR applications will also face similar input validation and sanitization challenges as conventional applications. Additionally, users will need to place extreme amounts of trust in applications that have the capabilities to obscure or overlay their real-world senses, especially in the face of such attacks as the human joystick attack, wherein a participant can be manipulated to move to a predefined physical location without the user’s knowledge, and overlay attack, wherein unwanted content is overlaid on a participant’s MR view—this content could be a fake road sign or a banner ad.
The purpose of this document is to provide an overview of the data access threats and prompt research questions for the privacy and MR research communities. In order to recommend technical and regulatory mitigations, we need to have a full grasp of the range of the privacy problems. There’s a wide range of research left to do to inform mitigations and recommendations.
I’ve left this document as an editable space to expand our knowledge base and list of open questions that can reasonably be worked on at the moment. I’m happy to include additional background research that people recommend, but this isn’t intended to be a full survey paper, merely a description of what we know is possible.
Eye Tracking
Eye tracking is a particularly powerful nonverbal communication technique, allowing complex and invasive inferences into individual’s psychology and identity. As eye trackers are now being fitted into mass market devices, it’s critical that we understand their potential privacy threats and propose mitigations. Generally, eye tracking measures the following metrics:
fixation: the moments that the eyes stop scanning the scene, holding the central foveal vision in place so the visual system can take in detailed information about what is being looked at
saccade: quick eye movements occurring between fixations
scanpath: path of eye movement during eye tracking
blink rate: blink rate can be used as an index of cognitive workload, but is also determed by other factors such as ambient light levels
pupil size: like blink rate, this is determined by ambient light levels, but also provides insight into the level of arousal caused by a scene
Gaze tracking is a particular focus of this document as it is the least studied in the privacy literature and offers the most privacy threat vectors, ranging from data inference threats to shaping future behaviors and identification threats.
The gait cycle is the period from an initial contact of one foot to the following initial contact on the same foot, which is divided into three main tasks and eight phases
Søndrål, T. Using the human gait for authentication. Master’s thesis, Gjøvik Uni-versity College, 2005.
Gesture recognition refers to tracking human gestures, matching them to a representation and converting them to semantically meaningful commands. For MR applications, the focus is on vision-based methods.
Hand tracking is also of particular interest in MR, especially given that the Oculus Quest has now released hand tracking capabilities.
Head pose estimation
Head pose is estimated via three(pitch, roll, yaw) degrees of freedom. Much of the literature is vision-based, but in MR, we’re generally more concerned with sensor-based approaches(since we’re concerned with what we can tell about a user wearing an HMD—bystander privacy will remain an open, general problem).
Introduction
Background
Data inferences