Formal performance guarantees for an approach to human in the loop robot missions

A key challenge in the automatic verification of robot mission software, especially critical mission software, is to be able to effectively model the performance of a human operator and factor that into the formal performance guarantees for the mission. We present a novel approach to modelling the skill level of the operator and integrating it into automatic verification using a linear Gaussians model parameterized by experimental calibration. Our approach allows us to model different skill levels directly in terms of the behavior of the lumped, robot plus operator, system. Using MissionLab and VIPARS (a behavior-based robot mission verification module), we present a comparison of our predicted performance guarantees for two missions in which a teleoperated quadrotor identifies a target for an autonomous ground robot to intercept: one mission in which the operator flies the quadrotor by line of sight to locate the target and one where the operator flies the quadrotor using its video feed. We demonstrate the effectiveness of our approach by comparing predicated performance to experimentally measured performance.


I. INTRODUCTION
Deploying an effective team of robots to search for, identify, and neutralize a hidden chemical, biological, or nuclear weapon of mass destruction will likely involve the use of multiple robots of different capabilities directed in some part by human operators. Due to the great potential for loss of life in such a situation, it is important to be able to establish a-priori performance guarantees for the mission. A key challenge is to model the performance of the human operator and factor this into the formal performance guarantees for the mission.
While research in automatic verification of robot software has typically followed the methods used in the general purpose software verification field [1], we have developed an novel approach to efficiently establish probabilistic performance guarantees for behavior-behavior based robot software operating in uncertain environments [2]. A software module based on the approach, VIPARS, has been used to establish probabilistic performance guarantees for multiple-robot missions [3], for missions with probabilistic obstacle information [4] as well as missions using probabilistic localization software [5]. We have also studied the usability of our system [6].
In this paper, we address the challenge of establishing formal performance guarantees for an approach to a multirobot mission in which a quadrotor is piloted by a human operator to search an area for a target, and once it is found, a ground robot is automatically directed to acquire the target. We present a novel The paper is organized as follows: Section II presents a literature review. Section III describes the multiple robot mission, the role of the operator in the mission, and the required performance guarantees. Section IV begins with a brief overview of the verification approach, and then (Section IV.B) presents our novel approach to operator modelling for formal verification. Section V compares predicted and measured performance using this approach. Section VI discusses these results.

II. LITERATURE REVIEW
The issues involved in how human operator direction or monitoring can be integrated with autonomy/semi-autonomy are a topic of much research. This problem becomes even more difficult if it is necessary to obtain formal performance guarantees for a system that includes a human in the loop (see [7] for review). Much work in that area falls into the category of verifying the human-machine interface (e.g., for medical equipment); however, our work in this paper falls into the category of verifying properties of a system that includes a human operator component. Webster et al. [8] model a human patient that is the uncertain environment in which their Caro-Obot patient care robot works. Their principal concern is to formally guarantee the behavior of the robot with respect to its patient. In our application mission, the operator directly controls a quadrotor searching for the target. Thus, although there is uncertainty associated with the human operator's actions, the operator is an integral part of the robot mission attempting to achieve mission goals, and the effect of the operator on the mission is what must be modeled. Bolton et al. [9] use a Task Analytic Model to model the possible actions of a human driver using cruise control. In our work, our focus is on representing not the possible action of the operator, but rather the skill level and resultant uncertainty with which the operator pilots the quadrotor. As such a key element is collecting performance data on the operator; Driggs-Campbell et al. [

Formal Performance Guarantees for an Approach
to Human in the Loop Robot Missions * system in providing performance guarantees for missions involving heterogeneous human-robot systems ( Figure 3.1), we developed a simple biohazard search mission where the humanrobot team is tasked to search and locate a biohazard within a given environment. This mission with air and ground robots introduces beneficial heterogeneity into the WMD (e.g., biohazard) search missions we have addressed in our prior work [4]. Quadrotors have excellent speed and mobility, making ideal platforms for searching. Their small payload, however, means they are unable to carry WMD counter-measures. In contrast, ground robots can easily carry WMD counter-measures, but take significantly longer to survey an area. By utilizing these systems together, the search time can be drastically decreased, and the ground robot limited to travelling directly to the target. Our previous work has verified both search based missions and multi-agent missions. The primary research question for this mission is to see if we can extend the verification techniques developed to heterogeneous human-robot systems. The biohazard search mission can be broken down into three steps. First, the quadrotor launches, piloted by a human operator, to search for the target. During this period, the ground robot tracks the quadrotor via a camera. Once an operator locates a target, they will fly the quadrotor directly over it. One of the cameras on-board faces downward and can detect the biohazard bucket used as the target in our implementation. Once the operator/quadrotor detects that the target is underneath, a message is sent to the ground robot, signaling that the target is located. In our approach to verification, mission designers first code their mission using the graphical editing tools of MissionLab [11] (e.g., Figure 3.2). This is then translated to VIPARS for automated verification.
When the ground robot receives the notification from the quadrotor that a target has been located, the ground robot then uses the current quadrotor's position as an estimate of the target's position. As the quadrotor exits its task by flying back to home base piloted by the human operator, the ground robot will move towards the target location. The MissionLab FSA of the ground robot is shown in Figure 3.2. The GoToTarget behavior moves the robot toward a target and stops at a distance (Rmax ) from the target; the input to the behavior is the estimated location of the target.

IV. FORMAL VERIFICATION OF HUMAN IN THE LOOP
A brief introduction to our approach to formal verification is presented first to set the context for our proposed human operator model. In prior work [4] [2] [12], Lyons et al. designed a framework for verifying the performance of autonomous behavior-based robot missions in uncertain environments. MissionLab [11] mission software is autotranslated [12] to a process-algebra notation PARS (Process Algebra for Robot Schemas) for analysis. The interaction of the mission software with an environment model is analyzed to determine if the software will meet a performance criterion.

A. Automatic Verification with VIPARS
A behavior-based program and its environment are modeled in PARS as a set of interconnected processes, where a process is written as [2]: where u1,…,un are the initial values for the process variables, i1,…,ij and o1,…,ok are input and output port connections, and v1,…,vm are final result values of the process variables. Processes compute result values from initial values, and this computation may be influenced by any communications that occur over port connections. Processes can be defined as combinations of other processes using composition operators: parallel ('|'), disabling ('#') and sequential (';'). Bounded recursion is captured using tail-recursive process definitions [2], e.g., Process P first activates process Q with input value x. Q then delivers output value y, which is then used to recur P. A variable flow function fP relates the values of variables at the start of each recursive step of P to those at the end. The flow-function for atomic processes are specified a-priori; those for composite process, those defined as compositions of other processes, e.g., , are composed from the flow functions of the component processes. This can be automated to generate flow functions given a set of processes [2] with complexity linear in the number of processes. Since any execution of eq. (4-2) is modeled by fP n (x0) for n1 and initial parameter value x0, we have a straightforward verification method. Unfortunately, not all processes are defined in this form.
The system to be verified is expressed as the parallel, communicating process composition (e.g., Sys below) of a robot controller (e.g., Ctr with variable r1), and an environment model, (e.g., Env with variable r2), written: r1 (r1,r2), fSys,r2 (r1,r2) ) (4-5) In eq. (4-3), the input of Ctr is connected to the output of Env, (a), and the input of Env is connected to the output of Ctr, (b). If only (4-3) were a sequential composition like (4-2) then we could extract flow functions for the combined interaction of controller and environment. So, in [2] a static analysis algorithm Sysgen was developed to rewrite parallel compositions of the form (4-3) into a sequential composition  where Sys' is referred to as the system period. Once Sysgen analysis is complete, a system flow function can be extracted from Sys'.
Random variables e.g., r1, r2, are represented as multivariate mixtures of Gaussians, and operations on random variables are automatically translated by VIPARS into operations on distributions [13]. Flow functions relate variable values at recursion step t of Sys' to those at t+1. In the final phase of VIPARS processing, extracted flow functions are converted to conditional probabilities. These are then the basis of a Dynamic Bayesian Network used to carry out forward propagation of probability distributions, to determine whether the combination of controller and environment will meet a performance specification. We demonstrated that this approach is fast and accurate when validated against physical executions (most recently [4]).

B. Modeling a Human Operator
The combination of a specific human operator, with a specific skill level, guiding the quadrotor while searching for a target object, and the quadrotor motion to the target location, will be modelled as a single, lumped environment process. This is part of the environment model in verification because both the quadrotor and the human operator represent an external environment to which control signals are sent (the quadrotor) and from which input is taken (sensor input and the signal that the target has been found). The skill level of the operator can then be reflected in the quality of that result.
The environment model in PARS is Env = Geometry | LumpedQH | Robot | Laser | Time. (4-6) Other than the lumped quadrotor/operator model Lumpe dQH and the global time process Time, this environment is very like those we have used in prior work [5]. The Geometry process includes the probabilistic model of the space in which the mission is carried out, including any map information or knowledge of obstacles. The Robot and Laser processes are the experimentally calibrated models for the ground robot and laser sensor used in the mission. (The port connections and variable initializations have been omitted here for clarity.) The lumped model Lumpe dQH encapsulates the behavior of the operator in locating the target, and its output is a signal to the ground robot to proceed to the target. If H is the set of such operator/quadrotor lumped models, and if h H is one model, corresponding to a specific operator and quadrotor, then the lumped model captures the accuracy with which that operator eventually identifies the target, p, and the time it takes that operator to identify the target, t, as the set where P(.) is a probability density. We propose the time t taken by this operator to find the target spatial error in location of target are Normally distributed. While there may be many contextual parameters that influence these, we have chosen to start by modelling the distance to the target d as the principal parameter: We will consider (d), (d), (d), (d) to be linear functions of d, yielding a linear Gaussian model, and also consider that  is diagonal with x and y. (4-6), LumpedQHh()(dT,dP). The result of the Quadrotor/Operator mission is the time of detection, written to the port dT, and location of the target, written to the port dP. The initial parameter h identifies the operator being modeled. For example, to model a more experienced operator, LumpedQH returns a short time value on dT and a target position on dP with small variance. In contrast, to model a less experienced operator, LumpedQH returns a longer time value on dT or a target position on dP with a larger variance.

C. Human in the Loop Quadrotor Mission
In prior work we have autotranslated [12] and probabilistically verified and experimentally validated [4] [5] sensory triggered sequences of robot motions through environments with obstacles similar to this mission, so we will focus here only on the aspects of it that relate to the novel lumped Quadrotor/Operator model. The MissionLab mission software is mapped to the following VIPARS Mission processes:

(4-9)
As Section III explains, the ground robot runs the TrackQuadrotor behavior to track the quadrotor visually. The detection of the target by the human operator is signaled by the termination of the TargetDetected process and its transmission of the detection time on the port connection tT.

D. Continuous Time
Step Our prior verification work [2] has represented time as a discretized time step, but represented other variables as continuous; for example, the robot location or spatial accuracy. For this application, however, time needs to also be modelled as a continuous distribution. In this paper, the value of time will be given by another continuous random variable within the environment model, and consequently within the Dynamic Bayesian network slice for each time. This propagates time as a distribution (since random variables are represented in the DBN framework as Gaussian mixture distributions) from one DBN slice to the next. Both temporal uncertainty and time steps can vary between slices. Furthermore, the mixture model allows multimodal time steps. For example, a slice might generate a mode at t=5 with some variance and a second mode at t=10 with some variance, indicating the slice ends with high probability with one of these times. Figure 5.1 illustrates a trial run of the search mission in a laboratory environment, where the target is represented by a red bucket and there are other non-target objects (e.g., boxes). A box is placed on the path between the ground robot and the target such that it occludes the robot's direct observation of the target. The mission starts with the human operator teleoperating the quadrotor to the target location. Once the target is detected, a signal is sent to the ground robot. The quadrotor is then flown back to a safe location. The ground robot uses the signaled location of the quadrotor as the estimated location of the target to move toward; it stops when it is within 1.5m of the target. The ground robot is a Pioneer 3-AT equipped with a camera and a front-facing SICK laser scanner for obstacle avoidance. The ground robot is controlled by the FSA (Figure 3.2) presented earlier. The quadrotor is an Ascending Technologies' Hummingbird. The quadrotor has two onboard cameras. One camera faces forward and is used for flying by video, while the other camera faces downward and is used for target detection. The object is detected when the biohazard is within the center field of view of the downward facing camera onboard the quadrotor.

A. Validation
We examine two modes of teleoperation for the validation experiment: the quadrotor is teleoperated by a human operator either by line of sight (LOS) or through the video stream from the quadrotor (FBV). The experiment is run 50 times for each operation mode. The performance of mission is evaluated with spatial and time criteria: 1. Rmaxsuccess radius; the ground robot is required to be within this radius (e.g., 1.5 m) of the target 2. Tmaxmission deadline; the mission needs to be completed under this time limit (e.g., 60s) For each trial, the time it takes for the team to complete the mission (t) and the distance of the ground robot from the target (r) are measured. The mission is only considered successful when both of criteria are met:

B. Calibration
To derive the linear Gaussian models for the lumped operator and quadrotor system, a series of calibration experiments were conducted for the Line of Sight (LOS) and Fly by Video (FBV) cases. In each case an operator was asked to fly the quadrotor from a known start position and find a target. The distance to the target, time to detection, and the spatial error in detection were recorded. A single operator was used for all trials. The linear Gaussian models derived from these calibration measurements were as follows (where  denotes the time distribution and  the spatial error distribution as in eqns. . This division provides a succinct representation of the performance guarantee of the mission that is more readily comprehensible to the mission operator. Since the goal of verification is to ensure mission success before deployment, this information helps the mission operator in making such decision. The high confidence regions are where the probability of success is either 0 or 1.0; that is, the mission is either guaranteed to fail or succeed. The uncertain region lies between the high confidence regions, where the probability of mission success is between 0 and 1.0. The mission operator should avoid operating in this region when designing a mission. Furthermore, most of the discrepancies between verification and validation lie in the uncertain region, where the verification error is nonzero. For instance, Figure 5.2 shows the VIPARS verification and experimental validation results for the performance guarantee of the time criterion P(t ≤ T max ), the probability that the cooperative search mission will be completed under the time limit, Tmax. Based on this result, the mission operator can easily discern that the mission is guaranteed to be successful for Tmax > 60 seconds, or has high confidence that the mission can be completed within 60 seconds. However, if the criterion is too stringent (e.g., Tmax = 30 seconds), then the mission would be unsuccessful, and the operator could modify mission parameters such as speeding up the robot, using a different robot, or even abandon the mission. Figure 5.2 shows that the VIPARS verification of the mission closely resembles the actual performance of the mission in experimental validation. Figure 5.3 shows the VIPARS and experimental results for the mission performance with the spatial criterion P(r ≤ R max ), the probability that the ground robot reaches within Rmax radius of the target. We observed that while most of the verification errors lie within the uncertain region, some discrepancies between verification and validation exist near its boundary with the high-confidence (successful) region. Nonetheless, the validation result is of still a relative high confidence with probability of mission success greater than 0.95. overall mission success for different combination of performance criteria.
To further analyze the results, we examine the effects of the one performance criterion on the other (R max and T max ) in terms of performance guarantees and verification errors in Figures 5.5 and 5.6. Figure 5.5 shows how the probability of success for the time criterion (Tmax) is affected by various specifications of the spatial criterion (Rmax). We observed that for Rmax within its own uncertain region, the various Rmax's significantly impacted the performance guarantee of the time criterion P(t≤Tmax). Specifically, it caused P(t≤T max ) to plateau at probability values other than 1.0 for different Rmax's. Moreover, it also introduced significant verification errors. Similar observations are made in Figure 5.6 for the verification and validation of the spatial criterion P(r≤Rmax) at various values of the time criterion, Tmax. Lastly, we also performed verification and validation of the cooperative search mission where the quadrotor is teleoperated with video streaming from a camera onboard the quadrotor (FBV operation mode) instead of line of sight (LOS). The results are summarized in Figure 5.7. For the time criterion ( Figure 5.7a), while most of the verification errors lie within the uncertain region, there is some significant error near its boundary with the high-confidence successful region. For the spatial criterion ( Figure 5.7b), the verification result is better than the LOS case ( Figure 5.3) in the sense that all the verification errors occur within the uncertain region of the performance guarantee.
VI. CONCLUSION In this paper, we have applied our approach to the probabilistic verification of behavior-based systems to obtain, and experimentally validate, formal performance guarantees for a heterogeneous robot mission in which a human operator pilots a quadrotor to search an area for a target, and once it is found, a ground robot is automatically directed to acquire the target. A novel approach to modeling the skill level of the operator is presented and integrated into VIPARS automatic verification. The method leverages a linear Gaussian model of skill, parameterized by experimental calibration. This approach allows us to model different skill levels directly in terms of the behavior of the lumped, robot/operator system.
We present a comparison of our predicted guarantees for two missions: one where a single operator flies the quadrotor by line of sight and one where the same operator flies the quadrotor using its video feed. The results in both cases are well aligned and we discuss the specific details of each result. However, lesser verification accuracy was observed for the FBV time criterion as compared to the LOS case ( Figure 5.2) for the time criterion. We estimate that this can be attributed to the nonlinearity in the performance of the human-quadrotor system, which we approximated with a linear performance model, eq. (4-8), and with a single dependent variable, distance to target.
VII. REFERENCES