After some piddling with my program, I had it count missing data. I had about 20,000 points missing out of the half-million or so that I’m using. In short, about 4% of my data points were invalid (1% to 7% range by subject). That is higher than I expected. I hope that with what I’ve learned about subject placement and lighting that number can be reduced in the future.
The biggest issue with the missing points was that they were spread about randomly so that all subjects had data missing on at least one “trial” and almost all trials had data missing on at least one subject. That distribution makes analysis kinda tough. So, I went about replacing missing values with something that hopefully made sense.
Subject by subject, if data were missing between two gaze points I simply filled in the missing values by linearly interpolating between the two points. If data were missing at the beginning or end of a recording, so that I did not have two data points between which to lerp, I had to do something different. I started with the first valid data point, then looked at the next valid data point and got the vector between the two. I continued finding these “direction change” vectors between consecutive pairs of points until I had exhausted all the data points for a subject. Then, I averaged all those vectors and called it “average change.” If data were missing at the beginning of a recording I advanced to the first valid data point and worked backwards filling in data points by adding the average change (point n-1 = pointN + average change). If data were missing at the end of a recording, I backed up to the last valid point and filled in by adding in the average change (point N = point N-1 + average change).
In the end, filling in all the 0.00 values with these estimates had no real effect on the analyses or graphs.