Driver behavior while using Level 2 vehicle automation: a hybrid naturalistic study

Cooper, Joel M.; Crabtree, Kaedyn W.; McDonnell, Amy S.; May, Dominik; Strayer, Sean C.; Tsogtbaatar, Tushig; Cook, Danielle R.; Alexander, Parker A.; Sanbonmatsu, David M.; Strayer, David L.

doi:10.1186/s41235-023-00527-5

Original article
Open access
Published: 20 December 2023

Driver behavior while using Level 2 vehicle automation: a hybrid naturalistic study

Joel M. Cooper ORCID: orcid.org/0000-0002-7789-7704¹,
Kaedyn W. Crabtree²,
Amy S. McDonnell²,
Dominik May¹,
Sean C. Strayer¹,
Tushig Tsogtbaatar¹,
Danielle R. Cook¹,
Parker A. Alexander¹,
David M. Sanbonmatsu² &
…
David L. Strayer²

Cognitive Research: Principles and Implications volume 8, Article number: 71 (2023) Cite this article

955 Accesses
1 Citations
1 Altmetric
Metrics details

Abstract

Vehicle automation is becoming more prevalent. Understanding how drivers use this technology and its safety implications is crucial. In a 6–8 week naturalistic study, we leveraged a hybrid naturalistic driving research design to evaluate driver behavior with Level 2 vehicle automation, incorporating unique naturalistic and experimental control conditions. Our investigation covered four main areas: automation usage, system warnings, driving demand, and driver arousal, as well as secondary task engagement. While on the interstate, drivers were advised to engage Level 2 automation whenever they deemed it safe, and they complied by using it over 70% of the time. Interestingly, the frequency of system warnings increased with prolonged use, suggesting an evolving relationship between drivers and the automation features. Our data also revealed that drivers were discerning in their use of automation, opting for manual control under high driving demand conditions. Contrary to common safety concerns, our data indicated no significant rise in driver fatigue or fidgeting when using automation, compared to a control condition. Additionally, observed patterns of engagement in secondary tasks like radio listening and text messaging challenge existing assumptions about automation leading to dangerous driver distraction. Overall, our findings provide new insights into the conditions under which drivers opt to use automation and reveal a nuanced behavioral profile that emerges when automation is in use.

Key findings

Drivers were less likely to use automation when roadway demands were higher.
Secondary task engagements did not alarmingly change with automation usage (i.e., we only observed an increase in radio listening with Automation-L2).
Automation usage alarms increased over time suggesting that drivers adopt a more relaxed interaction strategy with practice.
The use of automation did not, by itself, increase fatigue or fidgeting. Rather, drivers used automation when they were already at risk of fatigue (i.e., during situations of low driving demand).
Naturalistic Driving Research may benefit from true experimental control, especially in cases where driver behavior is contextually dependent (e.g., drivers may choose to use Automation only when they feel it is safe to do so).

Introduction

The rapid development and widespread availability of automated vehicles has sparked considerable interest in understanding their impact on driver behavior and safety. Automated vehicles hold promise for improving transportation safety, mobility, sustainability, and overall quality of life for billions of drivers worldwide. Vehicle automation intends to enhance safety by eliminating human error, which accounts for a considerable portion of traffic deaths in the United States (Iden & Shappell, 2006). It may also provide mobility solutions for those unable to drive due to age or disability (Alessandrini, 2015) and significantly reduce highway and city congestion (Makridis et al., 2018; Sener & Zmud, 2019). However, developing vehicle automation involves numerous challenges with many degrees of freedom (Musk, 2021), therefore many automakers have taken small and incremental steps toward full autonomy over time.

To classify these steps and characterize the role of the driver at each stage of technological development, the Society of Automotive Engineers (SAE) has defined six levels of vehicle automation that gradually transition from full manual control (Level 0) to complete vehicle autonomy (Level 5). Level 1 automation is commonplace and entails Adaptive Cruise Control and Lane Keep Assist, which offer brake/acceleration and steering assistance to the driver. When Adaptive Cruise Control and Lane Keep Assist technologies are used simultaneously, these features form Level 2 partial automation (SAE, 2018). Level 2 partially automated vehicles are increasingly common on the roadways and are the focus of the current study. Herein, we often use the shorthand term Automation-L2 to refer to vehicles with Level 2 automation (e.g., simultaneous activation of Adaptive Cruise Control and Lane Keep Assist).

Under Level 2 partial vehicle automation, the driver must remain vigilant and continue to monitor the vehicle should the technology fail and the driver need to resume manual control. In this sense, the role of the driver shifts from being an active controller of the vehicle (as is typical in manual driving) to a passive monitor of the automated system (Endsley, 2017b). There is concern that this shared role may lead to safety issues related to driver attention and vigilance such that the monotony of automated driving may increase the likelihood for a driver to disengage with the driving environment. Decades of research on automation suggest this shared responsibility between the human and vehicle may negatively impact safety because it does not fully remove driver vigilance and oversight requirements, possibly resulting in driver fatigue, increases in secondary task engagement, and other unintended consequences. However, much of this research relies on driving simulations (e.g., Forster et al., 2019; Greenlee et al., 2018; Zangi et al., 2022), leaving real-world testing outcomes inconclusive and raising questions about the generalizability of these findings.

To address the critical research gaps in our understanding of vehicle automation, the current on-road study employs a hybrid research design that combines both naturalistic and experimental elements. Participants drove one of five commercially available Level 2 vehicles for 6 to 8 weeks on their daily work commute while their behavior was recorded via video cameras mounted in the vehicle. Once a week, participants were instructed not to use automation. This gave us a control group to compare to the other days of the week, when participants were allowed to use automation. This innovative approach allowed for a more comprehensive investigation of how drivers interact with and adapt to vehicle automation systems in real-world scenarios.

The study explores four key considerations of human-automation interactions—the effect of familiarity on driver willingness to use automation, the effect of familiarity on proper use of the automated technology, the effect of automation on driver arousal and fatigue, and the effect of automation on secondary task engagement. By examining these factors, we aim to provide valuable insights into the safety concerns associated with automation use. Next, we explore each of these four considerations in lower-level vehicle automation usage and pose questions which are addressed in this research.

Automation usage

Research suggests that drivers' familiarity and experience with automation technologies such as Adaptive Cruise Control or Lane Keep Assist may influence usage patterns (Beggiato et al., 2015; Larsson, 2012). Initially, drivers may be hesitant to use automation due to lack of understanding or concerns about reliability. As they gain experience, they may become more comfortable and proficient. However, it is unclear how increased proficiency affects usage. Dunn et al. (2021) propose that experience with automation changes behavior through operational phases, but this has not been experimentally confirmed and likely depends on the driver's perception of control, usefulness, and reliability (Parasuraman & Riley, 1997).

To investigate the relationship between practice and vehicle automation usage, we observed drivers interacting with a Level 2 vehicle over a 6- to 8-week period. This design allowed us to examine two interrelated questions pertaining to automation usage and provide insight into the operational phases hypothesis proposed by Dunn et al. (2021):

Automation Usage Q1 – Does experience with automation change the frequency with which drivers activate the automation?
Automation Usage Q2 – How does the re-engagement time (after disengagement) change with practice?

System warnings and driving demand

Automation warnings occur for various reasons but are often related to driver state monitoring. These warnings arise when drivers fail to maintain sufficient steering torque or keep their eyes on the forward roadway. These warnings typically involve visual, auditory, and tactile cues such as vibrations through the steering wheel and seat. The specific types of warnings, their activation methods, and their intended messages to drivers vary depending on the vehicle's automation system and capabilities.

Research suggests that driver acceptance of system warnings is often low (Xu et al., 2021) and is influenced by factors such as the driver's experience and familiarity with the technology, as well as the perceived reliability and usefulness of the automation (Abe & Richardson, 2004; Large et al., 2017). Changes in the frequency of system warnings may result from changes in a driver's understanding of the warning cause, intent, and severity.

Warnings are also occasionally issued to request that drivers take over steering control due to poor driving conditions that the automation is not designed to handle. Although automated systems can function in challenging conditions, they are not currently intended for situations requiring extra driver caution and vigilance, such as in inclement weather or constructions zones. The road-facing camera used in this research allowed us to code various types of poor conditions.

The frequency of system alarms and the continued use of vehicle automation in poor driving conditions reflect automation control strategies and the extent to which drivers remain functionally vigilant to the driving task (Fridman et al., 2019). This research addresses two distinct but interrelated questions:

Warnings & Demand Q1 – Does the frequency of system warnings change over time?
Warnings & Demand Q2 – Does the frequency of automation use change during poor conditions?

Automation and driver arousal—measured through fatigue and fidgeting

The relationship between Level 2 partial vehicle automation and arousal is complex and not fully resolved. Several research studies using driving simulations have found that automation use leads to an increase in driver passive fatigue, caused by under-arousal and boredom (Ahlström et al., 2021; Arefnezhad et al., 2022; Desmond & Hancock, 2000; Matthews et al., 2019). However, the controlled nature of these research designs often limits the types of natural countermeasures that drivers may employ to combat fatigue and under-arousal. For example, research has shown that secondary task interactions may, in some cases, protect against fatigue that arises during the use of automation (Feldhütter et al., 2019; Schömig et al., 2015), leading some to suggest secondary task use as a countermeasure for automation-related fatigue (Vogelpohl et al., 2019). However, complex secondary tasks can also distract from the driving task and result in slow resumption of vehicle control during a takeover request (Louw et al., 2015; Merat et al., 2014). Because this research on driver fatigue during automation use has primarily been conducted in simulators, it remains unclear if these findings can be extrapolated to real-world scenarios.

Fidgeting is defined by the Oxford Dictionary as making small movements, especially of the hands and feet, through nervousness or impatience. Research suggests that fidgeting is highly associated with mind wandering and inattention (Carriere et al., 2013) and is sometimes viewed as a distracting secondary task (Hasan et al., 2022). Fidgeting behaviors may therefore be indicative of a driver countermeasure to combat fatigue or boredom and a potential precursor to passive fatigue. Based on these definitions and findings, fidgeting behavior may serve as an indirect measure of driving task engagement, with lower rates of fidgeting suggesting higher driving engagement or potential fatigue, and higher rates of fidgeting suggesting lower driving engagement and possible mind wandering.

To investigate the link between Level 2 automation use and arousal, video recordings of participants' faces and hands were used to determine the frequency of fatigue and fidgeting behaviors:

Fatigue & Fidgeting Q1—How do visual signs of driver fatigue relate to Level 2 automation use?
Fatigue & Fidgeting Q2—How do visual signs of driver fidgeting relate to Level 2 automation use?

Secondary task engagement

Roadside observations of drivers suggest that they engage in non-driving related secondary tasks up to 32% of the time (Huisingh et al., 2015). With recent technological developments allowing for vehicle-phone pairing, voice control, and heads-up technology interactions, it is likely that this number is both underreported and growing. Behavioral analyses using the SHRP2 naturalistic driving dataset suggest that observable distractions are prevalent in 52% of normal baseline driving (Dingus et al., 2016). While the prevalence of handheld phone use for talking by drivers has gradually decreased, the prevalence of handheld device manipulation for activities such as texting and internet use has increased (NHTSA, 2021).

Several studies have indicated that drivers are more likely to engage in secondary tasks when vehicle automation is active (De Winter et al., 2014; Dunn et al., 2021; Endsley, 2017a; Naujoks et al., 2016; Reagan et al., 2021). Drivers are also able to more efficiently complete secondary tasks with automation than when manually driving (He & Donmez, 2019). The primary concern with secondary task engagements during automation use is that they reduce the driver’s ability to safely monitor the automation through a diversion of visual and cognitive resources (Gaspar & Carney, 2019) and decrease a driver’s ability to quickly resume full control of the vehicle (see Morales-Alvares et al., 2020 for review). A second concern is that the driver may develop automation-induced complacency over time. Results in two naturalistic driving studies analyzed by Dunn et al. (2021) suggest that driver complacency and willingness to engage in secondary tasks may develop through a series of phases. In the first phase, the learning phase, drivers become acquainted with the automation, including learning about its potential uses and limitations. During this phase, drivers may not fully trust the automation and may be unwilling to engage in tasks that are outside of their normal behavior.

However, as experience with the automation grows, drivers are suggested to transition into an integration phase (Saad et al., 2004), indicated by an increased willingness to divert attention from the roadway and toward secondary tasks. The existence of this type of phased learning has not, however, been demonstrated in a single study, and it is unclear whether this theory accurately characterizes the evolution of secondary task behaviors with automation use in the real world.

The current study adapted the secondary task coding scheme developed by Strayer et al. (2017), where each observable secondary task was coded by the type of task (e.g., texting, talking, etc.), mode of interaction (visual manual vs auditory vocal), and interface modality (cell phone vs vehicle interface). Each of these behaviors was coded over time, allowing us to address several interrelated questions:

Secondary Task Engagement Q1—How does the frequency of secondary task use (non-driving related) change during Level 2 automation compared to Level 0 manual driving over time?
Secondary Task Engagement Q2—How does the frequency of task type, mode of interaction (voice versus manual), and interface (cell phone versus In-Vehicle Information System [IVIS]) change during Level 2 automation compared to Level 0 manual driving?

Naturalistic and experimental driving approaches

The naturalistic driving approach, originally developed by the Virginia Tech Transportation Institute (Neale et al., 2005) and now used by researchers worldwide (Eenink et al., 2014; Fitch et al., 2013; Fridman et al., 2019), uses cameras placed in participant vehicles to passively collect video recordings of drivers during their normal use of the vehicle. This approach allows researchers to observe driving behavior as it occurs in real-world scenarios, while allowing drivers to act naturally.

Naturalistic driving research generates a continuous stream of video which can be challenging to transfer, catalog, and analyze. To help manage this complexity, several approaches have been developed to both identify events of interest and suitable sections of video to code for baseline behavior. In most cases, critical events are identified either through high-g events (Klauer et al., 2010) or through some form of machine learning (Fridman et al., 2017). Baseline driving epochs are then selected to match as closely as possible to the event of interest, with the exception that the event of interest is not found in the selected baseline video.

An innovative approach to sifting through naturalistic video data was employed by Fitch et al. (2013). They focused on coding an array of driver performance metrics both in the presence and absence of cell phone use. To establish a comparative baseline, epochs of driver performance were extracted from the 30-s window preceding any phone use. These epochs served as quasi-controls, enabling the researchers to gauge the extent to which cell phone usage disrupted conventional driving behaviors. Although this methodology does not offer the rigidity of a true experimental control design—given that participants had the freedom to choose when to engage with their phones—it provided a well-matched samples approach that was instrumental in isolating the effects of phone use on driving performance.

The validity of analytical techniques in naturalistic driving research hinges on a complex interplay of factors, most notably the contextual nature of the driver behaviors in question and the fidelity between baseline and event epochs. Specifically, if a behavior—such as automation usage—is environmentally contingent (i.e., drivers engage in it only under perceived safe conditions), it becomes crucial to ensure a precise contextual match between the baseline and event epochs. Any deviation in this respect can introduce confounding variables that compromise the study's validity and risk misinterpretation of the results. The absence of true experimental controls in naturalistic studies presents inherent challenges in establishing causal relationships (Carsten, Kircher, & Jamson, 2012). Experimentally controlled evaluations of driver performance (e.g., Laboratory research) are commonly used to gain insights into the potential safety concerns that may arise with vehicle automation. Within the driving domain, these come in several variations that range from simple tracking tasks (Strayer & Johnston, 2001) to complex scenario mock-ups using highly instrumented vehicles on climate-controlled test tracks (Gibson, 2015; Tan et al., 1998). The primary strength of tight experimental control is that it allows researchers to manipulate a single factor while holding all other factors constant. Unlike with the naturalistic driving approach, the performance baseline is often an identical or near-identical scenario. This allows for confident statements about causality. The challenge with these types of studies is generalizability, as naturalism is often sacrificed for control and observed behavior may not generalize to the real-world.

In this within-subjects study, participants' behaviors were compared under two conditions: when they chose to use Level 2 automation and when they were instructed not to use it. Unlike the other research questions that will be addressed separately, analyses contrasting the experimental control condition with the naturalistic observations thread through the entirety of our study.

The current study

The current study expands on previous research in several keyways. First, all vehicles in the study were equipped with advanced driver-assistance systems that meet the SAE definition of Level 2 automation. Prior research has often used a mixture of Level 1 and Level 2 vehicles (e.g., Dunn et al., 2021). Second, through the introduction of a unique experimental control, this study was designed to systematically control environmental differences that could influence automation use, such as varying road conditions, weather, traffic density, and infrastructure. This is a unique and important manipulation that, to our knowledge, has never been done before. Finally, the current study tracks novice users for longer periods than previous studies, which will allow for in-depth analysis of how behavior change as drivers become more familiar with advanced driver assistance systems. Through this novel experimental design, this study seeks to answer each of the various questions posed above related to driver usage and engagement during Level 2 automated driving.

Methods

The video data analyzed and presented in this manuscript form a subset of a larger research effort (see Fig. 1), which includes a 6–8-week naturalistic observation period (reported here), survey data collection (see Sanbonmatsu et al., 2023), and two 5-h on-road performance evaluations (see McDonnell et al., 2023). Additional details about the unique methods employed in each part of the project can be found in their respective reports. In this manuscript, we focus on the methods specific to the 6–8-week Naturalistic Driving portion of the larger research effort (see Fig. 1).

Participants

Participants in this study (N = 30, 12 females, 18 males) ranged in age from 18 to 55 (M = 35.73, SD = 9.34) and were recruited through online advertisements. For the 6–8-week naturalistic portion of the experiment, participants received an average compensation of $300. Eligibility criteria included having a valid U.S. driver's license, no at-fault accidents within the past two years (verified by driving records obtained through the University of Utah Division of Risk Management), and no prior experience with Level 2 automation. Participants were required to have a daily work commute of at least 20 min (40 min round trip) on a major local interstate and were instructed to use vehicle automation as often as they felt comfortable.

Materials

Vehicles: This study used five commercially available vehicles equipped with Level 2 automation: 2018 Tesla Model 3 AWD/Long Range with Autopilot, 2017 Tesla Model S with Autopilot, 2018 Cadillac CT6 with Supercruise, 2018 Volvo XC90 Momentum with Pilot Assist, and 2019 Nissan Rogue SL Premium with ProPILOT Assist. The distribution of participants that tested in each vehicle was as follows: eight in the Tesla Model S, six in the Tesla Model 3, one in the Cadillac CT6, six in the Nissan Rogue, and nine in the Volvo XC90. Participants were randomly assigned to a vehicle based on vehicle availability at the time of participant enrollment.

Cameras: Rosco-developed Dual-Vision XC4 cameras were installed under each vehicle's rear-view mirror. The cameras offered a view of both the forward roadway and the vehicle interior using a fish-eye lens. Additionally, an auxiliary camera captured either the screen behind the steering wheel or the screen between the front seats, depending on the location of vehicle state icons indicating automation status (see Fig. 2). Video data was stored on Rosco and Transcend brand SD cards, and the cameras automatically started and stopped recording when the vehicle was turned on or off.

Video Coding: Videos were processed for analysis using BORIS (Friard & Gamba, 2016). BORIS enabled coders to pre-specify activities of interest and then perform frame-by-frame video playback to mark the beginning and end of each behavior. Summary results for each coded video were output to.csv formatted files, with each line in the file containing details about individual observations, such as the behavior, location within the video, and start and stop times of the coded behavior (see Fig. 3).

Procedure

In the initial experimental session, participants underwent comprehensive training comprising verbal, written, and video instructions on using vehicle automation. They also participated in a 1-h on-road practice session with real-time feedback and guidance on the automation. After which they completed an on-road performance evaluation both with and without automation. After finishing Experimental Session 1, participants received one of the five research vehicles, which they agreed to use on weekdays for commuting to and from work, not allowing other people inside, and operating the vehicle according to the law. Participants were encouraged to use vehicle automation on interstate segments of their commute as often as they felt comfortable. They used the vehicle on workdays for 6–8 weeks (subject to scheduling constraints related to the final evaluation) before completing the final experimental session and returning the vehicle (for more details on Experimental Sessions 1 and 2, see McDonnell et al., 2023). The 6–8 Weeks Naturalistic Driving observation period is the focus of this research report (See Fig. 1). Experimental Control Day. A unique component of this research is that each week, one randomly selected day was designated as an experimental control day, during which participants were instructed not to use vehicle automation the following day (See Fig. 4Automation: NO). Control days were chosen at random and reassigned if they coincided with adverse weather unlike other drives that week. Videos from these days were coded and included in the analyses under the Experimental Control condition (see Fig. 4).

Naturalistic day Due to the large volume of video data collected during daily commutes, we selected and coded only one day each week from the remaining days (Automation: YES in Fig. 4). This day was chosen at random, with the constraint that its weather closely matched that of the Experimental Control Day (e.g., if it was sunny on the control day, the Naturalistic Day was also sunny). Instances of automation use during this day were coded and analyzed under the Automation-L2 condition, while instances in which participants elected to drive manually were coded and analyzed under the Naturalistic Control condition (see Fig. 4).

Data handling protocol

Video handling and selection After the 6–8 weeks of naturalistic driving, participants completed the final experimental session and returned their vehicles (c.f., Fig. 1). SD video cards were then removed from the vehicle cameras and processed for analysis. Videos were continuously recorded within participants' vehicles during daylight hours. However, our analysis was confined to segments of the video stream that captured interstate travel during the participants' commutes, specifically along major interstates within and surrounding the Salt Lake Valley (e.g., I-80, I-15, I-215). Prior to uploading and saving the videos, files were cleaned to eliminate all non-commute driving on the regional interstates. Furthermore, video files were combined into AM and PM commutes for each day. Cleaned video files capturing highway driving during AM and PM commutes were uploaded to a secure server for analysis.

Video blinding To minimize potential bias among coders, several procedures were implemented to blind them to the experimental condition present in the videos. This primarily involved a two-pass approach to video coding, wherein all behaviors except the state of automation were coded during the first pass. Automation indicators were obscured during video playback using strips of painter's tape positioned on the monitor. During the second pass, the tape was removed, and the automation state was recorded and integrated into the record. All other indicators of the experimental condition were eliminated, including file labels and other electronic data, until the final completion of each participant record, after which condition information was reintegrated.

Video coder training Video reduction took place over approximately 1.5 years, involving several different reductionists. To ensure coding consistency, new reductionists underwent a three-week peer-to-peer training focused on coding quality and consistency, established through redundant coding and regular checks of inter-rater reliability.

Additional steps were taken to further ensure coding consistency. First, with each new participant, reductionists group-coded video from at least one drive, allowing them to determine if any unique or challenging behavior was likely to arise from the participant and to reach a consensus on how to handle such behavior if observed. Second, at least one video was group-coded each week, regardless of whether it was from a new participant. This strategy led to a target of 40% of all videos being redundantly coded. Finally, inter-rater reliability was continuously assessed using an Excel-generated script and BORIS's kappa score generator. An acceptable kappa score on the unaggregated raw coding was set to 0.6, which, when collapsed by coded task, led to scores above 0.9. If significant differences were found between observations, reductionists would review the video as a group to identify and correct discrepancies.

Videos were generally coded in real-time, but reductionists often had to rewatch complex sections to accurately code the start and stop of overlapping behaviors. This demanding process required significant focused attention, so reductionists were encouraged to take breaks as needed to maintain high performance levels.

Video coding rubric A comprehensive and systematic coding scheme was developed to capture various participant behaviors, resulting in a video coding dictionary to guide video reduction. This dictionary included clear definitions of all behaviors of interest and examples of each behavior. To address the four sets of questions posed by this research, the following coding scheme was developed:

Automation Usage – Instances of automation engagement and disengagement were coded using the instrument-facing camera that captured an image of the screen displaying automation state. The use of automation activation controls served as a redundant marker of automation use and helped to disambiguate system state when icon visibility was poor.
System Warnings and Driving Demand – System warnings were marked as discrete events in the data file. Driving demand was operationalized as the sum of concurrent Poor Conditions present, with Low demand including no poor conditions, Moderate demand including one poor condition, and High demand including two or more poor conditions. Poor conditions were defined as weather, traffic, construction, emergency vehicles, or other events that could adversely affect driving.
Driver Arousal – Fatigue and fidgeting behaviors were coded as continuous events, meaning that the coders marked the start and stop times of each specific behavior. For fatigue, this included marking the beginning and end of visible signs of sleepiness, such as yawning, heavy eyelids, and nodding heads. For fidgeting, this included identifying the start and stop times of body movements lasting more than 3 s, such as touching the face, neck, head/hair, or moving hands to and from the steering wheel. Additionally, reaching and grabbing, and eating and drinking behaviors were grouped into fidgeting.
Secondary Task Engagement – This was a comprehensive class of behaviors, and detailed data were collected on each instance. Five core distracting activities were defined: Text Messaging, Calling and Dialing, Radio Listening, Navigation, and Video Interaction. Each of these activities was coded for modality of interaction, which included Visual-Manual or Auditory-Vocal, and interface, which included Cell Phone or In-Vehicle-Information-System (IVIS). For each trip (AM or PM commute), the coders recorded the start and stop times of these distracting activities, capturing the frequency and duration of each behavior. This allowed for a detailed analysis of distraction and inattention on a trip-by-trip basis, as well as for the entire day's drive. Furthermore, an aggregate measure was used to provide an overall assessment of secondary task engagement by summing all secondary task interactions across the various activities.

Statistical analysis

BORIS provided a.csv file as output for each coded video, listing details for each behavior in separate columns with one row per behavior. To analyze this data, we generated several R scripts that converted outputs into a time-series format, with behaviors organized in columns, time represented by each row, and a binary task state indicator listed in each column. Organized in this structure, we were able to combine and collapse behaviors as required for various analyses. Transformations were primarily carried out using base R (R Core Team, 2022) and packages within the tidyverse (Wickham et al., 2019).

To account for sources of non-independence in the data (i.e., repeated measures within each participant) and allow for missing data, we analyzed our data with linear mixed-effects models using the lmer function found in the lmerTest library (Kuznetsova et al., 2017). Participant ID and the AM/PM drive indicator were included in all models as random intercepts, and, where appropriate, Session and Condition were input as predictor variables (see bulleted list below). Outcome variables were dictated by the specific question and included Fatigue, Fidgeting, Secondary Task, etc., as described in the video coding rubric. Likelihood ratio tests were run using the ANOVA function in the stats package to test the significance of all effects, and pairwise comparisons were run using the contrasts function of the lmerTest library. Significance levels for all analyses were set at p < 0.05, p < 0.01, and p < 0.001, indicated by one *, two **, or three ***, respectively in the figures and tables that follow.

Predictor Variables of interest were:

Session – fixed continuous factor. This was the numerical indicator of week (e.g., 1, 2, 3, etc.). Session was handled as a continuous fixed factor for all relevant analyses but treated as discrete for plotting purposes.
Condition – fixed discrete factor with 3 levels. The "Automation: Yes" day provided two levels of Condition, which were Automation L2 and Naturalistic Control. The "Automation: No" day provided the third level of Condition, which was Experimental Control. Condition was also entered as a discrete fixed effect in relevant models.
Subject – random discrete factor. This was the simple subject identifier. Subject was modeled as a random intercept in all analyses.
AMPM – random discrete factor. Simple identifier of the AM or PM drives (e.g., the morning and evening commutes for each participant). AM_PM was also entered as a random slope in all analyses.

Results

Data overview

Video Record. We obtained video results from 30 participants, resulting in a total of 670 videos (353 Naturalistic, 317 Baseline). Within the baseline day, 26 of the videos contained instances of automation use, indicating a misunderstanding of the task for that day. These videos were excluded from the analysis, leaving 291 baseline videos. For each of the weeks 1–8, the following number of subjects were available to code: 26, 30, 30, 28, 27, 22, 17, and 12. Of the 670 available videos, 308 were double-coded, 76 were triple-coded, and 4 were coded by 4 different reductionists. In total, 1060 coding records were entered into the analysis. Results from redundantly coded videos were averaged together.

By coding only 1 Naturalistic and 1 Experimental Control Day per week, we obtained 297 total hours of coded video, with just over half collected during Naturalistic driving (161 h). Overall, participants used automation between 25 and 99% of the time during the Naturalistic observation period, resulting in 124 h of video where participants engaged Level 2 automation.