Gestures are an integral component in human-to-human communication when the speaker is visually present to the listener. In the past several years, research has examined how computer-generated pedagogical agents can be designed to perform the four main gesture types and what this means for agent persona and learning outcomes. The research into agent gesturing has only explored gestures without other presentation strategies such as visual aids or verbal redundancy to properly explore the impact of gestures, and to avoid overly “rich” displays of information.
The objective of this study is to explore the use of static images and varying frequencies of gestures to assess whether two visual inputs increases the risk of the split-attention effect, and to investigate the potential for visual redundancy when two visual inputs coincide with narration. Data on cognitive load, agent persona, and learning outcomes (recall and transfer) will be collected to measure participants' learning experience while acquiring procedural knowledge, specifically regarding the principles of lightning, in comparison to previous research.
A mixed methods approach consisted of three gesture frequency conditions (enhanced, average, no) with 118 participants. Quantitative data were analysed using a random-effect linear regression model; whereas qualitative data was collected through individual interviews that lasted 15–20 min.
The use of enhanced gesture frequency and images may significantly increase intrinsic cognitive load, but gestures and images do not cause extraneous cognitive load. The enhanced gesture condition significantly outperformed the no-gesture condition. Interviews indicated that depending upon the gesture condition, students selectively attended to information that they perceived as offering them the greatest learning opportunity. Using two visual inputs does not cause split-attention, nor does it provide evidence of a visual redundancy effect.