Publications by Olov Engwall

Peer reviewed

Articles

[1]

S. Zeivots et al., "Reshaping Higher Education Designs and Futures : Postdigital Co-design with Generative Artificial Intelligence," Postdigital Science and Education, vol. 7, no. 4, pp. 1334-1374, 2025.

[2]

P. Gonzalez Oliveras, O. Engwall and A. Wilde, "Social Educational Robotics and Learning Analytics : A Scoping Review of an Emerging Field," International Journal of Social Robotics, vol. 17, no. 6, pp. 1113-1128, 2025.

[3]

O. Engwall, R. Cumbal and A. R. Majlesi, "Socio-cultural perception of robot backchannels," Frontiers in Robotics and AI, vol. 10, 2023.

[4]

R. Cumbal et al., "Stereotypical nationality representations in HRI : perspectives from international young adults," Frontiers in Robotics and AI, vol. 10, 2023.

[5]

O. Engwall et al., "Identification of Low-engaged Learners in Robot-led Second Language Conversations with Adults," ACM Transactions on Human-Robot Interaction, vol. 11, no. 2, 2022.

[6]

O. Engwall and J. David Lopes, "Interaction and collaboration in robot-assisted language learning for adults," Computer Assisted Language Learning, vol. 35, no. 5-6, pp. 1273-1309, 2022.

[7]

O. Engwall, J. D. Águas Lopes and R. Cumbal, "Is a Wizard-of-Oz Required for Robot-Led Conversation Practice in a Second Language?," International Journal of Social Robotics, vol. 14, no. 4, pp. 1067-1085, 2022.

[8]

O. Engwall et al., "Learner and teacher perspectives on robot-led L2 conversation practice," ReCALL, vol. 34, no. 3, pp. 344-359, 2022.

[9]

O. Engwall, J. David Lopes and A. Åhlund, "Robot interaction styles for conversation practice in second language learning," International Journal of Social Robotics, vol. 13, no. 2, pp. 251-276, 2021.

[10]

S. Dabbaghchian et al., "Simulation of vowel-vowel utterances using a 3D biomechanical-acoustic model," International Journal for Numerical Methods in Biomedical Engineering, vol. 37, no. 1, 2021.

[11]

M. Arnela et al., "MRI-based vocal tract representations for the three-dimensional finite element synthesis of diphthongs," IEEE Transactions on Audio, Speech, and Language Processing, vol. 27, no. 12, pp. 2173-2182, 2019.

[12]

S. Dabbaghchian et al., "Reconstruction of vocal tract geometries from biomechanical simulations," International Journal for Numerical Methods in Biomedical Engineering, 2018.

[13]

M. Arnela et al., "Influence of lips on the production of vowels based on finite element simulations and experiments," Journal of the Acoustical Society of America, vol. 139, no. 5, pp. 2852-2859, 2016.

[14]

M. Arnela et al., "Influence of vocal tract geometry simplifications on the numerical simulation of vowel sounds," Journal of the Acoustical Society of America, vol. 140, no. 3, pp. 1707-1718, 2016.

[15]

C. Koniaris, G. Salvi and O. Engwall, "On mispronunciation analysis of individual foreign speakers using auditory periphery models," Speech Communication, vol. 55, no. 5, pp. 691-706, 2013.

[16]

O. Engwall, "Analysis of and feedback on phonetic features in pronunciation training with a virtual teacher," Computer Assisted Language Learning, vol. 25, no. 1, pp. 37-64, 2012.

[17]

G. Ananthakrishnan, O. Engwall and D. Neiberg, "Exploring the Predictability of Non-Unique Acoustic-to-Articulatory Mappings," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 10, pp. 2672-2682, 2012.

[18]

G. Ananthakrishnan and O. Engwall, "Mapping between acoustic and articulatory gestures," Speech Communication, vol. 53, no. 4, pp. 567-589, 2011.

[19]

H. Kjellström and O. Engwall, "Audiovisual-to-articulatory inversion," Speech Communication, vol. 51, no. 3, pp. 195-209, 2009.

[20]

J. Beskow et al., "Visualization of speech and audio for hearing-impaired persons," Technology and Disability, vol. 20, no. 2, pp. 97-107, 2008.

[21]

O. Engwall and O. Bälter, "Pronunciation feedback from real and virtual language teachers," Computer Assisted Language Learning, vol. 20, no. 3, pp. 235-262, 2007.

[22]

O. Engwall et al., "Designing the user interface of the computer-based speech training system ARTUR based on early user tests," Behavior and Information Technology, vol. 25, no. 4, pp. 353-365, 2006.

[23]

O. Engwall, "Combining MRI, EMA and EPG measurements in a three-dimensional tongue model," Speech Communication, vol. 41, no. 2-3, pp. 303-329, 2003.

Conference papers

[24]

M. Jansson et al., "An initial exploration of semi-automated tutoring : How AI could be used as support for online human tutors," in Proceedings of the Fourteenth International Conference on Networked Learning, 2024.

[25]

A. M. Kamelabad, O. Engwall and G. Skantze, "Conformity and Trust in Multi-party vs. Individual Human-Robot Interaction," in Proceedings of the 24th ACM International Conference on Intelligent Virtual Agents, 2024.

[26]

R. Cumbal and O. Engwall, "Speaking Transparently : Social Robots in Educational Settings," in Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction (HRI '24 Companion), March 11--14, 2024, Boulder, CO, USA, 2024.

[27]

R. Cumbal et al., "Shaping unbalanced multi-party interactions through adaptive robot backchannels," in IVA 2022 - Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents, 2022.

[28]

S. Gillet et al., "Robot Gaze Can Mediate Participation Imbalance in Groups with Different Skill Levels," in Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction, 2021, pp. 303-311.

[29]

R. Cumbal et al., "“You don’t understand me!” : Comparing ASR Results for L1 and L2 Speakers of Swedish," in Proceedings Interspeech 2021, 2021, pp. 96-100.

[30]

R. Cumbal, J. David Lopes and O. Engwall, "Detection of Listener Uncertainty in Robot-Led Second Language Conversation Practice," in Proceedings ICMI '20: International Conference on Multimodal Interaction, 2020.

[31]

R. Cumbal, J. Lopes and O. Engwall, "Uncertainty in robot assisted second language conversation practice," in HRI '20: Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction, 2020, pp. 171-173.

[32]

J. Lopes, O. Engwall and G. Skantze, "A First Visit to the Robot Language Café," in Proceedings of the ISCA workshop on Speech and Language Technology in Education, 2017.

[33]

M. Arnela et al., "A semi-polar grid strategy for the three-dimensional finite element simulation of vowel-vowel sequences," in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 2017, pp. 3477-3481.

[34]

S. Dabbaghchian et al., "Synthesis of VV utterances from muscle activation to sound with a 3d model," in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 2017, pp. 3497-3501.

[35]

M. Arnela et al., "Finite element generation of vowel sounds using dynamic complex three-dimensional vocal tracts," in Proceedings of the 23rd international congress on sound and vibration : From ancient to modern acoustics, 2016.

[36]

S. Dabbaghchian et al., "Using a Biomechanical Model and Articulatory Data for the Numerical Production of Vowels," in Interspeech 2016, 2016, pp. 3569-3573.

[37]

S. Dabbaghchian, M. Arnela and O. Engwall, "SIMPLIFICATION OF VOCAL TRACT SHAPES WITH DIFFERENT LEVELS OF DETAIL," in Proceedings of the 18th International Congress of Phonetic Sciences. Glasgow, UK, 2015, pp. 1-5.

[38]

C. Koniaris, O. Engwall and G. Salvi, "Auditory and Dynamic Modeling Paradigms to Detect L2 Mispronunciations," in 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, Vol 1, 2012, pp. 898-901.

[39]

C. Koniaris, O. Engwall and G. Salvi, "On the Benefit of Using Auditory Modeling for Diagnostic Evaluation of Pronunciations," in International Symposium on Automatic Detection of Errors in Pronunciation Training (IS ADEPT), Stockholm, Sweden, June 6-8, 2012, 2012, pp. 59-64.

[40]

O. Engwall, "Pronunciation analysis by acoustic-to-articulatory feature inversion," in Proceedings of the International Symposium on Automatic detection of Errors in Pronunciation Training, 2012, pp. 79-84.

[41]

C. Koniaris and O. Engwall, "Perceptual differentiation modeling explains phoneme mispronunciation by non-native speakers," in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2011, pp. 5704-5707.

[42]

C. Koniaris and O. Engwall, "Phoneme Level Non-Native Pronunciation Analysis by an Auditory Model-based Native Assessment Scheme," in 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011, 2011, pp. 1157-1160.

[43]

G. Ananthakrishnan and O. Engwall, "Resolving Non-uniqueness in the Acoustic-to-Articulatory Mapping," in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2011, pp. 4628-4631.

[44]

G. Ananthakrishnan et al., "Using an Ensemble of Classifiers for Mispronunciation Feedback," in Proceedings of SLaTE, 2011.

[45]

S. Picard et al., "Detection of Specific Mispronunciations using Audiovisual Features," in Auditory-Visual Speech Processing (AVSP) 2010, 2010.

[46]

O. Engwall, "Is there a McGurk effect for tongue reading?," in Proceedings of AVSP : International Conferenceon Audio-Visual Speech Processing, 2010.

[47]

G. Ananthakrishnan et al., "Predicting Unseen Articulations from Multi-speaker Articulatory Models," in Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, 2010, pp. 1588-1591.

[48]

O. Engwall and P. Wik, "Are real tongue movements easier to speech read than synthesized?," in INTERSPEECH 2009 : 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, 2009, pp. 824-827.

[49]

O. Engwall and P. Wik, "Can you tell if tongue movements are real or synthetic?," in Proceedings of AVSP, 2009.

[50]

G. Ananthakrishnan, D. Neiberg and O. Engwall, "In search of Non-uniqueness in the Acoustic-to-Articulatory Mapping," in INTERSPEECH 2009 : 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, 2009, pp. 2799-2802.

[51]

N. Katsamanis et al., "Audiovisual speech inversion by switching dynamical modeling Governed by a Hidden Markov Process," in Proceedings of EUSIPCO, 2008.

[52]

O. Engwall, "Can audio-visual instructions help learners improve their articulation? : an ultrasound study of short term changes," in INTERSPEECH 2008 : 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, 2008, pp. 2631-2634.

[53]

P. Wik and O. Engwall, "Can visualization of internal articulators support speech perception?," in INTERSPEECH 2008 : 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, pp. 2627-2630.

[54]

G. Ananthakrishnan and O. Engwall, "Important regions in the articulator trajectory," in Proceedings of International Seminar on Speech Production, 2008, pp. 305-308.

[55]

D. Neiberg, G. Ananthakrishnan and O. Engwall, "The Acoustic to Articulation Mapping : Non-linear or Non-unique?," in INTERSPEECH 2008 : 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, 2008, pp. 1485-1488.

[56]

H. Kjellström et al., "Audio-visual phoneme classification for pronunciation training applications," in INTERSPEECH 2007 : 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, 2007, pp. 57-60.

[57]

O. Engwall, "Evaluation of speech inversion using an articulatory classifier," in In Proceedings of the Seventh International Seminar on Speech Production, 2006, pp. 469-476.

[58]

O. Engwall et al., "Feedback management in the pronunciation training system ARTUR," in Proceedings of CHI 2006, 2006, pp. 231-234.

[59]

O. Engwall, V. Delvaux and T. Metens, "Interspeaker Variation in the Articulation of French Nasal Vowels," in In Proceedings of the Seventh International Seminar on Speech Production, 2006, pp. 3-10.

[60]

H. Kjellström, O. Engwall and O. Bälter, "Reconstructing Tongue Movements from Audio and Video," in INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, Vol. 1-5, 2006, pp. 2238-2241.

[61]

O. Engwall, "Articulatory synthesis using corpus-based estimation of line spectrum pairs," in 9th European Conference on Speech Communication and Technology, 2005, pp. 1909-1912.

[62]

E. Eriksson et al., "Design Recommendations for a Computer-Based Speech Training System Based on End User Interviews," in Proceedings of the Tenth International Conference on Speech and Computers, 2005, pp. 483-486.

[63]

O. Engwall, "Introducing visual cues in acoustic-to-articulatory inversion," in Interspeech 2005 : 9th European Conference on Speech Communication and Technology, 2005, pp. 3205-3208.

[64]

O. Bälter et al., "Wizard-of-Oz Test of ARTUR - a Computer-Based Speech Training System with Articulation Correction," in proceedings of ASSETS 2005, 2005, pp. 36-43.

[65]

O. Engwall et al., "Design strategies for a virtual language tutor," in INTERSPEECH 2004, ICSLP, 8^th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004, 2004, pp. 1693-1696.

[66]

O. Engwall, "From real-time MRI to 3D tongue movements," in INTERSPEECH 2004 : ICSLP 8^th International Conference on Spoken Language Processing, 2004, pp. 1109-1112.

[67]

O. Engwall, "Speaker adaptation of a three-dimensional tongue model," in INTERSPEECH 2004 : ICSLP 8^th International Conference on Spoken Language Processing, 2004, pp. 465-468.

Chapters in books

[68]

O. Engwall, "Augmented Reality Talking Heads as a Support for Speech Perception and Production," in Augmented Reality : Some Emerging Application Areas, Nee, Andrew Yeh Ching Ed., : IN-TECH, 2011, pp. 89-114.

[69]

O. Engwall, "Assessing MRI measurements : Effects of sustenation, gravitation and coarticulation," in Speech production : Models, Phonetic Processes and Techniques, Harrington, J.; Tabain, M. Ed., New York : Psychology Press, 2006, pp. 301-314.

Non-peer reviewed

Articles

[70]

O. Engwall et al., "Editorial : Socially, culturally and contextually aware robots," Frontiers in Robotics and AI, vol. 10, 2023.

[71]

G. Ananthakrishnan, P. Wik and O. Engwall, "Detecting confusable phoneme pairs for Swedish language learners depending on their first language," TMH-QPSR, vol. 51, no. 1, pp. 89-92, 2011.

[72]

O. Engwall, "Feedback strategies of human and virtual tutors in pronunciation training," TMH-QPSR, vol. 48, no. 1, pp. 011-034, 2006.

[73]

O. Engwall, "Dynamical Aspects of Coarticulation in Swedish Fricatives : A Combined EMA and EPG Study," TMH Quarterly Status and Progress Report, pp. 49-73, 2000.

[74]

O. Engwall and P. Badin, "Collecting and Analysing Two- and Three-dimensional MRI data for Swedish," TMH Quarterly Status and Progress Report, pp. 11-38, 1999.

[75]

O. Engwall, "Vocal Tract Modeling i 3D," TMH Quarterly Status and Progress Report, pp. 31-38, 1999.

Conference papers

[76]

M. Arnela et al., "Effects of vocal tract geometry simplifications on the numerical simulation of vowels," in PAN EUROPEAN VOICE CONFERENCE ABSTRACT BOOK : Proceedings e report 104, 2015, p. 177.

[77]

S. Dabbaghchian, I. Nilsson and O. Engwall, "From Tongue Movement Data to Muscle Activation – A Preliminary Study of Artisynth's Inverse Modelling," in Parametric Modeling of Human Anatomy, PMHA 14, Aug 22-23, 2014, Vancouver, BC, CA, 2014.

[78]

J. Beskow, O. Engwall and B. Granström, "Resynthesis of Facial and Intraoral Articulation fromSimultaneous Measurements," in Proceedings of the 15th International Congress of phonetic Sciences (ICPhS'03), 2003.

[79]

O. Engwall, "Evaluation of a System for Concatenative Articulatory Visual Synthesis," in Proceedings of the ICSLP, 2002.

[80]

O. Engwall, "Synthesizing Static Vowels and Dynamic Sounds Using a 3D Vocal Tract Model," in Proceedings of the 4th ISCA workshop on Speech Synthesis, 2001, pp. 81-86.

[81]

O. Engwall and P. Badin, "An MRI Study of Swedish Fricatives : Coarticulatory effects," in Proceedings of the 5th Speech Production Seminar, 2000, pp. 297-300.

[82]

O. Engwall, "Are Static MRI Data Representative of Dynamic Speech? : Results from a Comparative Study Using MRI, EMA, and EPG," in Proceedings of the 6th ICSLP, 2000, pp. 17-20.

Chapters in books

[83]

O. Engwall, "Datoranimerade talande ansikten," in Människans ansikten : Emotion, interaktion och konst, Adelswärd, V.; Forstorp, P-A. Ed., Stockholm : Carlssons Bokförlag, 2012.

[84]

O. Engwall, "Bättre tala än texta - talteknologi nu och i framtiden," in Tekniken bakom språket, Domeij, Rickard Ed., Stockholm : Norstedts Akademiska Förlag, 2008, pp. 98-118.

Theses

[85]

O. Engwall, "Tongue Talking : Studies in Intraoral Speech Synthesis," Doctoral thesis Stockholm : KTH, TRITA-TMH, 2002:04, 2002.

Other

[86]

O. Engwall, "Concatenative Articulatory Synthesis," (Manuscript).

[87]

O. Engwall et al., "Identification of low-engaged learners in robot-led second language conversations with adults," (Manuscript).

[88]

O. Engwall et al., "Learner and teacher perspectives on robot-led L2 conversation practice," (Manuscript).

Latest sync with DiVA:

2026-02-10 01:03:54 UTC

Studies

Research

Collaboration

About KTH

Library

Publications by Olov Engwall

Peer reviewed

Articles

Conference papers

Chapters in books

Non-peer reviewed

Articles

Conference papers

Chapters in books

Theses

Other

Contact