Hoppa till huvudinnehållet
Till KTH:s startsida

Publikationer av Jonas Beskow

Refereegranskade

Artiklar

[1]
A. Deichler et al., "Learning to generate pointing gestures in situated embodied conversational agents," Frontiers in Robotics and AI, vol. 10, 2023.
[2]
S. Alexanderson et al., "Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models," ACM Transactions on Graphics, vol. 42, no. 4, 2023.
[3]
M. Cohn et al., "Vocal accommodation to technology: the role of physical form," Language sciences (Oxford), vol. 99, 2023.
[5]
G. Valle-Perez et al., "Transflower : probabilistic autoregressive dance generation with multimodal attention," ACM Transactions on Graphics, vol. 40, no. 6, 2021.
[6]
G. E. Henter, S. Alexanderson och J. Beskow, "MoGlow : Probabilistic and controllable motion synthesis using normalising flows," ACM Transactions on Graphics, vol. 39, no. 6, s. 1-14, 2020.
[7]
K. Stefanov, J. Beskow och G. Salvi, "Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition," IEEE Transactions on Cognitive and Developmental Systems, vol. 12, no. 2, s. 250-259, 2020.
[8]
S. Alexanderson et al., "Style-Controllable Speech-Driven Gesture Synthesis Using Normalising Flows," Computer graphics forum (Print), vol. 39, no. 2, s. 487-496, 2020.
[9]
K. Stefanov et al., "Modeling of Human Visual Attention in Multiparty Open-World Dialogues," ACM Transactions on Human-Robot Interaction, vol. 8, no. 2, 2019.
[10]
S. Alexanderson et al., "Mimebot—Investigating the Expressibility of Non-Verbal Communication Across Agent Embodiments," ACM Transactions on Applied Perception, vol. 14, no. 4, 2017.
[11]
S. Alexanderson, C. O'Sullivan och J. Beskow, "Real-time labeling of non-rigid motion capture marker sets," Computers & graphics, vol. 69, no. Supplement C, s. 59-67, 2017.
[12]
S. Alexanderson och J. Beskow, "Towards Fully Automated Motion Capture of Signs -- Development and Evaluation of a Key Word Signing Avatar," ACM Transactions on Accessible Computing, vol. 7, no. 2, s. 7:1-7:17, 2015.
[13]
S. Alexanderson och J. Beskow, "Animated Lombard speech : Motion capture, facial animation and visual intelligibility of speech produced in adverse conditions," Computer speech & language (Print), vol. 28, no. 2, s. 607-618, 2014.
[14]
N. Mirnig et al., "Face-To-Face With A Robot : What do we actually talk about?," International Journal of Humanoid Robotics, vol. 10, no. 1, s. 1350011, 2013.
[15]
S. Al Moubayed, G. Skantze och J. Beskow, "The Furhat Back-Projected Humanoid Head-Lip Reading, Gaze And Multi-Party Interaction," International Journal of Humanoid Robotics, vol. 10, no. 1, s. 1350005, 2013.
[16]
S. Al Moubayed, J. Edlund och J. Beskow, "Taming Mona Lisa : communicating gaze faithfully in 2D and 3D facial projections," ACM Transactions on Interactive Intelligent Systems, vol. 1, no. 2, s. 25, 2012.
[17]
S. Al Moubayed, J. Beskow och B. Granström, "Auditory visual prominence From intelligibility to behavior," Journal on Multimodal User Interfaces, vol. 3, no. 4, s. 299-309, 2009.
[18]
J. Edlund och J. Beskow, "MushyPeek : A Framework for Online Investigation of Audiovisual Dialogue Phenomena," Language and Speech, vol. 52, s. 351-367, 2009.
[19]
G. Salvi et al., "SynFace-Speech-Driven Facial Animation for Virtual Speech-Reading Support," Eurasip Journal on Audio, Speech, and Music Processing, vol. 2009, s. 191940, 2009.
[20]
J. Beskow et al., "Visualization of speech and audio for hearing-impaired persons," Technology and Disability, vol. 20, no. 2, s. 97-107, 2008.
[21]
B. Lidestam och J. Beskow, "Motivation and appraisal in perception of poorly specified speech," Scandinavian Journal of Psychology, vol. 47, no. 2, s. 93-101, 2006.
[22]
B. Lidestam och J. Beskow, "Visual phonemic ambiguity and speechreading," Journal of Speech, Language and Hearing Research, vol. 49, no. 4, s. 835-847, 2006.
[23]
J. Beskow, "Trainable articulatory control models for visual speech synthesis," International Journal of Speech Technology, vol. 7, no. 4, s. 335-349, 2004.

Konferensbidrag

[24]
F. Malmberg et al., "Exploring Latent Sign Language Representations with Isolated Signs, Sentences and In-the-Wild Data," i 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources, sign-lang@LREC-COLING 2024, 2024, s. 219-224.
[25]
S. Mehta et al., "Fake it to make it : Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis," i Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, s. 1952-1964.
[26]
A. W. $$$Werner, J. $$$Beskow och A. $$$Deichler, "Gesture Evaluation in Virtual Reality," i ICMI Companion 2024 - Companion Publication of the 26th International Conference on Multimodal Interaction, 2024, s. 156-164.
[27]
S. Mehta et al., "MATCHA-TTS: A FAST TTS ARCHITECTURE WITH CONDITIONAL FLOW MATCHING," i 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Proceedings, 2024, s. 11341-11345.
[28]
C. Tånnander et al., "Prosodic characteristics of English-accented Swedish neural TTS," i Proceedings of Speech Prosody 2024, 2024, s. 1035-1039.
[29]
J. Gustafsson et al., "Casual chatter or speaking up? Adjusting articulatory effort in generation of speech and animation for conversational characters," i 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition, FG 2023, 2023.
[30]
A. Deichler et al., "Difusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation," i PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2023, 2023, s. 755-762.
[31]
J. Gustafsson, É. Székely och J. Beskow, "Generation of speech and facial animation with controllable articulatory effort for amusing conversational characters," i 23rd ACM International Conference on Interlligent Virtual Agent (IVA 2023), 2023.
[32]
J. Miniotaitė et al., "Hi robot, it's not what you say, it's how you say it," i 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN, 2023, s. 307-314.
[33]
S. Mehta et al., "OverFlow : Putting flows on top of neural transducers for better TTS," i Interspeech 2023, 2023, s. 4279-4283.
[34]
S. Mehta et al., "Neural HMMs are all you need (for high-quality attention-free TTS)," i 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, s. 7457-7461.
[35]
B. Moell et al., "Speech Data Augmentation for Improving Phoneme Transcriptions of Aphasic Speech Using Wav2Vec 2.0 for the PSST Challenge," i The RaPID4 Workshop : Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric/developmental impairments, 2022, s. 62-70.
[36]
A. Deichler et al., "Towards Context-Aware Human-like Pointing Gestures with RL Motion Imitation," i Context-Awareness in Human-Robot Interaction: Approaches and Challenges, workshop at 2022 ACM/IEEE International Conference on Human-Robot Interaction, 2022, s. 2022.
[37]
J. Beskow et al., "Expressive Robot Performance based on Facial Motion Capture," i INTERSPEECH 2021, 2021, s. 2343-2344.
[38]
J. Beskow et al., "Expressive robot performance based on facial motion capture," i Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2021, s. 2165-2166.
[39]
S. Wang et al., "Integrated Speech and Gesture Synthesis," i ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction, 2021, s. 177-185.
[40]
P. Jonell et al., "Mechanical Chameleons : Evaluating the effects of a social robot’snon-verbal behavior on social influence," i Proceedings of SCRITA 2021, a workshop at IEEE RO-MAN 2021, 2021.
[41]
K. Chhatre et al., "Spatio-temporal priors in 3D human motion," i IEEE ICDL Workshop on Spatio-temporal Aspects of Embodied Predictive Processing, 2021.
[42]
É. Székely et al., "Breathing and Speech Planning in Spontaneous Speech Synthesis," i 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, s. 7649-7653.
[43]
P. Jonell et al., "Can we trust online crowdworkers? : Comparing online and offline participants in a preference test of virtual agents.," i IVA '20: Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, 2020.
[44]
M. Cohn et al., "Embodiment and gender interact in alignment to TTS voices," i Proceedings for the 42nd Annual Meeting of the Cognitive Science Society : Developing a Mind: Learning in Humans, Animals, and Machines, CogSci 2020, 2020, s. 220-226.
[45]
S. Alexanderson et al., "Generating coherent spontaneous speech and gesture from text," i Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, IVA 2020, 2020.
[46]
P. Jonell et al., "Let’s face it : Probabilistic multi-modal interlocutor-aware generation of facial gestures in dyadic settings," i IVA '20: Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, 2020.
[47]
K. Håkansson et al., "Robot-assisted detection of subclinical dementia : progress report and preliminary findings," i In 2020 Alzheimer's Association International Conference. ALZ., 2020.
[48]
C. Chen et al., "Equipping social robots with culturally-sensitive facial expressions of emotion using data-driven methods," i 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), 2019, s. 1-8.
[49]
É. Székely et al., "How to train your fillers: uh and um in spontaneous speech synthesis," i The 10th ISCA Speech Synthesis Workshop, 2019.
[50]
P. Jonell et al., "Learning Non-verbal Behavior for a Social Robot from YouTube Videos," i ICDL-EpiRob Workshop on Naturalistic Non-Verbal and Affective Human-Robot Interactions, Oslo, Norway, August 19, 2019, 2019.
[52]
É. Székely et al., "Off the cuff : Exploring extemporaneous speech delivery with TTS," i Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019, s. 3687-3688.
[53]
[54]
Z. Malisz et al., "PROMIS: a statistical-parametric speech synthesis system with prominence control via a prominence network," i Proceedings of SSW 10 - The 10th ISCA Speech Synthesis Workshop, 2019.
[55]
P. Wagner et al., "Speech Synthesis Evaluation : State-of-the-Art Assessment and Suggestion for a Novel Research Program," i Proceedings of the 10th Speech Synthesis Workshop (SSW10), 2019.
[56]
É. Székely et al., "Spontaneous conversational speech synthesis from found data," i Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019, s. 4435-4439.
[57]
Z. Malisz et al., "The speech synthesis phoneticians need is both realistic and controllable," i Proceedings from FONETIK 2019, 2019.
[58]
Z. Malisz, P. Jonell och J. Beskow, "The visual prominence of whispered speech in Swedish," i Proceedings of 19th International Congress of Phonetic Sciences, 2019.
[59]
D. Kontogiorgos et al., "A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction," i Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2018, s. 119-127.
[60]
P. Jonell et al., "Crowdsourced Multimodal Corpora Collection Tool," i Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2018, s. 728-734.
[61]
H. -. Vögel et al., "Emotion-awareness for intelligent vehicle assistants : A research agenda," i Proceedings - International Conference on Software Engineering, 2018, s. 11-15.
[62]
C. Chen et al., "Reverse engineering psychologically valid facial expressions of emotion into social robots," i 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), 2018, s. 448-452.
[63]
A. E. Vijayan et al., "Using Constrained Optimization for Real-Time Synchronization of Verbal and Nonverbal Robot Behavior," i 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, s. 1955-1961.
[64]
K. Stefanov och J. Beskow, "A Real-time Gesture Recognition System for Isolated Swedish Sign Language Signs," i Proceedings of the 4th European and 7th Nordic Symposium on Multimodal Communication (MMSYM 2016), 2017.
[65]
Z. Malisz et al., "Controlling prominence realisation in parametric DNN-based speech synthesis," i Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 2017, s. 1079-1083.
[66]
C. Oertel et al., "Crowd-Sourced Design of Artificial Attentive Listeners," i Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2017, s. 854-858.
[67]
P. Jonell et al., "Crowd-powered design of virtual attentive listeners," i 17th International Conference on Intelligent Virtual Agents, IVA 2017, 2017, s. 188-191.
[68]
C. Oertel et al., "Crowdsourced design of artificial attentive listeners," i INTERSPEECH: Situated Interaction, Augusti 20-24 Augusti, 2017, 2017.
[69]
Y. Zhang, J. Beskow och H. Kjellström, "Look but Don’t Stare : Mutual Gaze Interaction in Social Robots," i 9th International Conference on Social Robotics, ICSR 2017, 2017, s. 556-566.
[70]
M. S. L. Khan et al., "Moveable facial features in a social mediator," i 17th International Conference on Intelligent Virtual Agents, IVA 2017, 2017, s. 205-208.
[71]
J. Beskow et al., "Preface," i 17th International Conference on Intelligent Virtual Agents, IVA 2017, 2017, s. V-VI.
[72]
K. Stefanov, J. Beskow och G. Salvi, "Vision-based Active Speaker Detection in Multiparty Interaction," i Grounding Language Understanding, 2017.
[73]
K. Stefanov och J. Beskow, "A Multi-party Multi-modal Dataset for Focus of Visual Attention in Human-human and Human-robot Interaction," i Proceedings of the 10th edition of the Language Resources and Evaluation Conference, 2016.
[74]
J. Beskow och H. Berthelsen, "A hybrid harmonics-and-bursts modelling approach to speech synthesis," i Proceedings 9th ISCA Speech Synthesis Workshop, SSW 2016, 2016, s. 208-213.
[75]
S. Alexanderson, D. House och J. Beskow, "Automatic annotation of gestural units in spontaneous face-to-face interaction," i MA3HMI 2016 - Proceedings of the Workshop on Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction, 2016, s. 15-19.
[76]
K. Stefanov och J. Beskow, "Gesture Recognition System for Isolated Sign Language Signs," i The 4th European and 7th Nordic Symposium on Multimodal Communication, 29-30 September 2016, University of Copenhagen, Denmark, 2016, s. 57-59.
[77]
K. Stefanov, A. Sugimoto och J. Beskow, "Look Who’s Talking : Visual Identification of the Active Speaker in Multi-party Human-robot Interaction," i 2nd Workshop on Advancements in Social Signal Processing for Multimodal Interaction 2016, ASSP4MI 2016 - Held in conjunction with the 18th ACM International Conference on Multimodal Interaction 2016, ICMI 2016, 2016, s. 22-27.
[78]
S. Alexanderson, C. O'Sullivan och J. Beskow, "Robust online motion capture labeling of finger markers," i Proceedings - Motion in Games 2016 : 9th International Conference on Motion in Games, MIG 2016, 2016, s. 7-13.
[79]
J. Beskow, "Spoken and non-verbal interaction experiments with a social robot," i The Journal of the Acoustical Society of America, 2016.
[80]
G. Skantze, M. Johansson och J. Beskow, "A Collaborative Human-Robot Game as a Test-bed for Modelling Multi-party, Situated Interaction," i INTELLIGENT VIRTUAL AGENTS, IVA 2015, 2015, s. 348-351.
[81]
G. Skantze, M. Johansson och J. Beskow, "Exploring Turn-taking Cues in Multi-party Human-robot Discussions about Objects," i Proceedings of the 2015 ACM International Conference on Multimodal Interaction, 2015.
[82]
S. Al Moubayed et al., "Human-robot Collaborative Tutoring Using Multiparty Multimodal Spoken Dialogue," i 9th Annual ACM/IEEE International Conference on Human-Robot Interaction, Bielefeld, Germany, 2014.
[83]
S. Al Moubayed, J. Beskow och G. Skantze, "Spontaneous spoken dialogues with the Furhat human-like robot head," i HRI '14 Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction, 2014, s. 326.
[84]
J. Beskow et al., "Tivoli - Learning Signs Through Games and Interaction for Children with Communicative Disorders," i 6th Biennial Conference of the International Society for Augmentative and Alternative Communication, Lisbon, Portugal, 2014.
[85]
S. Al Moubayed et al., "Tutoring Robots: Multiparty Multimodal Social Dialogue With an Embodied Tutor," i 9th International Summer Workshop on Multimodal Interfaces, Lisbon, Portugal, 2014.
[86]
K. Stefanov och J. Beskow, "A Kinect Corpus of Swedish Sign Language Signs," i Proceedings of the 2013 Workshop on Multimodal Corpora : Beyond Audio and Video, 2013.
[87]
S. Alexanderson, D. House och J. Beskow, "Aspects of co-occurring syllables and head nods in spontaneous dialogue," i Proceedings of 12th International Conference on Auditory-Visual Speech Processing (AVSP2013), 2013, s. 169-172.
[88]
S. Alexanderson, D. House och J. Beskow, "Extracting and analysing co-speech head gestures from motion-capture data," i Proceedings of Fonetik 2013, 2013, s. 1-4.
[89]
S. Alexanderson, D. House och J. Beskow, "Extracting and analyzing head movements accompanying spontaneous dialogue," i Conference Proceedings TiGeR 2013 : Tilburg Gesture Research Meeting, 2013.
[90]
B. Bollepalli, J. Beskow och J. Gustafsson, "Non-Linear Pitch Modification in Voice Conversion using Artificial Neural Networks," i Advances in nonlinear speech processing : 6th International Conference, NOLISP 2013, Mons, Belgium, June 19-21, 2013 : proceedings, 2013, s. 97-103.
[91]
S. Al Moubayed, J. Beskow och G. Skantze, "The Furhat Social Companion Talking Head," i Interspeech 2013 - Show and Tell, 2013, s. 747-749.
[92]
J. Beskow et al., "The Tivoli System - A Sign-driven Game for Children with Communicative Disorders," i 1st Symposium on Multimodal Communication, Msida, Malta, 2013.
[93]
J. Beskow och K. Stefanov, "Web-enabled 3D Talking Avatars Based on WebGL and HTML5," i 13th International Conference on Intelligent Virtual Agents, Edinburgh, UK, 2013.
[94]
J. Edlund et al., "3rd party observer gaze as a continuous measure of dialogue flow," i Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012, 2012, s. 1354-1358.
[95]
S. Alexanderson och J. Beskow, "Can Anybody Read Me? Motion Capture Recordings for an Adaptable Visual Speech Synthesizer," i In proceedings of The Listening Talker, 2012, s. 52-52.
[97]
S. Al Moubayed et al., "Furhat : A Back-projected Human-like Robot Head for Multiparty Human-Machine Interaction," i Cognitive Behavioural Systems : COST 2102 International Training School, Dresden, Germany, February 21-26, 2011, Revised Selected Papers, 2012, s. 114-130.
[98]
G. Skantze et al., "Furhat at Robotville : A Robot Head Harvesting the Thoughts of the Public through Multi-party Dialogue," i Proceedings of the Workshop on Real-time Conversation with Virtual Agents IVA-RCVA, 2012.
[99]
[100]
B. Bollepalli, J. Beskow och J. Gustafson, "HMM based speech synthesis system for Swedish Language," i The Fourth Swedish Language Technology Conference, 2012.
[101]
S. Al Moubayed, G. Skantze och J. Beskow, "Lip-reading : Furhat audio visual intelligibility of a back projected animated face," i Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, s. 196-203.
[102]
S. Al Moubayed et al., "Multimodal Multiparty Social Interaction with the Furhat Head," i 14th ACM International Conference on Multimodal Interaction, Santa Monica, CA, 2012, s. 293-294.
[103]
S. Al Moubayed et al., "A robotic head using projected animated faces," i Proceedings of the International Conference on Audio-Visual Speech Processing 2011, 2011, s. 71.
[104]
S. Al Moubayed et al., "Animated Faces for Robotic Heads : Gaze and Beyond," i Analysis of Verbal and Nonverbal Communication and Enactment : The Processing Issues, 2011, s. 19-35.
[105]
J. Beskow et al., "Kinetic Data for Large-Scale Analysis and Modeling of Face-to-Face Conversation," i Proceedings of International Conference on Audio-Visual Speech Processing 2011, 2011, s. 103-106.
[106]
J. Edlund, S. Al Moubayed och J. Beskow, "The Mona Lisa Gaze Effect as an Objective Metric for Perceived Cospatiality," i Proc. of the Intelligent Virtual Agents 10th International Conference (IVA 2011), 2011, s. 439-440.
[107]
S. Al Moubayed et al., "Audio-Visual Prosody : Perception, Detection, and Synthesis of Prominence," i 3rd COST 2102 International Training School on Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces : Theoretical and Practical Issues, 2010, s. 55-71.
[108]
J. Edlund och J. Beskow, "Capturing massively multimodal dialogues : affordable synchronization and visualization," i Proc. of Multimodal Corpora : Advances in Capturing, Coding and Analyzing Multimodality (MMC 2010), 2010, s. 160-161.
[109]
J. Beskow et al., "Face-to-Face Interaction and the KTH Cooking Show," i Development of multimodal interfaces : Active listing and synchrony, 2010, s. 157-168.
[110]
J. Beskow och S. Al Moubayed, "Perception of Gaze Direction in 2D and 3D Facial Projections," i The ACM / SSPNET 2nd International Symposium on Facial Analysis and Animation, 2010, s. 24-24.
[111]
S. Al Moubayed och J. Beskow, "Perception of Nonverbal Gestures of Prominence in Visual Speech Animation," i Proceedings of the ACM/SSPNET 2nd International Symposium on Facial Analysis and Animation, 2010, s. 25.
[112]
S. Al Moubayed och J. Beskow, "Prominence Detection in Swedish Using Syllable Correlates," i Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, 2010, s. 1784-1787.
[113]
S. Schötz et al., "Simulating Intonation in Regional Varieties of Swedish," i Speech Prosody 2010, 2010.
[114]
J. Edlund et al., "Spontal : a Swedish spontaneous dialogue corpus of audio, video and motion capture," i Proc. of the Seventh conference on International Language Resources and Evaluation (LREC'10), 2010, s. 2992-2995.
[115]
S. Al Moubayed och J. Beskow, "Effects of Visual Prominence Cues on Speech Intelligibility," i Proceedings of Auditory-Visual Speech Processing AVSP'09, 2009.
[116]
F. López-Colino, J. Beskow och J. Colas, "Mobile Synface : Talking head interface for mobile VoIP telephone calls," i Actas del X Congreso Internacional de Interaccion Persona-Ordenador, INTERACCION 2009, 2009.
[117]
J. Beskow, G. Salvi och S. Al Moubayed, "SynFace : Verbal and Non-verbal Face Animation from Audio," i Proceedings of The International Conference on Auditory-Visual Speech Processing AVSP'09, 2009.
[118]
J. Beskow, G. Salvi och S. Al Moubayed, "SynFace - Verbal and Non-verbal Face Animation from Audio," i Auditory-Visual Speech Processing 2009, AVSP 2009, 2009.
[119]
J. Beskow et al., "The MonAMI Reminder : a spoken dialogue system for face-to-face interaction," i Proceedings of the 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009, 2009, s. 300-303.
[120]
S. Al Moubayed et al., "Virtual Speech Reading Support for Hard of Hearing in a Domestic Multi-Media Setting," i INTERSPEECH 2009 : 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, 2009, s. 1443-1446.
[121]
J. Beskow och L. Cerrato, "Evaluation of the expressivity of a Swedish talking head in the context of human-machine interaction," i Comunicazione parlatae manifestazione delle emozioni : Atti del I Convegno GSCP, Padova 29 novembre - 1 dicembre 2004, 2008.
[122]
J. Beskow et al., "Hearing at Home : Communication support in home environments for hearing impaired persons," i INTERSPEECH 2008 : 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, 2008, s. 2203-2206.
[123]
J. Beskow et al., "Innovative interfaces in MonAMI : The Reminder," i Perception In Multimodal Dialogue Systems, Proceedings, 2008, s. 272-275.
[124]
J. Beskow et al., "Recognizing and Modelling Regional Varieties of Swedish," i INTERSPEECH 2008 : 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, 2008, s. 512-515.
[125]
J. Beskow, B. Granström och D. House, "Analysis and synthesis of multimodal verbal and non-verbal interaction for animated interface agents," i VERBAL AND NONVERBAL COMMUNICATION BEHAVIOURS, 2007, s. 250-263.
[126]
J. Edlund och J. Beskow, "Pushy versus meek : using avatars to influence turn-taking behaviour," i INTERSPEECH 2007 : 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, 2007, s. 2784-2787.
[127]
E. Agelfors et al., "User evaluation of the SYNFACE talking head telephone," i Computers Helping People With Special Needs, Proceedings, 2006, s. 579-586.
[128]
J. Beskow, B. Granström och D. House, "Visual correlates to prominence in several expressive modes," i INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, 2006, s. 1272-1275.
[129]
J. Beskow och M. Nordenberg, "Data-driven synthesis of expressive visual speech using an MPEG-4 talking head," i 9th European Conference on Speech Communication and Technology, 2005, s. 793-796.
[130]
O. Engwall et al., "Design strategies for a virtual language tutor," i INTERSPEECH 2004, ICSLP, 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004, 2004, s. 1693-1696.
[131]
J. Beskow et al., "Expressive animated agents for affective dialogue systems," i AFFECTIVE DIALOGUE SYSTEMS, PROCEEDINGS, 2004, s. 240-243.
[132]
J. Beskow et al., "Preliminary cross-cultural evaluation of expressiveness in synthetic faces," i Affective Dialogue Systems, Proceedings, 2004, s. 301-304.
[133]
J. Beskow et al., "SYNFACE - A talking head telephone for the hearing-impaired," i COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS : PROCEEDINGS, 2004, s. 1178-1185.
[134]
K.-E. Spens et al., "SYNFACE, a talking head telephone for the hearing impaired," i IFHOH 7th World Congress for the Hard of Hearing. Helsinki Finland. July 4-9, 2004, 2004.
[135]
J. Beskow et al., "The Swedish PFs-Star Multimodal Corpora," i Proceedings of LREC Workshop on Models of Human Behaviour for the Specification and Evaluation of Multimodal Input and Output Interfaces, 2004, s. 34-37.
[136]
E. Agelfors et al., "A synthetic face as a lip-reading support for hearing impaired telephone users - problems and positive results," i European audiology in 1999 : proceeding of the 4th European Conference in Audiology, Oulu, Finland, June 6-10, 1999, 1999.
[137]
E. Agelfors et al., "Synthetic visual speech driven from auditory speech," i Proceedings of Audio-Visual Speech Processing (AVSP'99)), 1999.

Kapitel i böcker

[138]
G. Skantze, J. Gustafson och J. Beskow, "Multimodal Conversational Interaction with Robots," i The Handbook of Multimodal-Multisensor Interfaces, Volume 3 : Language Processing, Software, Commercialization, and Emerging Directions, Sharon Oviatt, Björn Schuller, Philip R. Cohen, Daniel Sonntag, Gerasimos Potamianos, Antonio Krüger red., : ACM Press, 2019.
[139]
J. Edlund, S. Al Moubayed och J. Beskow, "Co-present or Not? : Embodiment, Situatedness and the Mona Lisa Gaze Effect," i Eye gaze in intelligent user interfaces : gaze-based analyses, models and applications, Nakano, Yukiko; Conati, Cristina; Bader, Thomas red., London : Springer London, 2013, s. 185-203.
[140]
J. Edlund, D. House och J. Beskow, "Gesture movement profiles in dialogues from a Swedish multimodal database of spontaneous speech," i Prosodic and Visual Resources in Interactional Grammar, Bergmann, Pia; Brenning, Jana; Pfeiffer, Martin C.; Reber, Elisabeth red., : Walter de Gruyter, 2012.
[141]
J. Beskow et al., "Multimodal Interaction Control," i Computers in the Human Interaction Loop, Waibel, Alexander; Stiefelhagen, Rainer red., Berlin/Heidelberg : Springer Berlin/Heidelberg, 2009, s. 143-158.
[142]
J. Beskow, J. Edlund och M. Nordstrand, "A Model for Multimodal Dialogue System Output Applied to an Animated Talking Head," i SPOKEN MULTIMODAL HUMAN-COMPUTER DIALOGUE IN MOBILE ENVIRONMENTS, Minker, Wolfgang; Bühler, Dirk; Dybkjær, Laila red., Dordrecht : Springer, 2005, s. 93-113.

Icke refereegranskade

Konferensbidrag

[143]
D. House, S. Alexanderson och J. Beskow, "On the temporal domain of co-speech gestures: syllable, phrase or talk spurt?," i Proceedings of Fonetik 2015, 2015, s. 63-68.
[144]
S. Al Moubayed et al., "Talking with Furhat - multi-party interaction with a back-projected robot head," i Proceedings of Fonetik 2012, 2012, s. 109-112.
[145]
S. Al Moubayed och J. Beskow, "A novel Skype interface using SynFace for virtual speech reading support," i Proceedings from Fonetik 2011, June 8 - June 10, 2011 : Speech, Music and Hearing, Quarterly Progress and Status Report, TMH-OPSR, Volume 51, 2011, 2011, s. 33-36.
[146]
J. Edlund, J. Gustafson och J. Beskow, "Cocktail : a demonstration of massively multi-component audio environments for illustration and analysis," i SLTC 2010, The Third Swedish Language Technology Conference (SLTC 2010) : Proceedings of the Conference, 2010.
[147]
J. Beskow och B. Granström, "Goda utsikter för teckenspråksteknologi," i Språkteknologi för ökad tillgänglighet : Rapport från ett nordiskt seminarium, 2010, s. 77-86.
[148]
J. Beskow et al., "Modelling humanlike conversational behaviour," i SLTC 2010 : The Third Swedish Language Technology Conference (SLTC 2010), Proceedings of the Conference, 2010, s. 9-10.
[149]
J. Beskow et al., "Research focus : Interactional aspects of spoken face-to-face communication," i Proceedings from Fonetik, Lund, June 2-4, 2010 : , 2010, s. 7-10.
[150]
S. Schötz et al., "Simulating Intonation in Regional Varieties of Swedish," i Fonetik 2010, 2010.
[151]
J. Beskow och J. Gustafson, "Experiments with Synthesis of Swedish Dialects," i Proceedings of Fonetik 2009, 2009, s. 28-29.
[152]
J. Beskow et al., "Project presentation: Spontal : multimodal database of spontaneous dialog," i Proceedings of Fonetik 2009 : The XXIIth Swedish Phonetics Conference, 2009, s. 190-193.
[153]
S. Al Moubayed et al., "Studies on Using the SynFace Talking Head for the Hearing Impaired," i Proceedings of Fonetik'09 : The XXIIth Swedish Phonetics Conference, June 10-12, 2009, 2009, s. 140-143.
[154]
J. Beskow et al., "Human Recognition of Swedish Dialects," i Proceedings of Fonetik 2008 : The XXIst Swedish Phonetics Conference, 2008, s. 61-64.
[155]
F. López-Colino, J. Beskow och J. Colás, "Mobile SynFace : Ubiquitous visual interface for mobile VoIP telephone calls," i Proceedings of The second Swedish Language Technology Conference (SLTC), 2008.
[156]
J. Beskow et al., "Speech technology in the European project MonAMI," i Proceedings of FONETIK 2008, 2008, s. 33-36.
[157]
S. Al Moubayed, J. Beskow och G. Salvi, "SynFace Phone Recognizer for Swedish Wideband and Narrowband Speech," i Proceedings of The second Swedish Language Technology Conference (SLTC), 2008, s. 3-6.
[158]
J. Edlund, J. Beskow och M. Heldner, "MushyPeek : an experiment framework for controlled investigation of human-human interaction control behaviour," i Proceedings of Fonetik 2007, 2007, s. 61-64.
[159]
J. Beskow, B. Granström och D. House, "Focal accent and facial movements in expressive speech," i Proceedings from Fonetik 2006, Lund, June, 7-9, 2006, 2006, s. 9-12.
[160]
C. Siciliano et al., "Evaluation of a Multilingual Synthetic Talking Faceas a Communication Aid for the Hearing Impaired," i Proceedings of the 15th International Congress of Phonetic Science (ICPhS'03), 2003, s. 131-134.
[161]
J. Beskow, O. Engwall och B. Granström, "Resynthesis of Facial and Intraoral Articulation fromSimultaneous Measurements," i Proceedings of the 15th International Congress of phonetic Sciences (ICPhS'03), 2003.
[162]
D. W. Massaro et al., "Picture My Voice : Audio to Visual Speech Synthesis using Artificial Neural Networks," i Proceedings of International Conference on Auditory-Visual Speech Processing, 1999, s. 133-138.
[163]
M. M. Cohen,, J. Beskow och D. W. Massaro, "RECENT DEVELOPMENTS IN FACIAL ANIMATION : AN INSIDE VIEW," i Proceedings of International Conference on Auditory-Visual Speech Processing, 1998, s. 201-206.
[164]
J. Beskow, "ANIMATION OF TALKING AGENTS," i Proceedings of International Conference on Auditory-Visual Speech Processing, 1997, s. 149-152.
[165]
J. Beskow, "RULE-BASED VISUAL SPEECH SYNTHESIS," i Proceedings of the 4th European Conference on Speech Communication and Technology, 1995, s. 299-302.

Kapitel i böcker

[166]
D. W. Massaro et al., "Animated speech : Research progress and applications," i Audiovisual Speech Processing, : Cambridge University Press, 2012, s. 309-345.

Avhandlingar

[167]
J. Beskow, "Talking Heads - Models and Applications for Multimodal Speech Synthesis," Doktorsavhandling : Institutionen för talöverföring och musikakustik, Trita-TMH, 2003:7, 2003.
Senaste synkning med DiVA:
2024-12-22 02:14:25