Hoppa till huvudinnehållet
Till KTH:s startsida

Publikationer av Éva Székely

Refereegranskade

Artiklar

[1]
É. Székely et al., "Facial expression-based affective speech translation," Journal on Multimodal User Interfaces, vol. 8, no. 1, s. 87-96, 2014.
[2]

Konferensbidrag

[3]
S. Wang och É. Székely, "Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model," i 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings, 2024, s. 6464-6474.
[4]
S. Mehta et al., "MATCHA-TTS: A FAST TTS ARCHITECTURE WITH CONDITIONAL FLOW MATCHING," i 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Proceedings, 2024, s. 11341-11345.
[5]
H. Lameris, É. Székely och J. Gustafsson, "The Role of Creaky Voice in Turn Taking and the Perception of Speaker Stance: Experiments Using Controllable TTS," i 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings, 2024, s. 16058-16065.
[6]
S. Wang et al., "A Comparative Study of Self-Supervised Speech Representations in Read and Spontaneous TTS," i ICASSPW 2023 : 2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, Proceedings, 2023.
[7]
E. Ekstedt et al., "Automatic Evaluation of Turn-taking Cues in Conversational Speech Synthesis," i Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2023, 2023, s. 5481-5485.
[8]
H. Lameris, J. Gustafsson och É. Székely, "Beyond style : synthesizing speech with pragmatic functions," i Interspeech 2023, 2023, s. 3382-3386.
[9]
I. Torre et al., "Can a gender-ambiguous voice reduce gender stereotypes in human-robot interactions?," i 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN, 2023, s. 106-112.
[10]
J. Gustafsson et al., "Casual chatter or speaking up? Adjusting articulatory effort in generation of speech and animation for conversational characters," i 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition, FG 2023, 2023.
[11]
J. Gustafsson, É. Székely och J. Beskow, "Generation of speech and facial animation with controllable articulatory effort for amusing conversational characters," i 23rd ACM International Conference on Interlligent Virtual Agent (IVA 2023), 2023.
[12]
J. Miniotaitė et al., "Hi robot, it's not what you say, it's how you say it," i 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN, 2023, s. 307-314.
[13]
S. Mehta et al., "OverFlow : Putting flows on top of neural transducers for better TTS," i Interspeech 2023, 2023, s. 4279-4283.
[14]
A. Kirkland, J. Gustafsson och É. Székely, "Pardon my disfluency : The impact of disfluency effects on the perception of speaker competence and confidence," i Interspeech 2023, 2023, s. 5217-5221.
[15]
H. Lameris et al., "Prosody-Controllable Spontaneous TTS with Neural HMMs," i International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023.
[16]
É. Székely, J. Gustafsson och I. Torre, "Prosody-controllable gender-ambiguous speech synthesis : a tool for investigating implicit bias in speech perception," i Interspeech 2023, 2023, s. 1234-1238.
[17]
É. Székely, S. Wang och J. Gustafsson, "So-to-Speak : an exploratory platform for investigating the interplay between style and prosody in TTS," i Interspeech 2023, 2023, s. 2016-2017.
[19]
M. P. Aylett et al., "Why is my Agent so Slow? Deploying Human-Like Conversational Turn-Taking," i HAI 2023 - Proceedings of the 11th Conference on Human-Agent Interaction, 2023, s. 490-492.
[20]
S. Wang, J. Gustafsson och É. Székely, "Evaluating Sampling-based Filler Insertion with Spontaneous TTS," i LREC 2022 : Thirteen International Conference On Language Resources And Evaluation, 2022, s. 1960-1969.
[21]
S. Mehta et al., "Neural HMMs are all you need (for high-quality attention-free TTS)," i 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, s. 7457-7461.
[22]
N. Ward et al., "Two Pragmatic Functions of Breathy Voice in American English Conversation," i Proceedings 11th International Conference on Speech Prosody, 2022, s. 82-86.
[24]
S. Wang et al., "Integrated Speech and Gesture Synthesis," i ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction, 2021, s. 177-185.
[25]
A. Kirkland et al., "Perception of smiling voice in spontaneous speech synthesis," i Proceedings of Speech Synthesis Workshop (SSW11), 2021, s. 108-112.
[26]
É. Székely, J. Edlund och J. Gustafsson, "Augmented Prompt Selection for Evaluation of Spontaneous Speech Synthesis," i Proceedings of The 12th Language Resources and Evaluation Conference, 2020, s. 6368-6374.
[27]
É. Székely et al., "Breathing and Speech Planning in Spontaneous Speech Synthesis," i 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, s. 7649-7653.
[28]
S. Alexanderson et al., "Generating coherent spontaneous speech and gesture from text," i Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, IVA 2020, 2020.
[29]
É. Székely, G. E. Henter och J. Gustafson, "Casting to Corpus : Segmenting and Selecting Spontaneous Dialogue for TTS with a CNN-LSTM Speaker-Dependent Breath Detector," i 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, s. 6925-6929.
[30]
É. Székely et al., "How to train your fillers: uh and um in spontaneous speech synthesis," i The 10th ISCA Speech Synthesis Workshop, 2019.
[31]
L. Clark et al., "Mapping Theoretical and Methodological Perspectives for Understanding Speech Interface Interactions," i CHI EA '19 EXTENDED ABSTRACTS : EXTENDED ABSTRACTS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2019.
[32]
É. Székely et al., "Off the cuff : Exploring extemporaneous speech delivery with TTS," i Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019, s. 3687-3688.
[33]
P. Wagner et al., "Speech Synthesis Evaluation : State-of-the-Art Assessment and Suggestion for a Novel Research Program," i Proceedings of the 10th Speech Synthesis Workshop (SSW10), 2019.
[34]
É. Székely et al., "Spontaneous conversational speech synthesis from found data," i Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019, s. 4435-4439.
[35]
S. Betz et al., "The greennn tree - lengthening position influences uncertainty perception," i Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2019, 2019, s. 3990-3994.
[36]
É. Székely, P. Wagner och J. Gustafson, "THE WRYLIE-BOARD: MAPPING ACOUSTIC SPACE OF EXPRESSIVE FEEDBACK TO ATTITUDE MARKERS," i Proc. IEEE Spoken Language Technology conference, 2018.
[37]
E. Szekely, J. Mendelson och J. Gustafson, "Synthesising uncertainty : The interplay of vocal effort and hesitation disfluencies," i Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2017, s. 804-808.
[38]
B. R. Cowan et al., "They Know as Much as We Do : Knowledge Estimation and Partner Modelling of Artificial Partners," i CogSci 2017 - Proceedings of the 39th Annual Meeting of the Cognitive Science Society: Computational Foundations of Cognition, 2017, s. 1836-1841.
[39]
C. Oertel et al., "Using crowd-sourcing for the design of listening agents : Challenges and opportunities," i ISIAA 2017 - Proceedings of the 1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents, Co-located with ICMI 2017, 2017, s. 37-38.
[40]
É. Székely, M. T. Keane och J. Carson-Berndsen, "The Effect of Soft, Modal and Loud Voice Levels on Entrainment in Noisy Conditions," i Sixteenth Annual Conference of the International Speech Communication Association, 2015.
[41]
Z. Ahmed et al., "A system for facial expression-based affective speech translation," i Proceedings of the companion publication of the 2013 international conference on Intelligent user interfaces companion, 2013, s. 57-58.
[42]
É. Székely et al., "Detecting a targeted voice style in an audiobook using voice quality features," i Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, 2012, s. 4593-4596.
[43]
É. Székely et al., "Evaluating expressive speech synthesis from audiobooks in conversational phrases," i International Conference on Language Resources and Evaluation. MAY 21-27, 2012., 2012, s. 3335-3339.
[44]
É. Székely et al., "Facial expression as an input annotation modality for affective speech-to-speech translation," i Workshop on Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction, 2012.
[45]
M. Abou-Zleikha et al., "Multi-level exemplar-based duration generation for expressive speech synthesis," i Proceedings of Speech Prosody, 2012.
[46]
J. P. Cabral et al., "Rapidly Testing the Interaction Model of a Pronunciation Training System via Wizard-of-Oz.," i Proceedings of the International Conference on Language Resources and Evaluation, 2012, s. 4136-4142.
[47]
É. Székely et al., "Synthesizing expressive speech from amateur audiobook recordings," i Spoken Language Technology Workshop (SLT), 2012, s. 297-302.
[49]
É. Székely et al., "WinkTalk : a demonstration of a multimodal speech synthesis platform linking facial expressions to expressive synthetic voices," i Proceedings of the Third Workshop on Speech and Language Processing for Assistive Technologies, 2012, s. 5-8.
[50]
É. Székely et al., "Clustering Expressive Speech Styles in Audiobooks Using Glottal Source Parameters.," i 12th Annual Conference of the International-Speech-Communication-Association 2011 (INTERSPEECH 2011), 2011, s. 2409-2412.
[51]
P. Cahill et al., "Ucd blizzard challenge 2011 entry," i Proceedings of the Blizzard Challenge Workshop, 2011.

Icke refereegranskade

Konferensbidrag

[52]
H. Lameris et al., "Spontaneous Neural HMM TTS with Prosodic Feature Modification," i Proceedings of Fonetik 2022, 2022.
Senaste synkning med DiVA:
2024-11-20 00:07:31