Hoppa till huvudinnehållet

Sök på KTH:s webbplats
Sök på Studentwebben Sök på KTH Intranät
English

Publikationer av Shivam Mehta

Refereegranskade

Artiklar

[1]

R. Cumbal et al., "Stereotypical nationality representations in HRI : perspectives from international young adults," Frontiers in Robotics and AI, vol. 10, 2023.

Konferensbidrag

[2]

C. Tånnander et al., "Beyond graphemes and phonemes: continuous phonological features in neural text-to-speech synthesis," i Interspeech 2024, 2024, s. 2815-2819.

[3]

S. Mehta et al., "Fake it to make it : Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis," i Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, s. 1952-1964.

[4]

S. Mehta et al., "MATCHA-TTS: A FAST TTS ARCHITECTURE WITH CONDITIONAL FLOW MATCHING," i 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Proceedings, 2024, s. 11341-11345.

[5]

S. Mehta et al., "Should you use a probabilistic duration model in TTS? Probably! Especially for spontaneous speech," i Interspeech 2024, 2024, s. 2285-2289.

[6]

S. Mehta et al., "Unified speech and gesture synthesis using flow matching," i 2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024), 2024, s. 8220-8224.

[7]

A. Deichler et al., "Difusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation," i PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2023, 2023, s. 755-762.

[8]

S. Mehta et al., "OverFlow : Putting flows on top of neural transducers for better TTS," i Interspeech 2023, 2023, s. 4279-4283.

[9]

H. Lameris et al., "Prosody-Controllable Spontaneous TTS with Neural HMMs," i International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023.

[10]

S. Mehta et al., "Neural HMMs are all you need (for high-quality attention-free TTS)," i 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, s. 7457-7461.

[11]

B. Moell et al., "Speech Data Augmentation for Improving Phoneme Transcriptions of Aphasic Speech Using Wav2Vec 2.0 for the PSST Challenge," i The RaPID4 Workshop : Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric/developmental impairments, 2022, s. 62-70.

Icke refereegranskade

Konferensbidrag

[12]

H. Lameris et al., "Spontaneous Neural HMM TTS with Prosodic Feature Modification," i Proceedings of Fonetik 2022, 2022.

Senaste synkning med DiVA:

2025-05-21 10:04:50 UTC