Personal information

United States, India

Biography

I am currently a senior applied scientist at Microsoft. I completed my PhD at Johns Hopkins University in the Center for Language and Speech Processing. Most recently I have been working on robust speech recognition in far-field, multi-talker scenario. My PhD dissertation focused on improving ASR with front-end processing like speech denoising, speech dereverberation, source separation, and multi-source localization.

Activities

Employment (3)

Microsoft: Redmond, WA, US

2022-10-03 to present
Employment
Source: Self-asserted source
Aswin Shanmugam Subramanian

Mitsubishi Electric Research Laboratories: Cambridge, MA, US

2021-09-01 to 2022-09-30
Employment
Source: Self-asserted source
Aswin Shanmugam Subramanian

Johns Hopkins University: Baltimore, Maryland, US

2017-09-01 to 2021-08-20 | PhD Research Assistant
Employment
Source: Self-asserted source
Aswin Shanmugam Subramanian

Education and qualifications (4)

Johns Hopkins University: Baltimore, Maryland, US

2016-09 to 2022-05 | Ph.D. (Electrical and Computer Engineering)
Education
Source: Self-asserted source
Aswin Shanmugam Subramanian

Johns Hopkins University: Baltimore, Maryland, US

2016-09 to 2017-12 | Master of Science in Engineering (Electrical and Computer Engineering)
Education
Source: Self-asserted source
Aswin Shanmugam Subramanian

Indian Institute of Technology Madras: Chennai, Tamil Nadu, IN

2012-12 to 2016-07 | Master of Science by Research (Computer Science)
Education
Source: Self-asserted source
Aswin Shanmugam Subramanian

SSN College of Engineering: Chennai, Tamil Nadu, IN

2008 to 2012 | B.Tech. (Information Technology)
Education
Source: Self-asserted source
Aswin Shanmugam Subramanian

Works (22)

TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings

IEEE/ACM Transactions on Audio, Speech, and Language Processing
2024 | Journal article
Contributors: Christoph Boeddeker; Aswin Shanmugam Subramanian; Gordon Wichern; Reinhold Haeb-Umbach; Jonathan Le Roux
Source: check_circle
Crossref

Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks

IEEE/ACM Transactions on Audio, Speech, and Language Processing
2023 | Journal article
Contributors: Darius Petermann; Gordon Wichern; Aswin Shanmugam Subramanian; Zhong-Qiu Wang; Jonathan Le Roux
Source: check_circle
Crossref

Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition

Computer Speech & Language
2022-09 | Journal article
Contributors: Aswin Shanmugam Subramanian; Chao Weng; Shinji Watanabe; Meng Yu; Dong Yu
Source: check_circle
Crossref
grade
Preferred source (of 2)‎

Directional asr: A new paradigm for e2e multi-speaker speech recognition with source localization

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2021 | Conference paper
EID:

2-s2.0-85114964965

Part of ISSN: 15206149
Contributors: Subramanian, A.S.; Weng, C.; Watanabe, S.; Yu, M.; Xu, Y.; Zhang, S.-X.; Yu, D.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

Attention-Based ASR with Lightweight and Dynamic Convolutions

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2020 | Conference paper
EID:

2-s2.0-85089209725

Part of ISSN: 15206149
Contributors: Fujita, Y.; Subramanian, A.S.; Omachi, M.; Watanabe, S.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

CHiME-6 Challenge: Tackling multispeaker speech recognition for unsegmented recordings

arXiv
2020 | Other
EID:

2-s2.0-85095031852

Part of ISSN: 23318422
Contributors: Watanabe, S.; Mandel, M.; Barker, J.; Vincent, E.; Arora, A.; Chang, X.; Khudanpur, S.; Manohar, V.; Povey, D.; Raj, D. et al.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

End-to-end ASR with adaptive span self-attention

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2020 | Conference paper
EID:

2-s2.0-85098227176

Part of ISSN: 19909772 2308457X
Contributors: Chang, X.; Subramanian, A.S.; Guo, P.; Watanabe, S.; Fujita, Y.; Omachi, M.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

End-to-end far-field speech recognition with unified dereverberation and beamforming

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2020 | Conference paper
EID:

2-s2.0-85098131286

Part of ISSN: 19909772 2308457X
Contributors: Zhang, W.; Subramanian, A.S.; Chang, X.; Watanabe, S.; Qian, Y.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

Far-Field Location Guided Target Speech Extraction Using End-to-End Speech Recognition Objectives

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2020 | Conference paper
EID:

2-s2.0-85089242977

Part of ISSN: 15206149
Contributors: Subramanian, A.S.; Weng, C.; Yu, M.; Zhang, S.-X.; Xu, Y.; Watanabe, S.; Yu, D.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

Significance of spectral cues in automatic speech segmentation for Indian language speech synthesizers

Speech Communication
2020 | Journal article
EID:

2-s2.0-85087589953

Part of ISSN: 01676393
Contributors: Baby, A.; Prakash, J.J.; Subramanian, A.S.; Murthy, H.A.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge

arXiv
2020 | Other
EID:

2-s2.0-85094722609

Part of ISSN: 23318422
Contributors: Arora, A.; Raj, D.; Subramanian, A.S.; Li, K.; Ben-Yair, B.; Maciejewski, M.; Zelasko, P.; García, P.; Watanabe, S.; Khudanpur, S.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

CHiME-6 Challenge: Tackling Multispeaker Speech Recognition for Unsegmented Recordings

6th International Workshop on Speech Processing in Everyday Environments (CHiME 2020)
2020-05-04 | Other
Contributors: Shinji Watanabe; Michael Mandel; Jon Barker; Emmanuel Vincent; Ashish Arora; Xuankai Chang; Sanjeev Khudanpur; Vimal Manohar; Daniel Povey; Desh Raj et al.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Crossref Metadata Search

An investigation of end-to-end multichannel speech recognition for reverberant and mismatch conditions

arXiv
2019 | Other
EID:

2-s2.0-85093752224

Part of ISSN: 23318422
Contributors: Subramanian, A.S.; Wang, X.; Watanabe, S.; Taniguchi, T.; Tran, D.; Fujita, Y.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

Generalized weighted-prediction-error dereverberation with varying source priors for reverberant speech recognition

IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
2019 | Conference paper
EID:

2-s2.0-85078572064

Part of ISSN: 19471629 19311168
Contributors: Taniguchi, T.; Subramanian, A.S.; Wang, X.; Tran, D.; Fujita, Y.; Watanabe, S.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

Speech enhancement using end-to-end speech recognition objectives

IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
2019 | Conference paper
EID:

2-s2.0-85078046877

Part of ISSN: 19471629 19311168
Contributors: Subramanian, A.S.; Wang, X.; Baskar, M.K.; Watanabe, S.; Taniguchi, T.; Tran, D.; Fujita, Y.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

Building state-of-the-art distant speech recognition using the CHiME-4 challenge with a setup of speech enhancement baseline

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2018 | Conference paper
EID:

2-s2.0-85054975722

Part of ISSN: 19909772 2308457X
Contributors: Chen, S.-J.; Subramanian, A.S.; Xu, H.; Watanabe, S.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

Student-teacher learning for BLSTM mask-based speech enhancement

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2018 | Conference paper
EID:

2-s2.0-85054958811

Part of ISSN: 19909772 2308457X
Contributors: Subramanian, A.S.; Chen, S.-J.; Watanabe, S.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

TBT (Toolkit to build TTS): A high performance framework to build multiple language HTS voice

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2017 | Conference paper
EID:

2-s2.0-85039148047

Part of ISSN: 19909772 2308457X
Contributors: Ghone, A.S.; Nerpagar, R.; Kumar, P.; Baby, A.; Shanmugam, A.; Sasikumar, M.; Murthy, H.A.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

Exploration of vowel onset and offset points for hybrid speech segmentation

IEEE Region 10 Annual International Conference, Proceedings/TENCON
2016 | Conference paper
EID:

2-s2.0-84962136619

Part of ISSN: 21593450 21593442
Contributors: Sarma, B.D.; Sharma, B.; Shanmugam, S.A.; Prasanna, S.R.M.; Murthy, H.A.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

Significance of Pseudo-syllables in building better acoustic models for Indian English TTS

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2016 | Conference paper
EID:

2-s2.0-84973394997

Part of ISSN: 15206149
Contributors: Vignesh, S.R.; Shanmugam, S.A.; Murthy, H.A.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

Building speech synthesis systems for Indian languages

2015 Twenty First National Conference on Communications (NCC)
2015-02 | Other
Contributors: Abhijit Pradhan; Anusha Prakash; S Aswin Shanmugam; G R Kasthuri; Raghava Krishnan; Hema A Murthy
Source: Self-asserted source
Aswin Shanmugam Subramanian via Crossref Metadata Search

A hybrid approach to segmentation of speech using group delay processing and HMM based embedded reestimation

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2014 | Conference paper
EID:

2-s2.0-84910072398

Part of ISSN: 19909772 2308457X
Contributors: Shanmugam, S.A.; Murthy, H.
Source: Self-asserted source
Aswin Shanmugam Subramanian via Scopus - Elsevier

Peer review (6 reviews for 1 publication/grant)

Review activity for Speech communication. (6)