Hi, I'm Rimita Lahiri.

I am a PhD researcher. My research focuses on understanding human behaviour from multimodal behavioral signals in real-world settings.


I am a doctoral candidate in the Ming Hsieh Department of Electrical and Computer Engineering at University of Southern California, Viterbi School of Engineering. I am a member of the Signal Analysis and Intepretation Laboratory and currently advised by Prof. Shrikanth Narayanan. Prior to USC, I worked as a Researcher at TCS Innovation Labs, Kolkata from 2016 to 2018. I received my Masters in 2016 from Jadavpur University where I worked under supervision of Prof. Amit Konar. I received my Bachelors in Electronics and Communication Engineering from Heritage Institute of Technology Kolkata in 2014.

My research interests broadly span across domains of multimodal machine learning, behavior signal processing and applications of speech and language in clinical domain. I am specifically interested in developing machine learning and deep learning based solutions that help us to understand and analyze human behavioral traits in real-world settings.


USC logo
SAIL logo
Research Assistant
  • Leading and coordinating research and collaboration efforts of CARE project with 7 PhD students
  • Developing novel and efficient algorithms for speech and language processing tasks using machine learning for child speech understanding specifically in te domain of autism research
  • Developing contextual measures for understanding children behaviour using multimodal signals
  • Mentored in exploreCSR outreach workshop
  • Significantly contributing to grant proposals
August 2018 - Present | Los Angeles, USA
Applied Scientist Research Intern
  • Developed an ASR solution for overlapped multi-talker use case and analyzed the impact of leveraging speaker embeddings for the same.
May 2023 - August 2023 | Santa Clara, USA
Applied NLU Research Intern
  • Developed and analyzed the impact of adding synthetically generated data on the performance of intent/slot filling models using state of the art paraphrasing networks.
May 2022 - Aug 2022 | Burlingame, USA
Summer Research Intern
  • Developed a meta-learning based ASR framework for multilingual use case with enhanced performance.
May 2021 - August 2021 | Redmond, USA


  • August 2023 : Presented our work on building robust speaker embeddings using self-supervised approaches at Interspeech 2023.
  • June 2023 : Presented our work on modelling interactional entrainment using context-aware approaches at ICASSP 2023.
  • April 2023 : Awarded USC HCN pre-doctoral fellowship for the second year.
  • Projects

    adversarial speaker classification
    Adversarial learning for speaker classification

    Exploration of adversarial learning strategies for speaker classification in child-inclusive interactions.

    • Computational approach for speaker classification in semi-naturalistic interactions between child and adult.
    • Supervised binary classification setup for speaker label prediction.
    • Framework is trained to address 2 main sources of variability: age of the child, data source collection location.
    • Adversarial learning using 2 methods : inverted label loss and gradient reversal layer.
    ssl speaker classification
    Self-supervised approach for speaker classification

    Exploration of self-supervised learning strategies for speaker classification in child-inclusive interactions.

    • Investigation of the impact of additional pre-training with child speech.
    • Leveraging unlabelled target domain speech to improve speaker classifcation.
    • Considering 2 self-supervised algorithms : Wav2vec 2.0 and WavLM
    Modeling interpersonal synchrony

    Exploration of knowledge-sriven statistical approaches for modeling intepersonal synchrony in clinical interactions involving children

    • Introduction of 3 different objective measures of interpersonal synchrony across vocal and lexical modalities.
    • For vocal prosodic and spectral features Dynamic Time Warping Distance(DTWD), Squared Cosine Distance of (featurewise) Complexity(SCDC) are used and for lexical features Word Movers' Distance(WMD) is used.
    • The complementarity of these synchrony measures was studied by an ad-hoc classification experiment to distinguish between children with and without autism.
    Contextual approach for entrainment quantification
    Context-aware approach for modeling entrainment

    A context-aware computational approach for quantification of entrainment in dyadic interactions.

    • We use conformers to capture both local and global context.
    • We use cross-attention to jointly modeling both the interlocutors in a dyadic interactions
    • Experimental results show evidence of statistically significant association between the introduced measure and clinically meaningful behavioral codes.


    Scripting Languages

    Bash scripting




    University of Southern California

    Los Angeles, USA

    2018 - Present

    Degree: PhD
    Department: Ming Hsieh Department of Electrical & Computer Engineering
    CGPA: 3.70/4.0

      Relevant Courseworks:

      • Machine Learning
      • Applied Natural Language Processing
      • Foundations of Algorithms
      • Affective Computing

    Jadavpur University

    Kolkata, India


    Degree: Master of Engineering
    Department: Electronics and Telecommunication Engineering
    CGPA: 9.94/10.0

      Relevant Courseworks:

      • Artificial Intelligence and Soft Computing
      • Digital Signal Processing
      • Robotics and Computer Vision
      • Computational Biology and Bioinformatics

    Heritage Institute of Technology, Kolkata

    Kolkata, India


    Degree: Bachelor of Technology
    Department: Electronics and Communication Engineering
    CGPA: 8.92/10.0