About me

I’m a first-year PhD student in UKRI Centre for Doctoral Training in Artificial Intelligence and Music program at C4DM, Queen Mary University of London, supervised by Dr. György Fazekas. My current PhD theme is about leveraging deep generative models and signal processing techniques for controllable and expressive voice synthesis. The projects I made for achieving this goal are:

  • diffwave-sr: unsupervised speech super-resolution (bandwidth extension) using posterior sampling in diffusion models.
  • golf: a light-weight neural vocoder with glottal-flow models and differentiable LPC synthesis.

Before I started my PhD, I was doing software engineering and had worked for HTC and Rayark Inc.. I also do independent research in my spare time. My research interests include audio signal processing, music information retrieval, binaural audio, and machine learning.

I love making contributions on GitHub. Some of my most starred projects are:

I’m also the main contributor to making torchaudio.functional.lfilter differentiable. To know what I’m working on at the moment, you can check out my Kanban on GitHub.

My hobbies are music, manga, anime, and vtubers. My music taste is very broad and can be ranged from Heavy Metal, Djent, Post-Hardcore, to Happy Hardcore, Trance, Future Funk, Ani-Song, Lo-Fi, etc. I also produced some music (see my portfolio page) under the name Y.C.Y.

If you want to contact me for any reason, you can reach me via email or Twitter. If you’re interested in what I’m doing and want to support me, you can consider buying me a coffee on GitHub.