My research concerns:
Modern machine and deep learning models rely on massive data and parameters to tackle challenging tasks,
which induce expensive memory use and training cost.
(1) How to reduce the computational complexity in face of massive data interactions?
(2) How to resolve the curse of dimensionality when inferencing with high dimensional data?
(3) Complex geometry? Numerical stability? Efficiency?
My recent work aims to develop novel data-driven techniques for constructing hierarchical representations to efficiently handle pairwise interactions, addressing the challenges brought by general interactions, complex geometry, and high dimensionality. The ultimate goal is to reduce the high complexity in time and space associated with SVM, kernel density estimation, Gaussian processes, transformer models, etc.
Generating data with a particular structure plays a key role in various applicatins such as variational inference, computer graphics, numerical integration, machine learning...
(1) How to improve a poor structure or distribution generated by an existing approach or an arbitrary approach?
(2) Number-theoretic approaches are limited to cubes, can we design a distribution transformer for general geometry?
(3) How to measure uniformity of points on a general geometry such as a manifold?
My recent work aims to develop a general and robust way to transform poor distributions. By integrating physics principles into the design of deep neural network, the new approach is able to achieve superior result and works for complex manifolds. The work has an impact on the efficient hierarchical data compression with possibly irregular data. Future work includes incorporating generative models such as normalizing flows for better efficiency and adjusting the approach for the efficient solution of PDEs using adaptive mesh generation.
Normalizing flow is a powerful neural network architecture
that has been used in a wide range of applications such as variational inference, image generation, molecular dynamics, cosmological data analysis, etc.
It learns a bijective mapping from a latent base to the target distribution.
(1) How to design a normalizing flow with easy-to-compute inverse and log-likelihood?
(2) Can we guarantee the expressive power of the flow, i.e., universal approximation property?
(3) Can parameters be unrestricted for fast training without hindering invertibility?
My recent work investigates a versatile monotonic normalizing flow architecture with unrestricted parameters. The new architecture allows efficient inverse and log-likelihood computation and is rigorously shown to be a universal approximator for any monotonic normalizing flow. Experiments demonstrate overall better efficiency than state-of-the-art flow models. One interesting direction to study is to apply the model to adaptive mesh generation and distribution learning.
Partial Differential Equations (PDEs) with nonsmooth solutions are notoriously hard to solve numerically.
For challenging problems, naive discretizations can barely lead to any accuracy.
(1) How to estimate the error without knowing the true solution?
(2) How to achieve a desirable accuracy with minimal computational resource?
(3) Is it possible to generate a nearly optimal mesh without iterative refinement?
My past research concerns developing a posteriori error estimates that are robust to the singularity of the solution and are easy to compute. Inspired by my recent work in distribution learning and generative models, I become interested in using deep learning to "learn" the ideal mesh distribution without having to iteratively estimate the error and perform mesh refinement. The deep learning approach enables a general treatment of a wide class of PDEs and liberate users from sophisticated mathematical analysis that is often limited to certain class of PDEs.