Learning from Distant, High-level Human Supervision

State-of-the-art neural models have achieved impressive results on a range of NLP tasks but are still quite data hungry to build. Training (or fine tuning) these models towards a specific task/domain may require hundreds of thousands of labeled samples. This puts huge labor burden and time cost on manual data annotation. Going beyond the standard instance-label training design, we are developing next-generation training paradigms for building neural NLP systems. The key ideas are to translate high-level human supervisions into machine-executable, modularized programs for model training, and to reference pre-existing knowledge resources for automatic data annotation. We focus on building new datasets and algorithms for digesting high-level human supervision and making use of distant supervision, in order to accelerate the model construction process and improve label efficiency of current NLP systems.

Common Sense Reasoning for Artificial General Intelligence

Humans need commonsense knowledge to make new decisions in everyday situations, while even state-of-the-art AI models can make wrong decisions due to the lack of commonsense reasoning (CSR) ability. To teach machines to think with common sense like humans, we have been developing new reasoning methods and benchmarking datasets for CSR. For multiple-choice reasoning setting, we have focused on knowledge-aware methods that exploit commonsense knowledge graphs with graph neural networks. We have also been studying commonsense reasoning in generative and open-ended setting, which are closer to realistic applications (e.g., dialogue systems, search engines, etc.). Beyond the language modal, we are also studying CSR in multi-modal environments (e.g., language + vision). We hope our research in commonsense reasoning can become fundamental building blocks for future Artificial General Intelligence (AGI) systems.

Learning with Structured Inductive Biases

Deep neural networks have demonstrated strong capability in fitting large dataset in order to master a task, but at the same time also showing poor generalization ability in terms of task/domain transferability. One main reason is because the common mechanisms shared between the tasks (i.e., inductive biases), such as model components and constraints, are not explicitly specified in the model architectures. We are exploring various ways of designing structural inductive biases that are task-general and human-readable, and developing novel model architectures and learning algorithms to impose such inductive biases. This will yield NLP systems that run effectively under low data regime, while demonstrating good task/domain transferability.

Knowledge Reasoning over Heterogeneous Data

Rule-based symbolic reasoning systems have the advantage of precise grounding and induction but are short for the fuzzy matching and uncertainty. In contrast, embedding-based reasoning methods are built on data-driven machine learning paradigm and can fit an effective model with large amount of data, while lacking the strength of good generalization. We are working on neural-symbolic reasoning methods to combine fuzzy reasoning with good generalization, and extending the reasoning target from static, graph-structured data to heterogeneous sources such as time-variant graph structures and unstructured text.