Learning from Distant, High-level Human Supervision


State-of-the-art neural models have achieved impressive results on a range of NLP tasks but are still quite data hungry to build. Training (or fine tuning) these models towards a specific task/domain may require hundreds of thousands of labeled samples. This puts huge labor burden and time cost on manual data annotation. Going beyond the standard instance-label training design, we are developing next-generation training paradigms for building neural NLP systems. The key ideas are to translate high-level human supervisions into machine-executable, modularized programs for model training, and to reference pre-existing knowledge resources for automatic data annotation. We focus on building new datasets and algorithms for digesting high-level human supervision and making use of distant supervision, in order to accelerate the model construction process and improve label efficiency of current NLP systems.

Learning with Structured Inductive Biases


Deep neural networks have demonstrated strong capability in fitting large dataset in order to master a task, but at the same time also showing poor generalization ability in terms of task/domain transferability. One main reason is because the common mechanisms shared between the tasks (i.e., inductive biases), such as model components and constraints, are not explicitly specified in the model architectures. We are exploring various ways of designing structural inductive biases that are task-general and human-readable, and developing novel model architectures and learning algorithms to impose such inductive biases. This will yield NLP systems that run effectively under low data regime, while demonstrating good task/domain transferability.



Knowledge Reasoning over Heterogeneous Data


Rule-based symbolic reasoning systems have the advantage of precise grounding and induction but are short for the fuzzy matching and uncertainty. In contrast, embedding-based reasoning methods are built on data-driven machine learning paradigm and can fit an effective model with large amount of data, while lacking the strength of good generalization. We are working on neural-symbolic reasoning methods to combine fuzzy reasoning with good generalization, and extending the reasoning target from static, graph-structured data to heterogeneous sources such as time-variant graph structures and unstructured text.