The broad objective of my research is to enhance the inclusion of technology for processing human languages by learning from limited labeled data.
In the modern era of artificial intelligence (AI), developing natural language processing (NLP) systems require large-scale annotated data. However, it is unfortunate that most largescale labeled datasets are only available in a handful of domains and languages; for the vast majority of domains and languages, either a few or no annotations are available to empower automated NLP applications. Hence, one of the focuses of low-resource NLP research is to learn language representation by leveraging resource-rich domain or language corpora and utilize them in low-resource applications. Representation learning has emerged as an indispensable ingredient for natural language understanding. It is utilized to learn notions, such as meanings of words, how the words are combined to form a concept, how concepts are related to a specific NLP task, etc. However, many crucial research questions, including how to bridge the gap between languages (or domains) to learn universal language representations, how well does such representation transfer across languages or domains, and how to utilize learning signals from multiple related tasks or unlabeled resources to learn generalizable representations; remains mostly unsolved. My Ph.D. dissertation seeks to investigate new approaches to learn language representations that can be transferred across languages and domains. My research will benefit billions of users whose native language is resource-scarce and facilitate text processing in essential domains such as public health, scientific literature, security and privacy, in which annotations are expensive.
]]>My career goal is to become a scientist and work in leading artificial intelligence (AI) research labs. I want to contribute in the enhancement of technology for processing human languages to benefit billions of users whose native language lacks sufficient resources to build statistical NLP solutions. While contributing to the advancement of AI research, I want to help and inspire young researchers to devote themselves to this discipline. To prepare me as a research scientist and a mentor for young researchers, I have devoted significant efforts to research, collaboration, mentoring, and professional services.
Research. My primary research interest is on natural language processing, with an emphasis on representation learning. My research seeks to avail modern NLP applications in a broad spectrum of languages and domains. In collaboration with my advisor, fellow professors, and graduate/undergraduate researchers, I have proposed a series of new representation learning techniques to solve significant problems, such as embedding universal language syntax and transferring representations across languages and domains. The proposed models have been applied to a wide range of NLP applications, including text classification, question answering, semantic parsing, etc. My research also expands to other areas such as privacy-preserving personalized web search, users' intent modeling in web search, and keyphrase generation for contextual targeting. I have so far authored and co-authored 13 published research papers on these topics during my Ph.D. study. In the future, I plan to further explore semi-supervised, unsupervised, and transfer learning techniques to facilitate NLP from limited labeled data.
Collaboration. I realize that collaboration is critical in conducting research. Therefore, besides working closely with my advisor and lab-mates, I collaborated with researchers from other research labs and institutes. Some of them include the following. Hongning Wang (Associate Professor at UVa): we have co-authored five papers in the area of information retrieval, Nanyun Peng (Assistant Professor at UCLA): we have co-authored four papers in representation learning, Dat Duong (currently Scientist at NIH, UCLA Ph.D.): we have co-authored one paper in representation learning for Gene Ontology terms, Xueying Bai (currently PhD student at Stony Brook University), Chao Jiang (currently PhD student at Georgia Tech): we have co-authored one paper on sentence representation learning, Zhisong Zhang (Graduate student at CMU-LTI), Xuezhe Ma (currently Research Assistant Professor at University of Southern California): we have co-authored two papers on cross-lingual representation learning, Saikat Chakraborty (Ph.D. student at Columbia University): we have co-authored two papers on representation learning for programming languages; Jianfeng Chi (Ph.D. student at UVa): we have co-authored two papers automating privacy policy analysis; Xiao Bai and Soomin Lee (Researchers at Yahoo Research): we have co-authored one paper on keyphrase generation and are currently co-authoring another paper. Networking with researchers from different backgrounds has always brought interesting ideas that expanded my collaboration.
Teaching and mentoring. I enjoy advising young researchers as it provides an opportunity to manage and guide people, exchanging thoughts and ideas, work on the behavioral aspects to establish an interpersonal relationship that fosters research. My experience as a lecturer prior to starting my Ph.D. helped me in working as a teaching assistant in several undergraduate and graduate-level courses at the University of Virginia and UCLA. I have mentored five undergraduate/masters student in accomplishing their capstone project. They are Zhechao Huang (now Applied Scientist at Amazon), Puchin Chen (now SDE at Google), Sudharsan Krishnaswamy (now ML Engineer II at SoundHound Inc), Peter Kim (now ML Engineer at Buzzvil), and Scott Shi (now ML Engineer at Facebook). They have worked on sentence representation learning, text summarization, question answering, question generation, and coreference resolution. Under my supervision, they have gained the ability to solve problems in their research projects independently.
Presentations and professional services. One of the objectives of conducting research is to disseminate new knowledge to a larger audience. I have presented my research five times in premiere international conferences. A great opportunity for a researcher to give back to the community is by providing professional services. I have served as a program committee member in eleven top-tier international conferences, including ACL, EMNLP, NAACL, AAAI, IJCAI, NIPS, ICML, SIGIR, etc. Besides, I have reviewed papers for one research journal (ACM Transactions on Information Systems). I have participated in peer-review for 40+ papers so far during my Ph.D. study. All these experiences have inspired my enthusiasm to pursue a research career. Upon graduation, I plan to join research labs as a scientist, through which I can gain more research experiences and develop more skills. I will keep striving to provide my services to the community and other people.
]]>Cross-lingual transfer, which transfers models across languages, has tremendous practical value. It reduces the requirement of annotated data for a target language and is especially useful when the target language lacks resources. Transferring across languages is challenging as it requires understanding and handling differences between languages at levels of morphology, syntax, and semantics. One of the key challenges is the variation in word order among different languages. For example, the Verb-Object pattern in English can hardly be found in Japanese. This challenge should be taken into consideration in model design. We posit that order-free models have better transferability than order-sensitive models because they less suffer from overfitting language-specific word order features.
In the family of deep neural models, recurrent neural network (RNN) and Transformer are widely adopted to learn contextual word representations. RNNs have a sequential nature, which makes them reliant on word order. As a result, RNNs are exposed to the risk of encoding language-specific order information that cannot generalize across languages. We characterize this as the order-sensitive property. On the other hand, Transformer uses self-attention mechanisms to capture context that is insensitive to order information. With carefully designed position representations, the self-attention mechanism can be more robust than RNNs to the change of word order. We refer to this as the order-free property.
In our work published at NAACL 2019, we systematically investigate RNNs and Transformers for cross-lingual representation learning. We show that Transformers as order-free models generally perform better than the order-sensitive RNN models for cross-lingual transfer, especially when the source and target languages are distant (e.g., English and Japanese).
]]>