Cross-Lingual and Cross-Modal Limitations of Large Language Models

Loading...
Thumbnail Image

Author

Institution

http://id.loc.gov/authorities/names/n79058482

Degree Level

Master's

Degree

Master of Science

Department

Department of Computing Science

Supervisor / Co-Supervisor and Their Department(s)

Citation for Previous Publication

Link to Related Item

Abstract

Large Language Models (LLMs), including Vision Large Language Models (VLLMs), herald the coming of a new research epoch in machine learning and computational linguistics. Despite most LLMs being predominantly trained on English, their proficiency in various languages has been confirmed by many studies. Nonetheless, critical questions remain about their performance consistency across different languages. A similar concern is raised for VLLMs regarding their performance disparities across various modalities. Moreover, while the remarkable competence of LLMs in solving downstream tasks is widely acknowledged, they still fall short of satisfactory performance in several tasks, requiring further experimentation for deeper insights. In this thesis, we investigate the phenomenon of cross-language generalization in LLMs by employing a novel prompt back-translation method. We investigate the interactions and comparisons between text and image modalities by introducing a new concept called cross-modal consistency and propose a quantitative evaluation framework based on this concept. Additionally, we evaluate the performance of an LLM on two specific linguistic tasks: Lexicalization Generation and Lexical Gap Detection. We have also developed a novel algorithmic approach for comparative analysis. The findings reveal that LLMs face challenges in providing accurate results for translation-variant tasks, reveal a significant inconsistency between vision and language modalities within GPT, and show that ChatGPT underperforms in the two evaluated downstream tasks, being significantly outperformed by our rule-based method.

Item Type

http://purl.org/coar/resource_type/c_46ec

Alternative

License

Other License Text / Link

This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

en

Location

Time Period

Source