publications
* means Equal Contribution
- Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications 2024Qianqian Xie, Dong Li, Mengxi Xiao, Zihao Jiang, Ruoyu Xiang, Xiao Zhang, Zhengyu Chen, Yueru He, Weiguang Han, Yuzhe Yang, Shunian Chen, Yifei Zhang, and 27 more authorsarXiv preprint,First open-source financial multimodal LLM: FinLLaVA-8B
Large language models (LLMs) have advanced financial applications, yet they often lack sufficient financial knowledge and struggle with tasks involving multi-modal inputs like tables and time series data. To address these limitations, we introduce Open-FinLLMs, a series of Financial LLMs. We begin with FinLLaMA, pre-trained on a 52 billion token financial corpus, incorporating text, tables, and time-series data to embed comprehensive financial knowledge. FinLLaMA is then instruction fine-tuned with 573K financial instructions, resulting in FinLLaMA-instruct, which enhances task performance. Finally, we present FinLLaVA, a multimodal LLM trained with 1.43M image-text instructions to handle complex financial data types. Extensive evaluations demonstrate FinLLaMA’s superior performance over LLaMA3-8B, LLaMA3.1-8B, and BloombergGPT in both zero-shot and few-shot settings across 19 and 4 datasets, respectively. FinLLaMA-instruct outperforms GPT-4 and other Financial LLMs on 15 datasets. FinLLaVA excels in understanding tables and charts across 4 multimodal tasks. Additionally, FinLLaMA achieves impressive Sharpe Ratios in trading simulations, highlighting its robust financial application capabilities. We will continually maintain and improve our models and benchmarks to support ongoing innovation in academia and industry.
@article{xie2024openfinllmsopenmultimodallarge, title = {<strong>Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications 2024</strong>}, journal = {arXiv preprint}, author = {Xie, Qianqian and Li, Dong and Xiao, Mengxi and Jiang, Zihao and Xiang, Ruoyu and Zhang, Xiao and Chen, Zhengyu and He, Yueru and Han, Weiguang and Yang, Yuzhe and Chen, Shunian and Zhang, Yifei and Shen, Lihang and Kim, Daniel and Liu, Zhiwei and Luo, Zheheng and Yu, Yangyang and Cao, Yupeng and Deng, Zhiyang and Yao, Zhiyuan and Li, Haohang and Feng, Duanyu and Dai, Yongfu and Somasundaram, VijayaSai and Lu, Peng and Zhao, Yilun and Long, Yitao and Xiong, Guojun and Smith, Kaleb and Yu, Honghai and Lai, Yanzhao and Peng, Min and Nie, Jianyun and Suchow, Jordan W. and Liu, Xiao-Yang and Wang, Benyou and Lopez-Lira, Alejandro and Huang, Jimin and Ananiadou, Sophia}, note = {First open-source financial multimodal LLM: FinLLaVA-8B}, }
- Do investors’ actions speak louder than words? 2024Honghai Yu, Zhuo Chen, Yunmiao Zhang, Haining Wang, and Yifei ZhangAn extended version of undergraduate thesis, accepted by the 21st Annual Conference on Financial Engineering and Risk Management
A large body of literature has examined whether posts on social media propagate noise or information. In this paper, we propose that both coexist on Chinese social media platforms but can be distinguished by posters’ trading behavior. Individuals may post articles on social media that do not reflect their true opinions, often for impression management purposes, resulting in inconsistency between their words and subsequent actions. Additionally, observing a poster’s trading behavior prior to posting can help assess the reliability of their expressed views.
- UCFE: A User-Centric Financial Expertise Benchmark for Large Language ModelsYuzhe Yang*, Yifei Zhang*, Yan Hu*, Yilin Guo, Ruoli Gan, Yueru He, Mingcong Lei, Xiao Zhang, Haining Wang, Qianqian Xie, Jimin Huang, Honghai Yu, and 1 more authorNAACL Findings 2025,A User-Centric framework designed to evaluate LLMs’ ability to handle complex financial tasks
This paper introduces the UCFE: User-Centric Financial Expertise benchmark, an innovative framework designed to evaluate the ability of large language models (LLMs) to handle complex real-world financial tasks. UCFE benchmark adopts a hybrid approach that combines human expert evaluations with dynamic, taskspecific interactions to simulate the complexities of evolving financial scenarios. Firstly, we conducted a user study involving 804 participants, collecting their feedback on financial tasks. Secondly, based on this feedback, we created our dataset that encompasses a wide range of user intents and interactions. This dataset serves as the foundation for benchmarking 12 LLM services using the LLM-as-Judge methodology. Our results show a significant alignment between benchmark scores and human preferences, with a Pearson correlation coefficient of 0.78, confirming the effectiveness of the UCFE dataset and our evaluation approach. UCFE benchmark not only reveals the potential of LLMs in the financial sector but also provides a robust framework for assessing their performance and user satisfaction.The benchmark dataset and evaluation code are available.
@article{yang2024ucfeusercentricfinancialexpertise, title = {<strong>UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models</strong>}, author = {Yang, Yuzhe and Zhang, Yifei and Hu, Yan and Guo, Yilin and Gan, Ruoli and He, Yueru and Lei, Mingcong and Zhang, Xiao and Wang, Haining and Xie, Qianqian and Huang, Jimin and Yu, Honghai and Wang, Benyou}, journal = {<strong>NAACL Findings 2025</strong>}, primaryclass = {q-fin.CP}, note = {A User-Centric framework designed to evaluate LLMs' ability to handle complex financial tasks}, }
- TwinMarket: A Scalable Behavioral and SocialSimulation for Financial MarketsYuzhe Yang*, Yifei Zhang*, Minghao Wu*, Kaidi Zhang, Yunmiao Zhang, Honghai Yu, Yan Hu, and Benyou WangBest Paper Award in Financial AI @ ICLR 2025 & selected for oral presentation at ICLR 2025,A multi-agent framework that leverages LLMs to simulate socio-economic systems
The study of social emergence has long been a central focus in social science. Traditional modeling approaches, such as rule-based Agent-Based Models (ABMs), struggle to capture the diversity and complexity of human behavior, particularly the irrational factors emphasized in behavioral economics. Recently, large language model (LLM) agents have gained traction as simulation tools for modeling human behavior in social science and role-playing applications. Studies suggest that LLMs can account for cognitive biases, emotional fluctuations, and other non-rational influences, enabling more realistic simulations of socio-economic dynamics. In this work, we introduce TwinMarket, a novel multi-agent framework that leverages LLMs to simulate socio-economic systems. Specifically, we examine how individual behaviors, through interactions and feedback mechanisms, give rise to collective dynamics and emergent phenomena. Through experiments in a simulated stock market environment, we demonstrate how individual actions can trigger group behaviors, leading to emergent outcomes such as financial bubbles and recessions. Our approach provides valuable insights into the complex interplay between individual decision-making and collective socio-economic patterns.
@article{yang2025twinmarketscalablebehavioralsocialsimulation, title = {<strong>TwinMarket: A Scalable Behavioral and SocialSimulation for Financial Markets</strong>}, author = {Yang, Yuzhe and Zhang, Yifei and Wu, Minghao and Zhang, Kaidi and Zhang, Yunmiao and Yu, Honghai and Hu, Yan and Wang, Benyou}, note = {A multi-agent framework that leverages LLMs to simulate socio-economic systems}, journal = {{<span style="color: red;"><strong>Best Paper Award in Financial AI @ ICLR 2025</strong></span> & selected for <strong>oral</strong> presentation at <strong>ICLR 2025</strong>}}, }