Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2 
publications
Scientific document processing: challenges for modern learning methods
Published in International Journal on Digital Libraries, 2023
A survey of modern neural network learning methods for scholarly document processing, addressing discourse structure, interconnectivity, and multimodal nature of scientific publications
Recommended citation: Kashyap, Abhinav Ramesh, Yajing Yang, and Min-Yen Kan. (2023). "Scientific document processing: challenges for modern learning methods." International Journal on Digital Libraries. 24, 283–309.
Download Paper
DataTales: A Benchmark for Real-World Intelligent Data Narration
Published in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
A benchmark for real-world intelligent data narration using financial reports and market data
Recommended citation: Yang, Yajing, Qian Liu, and Min-Yen Kan. (2024). "DataTales: A Benchmark for Real-World Intelligent Data Narration." Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Miami, FL, USA.
Download Paper
KAHAN: Knowledge-Augmented Hierarchical Analysis and Narration for Financial Data Narration
Published in Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
A knowledge-augmented hierarchical framework for financial data narration that leverages LLMs as domain experts
Recommended citation: Yang, Yajing, Tony Deng, and Min-Yen Kan. (2025). "KAHAN: Knowledge-Augmented Hierarchical Analysis and Narration for Financial Data Narration." Findings of the Association for Computational Linguistics: EMNLP 2025. Suzhou, China.
Download Paper
talks
From Data to Insights: LLMs for Financial Narration
Published:
Date/Time: Wednesday, 19 November 2025, 5:00-6:00pm
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.
