Abstract
This workshop introduces the concept of fine-tuning large-language models (LLMs) and explores their potential applications in the humanities. This workshop begins by outlining the basic principles of fine-tuning and highlight key technical considerations, such as dataset preparation, parameter selection, and evaluation strategies. To illustrate these ideas in practice, we present an ongoing project that fine-tunes a vision–language model for Manchu optical character recognition (OCR). This case study demonstrates how adapting LLMs to specialized historical sources can unlock new possibilities for text analysis, digitization, and multilingual research. By bridging technical workflows with humanistic inquiry, the workshop shows how fine-tuned models can empower scholars to engage with rare and complex sources in innovative ways.
Biography
Dr. Donghyeok Choi is a Postdoctoral Fellow in the Department of History at Hong Kong Baptist University. He received his Ph.D. in Culture Technology from KAIST with a dissertation on bureaucratic success and intergenerational mobility in the Joseon dynasty. His research lies at the intersection of digital humanities, quantitative history, and AI-driven analysis, with a particular focus on applying large-language models to East Asian historical sources. His work spans studies of historical governance in Korea and the development of AI methods that support data-driven research in the humanities.