AI-assisted research across multilingual corpora

Practical workshop · 3 hours

Dr. Jakub Sypiański · Ca' Foscari University of Venice


About the workshop

This workshop shows how large language models (LLMs) can genuinely accelerate and enrich research — with no programming experience required. I focus on concrete applications: analysing and translating historical sources, automatically extracting structured data from large corpora, and intelligently searching literature. The session combines short presentations with live demonstrations on real research materials — including, where possible, text fragments submitted by participants in advance.


What you will learn:


Case study

As the main case study I present my ongoing research project on the medieval agricultural history of the Mediterranean, in which I use LLMs to process multilingual historical corpora at a scale previously out of reach. The workshop illustrates how this kind of AI-assisted reading can open up historical questions that are otherwise blocked by the sheer volume of sources.


Programme

Introduction and LLMs in practice

How models work, how they differ, how to write good prompts; exercise: analysing a historical source or a linguistics article.

AI in the humanities and linguistics

OCR and work with historical manuscripts, named-entity recognition, corpus processing, laboratory data analysis.

RAG and knowledge management

NotebookLM, Obsidian with local AI, LightRAG.

Exercise and discussion

Extracting structured data from your own sources; the role of AI in academic teaching.


Instructor

Dr. Jakub Sypiański is a historian and arabist, PhD in History (Sorbonne, 2024). Postdoctoral research fellow in the ERC SSE1K project at Ca' Foscari University of Venice. He examines intellectual exchanges between Byzantium and the Islamic world (7th–11th c.), environmental history of the medieval Mediterranean, and AI in humanities research — he taught a course on this at the Sorbonne.


Contact

[email protected] · sypian.ski