Linguists using a supercomputer – not so long ago, that might have seemed strange. But now we live in the age of AI, in which popular language models such as ChatGPT have turned the world upside down. Sara Budts and Yoshi Malaise from the Brussels Digital Text Lab (B-TXT) are taking advantage of this development.
B-TXT was founded in May 2025 to provide support to researchers who work with large amounts of language material but lack the technical expertise to get the work done by computers. Linguist Sara Budts and computer scientist Yoshi Malaise are working together to develop tools that help researchers in the language and humanities conduct their research. They prefer to do this using the university's own infrastructure. "Firstly, because commercially available variants are often insufficient," explains Malaise. "They are designed, for example, to summarise reports or to recognise invoices and ensure that they end up in the right department. They are rarely useful for medieval texts. There are commercially available alternatives, but they are quite expensive in themselves and, on top of that, you often have to pay for the use of the servers. With the huge amounts of data sets required for many research projects, this becomes virtually unaffordable. And that is even more unjustifiable if you have powerful and efficient servers at your disposal."
Yoshi Malaise and Sara Budts
“Our presence demonstrates that non-technical research groups also require rapid computing power"
"Many researchers work with voice recordings, and mainstream tools often struggle with this too – think of recordings of children's voices, for example," Budts continues. "Those tools are not optimised for this, whereas we can really focus on that if needed. In such cases, there is a second good reason to work on local servers by the way. When working with children, very strict agreements are usually made regarding privacy. In such cases, you cannot use online applications where you do not know exactly what will happen to the data you enter."
Budts and Malaise mainly focus on researchers from faculties such as education, social and political sciences, history, and language and literature – not exactly fields you would immediately associate with state-of-the-art computer technology. Unfairly so, according to Budts. "The history department, for example, has just completed a so-called citizen science project in which ordinary citizens transcribed large quantities of witness statements from the Bruges police from the eighteenth and nineteenth centuries. These are now available digitally, allowing us to use them to build an open source model that can digitise photos of historical documents almost independently. This will also enable other researchers or archive institutions to make their own material more easily accessible. These are areas where the Tier 1 supercomputer can prove its worth because it can process enormous amounts of data – in our case, texts – in a very short time period. The applications we develop in this way often run on the Tier 2 supercomputer, which is a slightly older model."
Do Budts and Malaise feel like outsiders among Tier 1 users? “I have the impression that they are pleased to see us,” Budts gladly remarks. “Our presence demonstrates that non-technical research groups also require rapid computing power – and are capable of utilising it efficiently.”
Sara Budts studied Language and Literature at KULeuven, but wrote her master's thesis on Artificial Intelligence for the University of Antwerp. She is currently a postdoctoral researcher at VUB and works at B-TXT, where her responsibilities include coordination, consulting and application development.
Yoshi Malaise studied Computer Science at the VUB and is responsible for technical support: ensuring that the equipment and applications run smoothly, but also estimating how much computing power is needed to develop the tools and models that B-TXT has in mind.