Domain-specific Embeddings for Question-Answering Systems: FAQs for Health Coaching


  • Andreas Martin FHNW University of Applied Sciences and Arts Northwestern Switzerland, Intelligent Information Systems Research Group, Riggenbachstrasse 16, 4600 Olten, Switzerland
  • Charuta Pande FHNW University of Applied Sciences and Arts Northwestern Switzerland, Intelligent Information Systems Research Group, Riggenbachstrasse 16, 4600 Olten, Switzerland
  • Sandro Schwander FHNW University of Applied Sciences and Arts Northwestern Switzerland, Intelligent Information Systems Research Group, Riggenbachstrasse 16, 4600 Olten, Switzerland
  • Ademola J. Ajuwon Department of Health Promotion and Education, College of Medicine, University of Ibadan, Nigeria
  • Christoph Pimmer Swiss Tropical and Public Health Institute, Education and Training Department, Kreuzstrasse 2, 4123 Allschwil, Switzerland



Conversational AI, Chatbot, Coaching, Natural Language Processing, Healthcare, HIV


FAQs are widely used to respond to users’ knowledge needs within knowledge domains. While LLM might be a promising way to address user questions, they are still prone to hallucinations i.e., inaccurate or wrong responses, which, can, inter alia, lead to massive problems, including, but not limited to, ethical issues. As a part of the healthcare coach chatbot for young Nigerian HIV clients, the need to meet their information needs through FAQs is one of the main coaching requirements. In this paper, we explore if domain knowledge in HIV FAQs can be represented as text embeddings to retrieve similar questions matching user queries, thus improving the understanding of the chatbot and the satisfaction of the users. Specifically, we describe our approach to developing an FAQ chatbot for the domain of HIV. We used a predefined FAQ question-answer knowledge base in English and Pidgin co-created by HIV clients and experts from Nigeria and Switzerland. The results of the post-engagement survey show that the chatbot mostly understood the user’s questions and could identify relevant matching questions and retrieve an appropriate response.






Empowering Machine Learning and Large Language Models with Domain and Commonsense Knowledge