Overview
HEAL is back for CHI 2025! This workshop aims to address the current ''evaluation crisis'' in LLM research and practice by bringing together HCI and AI researchers and practitioners to rethink LLM evaluation and auditing from a human-centered perspective. The recent advancements in Large Language Models (LLMs) have significantly impacted numerous and will impact more, real-world applications. However, these models also pose significant risks to individuals and society. To mitigate these issues and guide future model development, responsible evaluation and auditing of LLMs are essential.
The CHI 2025 Workshop on Human-centered Evaluation and Auditing of Language Models (HEAL@CHI'25) will explore topics around understanding stakeholders' needs and goals with evaluation and auditing LMs, establishing human-centered evaluation and auditing methods, developing tools and resources to support these methods, building community, and fostering collaboration
Special Theme - Mind the Context: For this year’s HEAL, we introduce the theme of ''mind the context'' to encourage attendees to engage with specific contexts in LLM evaluation and auditing. This theme involves various topics: the usage contexts of LLMs (e.g., evaluating the capabilities and limitations of LLM applications in mental wellness care, or translation in high-stakes scenarios), the context of the evaluation/auditing itself (e.g., who are using LLM evaluation tools, and how should we design these tools with this context in mind?), and more. We purposefully leave ''context'' open for interpretation by participants, so to encourage diversity in how participants conceptualize and operationalize this key concept in LLM evaluation and auditing.