Explainable NLP for Conversational Clinical Language

2026

This project investigated how explainable data‑mining and NLP techniques behave when applied to conversational clinical language, a form of text that differs substantially from formal medical documentation. The focus was not on maximising predictive accuracy, but on understanding how different modelling and explanation approaches relate to the linguistic properties of patient-doctor dialogue.

Using a synthetic patient-doctor conversation dataset (MedSynth), the project began with a corpus‑linguistic analysis to characterise the vocabulary and register of conversational clinical language. Patient and clinician turns were analysed separately using keyword and comparative sub‑corpus techniques, revealing a mixed register combining lay symptom descriptions with partially technical clinical terminology. This linguistic profile was used as an external reference point for evaluating model explanations.

For the modelling task, ICD‑10 chapter‑level classification was chosen as a controlled prediction problem that avoids the sparsity and ambiguity of fine‑grained diagnostic codes. An interpretable baseline model was implemented using TF‑IDF features and XGBoost, with patient and doctor language vectorised separately to preserve speaker provenance. Model behaviour was analysed using standard classification metrics, which were treated as contextual evidence rather than optimisation targets.

To assess explainability, SHAP was applied to extract feature‑level attributions for representative diagnostic categories. These explanations were compared against the corpus‑derived register profile to assess whether highly influential features aligned with linguistically distinctive or diagnostically meaningful vocabulary. The analysis showed that strong predictive features did not always correspond to register‑distinctive language, highlighting the limitations of feature attribution when used in isolation.

In parallel, a locally hosted large language model (Llama 3.1) was applied in a zero‑shot setting to generate both ICD‑10 chapter predictions and natural‑language explanations via chain‑of‑thought reasoning. While the LLM produced fluent and clinically plausible explanations, its reasoning relied heavily on abstract, register‑neutral clinical language and often operated at a finer diagnostic granularity than the task definition.

Comparing these approaches revealed systematic differences in explanatory behaviour: SHAP explanations were tightly coupled to surface lexical features, while LLM explanations abstracted away from conversational evidence. Crucially, neither explanation paradigm aligned straightforwardly with the empirically observed linguistic register of the input data. This finding demonstrates that explainability cannot be evaluated meaningfully without understanding the language on which models operate.

Overall, the project demonstrates an end‑to‑end data‑mining workflow combining corpus analysis, interpretable machine learning, and LLM‑based explanation generation. It highlights the importance of grounding explainability evaluation in data properties rather than treating explanations as purely internal model artefacts, particularly in high‑stakes domains such as healthcare.

Code & Artifacts

Code and analysis are available in a private repository and can be shared on request.