The entropic tongue: Disorganization of natural language under LSD

This placebo-controlled study (n=20) suggests that speech produced under the influence of LSD (75 μg) exhibits more entropy than normal speech. This allowed machine learning programs to identify speech produced under the influence of LSD without analyzing semantic content.

Abstract

“Serotonergic psychedelics have been suggested to mirror certain aspects of psychosis, and, more generally, elicit a state of consciousness underpinned by increased entropy of ongoing neural activity. We investigated the hypothesis that language produced under the effects of lysergic acid diethylamide (LSD) should exhibit increased entropy and reduced semantic coherence. Computational analysis of interviews conducted at two different time points after 75 μg of intravenous LSD verified this prediction. Non-semantic analysis of speech organization revealed increased verbosity and a reduced lexicon, changes that are more similar to those observed during manic psychoses than in schizophrenia, which was confirmed by direct comparison with reference samples. Importantly, features related to language organization allowed machine learning classifiers to identify speech under LSD with accuracy comparable to that obtained by examining semantic content. These results constitute a quantitative and objective characterization of disorganized natural speech as a landmark feature of the psychedelic state.”

Authors: Camila Sanz, Carla Pallavicini, Facundo Carrillo, Federico Zamberlan, Mariano Sigman, Natalia Mota, Mauro Copelli, Sidarta Ribeiro, David J. Nutt, Robin L. Carhart-Harris & Enzo Tagliazucchi

Summary

Serotonergic psychedelics have been suggested to mirror certain aspects of psychosis, and to elicit a state of consciousness underpinned by increased entropy of on-going neural activity. Language produced under the effects of lysergic acid diethylamide (LSD) exhibits increased entropy and reduced semantic coherence.

  1. Introduction

Psychedelics are compounds with the potential to deeply alter the conscious state of the user. They are used to induce a transient and reversible state of psychosis, and peculiar language structure has been found to predict conversion to psychosis in individuals at risk.

Studies show that psychedelics expand the repertoire of brain configurations, increase neural entropy, and render speech less predictable and enhance free-association. This may provide insight into the topography and content of the human mind with far greater depth than is ordinarily possible.

In this paper, we used automated natural language processing to determine the effects of psychedelics on the organization and content of natural language. We compared the results with speech collected during a placebo condition and with two reference samples comprising patients diagnosed with schizophrenia and bipolar disorder.

2.1. Participants and protocol

Twenty healthy volunteers underwent functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG) scans. They were interviewed about their general experience, and the fMRI scan was conducted at the peak of the LSD effects and the MEG scan was conducted 225 min post-infusion.

This study was approved by the National Research Ethics Service and conducted by Imperial College London under a Home Office license.

2.2. Speech pre-processing

The transcribed interviews were preprocessed using the Natural Language Toolkit (NLTK) Python library, and the most relevant parts of speech were identified and counted for both conditions and times.

2.3. Shannon’s information entropy

Shannon’s information entropy represents the average rate at which information is produced by a source.

2.4. Word2Vec word embedding model

A Word2Vec model was trained with the Google News corpus to encode the interviews, and the semantic distance between words was determined by measuring the cosine of the distance between vectors.

Polysemic words can introduce noise in the semantic analysis, but if the proportion of polysemic words is similar between both conditions, this noise should cancel out when performing statistical comparisons between groups.

Following previous work by Bedi and colleagues, ten features were constructed by computing the cosine distance between a word’s Word2Vec vector and a set of pre-defined terms. These terms were obtained using topic modelling with latent semantic analysis.

2.5. Rank of the word embedding matrix

We created a method to measure the minimum dimensionality required to fit a subject’s discourse without information loss. This method uses a matrix rank to quantify the smallest subspace where the vectors can be represented without information loss.

This new metric can be understood as a compressibility score. It measures the minimum dimension of the underlying vector space that is needed to embed the words used in the report without significant loss of semantic information.

2.6. Graph-based speech analysis

A graph represents the structure of speech as a set of nodes connected by edges. The word count (WC) indicates the number of words in the interview.

Density and Average Shortest Path (ASP) are two properties of a graph.

The average total degree, largest strongly connected component, average clustering coefficient, and the number of edges connecting a node to its neighbors are calculated for a network.

Local metrics: L1: number of edges linking a node with itself, L2: number of loops containing two nodes, L3: number of loops containing three nodes.

SpeechGraphs v1.0 was used to analyze the interviews and to compute metrics for the non-semantic aspects of speech.

To rule out the potential confound of increased verbosity, 1000 different random graphs were computed using the words from each interview. The average was used as normalization for the machine learning binary classifier.

2.7. Non-semantic speech disorganization index

To compare the current sample with a psychopathological sample, we computed a speech disorganization index combining measures of graph connectedness weighted by psychotic symptomatology. This index has been shown to correlate with negative symptoms of psychosis and distinguish patients with schizophrenia from individuals without schizophrenia. We compared the disorganization index of individuals on LSD and PCB with the disorganization index of individuals diagnosed with schizophrenia, bipolar disorder, and individuals free from psychotic symptoms.

2.8. Machine learning classifier

A 5-fold cross-validated random forest classifier was applied to the speech graphs of patients to distinguish between the psychedelic (LSD) and waking state (PCB). Two different models were implemented, the first one using semantic similarity and the second one using non-semantic features.

  1. Results

The corpus was divided into four groups: interviews under LSD/PCB at time 1/time 2; analyses were conducted between pairs of conditions at separate times after infusion.

3.1. Word frequency and semantic content

The most frequent words used in the interviews under LSD and PCB were related to perception, whereas the most frequent words used in the interviews under PCB were related to tension and vigilance.

Based on this result, and on previous studies that applied topic modeling to retrospective reports of psychoactive drug experiences, we found that PCB and LSD increased the average semantic similarity to “mood” at both time 1 and time 2.

3.2. Speech graphs

The speech graphs of subjects under LSD and PCB show differences in the amount and organization of nodes and edges, as well as in the normalized speech graph metrics. The size of the vocabulary used in these interviews was reduced by subjects under LSD.

3.3. Speech disorganization index

A disorganization index was calculated for speech samples from individuals manifesting psychotic symptoms with diagnosis of schizophrenia or bipolar disorder. The disorganization index for the LSD condition was different from the schizophrenia group, but not from the bipolar disorder group or from the control group.

3.4. Entropy, semantic variability and rank

We found that subjects under the effects of LSD had a higher Shannon’s entropy and a lower minimum rank required to represent the word embedding matrix without loss of information.

3.5. Machine learning classifier

We performed 1000 iterations of stratified cross-validation on a binary random forest classifier based on semantic and non-semantic features. The mean AUC with semantic features was 0.7570 0.0003, compared to 0.507 0.004 obtained with label shuffling (p = 0.015).

  1. Discussion

Several neuropsychiatric conditions can be diagnosed by analyzing the flow of natural language produced by patients. This method is objective, automatic and cost-effective.

The production of language is profoundly affected in psychoses, with different patterns observed in bipolar and schizophrenic psychosis. The present work fills this gap by revealing that the psychedelic state is characterized by considerably unconstrained speech.

Semantic analysis was used to differentiate language produced under LSD vs. PCB. Terms related to visual and auditory perception featured prominently under LSD. We extracted words from a corpus of first-person reports of psychedelic experiences to determine the frequency of occurrence in PCB reports. These words were consistent with previous semantic association studies and with the neurophysiological effects of LSD.

Under LSD, subjects presented higher Shannon’s information entropy, semantic variability, and rank of the word embedding matrix compared to PCB, indicating increased speech disorganization and more sudden “jumps” in the content of one’s discourse.

The analysis of speech graphs revealed that LSD increased the verbosity of speech while reducing the lexicon, and that the topics covered were more diverse and varied. The present findings situate LSD’s effects closer to the quality of speech seen in manic patients than those with schizophrenia, and confirm the difference from the connectedness pattern associated with schizophrenia diagnosis and the similarity with the speech patterns associated with bipolar disorder and matched controls.

Standard diagnostic criteria might fail to provide a robust and consistent characterization of psychiatric conditions. Future work should specifically combine computational linguistic analyses with in-depth biological profiling.

The hypothesis that LSD induces a transient state of psychosis has been questioned from different perspectives. The results reported here are supportive of the entropic brain model, which states that higher entropy should manifest at the level of subjective experience.

In this regard, semantic discontinuities are intrinsic to the content of the narration, and are thus likely to be preserved even in retrospective accounts of subjective experiences or thought processes. However, speech graphs characterize how spontaneous speech production itself can be disrupted under the effects of LSD.

LSD has basic yet profound effects on human consciousness, which are difficult to study via classic behavioral paradigms. Analysis of spontaneously produced speech is therefore a convenient way to approach the quantification of a drug’s behavioral effects.

Future studies should conduct a more exhaustive examination of language produced under the acute effects of LSD, such as asking subjects to narrate a recent dream or a past experience they consider meaningful.

Previous work has shown that increased neural entropy under psychedelics is predictive of enduring psychological changes, and that natural language processing could represent a particularly useful approach for screening, monitoring and predicting treatment response in conditions associated with rigid thinking and behavioral patterns.

  1. Conclusions

We characterized natural language under the acute effects of LSD, and showed that speech becomes more disorganized under LSD, aligning more closely with speech seen in manic psychoses than in cases of schizophrenia. Non-semantic features can classify interviews under the drug vs. PCB with equally good performance.

CRediT authorship contribution statement

Camila Sanz, Carla Pallavicini, Facundo Carrillo, Federico Zamberlan, Mariano Sigman, Natalia Mota, Mauro Copelli, Sidarta Ribeiro, David Nutt, Robin Carhart-Harris, Enzo Tagliazucchi contributed to this work.

Acknowledgements

CS, CP, FC, FZ, MS, ET and NM are supported by CONICET, CNPq, FINEP, FAPERN and the Tamas family. RCH is supported by the Alex Mosley Charitable Trust.

Authors

Authors associated with this publication with profiles on Blossom

Enzo Tagliazucchi
Enzo Tagliazucchi is the head of the Consciousness, Culture and Complexity Group at the Buenos Aires University, a Professor of Neuroscience at the Favaloro University, and a Marie Curie fellow at the Brain and Spine Institute in Paris. His main interest is the study of human consciousness as embedded within society and culture.

David Nutt
David John Nutt is a great advocate for looking at drugs and their harm objectively and scientifically. This got him dismissed as ACMD (Advisory Council on the Misuse of Drugs) chairman.

Robin Carhart-Harris
Dr. Robin Carhart-Harris is the Founding Director of the Neuroscape Psychedelics Division at UCSF. Previously he led the Psychedelic group at Imperial College London.