The Experience Elicited by Hallucinogens Presents the Highest Similarity to Dreaming within a Large Database of Psychoactive Substance Reports

This paper compared over 20,000 Erowid ‘trip’ reports with over 200,000 dream reports to evaluate the semantic similarities between experiences elicited by psychoactive substances and those of dreams. The analysis found that hallucinogens (vs sedatives, stimulants, etc) elicited experiences with the highest semantic similarity to dreams.

Abstract

“Ever since the modern rediscovery of psychedelic substances by Western society, several authors have independently proposed that their effects bear a high resemblance to the dreams and dreamlike experiences occurring naturally during the sleep-wake cycle. Recent studies in humans have provided neurophysiological evidence supporting this hypothesis. However, a rigorous comparative analysis of the phenomenology (“what it feels like” to experience these states) is currently lacking. We investigated the semantic similarity between a large number of subjective reports of psychoactive substances and reports of high/low lucidity dreams, and found that the highest-ranking substance in terms of the similarity to high lucidity dreams was the serotonergic psychedelic lysergic acid diethylamide (LSD), whereas the highest-ranking in terms of the similarity to dreams of low lucidity were plants of the Datura genus, rich in deliriant tropane alkaloids. Conversely, sedatives, stimulants, antipsychotics, and antidepressants comprised most of the lowest-ranking substances. An analysis of the most frequent words in the subjective reports of dreams and hallucinogens revealed that terms associated with perception (“see,” “visual,” “face,” “reality,” “color”), emotion (“fear”), setting (“outside,” “inside,” “street,” “front,” “behind”) and relatives (“mom,” “dad,” “brother,” “parent,” “family”) were the most prevalent across both experiences. In summary, we applied novel quantitative analyses to a large volume of empirical data to confirm the hypothesis that, among all psychoactive substances, hallucinogen drugs elicit experiences with the highest semantic similarity to those of dreams. Our results and the associated methodological developments open the way to study the comparative phenomenology of different altered states of consciousness and its relationship with non-invasive measurements of brain physiology.”

Authors: Camila Sanz, Federico Zamberlan, Earth Erowid, Fire Erowid & Enzo Tagliazucchi

Summary

Serotonergic psychedelics have been suggested to mirror certain aspects of psychosis, and to elicit a state of consciousness underpinned by increased entropy of on-going neural activity. Language produced under the effects of lysergic acid diethylamide (LSD) exhibits increased entropy and reduced semantic coherence.

  1. Introduction

Psychedelics are compounds with the potential to deeply alter the conscious state of the user. They are used to induce a transient and reversible state of psychosis, and peculiar language structure has been found to predict conversion to psychosis in individuals at risk.

Studies show that psychedelics expand the repertoire of brain configurations, increase neural entropy, and render speech less predictable and enhance free-association. This may provide insight into the topography and content of the human mind with far greater depth than is ordinarily possible.

In this paper, we used automated natural language processing to determine the effects of psychedelics on the organization and content of natural language. We compared the results with speech collected during a placebo condition and with two reference samples comprising patients diagnosed with schizophrenia and bipolar disorder.

2.1. Participants and protocol

Twenty healthy volunteers underwent functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG) scans. They were interviewed about their general experience, and the fMRI scan was conducted at the peak of the LSD effects and the MEG scan was conducted 225 min post-infusion.

This study was approved by the National Research Ethics Service and conducted by Imperial College London under a Home Office license.

2.2. Speech pre-processing

The transcribed interviews were preprocessed using the Natural Language Toolkit (NLTK) Python library, and the most relevant parts of speech were identified and counted for both conditions and times.

2.3. Shannon’s information entropy

Shannon’s information entropy represents the average rate at which information is produced by a source.

2.4. Word2Vec word embedding model

A Word2Vec model was trained with the Google News corpus to encode the interviews, and the semantic distance between words was determined by measuring the cosine of the distance between vectors.

Polysemic words can introduce noise in the semantic analysis, but if the proportion of polysemic words is similar between both conditions, this noise should cancel out when performing statistical comparisons between groups.

Following previous work by Bedi and colleagues, ten features were constructed by computing the cosine distance between a word’s Word2Vec vector and a set of pre-defined terms. These terms were obtained using topic modelling with latent semantic analysis.

2.5. Rank of the word embedding matrix

We created a method to measure the minimum dimensionality required to fit a subject’s discourse without information loss. This method uses a matrix rank to quantify the smallest subspace where the vectors can be represented without information loss.

This new metric can be understood as a compressibility score. It measures the minimum dimension of the underlying vector space that is needed to embed the words used in the report without significant loss of semantic information.

2.6. Graph-based speech analysis

A graph represents the structure of speech as a set of nodes connected by edges. The word count (WC) indicates the number of words in the interview.

Density and Average Shortest Path (ASP) are two properties of a graph.

The average total degree, largest strongly connected component, average clustering coefficient, and the number of edges connecting a node to its neighbors are calculated for a network.

Local metrics: L1: number of edges linking a node with itself, L2: number of loops containing two nodes, L3: number of loops containing three nodes.

SpeechGraphs v1.0 was used to analyze the interviews and to compute metrics for the non-semantic aspects of speech.

To rule out the potential confound of increased verbosity, 1000 different random graphs were computed using the words from each interview. The average was used as normalization for the machine learning binary classifier.

2.7. Non-semantic speech disorganization index

To compare the current sample with a psychopathological sample, we computed a speech disorganization index combining measures of graph connectedness weighted by psychotic symptomatology. This index has been shown to correlate with negative symptoms of psychosis and distinguish patients with schizophrenia from individuals without schizophrenia. We compared the disorganization index of individuals on LSD and PCB with the disorganization index of individuals diagnosed with schizophrenia, bipolar disorder, and individuals free from psychotic symptoms.

2.8. Machine learning classifier

A 5-fold cross-validated random forest classifier was applied to the speech graphs of patients to distinguish between the psychedelic (LSD) and waking state (PCB). Two different models were implemented, the first one using semantic similarity and the second one using non-semantic features.

  1. Results

The corpus was divided into four groups: interviews under LSD/PCB at time 1/time 2; analyses were conducted between pairs of conditions at separate times after infusion.

3.1. Word frequency and semantic content

The most frequent words used in the interviews under LSD and PCB were related to perception, whereas the most frequent words used in the interviews under PCB were related to tension and vigilance.

Based on this result, and on previous studies that applied topic modeling to retrospective reports of psychoactive drug experiences, we found that PCB and LSD increased the average semantic similarity to “mood” at both time 1 and time 2.

3.2. Speech graphs

The speech graphs of subjects under LSD and PCB show differences in the amount and organization of nodes and edges, as well as in the normalized speech graph metrics. The size of the vocabulary used in these interviews was reduced by subjects under LSD.

3.3. Speech disorganization index

A disorganization index was calculated for speech samples from individuals manifesting psychotic symptoms with diagnosis of schizophrenia or bipolar disorder. The disorganization index for the LSD condition was different from the schizophrenia group, but not from the bipolar disorder group or from the control group.

3.4. Entropy, semantic variability and rank

We found that subjects under the effects of LSD had a higher Shannon’s entropy and a lower minimum rank required to represent the word embedding matrix without loss of information.

3.5. Machine learning classifier

We performed 1000 iterations of stratified cross-validation on a binary random forest classifier based on semantic and non-semantic features. The mean AUC with semantic features was 0.7570 0.0003, compared to 0.507 0.004 obtained with label shuffling (p = 0.015).

  1. Discussion

Several neuropsychiatric conditions can be diagnosed by analyzing the flow of natural language produced by patients. This method is objective, automatic and cost-effective.

The production of language is profoundly affected in psychoses, with different patterns observed in bipolar and schizophrenic psychosis. The present work fills this gap by revealing that the psychedelic state is characterized by considerably unconstrained speech.

Semantic analysis was used to differentiate language produced under LSD vs. PCB. Terms related to visual and auditory perception featured prominently under LSD. We extracted words from a corpus of first-person reports of psychedelic experiences to determine the frequency of occurrence in PCB reports. These words were consistent with previous semantic association studies and with the neurophysiological effects of LSD.

Under LSD, subjects presented higher Shannon’s information entropy, semantic variability, and rank of the word embedding matrix compared to PCB, indicating increased speech disorganization and more sudden “jumps” in the content of one’s discourse.

The analysis of speech graphs revealed that LSD increased the verbosity of speech while reducing the lexicon, and that the topics covered were more diverse and varied. The present findings situate LSD’s effects closer to the quality of speech seen in manic patients than those with schizophrenia, and confirm the difference from the connectedness pattern associated with schizophrenia diagnosis and the similarity with the speech patterns associated with bipolar disorder and matched controls.

Standard diagnostic criteria might fail to provide a robust and consistent characterization of psychiatric conditions. Future work should specifically combine computational linguistic analyses with in-depth biological profiling.

The hypothesis that LSD induces a transient state of psychosis has been questioned from different perspectives. The results reported here are supportive of the entropic brain model, which states that higher entropy should manifest at the level of subjective experience.

In this regard, semantic discontinuities are intrinsic to the content of the narration, and are thus likely to be preserved even in retrospective accounts of subjective experiences or thought processes. However, speech graphs characterize how spontaneous speech production itself can be disrupted under the effects of LSD.

LSD has basic yet profound effects on human consciousness, which are difficult to study via classic behavioral paradigms. Analysis of spontaneously produced speech is therefore a convenient way to approach the quantification of a drug’s behavioral effects.

Future studies should conduct a more exhaustive examination of language produced under the acute effects of LSD, such as asking subjects to narrate a recent dream or a past experience they consider meaningful.

Previous work has shown that increased neural entropy under psychedelics is predictive of enduring psychological changes, and that natural language processing could represent a particularly useful approach for screening, monitoring and predicting treatment response in conditions associated with rigid thinking and behavioral patterns.

  1. Conclusions

We characterized natural language under the acute effects of LSD, and showed that speech becomes more disorganized under LSD, aligning more closely with speech seen in manic psychoses than in cases of schizophrenia. Non-semantic features can classify interviews under the drug vs. PCB with equally good performance.

CRediT authorship contribution statement

Camila Sanz, Carla Pallavicini, Facundo Carrillo, Federico Zamberlan, Mariano Sigman, Natalia Mota, Mauro Copelli, Sidarta Ribeiro, David Nutt, Robin Carhart-Harris, Enzo Tagliazucchi contributed to this work.

Acknowledgements

CS, CP, FC, FZ, MS, ET and NM are supported by CONICET, CNPq, FINEP, FAPERN and the Tamas family. RCH is supported by the Alex Mosley Charitable Trust.

PDF of The Experience Elicited by Hallucinogens Presents the Highest Similarity to Dreaming within a Large Database of Psychoactive Substance Reports