Natural Language as a Window into the Subjective Effects and Neurochemistry of Psychedelic Drugs

In this preprint review (2021) the link between serotonergic psychedelic’s and language production is explored. It is shown that language production can be used to predict the therapeutic outcomes of individual psychedelic experiences and therefore, brief interviews obtained before, during and after the experience may expand the range of potential scientific conclusions.


“Psychedelics are drugs capable of eliciting profound alterations in the subjective experience of the users, sometimes with long-lasting consequences. Because of this, psychedelic research tends to focus on human subjects, given their capacity to construct detailed narratives about the contents of their consciousness experiences. In spite of its relevance, the interaction between serotonergic psychedelics and language production is comparatively understudied in the recent literature. This review is focused on two aspects of this interaction: how the acute effects of psychedelic drugs impact on speech organization regardless of its semantic content, and how to characterize the subjective effects of psychedelic drugs by analyzing the semantic content of written retrospective reports. We show that the computational characterization of language production is an emergent powerful tool to predict the therapeutic outcome of individual experiences, relate the effects elicited by psychedelics with those associated with other altered states of consciousness, draw comparisons between the psychedelic state and the symptomatology of certain psychiatric disorders, and investigate the neurochemical profile and mechanism of action of different psychedelic drugs. We conclude that researchers studying psychedelics can considerably expand the range of their potential scientific conclusions by analyzing brief interviews obtained before, during and after the acute effects. Finally, we list a series of questions and open problems that should be addressed to further consolidate this approach.”

Authors: Enzo Tagliazucchi


Psychedelic drugs can affect the subjective experience of the users, sometimes with long-lasting consequences. This review focuses on two aspects of this interaction: how the acute effects of psychedelic drugs impact on speech organization regardless of its semantic content, and how to characterize the subjective effects of psychedelic drugs.

Humans and other animals display a natural tendency to consume drugs that transiently modify their behavior, cognition and overall state of consciousness. Some humans have adopted an exploratory attitude towards drugs, experiencing their effects and then communicating to others the nature of their subjective experiences.

Due to the complexity of psychedelic effects, it is almost impossible for users to communicate the nature of their subjective experience without resorting to language. Animal models are not sufficient to understand psychedelic effects.

Language plays an important role in the investigation of serotonergic psychedelics, yet studies of the interaction between these drugs and language production are relatively underrepresented in the literature. In this review, we will discuss the advantages and limitations of unconstrained natural language reports compared to other tools to quantify subjective experience.

The consensus view is that psychedelic drugs must necessarily include human participants, yet what is the adequate methodology to investigate these participants and their subjective experiences?

Impaired attention limits the use of paradigms from cognitive neuroscience, such as the standard approach of measuring performance in a task designed to evaluate specific domains of human cognition and their relationship with brain function. Self-reported questionnaires have been widely used to study how psychedelic compounds affect cognition, conscious perception, thought processes, beliefs, attitudes, and personality traits in humans. These questionnaires have been correlated with objective measurements of brain activity obtained using neuroimaging techniques.

The study of natural and spontaneous goal-oriented behavior is uncommon, mainly due to the difficulty of extracting meaningful objective and quantitative data from unstructured and unconstrained observations.

The completeness of subjective reports communicated via natural language is limited by the participants’ capacities for recall and introspection. Guided interviews could help to capture subjective experience with more detail. Unconstrained reports of psychedelic experience are ubiquity, and there are large online repositories of such reports. While the usefulness of this data is hindered by several unknowns and limitations, it nevertheless represents one of the largest sources of information concerning the acute effects of psychedelic drugs.

During the acute effects of psychedelics, the main pharmacological target of these drugs, 5-HT2A receptors, are activated in the brain, and this is reflected in the activation of regions involved in language production and processing.

Psychedelic drugs can modify the production and understanding of language, including both semantic and non-semantic features. These language alterations could be characteristic of specific neuropsychiatric disorders, which could arise from neurochemical alterations at the level of networks of neurons.

To analyze natural language reports of psychedelic drug use, it is necessary to extract reliable and quantitative information. NLP methods are used to do this, and we will discuss only studies based on NLP applied to transcripts.

The analysis of written reports can be subdivided into two domains: semantic analyses (inferring the meaning intended by the subjects) and nonsemantic analyses (analyzing other aspects of language that are, in general, independent of meaning).

Texts are first pre-processed to reduce the proliferation of different but semantically related terms, and to remove stopwords. Semantic analyses usually start from a term-document matrix, but a sparse term-document matrix is likely to yield misleading results. Moreover, two terms might never co-occur in the documents, but at the same time frequently occur together with a third term, thus being semantically related.

Latent semantic analysis (LSA) reduces the number of linearly independent rows in the term-document matrix by lowering its rank, and then estimates the similarity between pairs of documents by computing the cosine distance or the correlation coefficient between the corresponding columns of the term-document matrix.

LSA and word embeddings are used to estimate the semantic similarity between words, and can be used to define metrics of semantic coherence, which can be used to identify altered thought processes in psychiatric patients.

Non-semantic methods treat texts as sequences of tokens arranged according to syntactical rules, regardless of their meaning. The topological properties of the resulting graphs contain information useful to characterize drug-induced language alterations, as well as abnormalities specific to certain neuropsychiatric disorders.

Semantic methods are useful to investigate retrospective subjective reports in terms of their content, and non-semantic methods are useful to determine how the overall structure of verbal expression is modified during the acute effects.

Psychedelics have been investigated as agents capable of eliciting states of altered consciousness and cognition that are similar to psychosis. The characteristics of unconstrained speech produced during the acute effects of psychedelics remain relatively underexplored from the perspective of NLP.

Sanz and colleagues applied semantic and non-semantic methods to interviews conducted after intravenous infusion of 75 g of LSD. They found that the acute effects of the drug reduced semantic coherence and increased Shannon’s entropy.

The psychotomimetic hypothesis and the entropic brain hypothesis predict similar changes in language production in schizophrenic and manic psychotic patients, but antipsychotic medications show opposite effects, leading to slower articulation rate, increased pausing, shorter utterances, and overall less information production rate. Sanz et al. investigated LSD, a drug presenting biphasic effects with a gradual shift from serotonergic to dopaminergic action.

Sanz and colleagues found that subjects with significant previous experience with psychedelic drugs had impaired semantic coherence and reduced the total number of spoken words, but these results were different from those found in individuals suffering from alcoholism.

Natural speech produced during or immediately after the acute effects of psychedelics can be used to compare 5-HT2A receptor activation with other altered states of consciousness, such as rapid eye movement (REM) sleep.

Huxley described books as glistening with brighter colors, a profounder significance, and being on the point of leaving the shelves to thrust themselves on his attention.

After returning to baseline, the author describes the effects of the drug, recalling how certain aspects of visual perception were profoundly altered. We can hypothesize that the experience included at least some of these effects.

Coyle et al. used supervised machine learning methods to separate between reports of different drugs based on the associated vocabulary frequency vectors, showing that different families of compounds were linked to narratives with distinctive semantic content.

Sanz and colleagues used LSA to investigate the relationship between the pharmacological mechanism of action of a variety of drugs and their associated Erowid reports. Figure 2 shows the unsupervised classification of several drugs based on the semantic similarity of their associated subjective reports. The unsupervised classification respects traditional categories such as antidepressants and antipsychotics, psychedelics, dissociatives, entactogens, stimulants, and sedatives, among others.

According to anecdotal reports, dreaming during sleep is characterized by vivid multimodal imagery, altered sense of the relationship between the self and body boundaries, loss of the sense of agency, suppressed metacognitive function and heightened emotional reactivity. According to a large-scale analysis of subjective reports, psychedelics are the closest to dreams of high lucidity, and deliriant compounds are the closest to dreams of low lucidity. Ketamine has the most similar subjective effects to those reported after “near death experiences”.

These studies foreshadow a possible research program to pursue a taxonomy of conscious states based on data-driven similarity metrics. However, the idea of levels of consciousness is problematic because it fails to take into account the acute effects of serotonergic psychedelics.

Figure 2 (bottom panel) represents an interim solution to the open problem of defining states of consciousness: it is possible to classify states of consciousness based on their pairwise similarity, and their hierarchical clustering could allow the identification of conscious states at different levels of temporal granularity.

The similarity between the subjective experience of different conscious states could be estimated based on the semantic distance between reports, without constraining the participants to answer a predefined set of questions.

The relationship between the mechanism of action of different drugs and the subjective effects they elicit is well-characterized at the molecular and cellular levels, but little is known about the downstream effects on large-scale activity patterns that correlate with cognition and conscious experience.

Zamberlan and colleagues hypothesized that the more similar two drugs are at the molecular level, the closer their subjective experience.

The semantic content of subjective reports is positively correlated with the similarity of binding affinity profiles for serotonergic psychedelics. If 5-HT2A agonism is the only relevant mechanism of action, how can different psychedelic drugs present a variety of different effects?

Natural language processing can be used to inform the relationship between subjective effects and neurochemical action of psychedelic compounds. If the correlation between binding affinity and semantic similarities increases, it could be concluded that activity at this subset of receptors is more specific to determine the subjective effects of the compound.

Taking this analysis one step further, it is possible to decompose the subjective reports into topics, which can be mapped to the binding affinity at different neurotransmitter receptors. This allows us to understand the subjective effects linked to high binding affinity at different sites.

Psychedelic drugs elicit their effects through interactions with proteins, which in turn recruit different second messengers, which in turn affect neural dynamics and form networks that extend through cortical and sub-cortical areas.

We can find similarity between compounds at different levels, such as molecular similarity between drugs or semantic similarity between drug-induced narratives. The correlation between semantic similarities and those obtained using data at each other level increases with the proximity to the top of the hierarchy.

After decades of little to none research, a surge of studies has demonstrated the potential usefulness of psychedelics to treat psychiatric disorders such as depression and anxiety. It is reasonable to hypothesize that natural language reports carry relevant information to determine the success and predict the outcome of interventions using these compounds.

The conceptual framework introduced in this study was further developed by Cox and colleagues, who applied it to a much larger dataset of natural language narratives to predict therapeutic outcomes.

Natural language processing can be used to analyze the emotional content of the speakers, as well as their ongoing thought processes, and this information could be used to improve the design and implementation of therapy sessions assisted by psychedelic compounds.

We have reviewed several studies illustrating how the analysis of written or spoken natural language reports can assist the investigation of the subjective effects and mechanisms of action of psychedelic compounds. However, this approach has several limitations, some of which are intrinsic to the acquisition and analysis of natural language.

The participants could be manifest in their reports, confounding attempts to objectivize their content using automated methods. Using standardized psychometric questionnaires can overcome this limitation, by training the interviewers to obtain the most of the participants and their reports.

Large online databases (e.g. Erowid’s Experience Vaults) present additional problems associated with unknown or underinformed variables, such as the precise nature of the compounds that were consumed, their dosage, whether drugs were consumed alone or in combination with others, subject demographics, mental health status and past history of drug use.

In spite of its potential shortcomings, the analysis of natural language reports can still be considered a promising tool to tackle research questions about the nature of subjective experience, in particular, those about effects and neurochemical action of serotonergic psychedelics.

When used to predict the outcome of psychedelic treatments, natural language reports have a higher accuracy than questionnaires.

The semantic similarity between drug use reports is related to other metrics of similarity between drugs and their effects on the brain.

Language is our main everyday vehicle for the expression of ideas, emotions, plans, and subjective, inner feelings. The articles reviewed here show how this urge can be leveraged for the scientific exploration of serotonergic psychedelics.

Study details

Topics studied

Study characteristics
Theory Building

0 Humans


Authors associated with this publication with profiles on Blossom

Enzo Tagliazucchi
Enzo Tagliazucchi is the head of the Consciousness, Culture and Complexity Group at the Buenos Aires University, a Professor of Neuroscience at the Favaloro University, and a Marie Curie fellow at the Brain and Spine Institute in Paris. His main interest is the study of human consciousness as embedded within society and culture.


Institutes associated with this publication

University of Buenos Aires
UBA is home to the Consciousness, Culture and Complexity & Phalaris Labs. Both labs are led by Enzo Tagliazucchi