Skip to main content

Progress in endophytic fungi secondary metabolites: biosynthetic gene cluster reactivation and advances in metabolomics



Fungal endophytes exhibit symbiotic relationships with their host plants but have recently emerged as sources for synthesizing important varieties of secondary metabolites (SMs). Many of these metabolites have shown significant importance as antibacterial, antifungal, antitumor, and anticancer drugs, leading to their exploration in medicine and pharmaceuticals.

Main body of the abstract

The endophytes' biosynthetic gene clusters (BGCs) are responsible for encoding enzymes that produce these SMs. The fungal endophytes' ability has been challenged due to their inability to trigger cryptic BGCs and their loss of ability to produce secondary metabolites over an extended period in an artificial culture medium. This review investigates the array of SMs produced by endophytic fungi. It identifies methods for awakening and exploiting silent BGCs to produce novel natural metabolites and explores recent advancements in metabolomics platforms used to profile SMs. Silent BGCs can be activated using various methods, including co-cultivation, one strain of many compounds, epigenetic modification, heterologous expression, and cluster-specific transcription factor methods.

Short conclusion

These methods reviewed effectively enhance the production of silent BGCs, leading to a significant increase in secondary metabolite production. Meanwhile, metabolomics profiling using liquid or gas chromatography coupled with mass spectrometry could provide several chances to discover bioactive compounds' complexity and chemical diversity. This review has, thus, given insight into the significance of methods used to reactivate BGCs from endophytes and the importance of varying techniques of their metabolomic profiling.


Endophytic fungi are filamentous fungi found living within plant tissues without causing harm to their hosts (Hashem et al. 2023). This symbiotic relationship between plants and endophytic fungi leads them to produce bioactive compounds that help defend the plants from pathogens and improve their growth, development, and reproduction (Pillay et al. 2022). They have attracted significant attention due to their ability to produce several secondary metabolites (SMs) similar to those produced by their host plants (Singh and Kumar 2023).

In addition, SMs produced by these organisms provide defence against predators as virulence factors and as transporting and differentiation agents (Kjærbølling et al. 2019; Jha et al. 2023). The most explored and effective bioactive compounds include antibiotics, antifungal agents, anticancer agents, and immunosuppressants (Zhgun 2023; Lv et al. 2024). However, these metabolites also include toxic compounds such as aflatoxins, known to be highly carcinogenic, and gliotoxins, known to suppress immune function and prevent angiogenesis (Kjærbølling et al. 2019; Ye et al. 2021).

Most of these SMs are produced in response to genetic signals coordinated by several genes arranged in continuous biosynthetic gene clusters (BGCs) (Pfannenstiel and Keller 2019; Shen et al. 2022). However, only a small fraction of BGCs produce the current naturally derived antibiotics and pharmaceutical compounds. Harvesting these gene clusters has been a major research challenge because they are mostly silent or cryptic, particularly when cultivated under laboratory conditions (Keller 2018; Qi et al. 2021; Zhang et al. 2024). Since many of these inactive genes have been unexplored, studies have suggested that they could represent the largest reservoir of SMs (Okada and Seyedsayamdost 2017; Figueiredo et al. 2021).

To this effect, researchers have devised various methods of activating cryptic BGCs (Scherlach and Hertweck 2021; Hur et al. 2023). Additionally, proper separation and identification of bioactive compounds are essential in discovering and optimizing various bioactive components. Mass spectrometry coupled with LC or GC and NMR-based analysis paves the way for accurate profiling of these metabolites (Panda et al. 2021). Therefore, this review investigates methods of awakening and exploiting silent BGCs to produce novel natural metabolites and explores recent advancements in metabolomics platforms used to profile SMs.

Main text

Endophytic fungi and their associated secondary metabolites

The diversity of endophytic fungi with respect to their host plants is as diverse as the variety of individual plant species present globally. Endophytic fungi in plants are primarily Ascomycetes and their anamorphs, although they can also be Basidiomycetes, Zygomycetes, and Oomycetes (Tan et al. 2018; Gong et al. 2019).

Several antimicrobial compounds produced by endophytic fungi are of importance in their effectiveness against pathogens that have developed resistance to antibiotics. However, Hashem et al (2023) have reported that secondary metabolites from fungal endophytes are strongly affected by many factors, such as the sample collection time, environmental conditions, and site or habitat location of plants (extreme habitats were preferred as saline habitats, very high altitudes, rainforests deserts, swamps, and marshes), source of nutrition, tissues of host plant (root, foliar, seeds), types of plant (angiosperms and gymnosperms). Some of the most essential fungal endophytes and their bioactive properties are given below (Table 1):

Table 1 Endophytic fungi and their bioactive properties

The population of Endophytic fungi is not static, and it has been reported that medicinal plants tend to have a higher incidence of these pharmaceuticals-producing organisms than their nonproducing counterparts (Gioia et al. 2020). Some of them have even been implicated as having the underlying blueprint for the medicinal activity of some of these plants. They are, hence, essential to the existence of these organisms (Gioia et al. 2020).

Mechanisms of secondary metabolites production by biosynthetic gene clusters

Biosynthetic gene clusters (BGCs) are organized in several ways to synthesize secondary metabolites in fungi. According to Zhgun (2023), these gene clusters can be arranged to contain one or more backbone genes or core enzymes responsible for producing the core structure of the resulting metabolites or as genes that encode enzymes which tailor the core to obtain varying products. Thus, the types of SMs produced mainly depend on the type of core enzymes assembled by the genes. These core enzymes, according to Keller (2018), include synthase or synthetase, e.g. terpene synthase, nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), etc., while the tailoring enzymes include hydroxylases, epimerases, methyltransferases, etc. The author further opined that BGCs might contain transcriptional factors that regulate other genes within the cluster, genes that encode a protein that mitigates toxicity and genes with unknown functions (Keller, 2018).

In endophytic fungi, there are four main types of secondary metabolites produced by BGCs: polyketides, terpenoids, alkaloids, and nonribosomal peptides (Pusztahelyi et al. 2015). The mechanisms of production are as discussed below:

BGCs with core genes for nonribosomal peptide synthetase and polyketide synthase

By this mechanisms, nonribosomal peptides and polyketides are created using large modular megasynthases known as NRPS and PKS. These compounds contain catalytic domains assembled into a polypeptide chain required to polymerize amino acids and acyl groups (acetyl-CoA to malonyl-CoA) for NRPS and PKS, respectively (Zhgun 2023).

Core catalytic domains of NRPS

The assembly of a nonribosomal peptide by an NRPS involves a series of repeating steps that are catalysed by the coordinated actions of three core catalytic domains: adenylation, thiolation (or peptidyl carrier proteins, PCP), and condensation (peptide bond formation). Within a module of NRPS are catalytic domains that incorporate a single amino acid. The adenylation, which is the (A) domain, activates the substrate as an aminoacyl-AMP intermediate and subsequent transfer of the amino acid to the 4'-Ppant of the neighbouring thiolation domain; the aminoacyl-AMP once formed is attached to thiolate (T) (peptide carrier protein) domain to form the aminoacylthioester intermediate and transferring them to the condensation (C) domains for catalytic peptide bond formation or chemical modifications. Together, these three core domains comprise a minimal NRPS module. Lastly, they are delivered to a thioesterase (Te) domain that leads to the release of the final product (Miller and Gulick 2016). This fourth domain, a thioesterase, is often found at the C-terminus of the NRPS and catalyses the release of the peptide from the NRPS. This domain catalyses either the hydrolysis of the nonribosomal peptide from the NRPS or the intramolecular cyclization and release of the peptide from the NRPS (Fig. 1).

Fig. 1
figure 1

Mechanism of action of the core domains in the synthesis of NRPS (Felnagle et al. 2008)

Core catalytic domains of polyketide synthase

PKS I and II iterative types are more common in fungi. They are called iterative because they can catalyse similar reactions on different substrate sites (Hang et al. 2016). Elaborating further, Zhgun (2023) asserted that types I and II PKS contain a single module that uses catalytic domains cyclically to produce a metabolite. An amino acid is attached to the module by an acyl transferase domain and then polymerized in the acyl transfer protein domain (Fig. 2). The product is then transferred to the start of the module, and the next amino acid is attached (Zhgun 2023). The diverse organisation of modules in NRPS and PKS results in the production of different metabolites from fungi. Thus, future research can investigate how these processes can be exploited to produce medically important natural products.

Fig. 2
figure 2

Scheme of reactions in polyketide synthase. ACP: acyl carrier protein, AT: acyltransferase, KS: ketosynthase, KR: ketoreductase, DH: dehydratase, ER: enoylreductase (Risdian et al. 2019)

BGCs with core genes for terpene cyclase

Terpenoids represent the largest group of naturally occurring metabolites with diverse applications and over 80,000 currently known (Bian et al. 2017). They are produced using terpene cyclase (TPC) as the core enzyme. TPCs work by cyclising and condensing the linear isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) backbone chains. The IPP back bones are condensed into diphosphate isoprenoid precursors (green). Thereafter, terpene cyclase and the tailoring enzymes use these precursors to synthesize terpenes (blue) and terpenoids (yellow), respectively, as seen in Fig. 3 (Gonzalez-Hernandez et al. 2023).

Fig. 3
figure 3

Biosynthesis of terpenes and terpenoids by the terpene cyclase enzyme (Gonzalez-Hernandez et al., 2023)

Hybrid BGCs with genes for different backbone enzymes

Studies have shown that mixed-BGCs contain genes that encode enzymes for the production of hybrids, for example, NRPS/PKS hybrids, which may contain NRPS module parts and PKS module parts on the gene (Zhgun 2023). Robey et al. (2021) noted that the mechanism of action of BGCs with this property is based on inter-polypeptide linkers on the NRPS and PKS ends of the gene terminal. For the PKS region, the KS domain uses malonyl CoA loaded on ACP and supported by AT to elongate an acetyl-CoA starter unit. Once each elongation is completed, the KS, DH, and ER domains begin the b-keto processing steps (Boettger and Hertweck 2013). For the NRPS region, the major steps of the NRPS catalysis occur first. Once completed, the C (condensation) domain of the NRPS catalyses the binding of the polyketide to the amino acid, producing an amide (Fig. 4). This amide can then be released by undergoing a reductive release through Knoevenagel condensation catalysed by the reductase domain (R) to form a pyrrolinone or a Dieckmann cyclization catalysed by the terminal domain (D) to form a tetramic acid derivative (Fig. 5) (Boettger and Hertweck 2013).

Fig. 4
figure 4

Domain organisation of fungal PKS-NRPS hybrid. MT: methyltransferase, KR:ketoreductase, A: adenylation domain, T: thiolation domain, R: reductase domain (Boettger and Hertweck 2013)

Fig. 5
figure 5

Release mechanisms of polyketide–amino acid hybrids (tetramic acids, black) and (pyrrolinones, grey) (Boettger and Hertweck 2013)

Tailoring enzymes

These include a wide array of enzymes that modify the products obtained from the biosynthesis of secondary metabolites. These modifications occur through enzymatic activities, including methylation, epimerisation, oxidative hydroxylation, and translocation (Zhgun 2023). Thus, after the production of the core structure, BGCs also encode these genes that tailor the products into the known biosynthetic compounds.

Reactivation of fungal silent biosynthetic gene clusters

Endophytic fungi exist in diverse tissues and organs of healthy plants and retain an association with their hosts for at least a portion of their life cycle without causing evident symptoms of infection in the host plant. They asymptomatically colonize living tissues of healthy plants (Li and Lou 2017). These characteristics of endophytic fungi provided a new approach to producing bioactive secondary metabolites compounds via industrial fermentation, as well as new ideas and methods for improving the accumulation of bioactive components in medicinal plants and ensuring the long-term development of traditional medical resources.

Endophytic fungi directly create beneficial secondary metabolites by fermentation under controlled settings (Wang et al. 2011), but separation from their hosts invariably results in degradation of this capacity (Kusari et al. 2014). These characteristics usually get eroded and are challenging to regain due to endophytic fungi being cultivated alone for a long time. Many studies have shown that the biosynthetic potential of fungi is far from being exploited, with enormous potential for the development of compounds with more chemical novelty and intriguing bioactivities (Gakuubi et al. 2021). The fact that most biosynthetic gene clusters (BGCs) are not transcriptionally expressed is one barrier to describing these undiscovered SMs and that secondary metabolic gene clusters are silenced under standard laboratory conditions (Kusari et al. 2014), accounting for why only a minority of the potential chemical structures is being produced. Such silent genetic loci are called "cryptic" or "orphan" pathways.

Fortunately, numerous techniques, including the one strain many compounds (OSMAC) (Schwarz et al. 2021), heterologous expression technique (Meng et al. 2022), promoter engineering approach (Yu et al. 2020), and other genetic engineering method (Ochi and Hosaka 2012) strategies, have been effectively devised and used for interrogating these silent BGCs and increasing chemical diversity of fungal SM. A growing body of research suggests that microbial co-culture has a higher impact on microbe growth and metabolism than axenic culture (Pan et al. 2019).


Microorganisms do not exist in isolation in nature but rather coexist in all settings. They generally interchange and share chemical signals, including SMs. There is widespread agreement that the interaction of microbes is a driving force in creating SMs (Caesarea et al. 2020). One famous example is the unanticipated discovery of penicillin in a culture of Staphylococcus aureus contaminated by Penicillium notatum (Fleming 1929). Since then, hundreds of co-cultivation investigations have been described for mining natural products, particularly novel compounds not previously found in mono-cultivation.

Co-culture, as one of the most widely utilized OSMAC approaches, is a simple and highly efficient strategy for activating silent BGCs for the synthesis of novel SMs. Microbial co-culture approach can boost antibiotic efficacy in crude extracts, raise the yield of known SMs, develop analogues of recognized metabolites, and initiate hitherto undetected bioactive component pathways by replicating naturally existing conditions (Li et al. 2021).

The co-culture strategy is a successful method for awakening quiet BGCs in fungal strains to produce cryptic SMs, and it typically consists of three approaches: fungal–fungal, fungal–bacterial, and fungal–host co-cultures.

Fungal–fungal co-culture

Fungal–fungal co-cultivation activates silent gene clusters and stimulation of natural product production. Citrifelins with a distinct tetracyclic framework were isolated from a co-culture of Penicillium citrinum and Beauveria felina (Meng et al. 2015). The co-culture of two marine fungus activated a rare type of 2-alkenyl-tetrahydropyran and deactivated the antifungal metabolite pyridoxatin, giving methyl-pyridoxatin, revealing a complicated offensive and defensive fungal–fungal interaction (Shang et al. 2017). According to studies by Wang et al. (2022), who carried out fungal–fungal cocultivation of the endophytic fungus Epicoccum dendrobii with the model fungus Aspergillus nidulans or other filamentous fungi, identification and genetic characterization of SMs resulted in the discovery that a partial loss-of-function mutation of VeA is required for mediating coculture-wide SM change in A. nidulans. Meanwhile, HPLC analysis and subsequent separation enabled the identification of 14 aspernidine derivatives, including eight new ones, encoded by the active pkf gene cluster. Comprehensive data from the transcriptome and subsequent genetic alterations also demonstrated that the transcription factor SclB regulated SM synthesis via VeA1. This genomic evidence suggests that a VeA1-containing velvet complex mediates SclB activation of quiet gene clusters in the fungal–fungal system.

Fungal–bacterial co-culture

In a fungal–bacterial co-culture system, the fungus is frequently utilized as the host strain, while bacterium is the guest. According to different researchers, polyketide pestalone (Cueto et al. 2001), a large macrolactone ibomycin (Robbins et al. 2016), four diterpenoid libertellenones (Oh et al. 2005), N-formyl alkaloids (Zuck et al. 2011), and hybrid polyketide synthase-nonribosomal peptide synthetase (PKS-NRPS)-derived tetramic acid analogues (Whitt et al. 2014) were discovered in bacterial–fungal co-cultures.

Moreover, the work by Schroeckh et al. (2009) showed that physical interactions between Streptomyces hygroscopicus and Aspergillus nidulans were significant in activating polyketide biosynthesis, resulting in the creation of orsellinic acid and its anti-osteoporosis derivatives F-9775 A and B.

Fungal–host co-culture

Endophytic microbes are common and have been found in all plant species investigated. The interaction between the host plant Dendrobium officinale and Paraphaeosphaeria verruculosa was found to have a significant influence on the generation of anti-phytopathogenic metabolites (Hu et al. 2020).

In the work of Shipley et al. (2020), a new antifungal 2,4-cyclopentadiene-1-one from the co-culture of endophyte—host (Nigrospora oryzae, Irpex lacteus, and the host plant Dendrobium officinale) in PDB and six new anti-feedant polyketides from the co-culture of Paraphaeosphaeria verruculosa and Dendrobium officinale in PDB. Figure 6 is a typical example of the activation of BGC from fungal co-culture.

Fig. 6
figure 6

Fungal co-culture strategy for activation of BGC (Xu et al. 2023)

One strain many compounds (OSMAC) method

One strain many compounds method (OSMAC) is an important method used to improve the production of secondary metabolites and increase the chances of obtaining novel compounds of medical value (Hashem et al. 2023). The principle behind OSMAC lies in the knowledge that microbial strains are diverse and capable of producing several compounds when cultivated under differing conditions, which can lead to the discovery of novel bioactive metabolites (Romano et al. 2018).

According to Hewage et al. (2014), endophytic fungi are most suited for this method because several strains of these fungi have been shown to change their metabolite profiles after long storage periods in culture media. Several cultivation parameters must be changed to utilize the OSMAC approach successfully (Zahroh et al. 2022). Hashem et al. (2023) grouped these parameters into physical properties (temperature, aeration, pH) and media composition (carbon and nitrogen source, salinity, trace elements, metal ions).

To demonstrate the efficiency of OSMAC, Bode et al. (2002), who introduced the concept, were the first to show that changing the temperature, aeration, and shape of the culture flask to cultivate Aspergillus ochraceus improved the yield of SMs. This organism would normally produce one metabolite known as "aspinonene", which later produced fifteen new metabolites. More recently, Supratman et al. (2019) isolated the endophytic fungus Clonostachys rosea B5-2 and cultivated it on a solid rice media supplemented with apple juice. This resulted in a change in the metabolic processes of the fungus and induced the production of four new compounds, together with the (-)-vertinolide, which it normally produces. Subsequently, research has improved, and fermentation techniques have become advanced, allowing for the combination of different parameters under varying conditions to induce the production of novel compounds (Supratman et al. 2019).

Epigenetic modification

Genome mining has been a significant advancement in detecting cryptic metabolic pathways producing secondary metabolites (Ramesha et al. 2021). Epigenetics is a study area that deals with reversible changes in gene expression without changing the DNA sequence (Singh 2023). Research has led to the use of epigenetics to change the expression of genes that encode the secondary metabolites in microorganisms to produce novel compounds or increase the production of secondary metabolites (Bind, 2021). The process of using genetic mechanisms to change the gene expression of microbes for specific purposes is known as epigenetic modification (Singh 2023). Epigenetic modification includes DNA methylation, RNA interference, and histone post-translation (Bind et al. 2022). This process involves the use of small molecules (epigenetic modifiers) that inhibit DNA methyltransferase (DNMT), histone deacetylase (HDAC), and histone acetyltransferase (HAT), thereby inducing changes in the chromatin, and activate silent biosynthetic gene clusters to produce a variety of secondary metabolites (Xue et al. 2023). The mechanism of epigenetic modification is shown in Fig. 7.

Fig. 7
figure 7

Mechanism of epigenetic modification

Furthermore, Hashem et al. (2022) asserted that epigenetic modification also involves the overexpression of the activator or repressor genes or the deletion of some genes, which leads to genetic changes. The most common and successful HDACs and DNMTs used include suberoylanilide hydroxamic acid (SAHA), sodium butyrate, valproic acid, and 5-azacytidine, respectively (Makhwitine et al. 2023). Ramesha et al. (2021) conducted a study to determine the effect of epigenetic modification on developing secondary metabolites by Nigrospora sphaerica. Sodium butyrate showed the highest induction of cryptic metabolites (22); SAHA produced 19 new metabolites, while valproic acid produced 10. Although these chemicals have been successful, research needs to focus on understanding biosynthetic genes and pathways to select and maintain the most effective pathways that can produce the best metabolites.

Chromatin remodelling

As with epigenetic modification, chromatin remodelling involves changes or mutations in the heterochromatin structure of microbial genes. These mutations have been exploited to influence secondary metabolism (Pillay et al. 2022). Studies have shown that changes in histone deacetylase activity can lead to the transcription of genes that encode for metabolites of medical and pharmaceutical importance (Ding et al. 2020).

Shwab et al. (2007) conducted a study using A. nidulans to determine the impact of histone deletion on secondary metabolism. The authors found that the deletion or inactivation of HDACs activated silent BGCs that utilized novel biosynthetic pathways to produce bioactive secondary metabolites. In A. nidulans, the deletion of hdaA, an HDAC-encoding gene, bypassed the need for laeA (loss of aflR expression), thereby leading to the expression of Penicillin and sterigmatocystin.

This research further expanded the idea that the hdaA gene was a suppressor of BGCs in several filamentous fungi (Pillay et al. 2022). This occurrence has been studied in Calcariporium arbuscular (Mao et al. 2015) and Penicillium (Ding et al. 2020). However, it is essential to note that deletion of the HDAC gene also negatively impacts the growth, differentiation, and survival of fungal cells (Mao et al. 2015). Therefore, strategies should be carefully adopted to ensure that chromatin is remodelled in such a way that the development and proliferation of microorganisms are not affected in the process.

Heterologous expression of BGCs

Heterologous expression of BGCs has been identified as an important process for identifying gene clusters on organisms that are difficult to culture in the laboratory or not easily manipulated (Kjærbølling et al. 2021). Liu et al. (2021a, b) note that heterologous expression is used to activate silent gene clusters with the potential to produce novel SMs or increase the production of known SMs. According to the authors, the process of heterologous expression involves three steps: cloning the BGCs, engineering them, and transforming them into the desired heterologous hosts. Further, Kang and Kim (2021) noted that selecting a suitable heterologous host remains a critical factor for ensuring the success of a heterologous expression of natural BGCs. The selected host depends on the products targeted and the aim of its application.

To select a suitable host, Xu et al. (2022) asserted that the physiological characteristics of the original host strains, the characteristics of the BGCs, and the required substrate must be considered. Assessing the selected heterologous host based on these criteria helps to determine the closeness to its original strain, which is important as the closer they are, the more likely they are to share similar codon patterns and the more efficient the transcription factors will work (Xu et al. 2022).

In the same vein, Pham et al. (2021) opined that other elements are necessary to ensure the gene expression is successful for the production of secondary metabolites from microorganisms. These include the engineering of new hosts as well as the construction of new clusters DNA constructs, promoter engineering, vector and cloning methods, and ribosomal binding site (RBS) tuning (Knærbølling et al., 2019; Pham et al. 2021). Traditionally, heterologous gene expression involves constructing large DNA libraries and screening them using PCR to identify clones with important BGCs (Kang and Kim 2021). However, recent advancements in genomic engineering, such as CRISPR-Cas9 and structural biology, have developed new and highly efficient strategies that allow BGCs to be cloned from DNAs without constructing the libraries (Liu et al. 2021a, b).

Cluster-specific transcription factors

Transcription factors (TF) are a group of DNA-binding proteins specific to a genetic sequence and required to modulate gene expression (Wang et al. 2021). Therefore, cluster or pathway-specific genes are transcription factors (CSTF or PSTF) found on specific BGC and are useful for regulating the SMs produced by that BGC. Thus, studies have investigated the possibility of using cluster-specific transcription factors to activate the specific BGCs they're located on (Kjærbølling et al., 2021).

Over-expression of a CSTF has been successfully used to activate BGCs in filamentous fungi. For this, a study was conducted by Bergmann et al. (2007), who integrated a CSTF of A. nidulans known as apdR on a regulated alcohol dehydrogenase promoter (alcAp) that allowed for the transcription of the target BGC (apd) under controlled conditions. Two novel products were obtained, Aspyridones A and B, and the cryptic PKS-NRPS hybrid pathway involved in their production was elucidated (Wang et al. 2021; Bergmann et al. 2007). However, wang et al. (2021) asserted that proper understanding of the regulatory mechanisms of these TFs is important to discovering new SMs, as overexpression does not always activate BGCs.

Other molecular-based strategies that have been demonstrated include RNA polymerase and the manipulation of transcriptional activators and repressors (Begani et al. 2018; Mozsik et al. 2022). Table 2 provides an overview of other strategies used to activate cryptic BGCs and induce the production of metabolites.

Table 2 Strategies used to activate silent BGCs

Metabolic profiling: mass spectrometry (LC, GC) and NMR methods in the identification of endophytic metabolites

Metabolic profiling provides some chances for discovering the complexity and chemical diversity of bioactive compounds (Gupta et al. 2021). Mass spectrometry (MS)-based metabolic profiling allows for the most cost-effective and sensitive elucidation and characterization of known and undiscovered fungal bioactive compounds (Amberg et al. 2017). MS can be used alone by the direct infusion of samples or can be coupled to chromatographic techniques such as liquid chromatography (LC) and gas chromatography (GC) for a high-throughput analysis. LC–MS, GC–MS, and NMR constitute the primary and predominant methods for metabolic profiling. The techniques, analytical software, instrumentation, statistical methods, or computational techniques employed in these analyses are constantly evolving. Therefore, the studies highlighting advances in this field are essential in the discovery of novel bioactive compounds. A study in 2021 by Spina and colleagues demonstrated the identification of compounds in very low concentrations from endophytic extracts in Leucojum aestivum using LC–MS and GC–MS. To fully leverage MS detection, additional orthogonal chromatographic separation is required to distinguish between isomeric and isobaric structures, which cannot be distinguished by MS or MS/MS alone (Harrieder et al. 2022).

This section focuses on the methodology and new advancements in the different chromatographic techniques employed in mass spectrometry and nuclear magnetic resonance, highlighting their advantages and shortfalls.


Liquid chromatography-mass spectrometry is one of the most popular and major techniques used in the analysis and profiling of metabolites. The coupling of LC–MS has led to the identification of a wide variety of molecules with varying polarity in complex biological samples. The current preference employed is the reverse phase (RP) separation method representing the most dominant technique used in LC–ESI–MS, however, only covering mid- to nonpolar metabolites. Some examples include nonpolar amino acids like glycine, alanine, tryptophan, tyrosine, and valine. In contrast, hydrophilic interaction chromatography (HILIC) works better for polar metabolites analysis. While HILIC presents a noteworthy alternative for the separation of these metabolites, its adoption remains less prevalent compared to RP chromatography (Harrieder et al. 2022; Gupta et al. 2021).

A mass spectrometer is typically made up of an ion source, a mass analyser, and a detector. The ion source converts sample molecules into ions; the mass analyser resolves these ions in a time-of-flight tube or an electromagnetic field before being evaluated by the detector (Zhou et al. 2012; Plumb et al. 2023). LC–MS was not a vastly considered method due to the limitation that the MS ion sources were not in compatibility with the continuous liquid stream. The development of the electrospray ion source (ESI) resulted in a quantum jump, and several other options are available as ion sources including atmospheric pressure chemical ionization (APCI), atmospheric pressure photoionization (APPI), and fast atom bombardment (FAB) (Zhou et al. 2012; Gupta et al. 2021).

According to a study by Laaniste et al. 2019, comparing four ion sources for LC–MS analysis, ESI is the ion source of choice for trace analysis, with ESI obtaining the lowest limits of detection (LoDs), widest linear range, and was less affected by matrix effect (ME). It also offers desorption and ionization of a broad spectrum of molecules directly from the liquid phase, and the large number of ions are created as a result of charge exchange in solution caused by the "soft ionization" provided by ESI; thereby, minute residual energy is retained by the analyte, preventing fragmentation from occurring during ionization (Bowen and Northern, 2010; Banerjee and Mazumdar 2012). Tienaho et al. (2019) reported the use of electron spray ionization source mass spectrophotometry using ultrahigh performance liquid chromatography to identify 220 compounds out of 318 metabolites from hot water extract of endophytic fungi.

Notably, in certain ESI sources, the application of heat is employed to optimize the efficiency of the dehydration process (Clarke 2017). Some shortcomings in using ESI include uneven, compound-specific responses, restricted ionization of nonpolar compounds, and vulnerability to signal alteration, influenced by the sample's matrix. The matrix effect phenomenon can be regarded as an increase (ion enhancement) or decrease (ion suppression) in response, caused usually by an altered ionization efficiency of compounds of interest due to co-eluting analytes in the same matrix. Although low in ESI, ME influences the reliability, linearity, and accuracy, potentially leading to unreliable quantification. In ESI, matrix effect can manifest through various mechanisms, like the competition between matrix constituents and target analytes for the accessible charges in the liquid phase (Beccaria and Cabooter 2020).

The sensitivity and resolution of a mass spectrometer are largely dependent on the mass analyser employed in ion separation. There are two categories of mass analysers: the high (Orbitrap) and low resolution (mainly quadrupole analysers). The distinction lies in their capability to discern compounds with minute mass difference. High-resolution analysers are particularly vital in untargeted metabolomics, where they play a crucial role in elucidating the chemical composition of complex and unknown mixtures. On the contrary, targeted metabolomics emphasizes the quantification of a specific product ion, leading to heightened sensitivity and specificity by focusing on one or a few more m/z values. This distinction in mass analysers is a powerful tool in advancing our understanding of bioactive compounds and their role in various applications (La Barbera et al. 2017; Segers et al. 2019). Nagarajan et al. (2021) also reported that using high-resolution mass spectrophotometry with liquid chromatography (LC-HRMS) to identify metabolites from endophytic fungi has the unique advantage of accounting for both target and nontargeted metabolites.

On the other hand, tandem mass spectrometry (MS/MS) involves the integration of multiple mass analysers within a single mass spectrometer. This configuration enables the consecutive separation of ions, followed by their fragmentation in between the separations. The most commonly used MS/MS analyser for quantification is the triple quadrupole (QqQ). The first quadrupole quantifies the precursor ions generated by the ionization source; isolated precursor ions are then moved to the collision cell or second quadrupole, where they undergo fragmentation, and then the third quadrupole selects a particular or an array of product ions. The Orbitrap analyser may also operate in MS/MS mode; it operates by using an oscillating field to store ions between external electrodes, which combines a nonselective full-scan MS spectrum with high mass accuracy. According to a recent study by Pan et al. 2020, a high-resolution Orbitrap MS can produce approximately 500,000 resolving power (at m/z 200). Generally, a quadrupole time-of-flight (Q-TOF) is the most commonly used mass spectrometer in untargeted metabolomics and can also be used for quantification. Another recent and emerging technique is employing the ion mobility spectrometer (Niessen and Falck 2015; Rozanova et al. 2021).

LC–MS-based methods make up for some shortfalls in GC–MS analysis in its compatibility with highly volatile and thermally unstable metabolites. LC–MS/MS offers the advantage of streamlining sample processing, bypassing complexity, and time-consuming sample preparation. Furthermore, it facilitates the separation and identification of elusive drug metabolites, thereby enhancing analytical specificity, and it significantly improves signal-to-noise ratios and sensitivity through the application of multi-reaction detection (Liu et al. 2021a, b).

The components in RP-LC are separated based on their hydrophobicity through adsorption to the stationary phase. The stationary phase is nonpolar, typically made of either porous silica linked to alkyl chains (C4, C5, C8, and C18) or divinylbenzene (DVB), an inert nonpolar compound. For the more intact proteins, the shorter alkyl chains (C4 and C8) are preferred as they are considerably less retentive than the longer alkyl chains, and longer chains like C18 are typically used for the low hydrophobic proteins (≤ 10 kDa); this is because larger proteins have higher hydrophobicity and as such, interact strongly with the C18 matrix (Boone and Adamec 2016; Gupta et al. 2021). The hydrophobic stationary phase in the polar mobile phase adsorbs to the hydrophobic stationary phase as the sample is added to it, allowing the more hydrophilic molecules to be eluted first. Upon increasing the proportion of organic solvent, chloroform, water, ethanol, acetonitrile or methanol, acetonitrile being the most commonly used in the mobile phase, the polarity decreases, leading to a reduction in hydrophobic interaction between the stationary phase and solutes. This alteration allows for the effective elution of solutes. When dealing with more hydrophobic solutes, a higher concentration of organic solvent is essential in the mobile phase in order to achieve successful elution.


GC–MS stands out as another effective, consistently dependable, and robust tool extensively utilized in metabolic profiling and research. Its reliability stems from the electron impact (EI) hard ionization technique, which ensures consistent molecular fragmentation, making GC–MS a unique asset for identifying metabolites. GC stands as the foremost analytical technique for the separation of volatile compounds (Zeki et al. 2020).

The separation process is dependent on the disparities in both the boiling points and polarity of the analytes. Subsequently, the separation of analyte ions is executed based on their mass-to-charge ratio (m/z). The stationary phase employed is important to a selective analysis and is based on the polarity of the analytes; the stationary phase can be polar, semi-polar, or nonpolar, and the nonpolar stationary phase is the predominant phase used in GC–MS (Zeki et al. 2020; Prodhan et al. 2019). In all EI devices, ionization occurs at 70 electronvolts (eV) and this is achieved when the output flows into a heated ionization source under high vacuum conditions, where collector voltage extracts electrons from a tungsten filament, the voltage applied to the filament dictates the energy of the electrons. These high-energy electrons excite the neutral analyte molecules, leading to ionization and fragmentation (Gupta et al. 2021). Kanjana et al. (2019) carried out the GC–MS analysis of ethyl acetate extracts of fungal endophytes Chaetomium globosum, Cladosporium tenuissimum, and Penicillium janthinellum; the study confirmed the presence of biologically active phytocompounds in these endophytes and the pharmacological abilities of the host medicinal plants: Passiflora foetida, Memecylon edule, and Justicia adhatoda.

Chemical ionization (CI) technique is also utilized, albeit, not as frequent or preferred compared to the electron impact technique. This is because 7 fragments obtained in CI are limited, and libraries available for the analysis and identification are also limited. Its utilization in untargeted metabolomics is due to the soft ionization technique, applying low energy to the molecules, thereby revealing critical information on the molecular weight of the metabolite (Prodhan et al. 2019).

The mass spectra are generally regarded as consistent across instruments made by various manufacturers and across instruments equipped with different types of mass analysers, such as quadrupoles, time of flight, and so on (McNair et al. 2019) The spectra acquired are then carefully compared with established standards such as National Institute of Standards and Technology (NIST) library, Golm library, Metlin, MassBank, and Fiehn library for precise assessment (Schauer et al. 2005). Matyushin et al. (2020) also highlighted the application of deep learning ranking for the identification of small molecules using low-resolution electron ionization mass spectrometry.

The synergy between GC and MS systems is highly harmonious. A study by Farhat and colleagues (2022) using GC–MS analysis resulted in the identification of several compounds, some of which were not reported before from Fusarium solani. The capability of GC–MS to examine a vast number of compounds while being furnished with a standardized library of metabolite spectra, facilitating rapid and precise qualitative assessment of metabolites has made it a frequently used method in metabolomics. The hard ionization, electron impact utilized leads to the production of unique fragmentation patterns for identifying metabolites. The development of commercial and in-house libraries for metabolite identification gives GC–MS an edge over LC–MS. The primary challenge of the GC–MS systems lies in the necessity to decrease the atmospheric pressure within the GC to a vacuum level ranging from 105 to 106 Torr before introducing samples into the MS. This coupling process, characterized by a significant pressure reduction, is achieved through the use of an interface. A common interface in use today is the fused silica tubing. Other drawbacks in the GC–MS systems include its incompatibility with less volatile compounds (McNair et al. 2019; Liu et al. 2021a, b).


Nuclear magnetic resonance (NMR) spectroscopy entails the examination of nuclei by observing how they interact with electromagnetic radiation of radiofrequency when positioned within a robust magnetic field. NMR analyses compounds by utilizing hydrogen spectrum (1H NMR), carbon spectrum (13C NMR), or phosphorus spectrum (31P NMR), thereby providing insights into molecular properties (Wang et al. 2023).

NMR simplifies chromatographic separation, chemical derivatization, and sample treatment. It is easily measurable, nondestructive, and impartial, making it essential for identifying new compounds (Talukdar et al. 2021). In addition to its other advantages, NMR offers exceptional reproducibility, requiring minimal sample preparation. It employs a quantitative approach without targeting specific compounds, is highly automatable, and enables high-throughput analysis of large-scale compounds without the need for standards. Compound like organic acids, polyols, alcohols, sugar, and many other highly polar compounds with low detection using LC–MS technique can be detected using NMR. NMR has proven to be highly valuable in the provision of detailed insights into the chemical composition, structural elucidation, and molecular identification of endophytic fungal metabolites. Notably, NMR exhibits remarkable detection efficiency, capable of identifying metabolites in solution at concentrations greater than 1 µM, even for compounds not previously reported and with little or no prior documentation. However, these advantages are sometimes overturned by the fact that analytical techniques, like LC–MS and GC–MS, are comparatively more sensitive than NMR, with a lower limit of detection (10 to 100 times better) (Emwas et al. 2019; Gupta et al. 2021). The advantages and limitations of MS and NMR spectroscopy as used as an analytical tool in endophytic metabolite profiling are shown in Table 3.

Table 3 Advantages and limitations of MS and NMR spectroscopy as an analytical tool in profiling of endophytic-derived metabolites


In conclusion, while there may have been deliberate attempts by various scientists to elucidate more on endophytic fungi, it remains that it may not be completely exhaustive as they are diverse as the individual species of plants exist in all parts of the world. However, for the benefits they hold, continuous research into the benefits they possess, the organisms responsible for such antimicrobial action, and the nature of such compounds remain work infinitum. Extensively, researchers reported reactivation using approaches of co-culture and OSMAC probably because of the ease and cost compared to other genetic-based methods, which are more expensive and require appropriate expertise. NRPS and PKS could be organized in varied ways in fungi, resulting in diverse metabolites that can be produced. Furthermore, since all the metabolic profiling methods discussed are effective, albeit with varied shortcomings, the method chosen will depend on the desired results. Indeed, endophytic fungi present an interesting field of research, leaving us to continually explore the benefits inherent in this beautiful piece of nature.

Availability of data and materials

Not applicable.



Adenylation domain


Acyl carrier protein


Atmospheric pressure chemical ionization


Atmospheric pressure photoionization




Biosynthetic gene cluster


Chemical ionization




Dimethylallyl diphosphate




Electron impact




Electrospray ion source


Fast atom bombardment


Gas chromatography


Gas chromatography-mass spectrometry


Hydrophilic interaction chromatography


Isopentenyl diphosphate






Liquid chromatography


Liquid chromatography-mass spectrometry


High-resolution mass spectrophotometry with liquid chromatography


Limits of detection


Matrix effect


Mass spectrometry


Tandem mass spectrometry




National Institute of Standards and Technology


Nuclear magnetic resonance


Nonribosomal peptide synthetase


One strain many compounds


Polyketide synthase


Triple quadruple


Quadrupole time-of-flight


Reductase domain


Reverse phase


Secondary metabolite


Thiolation domain


Terpene cyclase


Velvet gene


Download references


Not applicable.


Not applicable.

Author information

Authors and Affiliations



RFZ and AAJ reviewed studies on Biosynthesis Gene cluster. FBI reviewed studies on endophytic fungi and their bioactive properties. SEP reviewed studies on reactivation of secondary metabolites. LBA and OSO reviewed studies on metabolic profiling. RFZ, AKA, and RNA reviewed the initial drafts. All authors have read and approved the manuscript.

Corresponding author

Correspondence to Rahmat Folashade Zakariyah.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

The publishers/authors which their works were reused consented to the review and have been duly cited.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zakariyah, R.F., Ajijolakewu, K.A., Ayodele, A.J. et al. Progress in endophytic fungi secondary metabolites: biosynthetic gene cluster reactivation and advances in metabolomics. Bull Natl Res Cent 48, 44 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: