Background

The Dietary Supplement Health and Education Act (DSHEA) of 1994 (United States Public Law 103–417) amended the Federal Food, Drug and Cosmetic Act by defining as a dietary supplement any product (other than tobacco) that contains a vitamin, mineral, herb or other botanical, or amino acid and is intended as a supplement to the diet [1]. The law also amended Title IV of the Public Health Service Act by establishing the Office of Dietary Supplements (ODS) within the Office of the Director, US National Institutes of Health (NIH) (42 USC. 287c-3). The mission of the ODS is to “strengthen knowledge and understanding of dietary supplements by evaluating scientific information, stimulating and supporting research, disseminating research results, and educating the public to foster an enhanced quality of life and health for the US population.”

In the United States, the regulatory category into which products fall is determined by the intended use of the product. The DSHEA was significant because it established dietary supplements as a separate legal category and defined a framework for Food and Drug Administration (FDA) regulation of this category [2]. It also established the regulatory framework for supplements as foods, not drugs, set rules for what information labels must contain, and gave FDA the authority to write supplement-specific Good Manufacturing Practices (GMPs) based on a food model.

When the DSHEA became law in 1994, there were an estimated 600 US dietary supplement manufacturers, producing about 4,000 products [3]. According to Food and Drug Administration estimates, there were more than 29,000 different dietary supplement products on the market by the year 2000, with an average of 1,000 new products being added annually [4]. Readers interested in an overview of the pertinent regulations should consult Israelsen and Barrett [5]. Growth of the marketplace was driven by increased consumer demand, often following publicity about the utility and efficacy of a particular herb. For example, in 1996 Linde et al. [6] published a meta-analysis of randomized clinical trials of the herb St. John’s wort (Hypericum perforatum L.) in the British Medical Journal (BMJ), which concluded that the herb was more effective than a placebo for treating certain types of mild-to-moderate depression. A June 27, 1997 broadcast of the television news magazine 20/20 (Using Herb St. John’s Wort to Treat Depression) highlighted the BMJ article, and sales of the herb increased rapidly. Other mainstream media outlets quickly followed with articles on herbal medicine [7, 8], and the entire industry grew quickly over the next few years [9].

Later, the mainstream media began to take a closer look at the subject of herbs. In 1998, the Los Angeles Times commissioned a survey of St. John’s wort products purchased from retail stores. The laboratory hired by the newspaper was asked to measure the amount of the phytochemical marker compound hypericin. The newspaper reported that three of ten tested products contained no more than 50% of the hypericin content declared on the label, and that another four of ten contained less than 90% of the label claim [10]. A later survey conducted by the Boston Globe in 2000 reported similar results [11]. Publications in the peer-reviewed scientific literature have also found problems with quantitative label claims. Draves and Walker observed that only two products (of 54) had a total naphthodianthrone (hypericin) concentration within 10% of the label claim [12]. Edwards and Draper examined levels of berberine and hydrastine in 20 goldenseal (Hydrastis canadensis) products and found that ten of seventeen root products met alkaloid content standards proposed by the United States Pharmacopeia (USP), and that five products contained little or no hydrastine, unusual berberine:hydrastine ratios, and additional peaks not observed with other products [13]. Harkey et al. tested the ginsenoside contents of 24 commercial ginseng products and found that concentrations of marker compounds differed significantly from labeled amounts [14].

In contrast to pharmaceutical products, there are frequently several analytical methods available for the determination of phytochemicals in botanical products. An unpublished 2002 literature search performed for an AOAC International review panel found 22 different methods for constituents of St. John’s wort. Many of these methods were based on different physical principles that would be expected to lead to different numerical results for the target constituents (e.g., spectrophotometry vs. chromatography). A review of the scientific literature also reveals that validation of methods within individual publications is often lacking, as is cross-validation among methods. In the absence of documentation of method performance, researchers wishing to analyze a particular ingredient or product are met with an abundance of methods whose reliability is unknown or with no methods at all. As a result, there are very few commonly accepted standard methods of analysis available. Pharmacopoeial methods, where they exist, may be of some help, but those methods are designed to fit within a monograph “system,” and their limits of applicability are often constrained. In this context, lack of standard methods often leads investigators new to the field to invent a method when they become interested in a particular ingredient. This has led to the publication in peer-reviewed journals of results that cannot be confirmed. The report of the presence of colchicine in ginkgo and echinacea products following discovery of this compound in the placental blood of women who were consuming dietary supplements is an example of this phenomenon [15], and repeated attempts to reproduce those results have failed [16, 17].

Product quality

Product quality is one of the greatest question marks facing consumers, clinicians, regulators, and researchers [18]. As expected, there are some differences between definitions of quality for herbal preparations and for chemically synthesized products. Overall, fundamental quality parameters are the same for both: identity, purity, and content determination (i.e., strength) [19]. The DSHEA does not set a framework for quality except to state that manufacturers: are prohibited from introducing products posing “significant or unreasonable risk” into interstate commerce; must get pre-market approval from FDA for “New Dietary Ingredients;” must follow labeling regulations (accuracy, label disclaimers, notification of claims); and must have substantiation that claims are truthful and not misleading [1]. Compliance with quality standards set out in official compendia is voluntary. This lack of regulatory guidance means that working definitions of quality within the supplement industry vary, and quality parameters used by individual companies in the US range from the simple to the complex. For example, for some companies, quality is ascertained by determining that the material was grown or wildcrafted organically, that the correct plant species and plant part (or extracts thereof) are present in the product, and that the product has been manufactured in a sanitary fashion (i.e., in compliance with GMPs [20]. Along with identity criteria, other companies may set (or follow) explicit specifications for microbial load, adventitious agents (poisonous and otherwise), and content of desirable and undesirable natural chemicals (see the USP, American Herbal Pharmacopoeia (AHP), or European Pharmacopoeia (EP)). Swanson [21] and NIH’s National Center for Complementary and Alternative Medicine [22] have provided a useful overview of the types of parameters that researchers and others should keep in mind when performing research or establishing quality assurance procedures for botanicals.

Analytical challenges I

Analytical challenges in quality assurance range from establishing the identity of the botanical source from which an extract was derived to measuring the amount of one or more desirable or undesirable natural constituents, such as pesticides, toxic elements, natural toxins, or marker compounds. Analytical methods are intended to generate reliable, accurate data for use by manufacturers or regulators for quality control or enforcement actions, respectively. Reliability, accuracy, precision, and specificity are the keys to the utility of a method, but analysts must take steps to prove that any method they use has these features, especially if the method is to be used in a critical setting such as a quality control laboratory, a regulatory enforcement action, or a clinical laboratory. However, there are systematic approaches to demonstrating that a particular method for a particular constituent yields accurate and precise data.

Manufacturers need methods applicable throughout the manufacturing process, while regulators require versatile methods that can be used for the same analyte(s) in a number of dissimilar finished products. The ability of a particular method to fit the specified purpose is one element of method validation that is important but often overlooked. In addition to the difficulties noted above, botanicals are complex mixtures that originate from biological sources. Such products and their ingredients pose particular analytical challenges for a number of reasons, and method validation has proven difficult. Raw materials are invariably “irregular” because their chemical composition depends on factors such as geographical origin, weather, harvesting practices, etc., while finished products frequently contain multiple botanical ingredients [23].

Over the past five decades or so, there has been an evolution in the production, sale, and use of botanical products. In the 1970s, botanical products were largely sifted, cut, or powdered plant material in the form of a tablet, capsule, tea, or tincture. More recently, products are often carefully controlled extracts of plant material that have been spray-dried onto a solid carrier or diluent and then formed into a hard or soft capsule or tablet. There are a number of advantages to such techniques. These include savings in shipping costs as a result of reduced bulk, the ability to produce dosage forms that are more uniform in their composition, and the ability to preferentially concentrate the desirable constituents of a plant while leaving behind undesirable constituents. The ability to affirm identity and quality using simple and inexpensive techniques such as microscopy or chemical spot testing are lost when ingredients are processed and traded in this manner.

Globally, most countries have different regulatory schemes to that of the US for therapeutic botanicals, considering them a class of drugs (phytomedicines) rather than a class of foods (dietary supplements). The EP defines herbal drugs as “mainly whole, fragmented or cut, plants, parts of plants, algae, fungi, lichen in an unprocessed state, usually in dried form but sometimes fresh” [24]. Herbal drugs are precisely defined by their Latin binomial name (genus, species, variety, and author), and are identified using their macroscopic and microscopic descriptions and any further tests that may be required (for example, thin-layer chromatography). The EP monographs for herbal drugs contain less detailed specifications than one might expect and depend heavily on descriptive botany (micro- and macro-anatomy) with some rudimentary chemistry when necessary. Herbal drug monographs include those for some familiar plants, including St. John’s wort, ginkgo leaf, ginseng root, elder flower, equisetum stem, and others.

The raw material extract feed stocks used in the preparation of commercial phytomedicines are termed “herbal drug preparations” by the EP [25]. These materials are obtained by subjecting herbal drugs to treatments such as extraction, distillation, expression, fractionation, purification, concentration, or fermentation. The physical form of an “extract” in the marketplace may vary. They may occur as liquids (liquid extracts and tinctures), semi-solids (soft extracts), or solids (dry extracts). Standardized extracts are those that have been adjusted within an acceptable tolerance to a given content of constituents with known therapeutic activity. Standardization is achieved by adjusting the extract with inert material or by blending batches of extracts. Examples of EP-standardized extracts are Ipecacuanha liquid extract, standardized Belladonna leaf dry extract, standardized senna leaf dry extract, standardized licorice ethanolic liquid extract, standardized Frangula bark dry extract, and standardized aloes dry extract (made from Aloe ferox Miller and its hybrids and A. barbadensis Miller). Note that extracts of many well known herbs such as St. John’s wort are not considered to be standardized extracts in the EP because the “active” constituents are unknown. To account for extracts for which the true active constituents are unknown, the EP has defined “quantified extracts” as those adjusted to a defined range of constituents (marker compounds) that may or may not be therapeutically active, but that are not now “known” to be responsible for the therapeutic activity of the plant [25].

As noted, the use of well-defined extracts aids in the production of a reproducible and uniform product, and the definition of “standardization” published by the American Herbal Products Association recognizes that it is possible to manufacture a standardized extract without knowing the therapeutically active constituent(s) [26]. Marker compounds are constituents of a plant selected as surrogate indicators of “quality” by manufacturers and/or researchers because they are thought to be unique to the target plant species (and therefore are an aid to plant identification) or because they are thought to be associated with the biological activity of the plant. Continuing research has overturned such assumptions in a number of cases. For instance, the napthodianthrone, hypericin, was once thought to be unique to St. John’s wort Hypericum perforatum L. and was a good species identifier. An early in vitro assay of St. John’s wort and hypericin also indicated that the compound was a monoamine oxidase inhibitor. More recent studies have proposed that hyperforin and/or a series of flavonol glycosides also contribute to the beneficial antidepressant activity of the herb [27] as well as to potential herb/drug interactions [28, 29].

Analytical challenges II

In the United States, systematic evaluations of botanical products are in their infancy. The main drivers of biomedical research in this area are taxpayer-funded trials conducted under the aegis of the U.S. National Institutes of Health. Two recent publications have highlighted the importance of product quality determination and the need to define and describe the nature of interventions to be used in biomedical research. Over the past decade, several authors have published scales for evaluating the quality of published clinical trials that assess the reliability of the conclusions drawn from those trials so that evidence-based standards of medical care may be developed. These systems include the Jadad scale [30] and the Consolidated Standards of Reporting Trials Statement (CONSORT) [31]. The best trials are considered randomized, placebo-controlled, double-blind studies, but variations between trial designs and their descriptions still allow for nominally similar designs to vary in quality. Elements of these rating systems allow readers to rate the strengths of individual trial designs. Items such as sample size, intent-to-treat analysis, degree and adequacy of blinding (single-blind, double-blind), and randomization scheme (and the means of achieving randomization) were assigned numerical scores from worst to best. Trials with good overall scores are said to be less biased. These rating systems were designed with a focus on the mechanics of the clinical trial itself and assumed that the interventions used in the trials had been well characterized. In 2005, Gagnier et al. [32] recognized that in the emerging field of biomedical research on natural products, adequate characterization and description of the intervention was not always present in trial publications, and they proposed a revised CONSORT Statement that includes scoring for the quality of the description of the intervention. As a result of the growing realization of the nature and extent of this problem, Wolsko et al. [33] performed a systematic evaluation of the literature of herbal clinical trials in order to quantify the reliability of the clinical botanical literature in terms of product quality. A review of 81 studies found that only 12 of the 81 (15%) reported having performed quantitative analysis on the intervention. Of these, only eight reported results of the analyses. The authors also reported that only 40 of the 81 botanical studies (49%) identified the plant source of the intervention by Latin name (recall the discussion on “Herbal Drugs” presented earlier). Only eight (10%) of the studies identified the plant part used in the study, and only 23 (28%) of the studies described the extraction/processing method used to prepare the study intervention. In 2006, Gagnier et al. [34] published an extensive review of 206 randomized controlled trials of herbal medicines and scored them against 42 separate CONSORT checklist items. A detailed description of the results is not within the scope of this discussion, but only 54% of the reviewed trials provide a precise description of the intervention. As a result of the marketplace problems noted in the “Introduction” and these more recent concerns about the quality of recent natural products trials, the National Center for Complementary and Alternative Medicine (NCCAM) at NIH has instituted a policy that requires grant applicants to demonstrate to NCCAM and grant reviewers that the materials to be used in NCCAM-funded research are adequately characterized and described [22]. Investigators have struggled to fulfill this mandate as they have discovered that there are few reliable published analytical methods and reference materials available.

Role of the Office of Dietary Supplements

The DSHEA empowered the FDA to establish current good manufacturing practices (GMPs) for dietary supplements, and a proposed rule has been published [19]. The law requires that any enforcement action taken against dietary supplement products use “publicly available” methods. In response to concerns about the lack of properly validated publicly available methods and general concerns about product quality, the US Congress directed the Office of Dietary Supplements (ODS) at NIH to accelerate an ongoing method validation process [35].

The ongoing process has been a collaboration between AOAC International (Gaithersburg, MD, USA) and various stakeholder groups, including representatives of the dietary supplement industry, regulatory and other governmental bodies, consumer groups, nongovernmental organizations, and research scientists. The ODS program began with a meeting held to identify needs that might be met by the new program. Stakeholders advised ODS that the new program should emphasize basic quality issues such as identity and contamination and should use existing frameworks for method development and validation [36]. The program itself began by providing funds to the FDA’s Center for Food Safety and Applied Nutrition (CFSAN) for the purpose of validating analytical methods through the AOAC Official MethodsSM program. Under the auspices of the ODS program, the AOAC/stakeholder collaboration identified supplement ingredients for study and set priorities for the order in which they would be pursued. Following prioritization, available analytical methods collected by the group were subjected to an “NIH-style” peer review process for the purpose of selecting the most promising methods for further study. To date, nine supplement methods have been collaboratively studied, with seven approved as first action Official Methods. Collaborative study reports have been published for five of the approved methods (ephedrine alkaloids in botanicals/dietary supplements by HPLC/UV [37], ephedrine alkaloids in plasma/urine by LC–MS [38], glucosamine in raw materials and finished products [39], beta-carotene in raw materials and finished products [40], flavonol glycosides in Ginkgo biloba raw materials and finished products [41]. Collaborative study manuscripts are in press for “Phytosterols in Saw Palmetto raw materials and finished products” and for “Aristolochic acid in raw materials and finished products.” Single-laboratory validation studies of these methods have been published [42, 43]. There are an additional nine single-laboratory validation studies in progress, and 38 additional ingredients in various stages of study. These include S-adenosyl-L-methionine (SAMe), chondroitin sulfate, St. John’s wort,L-carnitine, B vitamins, black cohosh, Ω-3 fatty acids, soy isoflavones, green tea catechins, lutein, turmeric, ginger, milk thistle, African plum, and flax seed.

In addition to progress on Official Methods, the past five years have seen the emergence of a community of analytical chemists, dietary supplement researchers, and others concerned with dietary supplement analytical methodology. In the two years before the launch of the ODS program, there were no dietary supplement publications in the Journal of AOAC International. Between the launch of the program in 2002 and the present, there have been over 100 dietary supplement methods published in the journal.

Some method development work is being performed by the Food Composition Laboratory at the United States Department of Agriculture (USDA) through an interagency agreement with ODS. The USDA agreement will result in the development of analytical methods for phenolic glycosides in foods and supplements. In addition, ODS has recently entered into agreements with experts at the FDA’s CFSAN to develop or extend validated methods for the determination of mycotoxins, pesticides, and toxic elements in dietary supplement raw materials and finished products.

Stakeholders identified three distinct needs in regard to reference materials. First was the need for biological reference materials to be used for botanical identity confirmation. This effort has been hampered by a number of technical issues, not least of which is defining the nature of such materials. The second need was for the production and dissemination of calibration standards needed for the development, validation, and routine use of supplement methods. Calibration standards are the single chemical entities needed to construct calibration curves for quantitative analysis and analyte identity confirmation. With some exceptions, the high cost of well characterized high purity plant secondary metabolites for use as calibration standards has been a major impediment to method development and validation. The usual practice used by researchers who needed to make an analytical measurement of a plant constituent was for the individual investigator to isolate pure compounds from the plant of interest and use that material in the investigation. This process is time-consuming, expensive, gives low yields, and is limited to the laboratory of the individual researcher. In addition, the compounds may also be unstable in pure form. In order to expedite the development and validation of methods, ODS has sponsored research into small- to medium-scale isolation methods for the production of pure compounds as well as the acquisition of these compounds for use in collaborative studies. It has also sponsored research into methods for stabilizing labile compounds. While quantities of the materials produced for collaborative studies are large by historical standards, they remain quite small in marketplace terms. ODS has therefore funded the production of larger quantities of very high purity standards for national standard-setting bodies, such as the USP. To date, the program has contracted for the production of: berberine hydrochloride, hydrastine, beta-sitosterol, gingerol, shoagol, actein, 27-deoxyactein, and eleutherosides B and E as USP reference materials. Approximately 20 other materials have also been produced for use in collaborative and pre-collaborative studies, including a stabilized hyperforin material in a liquid format and anthocyanin standards for various Vaccinium species.

The final need was for matrix reference materials. This part of the ODS program has been conducted in partnership with the National Institute of Standards and Technology (NIST) in the US Department of Commerce. One of the most important features of any analytical testing protocol is the ability of the analyst to verify whether or not the analytical instrument is operating properly and whether the assay has been performed correctly. A useful ways to make this determination is to perform the method on a material that has had values for the analyte of interest assigned to it through a formal certification process. If the analyst succeeds in reproducing the certified values, then he or she can be confident that the analysis was performed properly. These materials are used when evaluating analytical method performance in individual laboratory settings and are useful tools in laboratory proficiency testing. Briefly, the NIST materials are produced as suites of supplement raw materials and finished products for which certified values for selected chemical constituents are established using a rigorous standard approach. Each suite consists of properly identified dried, powdered botanical raw material, a commercial raw material extract, and one or more representative commercial products. The NIST process involves obtaining authentic botanical raw material, developing and validating analytical methods (if none exist) to determine the compounds to be certified, using two or more methods and laboratories to analyze for the compounds of interest, and assigning values to be written into a Certificate of Analysis [44]. Materials are then packaged appropriately and made available for purchase. There are currently 15 suites of NIST Standard Reference Materials (SRMTM) in process, with five (Ephedra, Ω-3 fatty acids in cod liver oil, ginkgo biloba, β-carotene in carrot oil, and multivitamin/multimineral tablets) now available. Those in process are bitter orange, saw palmetto, α-tocopherols, green tea, various vaccinium berries, Ω-3 fatty acids in seed oils, and vitamins D, B6, and B12 in serum. Acquisitions in progress are soy, St. John’s wort, and black cohosh [45].

Additional projects include funding of single-laboratory validation studies of ingredients that are deemed important by NIH or FDA but that are not highly ranked by the AOAC stakeholder committee process (e.g., constituents of bitter orange [46], aflatoxins in botanicals [47], as well as a study validating thin-layer chromatographic fingerprinting methods for determining botanical identity. This latter project is in keeping with stakeholder recommendations about pursuing methods for verifying plant identity, and is complemented by funding for an electronic herbarium pilot project and for the production of a handbook of botanical microscopy to replace the botanical drug microscopy texts of the late nineteenth and early twentieth century. An additional project related to plant identification is the development of a system for the identification, isolation, and characterization of compounds in plants that may indicate the presence of undesirable plant species (adulterants) in the botanical raw material or finished product. Identification of these “negative marker” compounds by bioassay-directed fractionation will be followed by the development and validation of analytical methods for these compounds.

Conclusion

An enormous challenge remains in developing, validating, and disseminating methods and reference materials for the 40,000 supplement products projected to be on the US market by 2010 [4]. Acceleration and expansion of the program will continue as the supplement analytical community grows. Recognition of new opportunities has already begun to reshape the program. Beginning in 2007, there will be an increasing emphasis on funding single-laboratory validation studies and on performance-based analytical methods. Part of this shift is made possible by an ODS-funded laboratory proficiency pilot program administered by NIST, but the emphasis on rapid dissemination and small-scale laboratory studies were both key recommendations made by stakeholders at the beginning of the program [36].