Comparative efficacy and tolerability of 32 oral antipsychotics for the acute treatment of adults with multi-episode schizophrenia: a systematic review and network meta-analysis

versus benefits of those drugs available in their countries. They should consider the importance of each outcome, the patients’ medical problems, and their preferences. molindone, penfluridol, perazine, perphenazine, pimozide, sulpiride, thioridazine, thiothixene, trifluoperazine, zuclopenthixol (cis-isomer) guided a


Summary Background
Schizophrenia is one of the most common, burdensome, and costly psychiatric disorders worldwide in adults.
Antipsychotic drugs are its treatment of choice, but there is controversy about which agent should be used. We, therefore, attempted to compare and rank antipsychotics by quantifying the information from randomized-controlled trials.

Methods
We performed a network meta-analysis of placebo-controlled and head-to-head randomized controlled trials to compare 32 antipsychotics. We searched EMBASE, MEDLINE, PsycINFO, PubMed, BIOSIS, Cochrane Library, WHO International Clinical Trials Registry Platform (ICTRP) and ClinicalTrials.gov (last search 8 th January 2019).
Two authors independently selected studies and extracted data. Our primary outcome was change in overall symptoms measured with standardised rating scales. Additionally we extracted data for seven efficacy and nine safety outcomes. Differences in the findings of the studies were explored in meta-regressions and sensitivity analyses. Effect sizes measures were standardized mean differences (SMDs), weighted mean differences (WMDs) or risk ratios (RRs) with 95% credible intervals (CrIs). Confidence in the evidence was assessed using CINeMA (Confidence in Network Meta-Analysis). The study protocol was registered at PROSPERO (CRD42014014919).
The confidence in evidence, depending on the treatment comparison, was often low or very low.

Interpretation
There are some efficacy differences between antipsychotics, but most of them are gradual rather than discrete.
Differences in side effects, many of which are serious, are marked. These findings will aid clinicians in balancing risks versus benefits of those drugs available in their countries. They should consider the importance of each outcome, the patients' medical problems, and their preferences.

Funding
The project was founded by the German Ministry of Education and Research (FKZ01KG1406).

Research in context
Evidence before this study Treatment with antipsychotic drugs is the standard for the acute phase of schizophrenia according to most national and international guidelines. Nevertheless, their use is strongly debated due to side effects and possible brain changes. As many antipsychotic drugs are available according to the WHO, it would be important to know how each of the many substances compares to each other, ideally ranked in a hierarchy. Due to their diverging receptor binding profiles, different antipsychotics can vary considerably in their efficacy and side effect profile. Even if there is a large evidence base of randomized clinical trials for the acute treatment of schizophrenia many evidence gaps remain, as many substances have never been compared directly in trials. This is especially true for older antipsychotics. We searched PubMed for network meta-analyses on the acute treatment of schizophrenic patients with antipsychotics. Using the search terms "antipsychotic" AND "schizophrenia" AND (network meta-analysis OR multiple treatment meta-analysis), we found several relevant systematic reviews in our last search October 2018.
Some examined special subgroups like children or first episode patients. Others were restricted to special populations like Chinese or Japanese people. Some focused on special outcomes like weight gain or glucose. One network metaanalysis from 2013 did not include the newly approved antipsychotics cariprazine and brexpiprazole and examined only the two older antipsychotics haloperidol and chlorpromazine. Furthermore, not all clinically important efficacy and safety outcomes were studied in this report. Altogether we found no comprehensive and systematic network meta-analysis comparing older and newer antipsychotics for the acute treatment of schizophrenia over several efficacy and safety outcomes.

Added value of this study
To our knowledge the present analysis is the largest network-meta-analysis carried out in the field of schizophrenia, based on 402 studies including 53,463 participants randomized to 32 different older and newer antipsychotics or placebo. The addition of two new and sixteen old antipsychotics is a major extension of a previous report. We investigated ten additional important outcomes, such as specific aspects of efficacy, quality of life and many more side effects, and a number of methodological issues, such as placebo response and sample sizes. The primary outcome was reduction of overall schizophrenic symptoms, but we also examined other domains: reduction in positive and negative symptoms, dropouts, depression, quality of life and functioning. Newer and older

Introduction
Schizophrenia is a common (1% of the world population are afflicted) debilitating, disorder lasting most of the patients life, with a huge burden for patients and a total cost of 155.7 billion USD only in the US. 1,2 Antipsychotics are the mainstay of treatment, 3 but are associated with important side-effects which can cause serious disability including death. 4 Many drugs are available according to the WHO, 5 which vary considerably in their affinity to different synaptic receptors, leading to possibly diverging efficacy and safety profiles. Many guidelines recommend newer antipsychotics as the treatment of choice, but older antipsychotics are less costly and still used worldwide, especially in low income countries. 6 Moreover, neither these old ones nor the recently approved antipsychotics brexpiprazole and cariprazine have been compared in a comprehensive network meta-analysis. 7 However, a clear understanding of the relative risks and benefits are essential for informed decision-making. The project extends our previous work that combined evidence from 212 randomized trials on 15 antipsychotics on seven outcomes to 32 antipsychotics and placebo from 402 randomized trials and seventeen efficacy and safety outcomes, with change in overall symptoms as the primary outcome. 3 The aim of this systematic review is to better inform clinical practice and mental health policies by comparing all licensed second-generation, as well as sixteen first-generation antipsychotics, in the acute treatment of adults with schizophrenia.

Search strategy and selection criteria
Presentation of the results follows the PRISMA guidelines for network meta-analyses (appendix 1). 8 We registered the protocol with PROSPERO (CRD42014014919, appendix 2).
We included randomized controlled trials (RCTs) in adults with acute symptoms of schizophrenia or related disorders (such as schizophreniform, or schizoaffective disorders). We excluded studies in patients with treatment resistance, first episode, predominant negative or depressive symptoms, concomitant medical illness, and relapse prevention studies.
We included all second-generation ("atypical") antipsychotics (SGAs) available in Europe or the US, placebo and a selection of first-generation ("typical", "conventional") antipsychotics (FGA) (benperidol, chlorpromazine, clopenthixol (cis-and trans-isomer) , flupenthixol, fluphenazine, haloperidol, levomepromazine, loxapine, molindone, penfluridol, perazine, perphenazine, pimozide, sulpiride, thioridazine, thiothixene, trifluoperazine, zuclopenthixol (cis-isomer) guided by a survey among 50 international schizophrenia experts. 9 We excluded intramuscular formulations because they are primarily used for relapse prevention (long-acting) or emergency use (short-acting). We included all flexible-dose studies since these allow the investigators to titrate to optimum dose for the individual patient. In fixed-dose studies, we included target to maximum doses according to the "International Consensus Study on Antipsychotic dose". 10 If studies used several doses, we averaged the results of the individual arms using weighted means. 11 We included published and unpublished RCTs comparing one antipsychotic with another or with placebo. Trials in which antipsychotics were used as an augmentation or combination strategy were excluded. For subjective outcomes (e.g. overall change in symptoms), we included only double-blind studies, because a lack of blinding can exaggerate differences between treatments in this area. 12 For objective outcomes, open studies were included (see appendix 18 for full details). We included short term studies with a follow-up period between three and 13 weeks. 13 Studies with a high risk of bias in sequence generation or allocation concealment according to the Cochrane Collaboration´s risk of bias tool were excluded. 11 We apriori excluded studies from mainland China due to serious quality concerns. 14  consensus was not reached a third reviewer (SL) was contacted. Study authors were contacted in case of missing or unclear information. For dichotomous data we assumed that participants lost to follow-up would not have responded (conservative approach). Missing standard deviations were estimated from test statistics or by using the mean standard deviation of the remaining studies. 15 Risk of bias in RCTs for the primary outcome was independently assessed by the same reviewers as above using the Cochrane Collaboration´s risk of bias tool (Appendix 19). 11 The overall risk of bias was classified into high, medium and low as proposed in a large network meta-analysis for antidepressants. 16

Outcomes
The single primary outcome was change in overall symptoms of schizophrenia as measured by rating scales such as the Positive and Negative Syndrome Scale (PANSS), the Brief Psychiatric Rating Scale (BPRS) or any other published scale. 17 Different scales were combined using standardized mean differences (SMD). Secondary outcomes were all-cause discontinuation, discontinuation due to inefficacy, responder rates (study defined), change in positive, negative and depressive symptoms, quality of life and social functioning, measured by means of published rating scales. The following major side-effects were examined: use of antiparkinson drugs as a measure of extrapyramidal side-effects, akathisia, weight gain in kg, >7% weight gain, prolactin levels, sedation/somnolence, QTc prolongation and at least one anticholinergic side-effect (appendix 18).

Data analysis
We conducted a network meta-analysis combining direct and indirect treatments in a Bayesian hierarchical model using the rjags package (appendix 4). 18 Effect sizes were risk ratios (RRs) for dichotomous outcomes and standardized mean differences (SMD) for continuous outcomes, except weight gain, QTc-prolongation and prolactin elevation for which weighted mean differences (WMD) were applied. Data were combined using a random effects model. Treatments were ranked using the surface under the curve cumulative ranking (SUCRA) probabilities. The transitivity assumption was evaluated by comparing the distribution of potential effect modifiers (placebo response, publication year, sample size, baseline seeverity, mean age and perecentage male) across studies grouped by comparison (appendix 5). We assumed a common heterogeneity parameter across the various treatment comparisons, and presented the between-study variance τ² for each outcome. We characterized the amount of heterogeneity as low, moderate or high using the first and third quantiles of their empirical distributions. 19 We statistically evaluated consistency (the agreement of the various sources of evidence) using the design-by treatment test 20 and by separating indirect from direct evidence 21 using the R netmeta package (appendix 13). 22 We explored residual heterogeneity and inconsistency by several a priori defined meta-regressions (with covariates: placebo response rate, study sample size, study publication year, baseline severity, sponsoring, study duration, mean age, percentage men) and sensitivity analyses (excluding studies at high risk of bias, that did a completer analysis, with imputed standard deviations, placebo controlled studies, with duration more than six weeks, that were published before 1990, that were considered as failed trials, with unfair dose comparison (studies in which the lowest dose was 50% less than the largest dose in olanzapine equivalents., e.g. a study comparing haloperidol 2mg/day (olanzapine equivalence dose =4mg) with olanzapine 10mg/day was excluded) and excluding placebo arms) (details appendix 4, 16 and 17). We used contourenhanced funnel plots and the trim-and-fill method for the primary outcome to investigate the presence of smallstudy effects, whereby small studies give different results from the large studies for all comparisons against placebo and against haloperidol. 23,24 The certainty of evidence produced by the synthesis for outcome was evaluated using the framework described in Salanti et al. 25 and implemented using the CINeMA (Confidence in Network Meta-Analysis) web application which allows to grade the confidence in the results into high, moderate, low and very low (details appendix 15). 26 For the primary outcome we examined the confidence of evidence of all comparisons, for the remaining outcomes we examined the comparisons of antipsychotics versus placebo.

Role of the funding source
The funder of the study had no role in study design, data collection, data analysis, data interpretation or writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.

Results
The search identified 53,791 citations and 2,827 full-text articles were retrieved. We included 550 reports from 402 studies with 53,463 participants (PRISMA flowchart figure 7, detailed appendix 6). The full list of included studies can be found in appendix 7. The sample had the following characteristics: mean age 37.40 years, 50.09% males and a mean illness duration of 11.90 years. Further characteristics are summarized in tables 1 (overall) and 2 (individual antipsychotics). We excluded studies with high risk of bias for randomization and allocation, but methods for sequence generation and allocation concealment were often not described in detail and therefore coded as unclear.
Risk of bias is presented in appendix 8. A total of 218 (54%) studies with 40,815 (76%) participants presented usable results for change in overall symptoms ( Fig. 1). Figure 2a presents the overall change in symptoms for all antipsychotics compared to placebo as a reference. Figure 2a includes a color code, reflecting the strength of recommendations according to CINeMA. It also includes the sample sizes which should be considered together with the credible intervals as measures of the uncertainty of the estimates. All antipsychotics were associated with larger improvement compared to placebo (Fig. 2a), and the differences were statistically significant except for penfluridol, pimozide, perazine, trifluperazine, fluphenazine and levomepromazine. The SMDs for drugs with significant results ranged between -0.89 (95% CrI -1.08;-0.71) for clozapine to -0.26 (95% CrI -0.4;-0.12) for brexpiprazole. Table 3 presents the detailed results for all possible antipsychotic comparisons ordered according to their SUCRA ranking. This table needs to be consulted for the question of which drug is more efficacious than another drug.
Results of network meta-analysis are presented in the left lower half and results from pairwise meta-analysis in the right upper half, if available. Table 3 shows a pattern that clozapine, amisulpride, zotepine, olanzapine and risperidone reduced overall symptoms significantly more than many other drugs. Most differences between the remaining drugs were small or very uncertain.
192 studies with 35,115 participants reported study defined response rates, using very different cut-offs (discussion s. Appendix 20.1). All antipsychotics had higher response rates compared to placebo. RRs for significant results ranged 226 studies (56%) reported all-cause discontinuation rates for 42,672 participants (80%). All drugs except pimozide had at least trends for lower discontinuation rates compared to placebo. RRs for significant results ranged from 0.52 (95%CrI: 0.11;0.95) for clopenthixol to 0.9 (95%CrI: 0.85;0.95) for haloperidol (Fig. 2f). Examining discontinuation due to inefficacy we found comparable results as for the primary outcome overall change in symptoms (appendix 20.2).
134 studies (33%) with 276,904 participants (50%) reported anticholinergic side effects. Most drugs were associated with a higher risk of anticholinergic side effect compared to placebo. This outcome can be influenced by use of anticholinergic rescue medication, which is often needed for classical antipsychotics. Significant evidence was present for risperidone, haloperidol, olanzapine, clozapine, iloperidone, chlorpromazine, zotepine, thioridazine and quetiapine (RR range: 1.31-4.17) (Fig. 3g) Heterogeneity was low (to moderate) for most outcomes, moderate to high for use of antiparkinson medication and high for prolactin elevation. Separating indirect from direct evidence (SIDE test), the percentage of inconsistent comparisons was between 2% and 26% for all outcomes, except for quality of life with 50% inconsistent comparisons; so this outcome was examined only in a pairwise meta-analysis (appendix 12). In addition prolactin results were significantly inconsistent according to the design-by-treatment interaction test. As prolactin values vary widely between men and women and assays used in different laboratories, we also applied SMDs, and heterogeneity and inconsistency were substantially lower (appendix 13, 20.4).
The most important differences in terms of study characteristics were that older antipsychotics had less placebo response than newer ones and that the antipsychotics differed in their median baseline severity across studies (appendix 5). These potential threats to the transitivity assumption and other potential effect modifiers were addressed by metaregressions and sensitivity analyses of the primary outcome. We excluded antipsychotics studied in less than 100 participants. The degree of placebo-response, which has increased over the years, 27 had the greatest influence on the heterogeneity. The effect sizes of the individual antipsychotics changed after accounting for response to placebo, but the overall hierarchy did not materially change (fig 4a and appendix 9). This finding was corroborated by removing placebo arms or placebo controlled studies in sensitivity analyses ( fig. 4b, 4c and appendix 10). Publication year, mean participants' age, baseline severity, percentage of male patients, sample size and sponsoring also did not materially change relative treatment effects compared to the unadjusted analysis (details and plots appendix 9). Sensitivity analyses removing studies with overall high risk of bias, completer analyses, imputed standard deviations, duration more than six weeks, unfair dose comparisons, failed trials and trials conducted before 1990 did not impact materially on the results (appendix 10).
The quality of the evidence was overall poor. Concerning the primary outcome we judged the confidence in the evidence for 75% of the comparisons with placebo to be low or very low (Figure 2a), and this was the case for 92% of the comparisons of two antipsychotic drugs (Appendix 15.2). Figure 5 shows that many older antipsychotics are among those with poor CINeMA ratings, and that for older drugs often no evidence was available at all for several secondary outcomes.
We conducted two contour-enhanced funnel plots of the primary outcome. Comparing all antipsychotics to haloperidol did not reveal any asymmetry and the SMD did not change using the trim-and-fill method (Fig. 6a). In contrast comparing all antipsychotics to placebo revealed that smaller trials exaggerate the effectiveness of the active interventions versus placebo (Fig. 6b). SMD changed from 0.45 to 0.38, confirming an earlier analysis. 27

Discussion
The present analysis is the largest network-meta-analysis in the field of schizophrenia, based on 402 studies including 53,463 participants randomized to 32 different first-and second-generation antipsychotics or placebo. We extended our previous report by two SGAs and 15 FGAs and by investigating ten additional important outcomes, such as specific aspects of efficacy, quality of life and many more side effects, and a number of methodological issues, such as placebo response and sample sizes. 3 All antipsychotic drugs reduced overall symptoms more than placebo with mean effect sizes between -0.89 and -0.03 (median -0.42). We emphasize that as can be derived from overlapping credible intervals in the figures comparing the drugs with placebo, most differences between drugs are gradual (not statistically significant) rather than discrete (statistically significant). With few exceptions only clozapine, amisulpride, zotepine, olanzapine and risperidone were significantly more efficacious for the primary outcome than other ones. We emphasize that readers should consult the league tables which provide this information (table 3). Amisulpride was among the most efficacious antipsychotics, but no placebo-controlled study was available making this evidence entirely indirect. Nevertheless amisulpride was significantly superior to placebo in elderly patients (SMD=0.86) and in patients with predominant negative symptoms (SMD=0.47). 28,29 Mainly newer antipsychotics provided data separately for positive and negative symptoms, but the picture was similar to overall change in symptoms. However, all included studies focused on positive symptoms, as studies with predominant negative symptoms were excluded in this analysis and evaluated separately. 28 It is impossible to clarify in populations with positive symptoms whether differences in negative symptoms relate to primary or just secondary negative symptoms. That many drugs improved depressive symptoms more than placebo may also mainly reflect a reduction of anxiety and distress associated with schizophrenia. Nevertheless, lurasidone and quetiapine are licensed for major depression. So is flupenthixol, but we did not find an antidepressant effect for it, based on sparse data (63 participants). 30 Many antipsychotics did not have data for quality of life, an important outcome for patients as it combines efficacy and safety. If reported, most drugs showed better effects than placebo. Some but not all drugs also outperformed placebo in terms of social functioning in these short-term studies, an outcome associated with recovery and social reintegration.
As all-cause discontinuation combines efficacy and tolerability, it has been used as a measure of effectiveness in the CATIE trial. 31 If reported separately more patients dropped out due to inefficacy (40%) rather than due to adverse events (20%) in the included trials so that all-cause discontinuation is mainly an efficacy measure.
As antipsychotics are often taken for a long period, side-effects play an important role concerning morbidity, adherence and may affect cognition. 32 Antipsychotics very often scored worse than placebo for side-effect outcomes, with different profiles. In general, older antipsychotics were associated often with more extrapyramidal motoric sideeffects and prolactin elevation (with markable exceptions such as amisulpride, paliperidone and risperidone), whereas many newer antipsychotics produced more weight gain and sedation. We consider weight gain to be a good proxy for metabolic side-effects in this already dense review. 33 Specific metabolic side effects such as glucose, insulin, HOMA-IR, total cholesterol, LDL cholesterol, HDL cholesterol, triglycerides will be addressed in future reviews. In contrast to our previous report we present QTc prolongation in original units (msec) which make clinical intepretation easier. Lurasidone and the partial dopamine agonists were the most benign drugs.
Looking at efficacy and safety outcomes many older antipsychotics, limited by few direct comparisons, performed well compared to newer antipsychotics. This finding is important, because in low-and middle-income countries second-generation antipsychotics may not be affordable. However, older studies with negative results could have remained more frequently unpublished in contrast to nowadays where all clinical trials and their results should be registered. In an analysis of all antipsychotics compared to placebo, contour-enhanced funnel plots suggested unpublished studies (Fig. 6b).
Our analysis had limitations. We used strict inclusion criteria to obtain a homogenous sample, but nevertheless, the included studies have been conducted in an over sixty-year period, during which study characteristics changed.
Checking for consistency revealed only few inconsistent loops and low-to-moderate heterogeneity in most outcomes (appendix 13), but the overall power to detect inconsistency is low. 16 The major exceptions were quality of life, for which a NMA was not calculated, and prolactin increase. As prolactin results may depend on the laboratory assay used, we calculated SMDs in addition to WMDs (appendix 20.4), which strongly reduced heterogeneity. Still, the fact that clozapine and zotepine significantly reduced prolactin compared to placebo (with wide credible intervals) may be a statistical artefact, because only two small trials were available. The most important threat to the transitivity assumption of NMA was the increase of placebo response over the years, 27,34 b.ecause adjusting for placebo response in a meta-regression strongly reduced heterogeneity (tau) by 60%-63% (Appendix 9.1). In this meta-regression model the ranking was not substantially different from the primary analysis. Removing placebo arms, placebo-controlled studies and failed studies in sensitivity analyses did not materially change the results. Nor did metaregressions of six other moderators and further sensitivity analyses, speaking for the robustness of the findings. The NMA results were overall consistent with those of pairwise meta-analyses (Table 3) and single studies.
For example, a study comparing brexpiprazole with placebo and quetiapine found that brexpiprazole was better than placebo, but worse than quetiapine, similar to the hierarchy of our analysis. 35 In a long-term study (sponsored by asenapine's manufacturer) olanzapine was significantly better than asenapine. 36 Thus, we do not believe that placebo response explains all efficacy differences between the compounds. Nevertheless, it is possible that the statistical methods could not fully account for the heterogeneity. 27 Our decision to exclude studies from mainland China limits the generalisability of the results to this country. A recent literature and telephone interview study revealed that the majority of Chinese trials continue to be of low quality. 14 Chinese reports are usually very short and as communication with the authors is often difficult due to language problems, risk of bias is difficult to assess. We had therefore a priori decided to exclude Chinese studies.
Clinical trials exclude suicidal patients and the severely ill are unlikely to be included in modern trials as giving informed consent is often not possible for them. With a mean duration of illness of twelve years, our sample consisted mainly of chronic patients, which are known to respond worse compared to first-episode patients. 37 These factors reduce generalisability.
For feasibility reasons our risk of bias assessment focussed on the primary outcome, but strictly speaking risk of bias is outcome specific (appendix 19). Moreover, the evidence for many secondary outcomes, e.g. social functioning, was based on much lower sample sizes compared to the primary outcome (42576 vs. 5199).
These limitations clearly limited the strength of the derivable recommendations. This is most clearly, albeit not only, the case for older antipsychotics, as their effect sizes are based mainly on one or two studies with sample sizes lower than 100. Small sample sizes leave room for small-trial effects which may have inflated some results. For example, the large effect of clozapine concerning reduction of negative symptoms is based on 63 participants, as clozapine is mainly studied in treatment-resistant patients which were excluded from the analysis. The contribution of indirect evidence is large for older drugs, resulting in large credible intervals, higher uncertainty and lower confidence in the evidence evaluated by CINeMA (Fig 5). The generally lower number of data available for old drugs, except for perphenazine which had more evidence of good quality from a large trial, 31 is highlighted in the figures and must thus be taken into account in the interpretation of all findings.
As so many antipsychotic options are available, our results should help to find the most suitable drug for the individual patient balancing side-effect profiles and efficacy of different drugs. We confirm that antipsychotics differ more in their side-effects than in their efficacy. We believe that efficacy differences between compounds exist, but it is a problem that their measurement is based on subjective rating scales. The development of objective efficacy measures would make interpretation easier. Finally, clinicians must remember, that the reported results are averages and that response and side effects may vary considerably in individual patients.

Declaration of interests
MH has received speaker's honoraria from Janssen and Lundbeck. SL has received honoraria for consulting from LB Pharma, Lundbeck, Otsuka, TEVA, Geodon Richter, Recordati, LTS Lohmann, and Boehringer Ingelheim; and for lectures from Janssen, Lilly, Lundbeck, Otsuka, SanofiAventis, and Servier. AC is supported by the National