Research Review: Recommendations for reporting on treatment trials for child and adolescent anxiety disorders – an international consensus statement

Background: Anxiety disorders in children and young people are common and bring signiﬁcant personal and societal costs. Over the last two decades, there has been a substantial increase in research evaluating psychological and pharmacological treatments for anxiety disorders in children and young people and exciting and novel research has continued as the ﬁeld strives to improve efﬁcacy and effectiveness, and accessibility of interventions. This increase in research brings potential to draw together data across studies to compare treatment approaches and advance understanding of what works, how, and for whom. There are challenges to these efforts due largely to variation in studies’ outcome measures and variation in the way study characteristics are reported, making it difﬁcult to compare and/or combine studies, and this is likely to lead to faulty conclusions. Studies particularly vary in their reliance on child, parent, and/or assessor-based ratings across a range of outcomes, including remission of anxiety diagnosis, symptom reduction, and other domains of functioning (e.g., family relationships, peer relationships). Methods: To address these challenges, we convened a series of international activities that brought together the views of key stakeholders (i.e., researchers, mental health professionals, young people, parents/caregivers) to develop recommendations for outcome measurement to be used in treatment trials for anxiety disorders in children and young people. Results and Conclusions: This article reports the results of these activities and offers recommendations for selection and reporting of outcome measures to (a) guide future research and (b) improve communication of what has been measured and reported. We offer these recommendations to promote international consistency in trial reporting and to enable the ﬁeld to take full advantage of the great opportunities that come from data sharing going forward.


Introduction
Anxiety disorders are among the most frequent mental health problems experienced by children and young people (CYP; i.e., youth under 19 years of age), affecting around 6.5% of CYP worldwide (Polanczyk, Salum, Sugaya, Caye, & Rohde, 2015). Anxiety disorders in CYP are associated with current social and academic adjustment difficulties, and the development of other mental health problems and adverse outcomes in critical life domains in adulthood (Asselmann, Wittchen, Lieb, & Beesdo-Baum, 2018;Essau, Lewinsohn, Olaya, & Seeley, 2014;Lieb et al., 2016;de Lijster et al., 2018). The high prevalence rates and the impact on current and future functioning calls for effective interventions for CYP with anxiety disorders.
The first RCTs had been designed to primarily study efficacy of particular treatments. Although these RCTs have enabled conclusions about efficacy, most of them were not designed to draw clear conclusions about the relative efficacy of different treatment approaches and about which treatments work best (and are most durable) for which CYP, for which specific disorders or symptom clusters, and in which settings (e.g., mental health institutions, research settings, schools). This information is critical for developments in understanding and practice and to inform end users, policymakers, and clinicians on the effects they can expect from different interventions in different contexts. However, attempts to address these issues, for example through network metaanalyses (Zhou et al., 2018) and through combining individual patient data from multiple trials (e.g., Skriner et al., 2019), are limited by a lack of consistency and clarity in how outcomes are measured, administered, and reported. Earlier trials were also limited by a lack of guidance on reporting on trials and the information provided in papers was quite heterogeneous. Such a lack of consistency in how trial procedures are reported, limits replicability of studies and the examination of potential treatment predictors, mediators, and moderators.
Since the end of the 1990s, this state of affairs improved somewhat following the development of the CONSORT (Consolidated Standards of Reporting Trials) Statement to enhance transparency and consistency in reporting on RCTs (Begg et al., 1996;. Recently, an extension was published for reporting on social and psychological interventions (Grant et al., 2018) to supplement the more biomedical approach of the original CONSORT statement. These guidelines provide an important framework and trial reporting quality appears to have improved since CONSORT was published, including in RCTs for CYP with anxiety disorders (e.g., James, James, Cowdrey, Soler, & Choke, 2013;Reynolds et al., 2012;Warwick et al., 2017). However, these general guidelines were not designed to address issues that are age and/or condition-specific. For instance, across trials for CYP with anxiety disorders, there is a broad range of ways in which diagnostic outcomes, symptom measures, and other secondary outcomes are used. Reports vary by the informant (e.g., parent or child) and on what exactly they are reporting on (e.g., comorbidity or functioning; see sections below and Warwick et al., 2017). Similarly, there are treatment characteristics in CYP that vary yet have not been addressed in general guidelines, for example, how and to what extent parents should be involved in treatment. We require transparent and consistent reporting on these characteristics in order to inform clear guidance going forward.
This article addresses these issues by providing recommendations for the selection and reporting of outcome measures and treatment characteristics in RCTs for CYP with anxiety disorders. Individual trials need to specify the most relevant primary and secondary outcome measures to achieve trial-specific aims, so it is not our intention to be prescriptive about particular diagnostic interviews or questionnaires that researchers should use. Instead, our aim is to operationalize a small core set of key recommendations so that data can be measured consistently and meaningfully combined, while also keeping it feasible to report all the recommended features alongside any trial-specific measures. Along with the core key recommendations, we formulated additional suggestions that we encourage researchers to report on where feasible. We also aim to outline the key issues to consider when deciding which specific measures to choose, and we make recommendations to ensure clear and consistent reporting. It is important to note that the recommendations were formulated as guidelines for reporting on clinical trials, and not for routinely collecting outcomes in clinical practice (for guidance on this instead see e.g., the standard formulated by the International Consortium for Health Outcomes Measurement (ICHOM); Working Group on anxiety, depression, OCD and PTSD in children and young people; ICHOM, 2019). European Association for Behavioural and Cognitive Therapies (EABCT) in Sofia, Bulgaria, in September 2018. We invited expert mental health professionals and researchers in the field from across geographical regions and at varying career stages. We identified four topics: (1) diagnostics, (2) anxiety symptoms, (3) secondary outcomes, and (4) treatment characteristics. Each topic was introduced within a plenary session followed by discussion with either the entire group or subgroups, to reach consensus and define recommendations. Our main goal was to provide guidelines in terminology and to formulate a minimal set of recommendations to enhance consistency in reporting across trials. In the process, much discussion was devoted to the distinction of topics that were regarded as 'key recommendations' from topics that would be good or interesting, yet not feasible to report on every trial. We formulated 'recommendations' (recommended to report) and 'suggestions' (encouraged to report), respectively. Volunteers signed up to write a draft of each topic in subgroups, using an online platform to share documents. Additional experts who expressed interest in joining but who were unable to attend the meeting were added to the subgroups. All drafts were edited and integrated into one full draft, which was then presented to a panel of patient representatives and a panel of clinical professionals. Stakeholder input was integrated in the second draft. The main conclusions of the second draft were then presented at the annual meeting of the Special Interest Group (childhood anxiety) of the Association of Behavioral and Cognitive Therapies (ABCT) in Washington DC in November 2018, where we requested feedback and further input from three subgroups of approximately 15 further participants. Some of those participants then joined the consensus group and provided more input, which was again edited and integrated into the final manuscript.
In parallel with this process, we consulted key stakeholders, aiming to establish broader views on measuring treatment outcomes. We included parents and CYP with experience with treatment for anxiety problems in the UK and conducted two discussion groups with mental health professionals with experience of assessing and treating child and adolescent anxiety disorders in mental health institutions in the Netherlands and the UK. We asked each group their thoughts about meaningful measures in the context of RCTs in anxiety disorders in CYP.

Stakeholder input
CYP feedback suggested that establishing a diagnosis was important at the start of treatment to facilitate access to relevant information, increase understanding, and to access support for the situation. However, CYP felt that the most important outcome was generally 'feeling better', in combination with progress on individualized goals that could be assessed regularly throughout the treatment. To capture this, CYP suggested that it may be most meaningful to assess changes in how anxiety impacts the individual's functioning. Short questionnaires with positively framed items on meaningful aspects were preferred over long questionnaires with multiple negatively framed items that they felt could make them feel worse.
The parents agreed that a diagnosis was useful to enable information seeking and understanding for family members, and stressed the importance of reducing symptoms and impairment in daily functioning. Parents also highlighted their interest in obtaining a more personalized 'story' of the individual CYP incorporating the history and the impact on the family beyond receiving a diagnosis because they felt a diagnosis did not necessarily reflect the full picture of the problem. Mental health professionals and parents stressed the importance of assessing anxiety symptom reduction as an important outcome. They particularly prioritized change in interference in daily functioning in various domains including school, family, and peers.
Mental health professionals described the aim of treatment as getting the CYP back on their developmental trajectory (e.g., academic, social, activities). Mental health professionals further valued the use of individual or personalized treatment goals that are specific to and meaningful for the child or family. For example, for one family the most important outcome following treatment from separation anxiety disorder may be for the child to go to sleep independently, whereas for another child, it may be to attend school consistently and/or on time. Such personalized treatment goals may also be broader than just anxiety symptom reduction, for example, 'taking part in out of school activities' or 'spending more time with friends', following the family's needs and wishes. Professionals highlighted that parent and CYP perspectives may be different and both are important to create a full picture. They also noted that, in some cases, a useful treatment outcome could reflect the family's confidence in managing current and/or future anxiety problems, that is, rather than full remission. We considered these views when making our recommendations for reporting and for future measurement development.

Issues and guidelines
Diagnostic outcomes. Semi-structured diagnostic interviews ((e.g. Anxiety Disorder Interview Schedule for DSM-IV-Child and Parent Version (ADIS-C/P); Schneider, Pflug, In-Albon, & Margraf, 2017;Silverman, Albano, & Barlow, 1996;Kinder-DIPS-OA) have been most commonly used in trials of CYP with anxiety disorders to assess a range of anxiety disorders, common comorbid disorders, and associated functional impairment. There is reasonable overlap in the interview schedules' content, but there can be marked variation in the outcomes that are derived and reported from these diagnostic interviews in RCTS. Variations include absence/presence of a specific diagnosis, the principal diagnosis, or all diagnoses, the total number of diagnoses, the number/type of symptoms, and interference/severity ratings (e.g., 'Clinician Severity Ratings'; CSRs). There is also variation in how these outcomes are assessed, including whether child, parent, and/or assessor report are used, how diagnoses and interference/severity ratings are assigned, and how a composite summary is established. We outline below 2020  Trial reporting for child and adolescent anxiety disorders key considerations and recommendations to encourage greater consistency in reporting and assessing diagnostic outcomes. Since the ADIS-C/P interview is the most commonly used diagnostic interview, we use examples that fit with this interview, but the recommendations can be applied to reporting on diagnostics derived from other interview schedules.
'Remission' outcomes. Remission, or absence of anxiety diagnoses, is often the primary outcome in child anxiety treatment trials (e.g., Silverman et al., 1999;Spence, Donovan, & Brechman-Toussaint, 2000) and meta-analyses (e.g., James et al., 2013). However, there is variation in how remission is defined and reported (Warwick et al., 2017). Some studies report the absence of the pretreatment principal anxiety disorder (i.e., typically established as the most interfering and, as such, may also be the most salient problem for which the family seeks help). In contrast, some other studies report the absence of a subset of anxiety disorders (e.g., the disorders that would have made a child eligible for the trial), the absence of all anxiety disorders, or the absence of all anxiety and nonanxiety disorders. Depending on the research aims, the type of treatment (e.g., whether it is a multianxiety disorder-focused treatment (where the treatment protocol can target a range of anxiety disorders) or disorder-specific (where treatment targets one particular anxiety disorder), and available resources, researchers may prioritize one remission outcome over another. For example, where a treatment targets a specific anxiety disorder (e.g., social anxiety disorder or specific phobia), researchers may be most interested in recovery from this target disorder. However, if the aim is to establish whether or not treatment gains extend to other anxiety disorders, recovery from all anxiety disorders will be the most relevant outcome. Post-treatment and follow-up assessment procedures also vary; in some cases, the full structured diagnostic interview is administered, but in others only anxiety disorders, or a subset of anxiety disorders, or the pretreatment anxiety disorders are assessed. Consistency in reporting remission outcomes is critical to allow data to be meaningfully compared and combined across child anxiety treatment trials. The high rate of comorbidity and overlap in symptoms among anxiety disorders in children and adolescents (e.g., Kendall et al., 2010;Waite & Creswell, 2014) means outcomes may vary substantially depending on which remission indices are used. However, focusing exclusively on the pretreatment principal diagnosis fails to consider the presence of common comorbid anxiety diagnoses, or the emergence of new anxiety diagnoses; and accurate differential diagnosis is reliant on a comprehensive assessment of all anxiety disorders. Note that, in line with DSM-5 classification, OCD and PTSD are typically regarded as comorbid nonanxiety disorders. We recommend that researchers assess all anxiety disorders post-treatment and at subsequent follow-ups (including those disorders that were not present at pretreatment) for trials of multianxiety disorder-focused anxiety disorder treatments. There was some discussion on whether or not to recommend reporting on all anxiety diagnoses in trials on specific anxiety disorders, weighing the disadvantages to the advantages. Perceived disadvantages include the effort required and that the aim is not necessarily for effects to generalize to all anxiety disorders. Advantages mentioned were the consistency and opportunities for comparisons across trials. In conclusion, we recommend that researchers routinely report remission outcomes in terms of both (a) absence of principal anxiety disorder diagnosis and (b) absence of all anxiety disorder diagnoses, in trials of multianxiety disorder-focused treatments and this is also encouraged for anxiety disorder-specific treatments. Reporting on nonanxiety comorbid disorders (such as depression) is encouraged but was not regarded as key to the minimal set of consistent variables to report on; however, these are likely to be particularly important considerations with increasing child age (Merikangas et al., 2010), particularly as there is evidence that anxiety disorders may be a gateway condition to other mental health problems (e.g., Beesdo-Baum et al., 2015;Kessler et al., 2012). To promote consistent and transparent terminology when reporting remission outcomes, we recommend researchers use the template for reporting remission outcomes in Table S1.
Child, parent, and clinician ratings in diagnostic interviews. Structured and semi-structured diagnostic interviews typically include independent child and parent interviews, with guidelines on procedures for assigning diagnoses based on the combination of symptoms and associated interference/ severity ratings. However, trials vary in use of (a) child and/or parent interviews; and (b) information provided by one or more informant to assign and report diagnoses and interference/severity ratings. Administering independent child and parent interviews is standard practice and joint parent and child interviews can be more acceptable in specific contexts (Ishikawa et al., 2019). However, there are circumstances where the child or parent interview only is used, for example, because of the specific sample, the purpose of the trial, the age of participants, available resources, or participant burden. Information collected from child and/or parent interviews, together with assessor ratings, is then used and combined (in various ways) to assign diagnoses and interference/severity ratings. Table 1 details descriptors of each potential reporter combination, together with the associated process for assigning diagnoses/interference ratings. Given the wide variability in how researchers have made decisions about diagnostic outcomes, we recommend that researchers use these descriptors to detail how diagnoses/interference ratings were assigned, and to indicate which reporter or combination of reporters is reported to promote transparency and clarity. We recommend that, wherever possible, researchers report 'consensus composite ratings' based on child and parent interviews, that is, assessor ratings assigned following discussion with a supervisor/independent clinician or the study team to derive consensus. Alternatively, where only one interview is used, we suggest the corresponding 'consensus rating' is used. Researchers may also choose to report diagnoses/interference ratings based on other reporters (e.g., child report, parent report), but this information should be additional to consensus ratings.
Where independent child and parent interviews are used, a key consideration is how best to combine information from these two interviews to assign diagnoses. This presents a challenge for researchers, given the common discrepancies between reporters, and particularly that the extent of the discrepancy may be influenced by clinically relevant factors, including child age, gender, ethnicity, socio-economic status, social desirability, problem type, parent mental health, and aspects of the parent-child relationship (e.g., De Los Reyes & Kazdin, 2005). While other approaches have been suggested for dealing with the challenge of informant discrepancies (e.g., De Los Reyes & Kazdin, 2005), our primary aim in this paper is to promote consistency in trial reporting so we have prioritized methods which are currently commonly used and that can be followed systematically and minimize potentially arbitrary approaches. In this way, a substantial degree of unknown variance may be reduced, precisely the reason why the field has long moved away from using unstructured clinical interviews. As such, we recommend that researchers continue to follow standard guidelines (as outlined in the ADIS-IV-C/P, and recent ADIS-5-C/P, manuals for instance Silverman & Albano, 2020) and apply the 'OR-rule', that is, a diagnosis is assigned if the symptoms are reported by either the child OR the parent. This is consistent with recommendations made by Comer and Kendall (2004) who concluded that 'employing the 'OR-rule' increases the likelihood that all clinical cases are included. Given the poor diagnostic agreement identified among parents and children, this rule seems more practical than the more conservative 'ANDrule', in which a disorder is considered present only if the reports of all informants meet criteria for that disorder. If a different approach is taken, it is critical that researchers explicitly report this and describe their approach, and this should be couched in further research to determine the potential impact on findings (in terms of reliability and reported outcomes) of taking an alternative approach.
While we advocate the use of the 'OR-rule' in order to promote consistency, it is important to reach the most parsimonious diagnostic conclusionin other words, the most appropriate single diagnosis should be applied to explain essentially the same symptoms. For example, a child may meet diagnostic criteria for social anxiety disorder based on the child report and criteria for generalized anxiety disorder based on the parent report. However, if the excessive worries reported by the parent are exclusively related to fear of embarrassment or social evaluation, and not accounted by more general worries about control, predictability, or negative affect, then the assessor should consider assigning a social anxiety disorder diagnosis only.
Clinician Severity Ratings (CSRs). The most widely reported interference/severity ratings derived from structured diagnostic interviews are CSRs. CSRs are assigned for each individual anxiety (and comorbid) diagnosis and, as detailed in Table 1, are generated by assessors for diagnoses based on (a) child report, (b) parent report, and (c) combined child and parent report. However, in addition to differences in how information from different reporters is combined, there is further variation in how CSRs are used across child anxiety treatment trials. CSRs are intended to provide a measure of severity and interference (Silverman et al., 1996), but this is not applied consistently. Our discussions indicate that some research groups use CSRs as a measure of severity, others focus on interference, and others use a conglomerate of the two. To help ensure CSRs are assigned consistently across trials, we urge researchers to continue to adhere to the guidance provided in interview schedules on assigning CSRs (e.g., (Silverman & Albano, 2020;Silverman et al., 1996). CSRs are summarized and reported in different ways, including, for example, the CSR for the pretreatment principal anxiety diagnosis, the sum of CSRs across anxiety diagnoses, or the average CSR across all anxiety diagnoses. There are also issues related to whether the entire continuum of CSRs is reported. Research groups differ in whether they assign and report CSRs across the whole scale or only for those children who would meet diagnostic criteria by virtue of having the required symptoms and a composite CSR of four or more (on a 0-8 scale). If ratings are reported dimensionally from 4-8, but then are assigned a 0 for all ratings below 4, the resulting scale is a combination of categorical and dimensional scales which raises questions about the appropriate statistical analyses. Summing and averaging CSRs across disorders can also be problematic.
Reporting CSRs can nevertheless provide important information about the degree of impairment associated with disorders that are present following treatment, both where the pretreatment principal anxiety disorder persists, or is in partial remission (CSR = 1-3), and where comorbid disorders remain or emerge. Given the issues raised above regarding the distribution of CSR data, we recommend that CSRs are primarily used to inform decisions about diagnoses; however, if they are reported, we encourage researchers to include reporting of this data in its categorical form, that is, to report the frequencies of participants assigned to each CSR for (a) the pretreatment principal anxiety disorder, and (b) the most-impairing disorder at each time point. Table 2 provides a summary of the recommendations for reporting on diagnostic outcomes. Critically, across diagnostic outcomes, it is important that researchers report on how assessors were trained and to what criterion, the inter-rater reliability that was established, and the nature of supervision was conducted and how reliability was ensured throughout the trial.

Continuous measures of reported symptoms and functional interference
This section focuses on important considerations in the selection of measures and respondents to assess outcomes in terms of (a) anxiety symptoms and (b) interference caused by anxiety.
Constructs: What to measure?. Different assessment measures have been designed for different contexts and purposes, such as screening in the general population, early identification of those at risk, clarifying the nature, severity and impairment associated with clinical disorders, and assessing the impact of interventions. Below we focus on a minimal set of key recommendations of measures of treatment outcome.
Anxiety symptoms. Historically, treatment trials have most commonly including CYP with various For multi-anxiety disorder-focused treatments (and where possible for disorder-specific treatments), assess all anxiety disorders at each time point and report on the absence of all anxiety disorder diagnoses.
Use the template for reporting remission outcomes provided in Table S1. Table 1 to provide clarity on how diagnoses and interference/ severity ratings were assigned and which reporter combination was used.

Use descriptors provided in
Where possible, report consensus composite ratings based on child and parent interviews.
Where independent child and parent interviews are administered, use the 'OR-rule' to assign diagnoses (or report if using another approach), but be mindful to ensure the same symptoms are not assigned to multiple diagnoses.
Follow guidance provided in interview schedules when assigning CSRs in order to assign diagnoses and report as categorical data. 2020 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for Child and Adolescent Mental Health anxiety disorders and have used 'multianxiety disorder-focused' treatment protocols that can be applied across a range of disorders (e.g., Ginsburg et al., 2011;Hudson et al., 2009;Kendall, Hudson, Gosch, Flannery-Schroeder, & Suveg, 2008;Thirlwall et al., 2013). Appropriately therefore, the bulk of studies conducted in this area have included measures of anxiety symptoms that cover symptoms of the major anxiety disorders specified in the DSM, for example, separation anxiety disorder, social anxiety disorder, generalized anxiety disorder, specific phobia, and panic/ agoraphobia. This approach has been valuable in allowing the comparison of results across clinical trials. Scales covering the different anxiety disorders have the advantage that they can report on both a broad range of anxiety symptoms as well as symptoms of specific anxiety domains, thus leaving the option of evaluating and comparing the impact of broad and specifically targeted treatments on broad and specific subscales. For instance, one could evaluate the impact of a multianxiety disorder-focused CBT program on total anxiety symptoms as well as on social anxiety symptoms specifically, and compare that to the broad and specific impacts of a social anxiety disorder-focused treatment. As such, we recommend the inclusion of a multidimensional measure of anxiety symptoms that includes a total score for anxiety in general, even when the trial targets a specific anxiety disorder. For an overview of specific measures that may be used for this purpose (at particular ages), we refer to the recent review by Spence (2018). Where a study focuses upon the treatment of a specific anxiety disorder, we recommend including a measure that focuses on symptoms of the specific disorder concerned, in addition to a broader measure of anxiety. Such a focused measure may potentially be a subscale of one of the multidimensional anxiety measures described above as long as there is evidence that it is a psychometrically sound measure of the specific construct of interest (e.g., see Reardon, Spence, Hesse, Shakir, & Creswell, 2018).
Interference associated with anxiety symptoms. We recommend that a measure of the level of day-to-day interference associated with anxiety is included and reported in clinical trials due to both the emphasis on the importance of this as a meaningful outcome to CYP, parents, and clinicians, and also based on recent evidence suggesting that interference measures align better with diagnostic outcomes than measures of symptoms (Evans, Thirlwall, Cooper, & Creswell, 2017). While there are specific measures of interference caused by anxiety (e.g., the Child Anxiety Impact Scale-CAIS (Langley, Bergman, McCracken, & Piacentini, 2004); the Child Anxiety Life Interference Scale -CALIS (Lyneham et al., 2013); the Child Sheehan Disability Scale adapted for child anxiety (CSDS; Whiteside, 2009)) other assessment tools (e.g., the Pediatric Anxiety Rating Scale (PARS; RUPP Anxiety Study Group, 2002) typically produce one total scale for anxiety covering various aspects including severity and interference of anxiety symptoms (as well as frequency, distress, and avoidance). We recommend that if such a measure is used, the item level data are also reported, and not only the 'total anxiety' score that conflates symptoms and interference.
Other considerations in selecting parent and child report measures Psychometric properties. First and foremost, when choosing which measures of the above constructs to include in a trial, the psychometric strength (e.g., testretest reliability, internal consistency, construct validity) of the measure is paramount. Importantly, the measure should be able to discriminate between clinical and nonclinical levels of anxiety and be sensitive to treatment gains in children and young people. Having said this it is important to highlight that there are limited data on the sensitivity and specificity of clinical cutoff points for determining diagnostic status on the basis of child anxiety questionnaires (for exceptions see e.g., DeSousa, Salum, Isolan, & Manfro, 2013;Evans et al., 2017;Rynn et al., 2006). Furthermore, typically, measures have been developed with children aged between 8 and 17 years, from white, two-parent families of relatively high socio-economic status, and we have limited knowledge of the relevance and sensitivity of these measures across cultures, minority groups, and age/gender sensitivity. While examples of culture-bound anxiety syndromes have been identified in young adults (e.g., Essau, Sasagawa, Chen, & Sakano, 2012), studies to date have suggested that commonly used measures for children are robust across particular ethnic groups (e.g., Pina, Little, Knight, & Silverman, 2009;Skriner & Chu, 2014). Future research should further focus on increasing knowledge of applicability of questionnaires to children and families who do not fall into this narrow demographic.

Reporting. There is considerable inconsistency
between trials in what is reported with respect to continuous measures. The results should be presented in sufficient detail to enable inclusion of studies within meta-analyses based on mean scores, standard deviations, and number of informants at each time point for each condition. Given significant variability in the availability of normative data for different child anxiety measures, and variations between age and sex, we recommend that researchers report mean raw scores and standard deviations rather than standardized scores (such as T-scores). With the above limitations in mind, we encourage future research to also show change from pre-to post-treatment (and follow-up) relative to the clinical cutoffs to provide relevant information to inform decision making for families, clinicians, and policymakers. 2020 The Trial reporting for child and adolescent anxiety disorders Informants. Parent-child agreement on anxiety symptoms is low to moderate (Popp, Neuschwander, Mannstadt, In-Albon, & Schneider, 2017). Informants disagree due to the varied perspectives on the child's experience across different contexts (Kraemer et al., 2003), although this varies with child age (Silverman & Eisen, 1992). As a result, the use of multiple informants (child, parent(s), clinician, and teacher) should be carefully considered when conducting youth anxiety research. With respect to parental report, although including two parents increases the richness of perspective due to potentially different views (e.g., Moreno, Silverman, Saavedra, & Phares, 2008), it often leads to high levels of missing data for one parent and increases the number of analyses reported in the trial papers (Hudson et al., 2014). The low to moderate agreement between parents on youth anxiety symptoms (e.g., Fjermestad, Nilsen, Johannessen, & Karevold, 2017;Villabø, Gere, Torgersen, March, & Kendall, 2012) also means that combining data from parents is not recommended as it leads to a potential loss of valuable information and makes interpretation difficult. Therefore, as complete data sets from all parents or caregivers are not possible or practical, we recommend prioritizing a primary caregiver and, importantly, ensuring consistent data collection from that reporter a each time point within the trial. Teacher report may provide valuable insight into a child's functioning at school, especially for children at primary school level where individual teachers typically have a high level of contact with individual children (e.g., Reardon et al., 2018). However, changes in teachers from year to year, together with multiple teachers being responsible for the child with differing depth of knowledge of the child's symptoms and interference, particularly in the high school years, can make use of teacher report within and across clinical trials problematic. Thus, because of these issues and concerns, we are not including teacher report in our list of key recommendations for clinical trials.
An important issue arising from the inclusion of multiple informants is resolving discrepancies in the data they provide with respect to outcome. For example, data from one informant (e.g., primary caregiver) may demonstrate significant change following treatment while another (e.g., child) may not. We recommend that researchers always prespecify which reporters will be prioritized (if any). For example, although Weisz et al. (2017) showed that treatment effect sizes did not differ according to the type of informant on anxiety symptoms, a number of studies with preadolescent children have failed to show treatment effects on child-reported anxiety measures in the presence of significant change on diagnostic outcomes and parent report (e.g., Rapee et al., 2017). In line with this, it has been found that, at least for preadolescent children, parent report is more often consistent with diagnostic outcomes (Evans et al., 2017). As such, for studies with preadolescent children, we encourage researchers to prioritize parent report, to maximize consistency across studies.
Development and age. There are considerable developmental differences across childhood and adolescence, and symptom measures may be more or less suitable for specific age groups (see also Spence, 2018). Indeed, there is a lack of well validated self-report measures for children under the age of 8 years, with studies most typically relying on parent report for this age group (e.g., Rapee et al., 2005), and, as noted above, for preadolescent children, parent report appears to more closely align with clinician assessments (Evans et al., 2017). Within trials, participants may cover age groups with marked developmental differences and assessments may also be conducted at time points between which there may have been substantial developmental shifts. The challenge for researchers is how to achieve a balance between consistency in measurement use within (and between) trials and ensuring that age-appropriate measures are used. The most important consideration is that age-appropriate measures are used at each assessment point, so researchers must select measures that are appropriate for their whole participant age-range at the outset of the study (and many questionnaire measures of anxiety have been validated across fairly wide age ranges). Where follow-up assessments are being administered sometime after the initial assessment, it is possible that different questionnaire measures will need to be used (e.g., if adolescents have become adults). One example of how that has been managed in previous studies comes from Saavedra et al. (2010) where both youth and adult measures were administered in parallel at all assessment points, and then, developmentally sensitive scale scores were generated based on item response theory models. However, currently little is known about the factorial (in)variance of measures with age or development, or indeed with a range of other characteristics that may influence responses to these questionnaires (though there are exceptions, e.g., Glod et al., 2017;Pina et al., 2009). Going forward, further studies of the factorial (in)variance of our commonly used scales are required to gain a greater understanding of how demographic and other characteristics affect responses.
A summary of the recommendations for reporting on continuous measures of symptoms and functional interference is given in Table 3.
Reporting on sample and treatment characteristics: specific considerations for trials of treatments for CYP with anxiety disorders Child and Adolescent Mental Health study replication, as well as to enable meaningful comparison of effects between trials and treatment types and to facilitate dissemination of effective interventions. Guidelines now exist for the reporting of trial characteristics however, in addition to the detailed reporting required by the CONSORT and CONSORT-SPI statements (Grant et al., 2018), there are several factors specific to CYP anxiety intervention trials that require consideration and for which we provide reporting recommendations here. We recognize that the recommendations and suggestions for reporting that follow provide a lot of detail about the interventions, which would traditionally be beyond what could be reported in a journal article. However, with growing use of supplementary materials by journals, the routine provision of this information will improve access to critical data for making comparisons across treatment studies and integrating data across treatment studies.
Sample characteristics. As highlighted by Warwick et al. (2017), further research is required to determine applicability of intervention effects across different participant groups and to detect potential moderators of treatment outcome. However, to facilitate such evaluations, it is imperative that researchers provide detailed pretreatment demographic and clinical characteristics of the sample in the context of the intervention. Specifically, we recommend that researchers report on the following baseline demographics: sex and/or gender (n, %), and age (range, mean, SD). We also encourage researchers to report on family living arrangements, ethnicity or country of birth, and socio-economic status; however, we recognize that there will be regional variations in how these data are collected and interpreted, so we encourage use of national standards for collecting and reporting. We recommend researchers report on principal anxiety disorder at baseline, the mean and range of the number of comorbid anxiety disorders, and the frequency of each anxiety disorder (e.g., % of children meeting criteria for social anxiety disorder as a comorbid disorder). Moreover, we encourage reporting of the number of comorbid (nonanxiety) mental health disorders and frequency of each disorder.
Treatments: contextual, structural, and content characteristics. Recent meta-analyses have identified specific directions for future child anxiety intervention research (e.g., James et al., 2013;Reynolds et al., 2012), highlighting the need for study designs and reporting to address questions about how treatments work, for whom, how to adapt treatments for different contextual situations, and how treatments can be provided in the most cost-effective way (Seligman & Ollendick, 2011). Combining data from across trials can help to address some of these questions; however, this requires explicit reporting of what the different interventions include and about the context in which the interventions were conducted. This should include a description of the contextual characteristics (target population(s), theory, empirical evidence; see Table S2), structural characteristics (dose/number of sessions, frequency, sequence, modality, mode of delivery; see Table S3) and content (treatment techniques and steps; see Table S4 for an example). If separate treatment elements are delivered to children, adolescents, and parents, specific information should be provided for each element. We recognize that different intervention approaches are required for young people of different ages and that this may be evidenced through different treatment content, delivery modes, treatment structure, and levels of parental involvement (if any). For this reason, we recommend that when age-specific treatments are provided within a trial, treatments are described separately and with sufficient contextual detail to allow effective comparison and understanding. While these issues all clearly apply to psychological interventions, they also apply to the support that is provided alongside a pharmacological intervention and should also be reported in these contexts. When describing treatments, particular attention should be paid to describing parental involvement. Recent meta-analyses have looked at the effect of including parental involvement in CBT for children with anxiety disorders, with some finding no beneficial effect of parental involvement (e.g., Lebowitz, Marin, Martino, Shimshoni, & Silverman, 2019;Vigerland et al., 2016), and other finding better Table 3 Summary of recommendations for continuous measures of reported symptoms and functional interference Include a multidimensional measure of anxiety symptoms that provides a total score on anxiety symptoms as well as subscales for symptoms of specific anxiety disorders, even when the trial targets a specific anxiety disorder.
Include a measure of target symptoms where relevant (e.g., include social anxiety symptoms measure, if social anxiety disorder is the target of the treatment; ensuring that this is psychometrically reliable and valid for the specific target).
Include a measure of interference or impact caused by anxiety, separate from anxiety symptoms.
Report mean raw scores and standard deviations and number of informants at each time point, for each measure and for each condition. Trial reporting for child and adolescent anxiety disorders long-term outcomes than when CBT is conducted with parents alongside child treatment sessions (Kreuze et al., 2018;Manassis et al., 2014). These mixed findings are hard to fully understand given the great variation in the type of and extent of parental factors targeted in treatment studies. Some treatments, for example, are solely delivered to or via parents (e.g., Cobham, 2012;Lebowitz et al., 2019;Thirlwall et al., 2013). To overcome this problem going forward, special attention must be paid to clearly describing the involvement of parents in treatment, and how this may differ where trials include, for example, both children and adolescents. If there is a separate intervention for parents (e.g., workbook, parent sessions, online material), this should be described in detail and separately using the columns of the templates in Table S3. Specifically, we recommend that intervention protocol descriptions are presented for all versions of the treatment delivered (e.g., reflecting variations for participants of different ages) and capture the number and frequency of parent sessions, parent involvement in home exercises, presence of parents during child treatment sessions, whether treatment involved the presence of one or two parents, strategies for providing information to parents in separated families, and mode of delivery for parents. Even when there is no specific treatment content for parents, we recommend researchers report on any planned interactions with parents that may represent active treatment components. For example, parents might receive psychoeducation together with the child, be involved as a 'therapist/coach' for homework purposes, be involved in exposure, or be allowed to have separate contact with the therapist during treatment, all of which may have an impact on the child's adherence and parental management of anxiety. We recommend that this detail is reported in line with the templates provided in Table S3.
Treatments: content characteristics. RCTs of interventions for child and adolescent anxiety disorders tend to focus on 'treatment packages'. For example, CBT, the most extensively evaluated treatment for child anxiety disorders, typically consists of multiple strategies or components, the most common components being exposure (included in 88% of treatment packages), cognitive techniques (62%), relaxation (54%), psychoeducation (42%), and modeling (34%) (Higa-McMillan, Francis, Rith-Najarian, & Chorpita, 2016). Programs targeting anxiety disorders, on average, contain six to seven strategies, with 29 different CBT strategies described in 27 studies (Frechette-Simard, Plante, & Bluteau, 2018). These findings highlight that treatment packages vary considerably in content, although this is not always clear from the research papers. In addition, treatment packages often offer treatment in a specific sequence and dosage and with multiple modalities. As noted above, the content of parental components also often varies considerably. For example, the meta-analysis by Manassis et al. (2014) highlighted potential differences in the roles of contingency management, transfer of control, and family functioning in parent focused components of treatments for child anxiety disorders.
We recommend that the content of the treatment studied in an RCT should be described in terms of all treatment components per session, separately for the child and parent components. The template in Table S4 provides an example of such a description of the content of the treatment(s). Furthermore, where researchers have used an adapted existing treatment packages for a specific (e.g., cultural) context, any modifications and adaptations should be reported explicitly (e.g., translation; tailoring of terms; adding, deleting, or reordering elements; Stirman, Miller, Toder, & Calloway, 2013).
Dosage of treatment received. It is important to report the dose of treatment actually received by participants in trials evaluating both psychopharmacology and psychotherapy. Regarding medication, CONSORT provides guidelines on how to report on a drug intervention (i.e., 'the drug name, dose, method of administration (such as oral, intravenous), timing and duration of administration, conditions under which interventions are withheld, and titration regimen if applicable'; Moher et al., 2010) and further, specific recommendations regarding childhood anxiety disorders do not seem necessary. However, in child and adolescent anxiety psychotherapy trials, meta-analyses have revealed much variation in how the actual dose of treatment received is reported (James et al., 2013) with particular gaps in reporting of the actual number of sessions 'attended' or 'received', or time duration (i.e., number of weeks) over which sessions were completed. This can lead to misconceptions about intervention dose and associated effects. Indeed, there is emerging evidence that some delivery modes (e.g., online programs) are associated with slower completion of treatment sessions (Jolstedt et al., 2018;March, Spence, & Donovan, 2008;Spence et al., 2011;Vigerland et al., 2016) and that, sometimes, treatment could be shortened in response to subject characteristics such as anxiety severity (Pettit, Silverman, Rey, Marin, & Jaccard, 2016). As such, we recommend that researchers detail actual intervention uptake by participants. That is, the average number of sessions and time spent in sessions actually attended or received by (both child and parent) participants at the posttreatment or later follow-up assessments should be reported as well as the actual total treatment duration (weeks) using a template such as that in Table S5. If applicable, the received treatment dosage at follow-up may be provided separately, for example, in the context of online interventions where sessions may continue after the post-treatment (or follow-up) assessment. Note that the current recommendation only includes a quantitative evaluation (how many sessions or hours were received). In addition and if feasible, it may be fruitful to also engage in more in depth evaluations of adherence (e.g., McLeod et al., 2019).
Given the variations in the treatment components employed in child and adolescent anxiety interventions, it has been difficult to accurately identify key active components. As such, we also encourage researchers to provide information on the proportion of participants adhering to specific treatment components (e.g., exposure, relaxation; for an example see McLeod et al., 2019), as well as on additional strategies that were delivered that were not detailed in the protocol. Such reporting has become particularly relevant in recent times with the emergence of new treatment modalities such as online delivery, within which detailed analytics are available regarding participation in treatment (e.g., Jolstedt et al., 2018). Subsequently, there is a unique opportunity to understand which components and modalities are most acceptable and effective for child and adolescent anxiety disorders.
Treatment and hidden time. Both 'hidden' and 'unplanned' time is commonly encountered in child and adolescent anxiety treatment trials (e.g., amount, content and delivery mode of preparation, administration, supervision, liaising with schools or teachers, conducting motivational interviewing with parents to enhance engagement, crisis management). We realize that many of these aspects vary as a function of local customs and therapist experience. Nevertheless, collecting and describing these aspects will be a good start to obtain insight into the real-life costs of implementing treatments. We suggest that the amount of hidden time is presented using the template provided using the template of Table S6. Specification of how potential missing sessions were replaced is also encouraged, for example, if missing a group-session resulted in additional therapist time through an individual face-to-face session. Table 4 provides a summary of the recommendations on reporting on sample and treatment characteristics.

Discussion
We identified consensus recommendations to supplement standard guidelines for reporting on treatment trials focused on child and adolescent anxiety disorders to enable international consistency and to facilitate comparison across trials. Our recommendations were influenced by the views of CYP, parents and professionals who particularly highlighted the importance of measures of interference or impairment caused by anxiety. This is a key component of any diagnostic assessment, however, in reflection of these views, we recommended that child and parent-reported interference measures are obtained in their own right. Of note, CYP, parents, and professionals also value the assessment of goals, highlighting an area for future measurement development and evaluation going forward (for example, through validation of measures such as the Goals Based Outcomes (e.g., Law & Jacob, 2015) in the context of treatment of child and adolescent anxiety disorders). Finally, stakeholders highlighted the importance of measures not being overly lengthy, feeling relevant, and not being overly negatively framed. We highlight these issues as ongoing challenges for the field. In terms of reducing burden, going forwards there is clearly great potential for the use of online assessments, algorithms, and measurement of wider variables (e.g., through activity monitoring) to both increase efficiency of assessment and to enable more consistent reporting across trials.
When making choices between recommendations and suggestions we decided to discard those constructs that deserve a more extensive assessment than we could recommend for all trials, which means that particular areas that will often be worthy of assessment, such as broader health related quality Table 4 Summary of recommendations on reporting on sample and treatment characteristics Report on the following baseline demographics: sex and/or gender (n, %) and age (range, mean, SD).
Report on principal anxiety disorder at baseline (i.e. the most interfering anxiety disorder), the number of comorbid anxiety disorders and number of each anxiety disorder.
Provide detailed description of treatment protocol including contextual characteristics and structural characteristics, both for child and parent interventions. We encourage researchers to use the templates provided in Tables S2 and S3.
Provide a detailed description of the content of the intervention (focus and components per session, separately for child and parent interventions). We encourage researchers to use the template provided in Table S4.
Report on participant received dosage of structural elements of the intervention. We encourage researchers to use the template provided in Table S5.
Report on planned and unplanned therapist/ treatment delivery time, including deviations from protocol and additional active treatment occasions. We encourage researchers to use the template provided in Table S6. 2020  Trial reporting for child and adolescent anxiety disorders of life and age-appropriate functioning in different domains (e.g., school, home, hobbies, peers) which, while encouraged, are not 'recommended'. We also did not make recommendations where issues are already well covered by CONSORT guidelines and where issues are not specific to reporting in the context of treatment trials for CYP with anxiety disorders, such as reporting on treatment integrity, or on adverse effects of treatments. Despite our attempts to retain a focused list of recommendations, we have recommended that reports are obtained from multiple informants, in particular, clinicians, CYP and primary caregivers. Taking this further, Rith-Najarian et al. (2017) recommend, and we concur, that to demonstrate treatment efficacy, significant effects should be found with data gathered from more than one reporter (e.g., clinical assessors and primary caregiver and/or adolescent). Thus, we recommend researchers include data from the reports of clinical assessors' as well as both children and parents and state, a priori, which reporters will be prioritized in drawing conclusions.
We also recommended that researchers provide detailed information on the treatment that was provided, including, global treatment components, their dosage, and sequence. The option to include supplementary materials alongside journal papers means that this sort of information can now be provided as standard. As a next step, it may be important to use more comprehensive standardized reporting templates to describe the details of treatment components or techniques (e.g., cognitive restructuring, problem solving, exposure) that are used in a treatment protocol (e.g., Bodden, Nauta, Kuijpers, Stone, & Stikkelbroek, 2016;Chorpita, Becker, Daleiden, & Hamilton, 2007;Chorpita & Daleiden, 2009). Such a taxonomy of treatment contents would allow for consistent definitions and descriptions of components (e.g., name (e.g., exposure), definition (e.g., exposing the client to the anxious situation or object in vivo or in imagination), rationale (e.g., rationale based on habituation, selfefficacy, inhibitory learning, etc). Such detailed reporting would also facilitate integrated analyses of processes of change and recovery including those relating to mediational processes (e.g., Silverman et al., 2019) and differential patient trajectories (e.g., Skriner et al., 2019).
It is important to highlight that the aim of the current paper is to increase consistency in reporting in clinical research trials, and some of the recommended measures may not be relevant or feasible in routine clinical practice. The topic of assessment in the context of routine clinical practice warrants its own consensus statement (see Szatmari, Offringa, Butcher, & Monga, 2019) as recently developed by ICHOM (2019). The inclusion of measures of both symptoms and interference provides a bridge to compare data from research trials and that obtained in routine practice. Further evaluation of measures to ensure appropriate and accurate clinical cutoffs are available will be critical to enable both researchers and clinicians to use these measures to establish whether clinically significant outcomes have been obtained. We deliberately focused on general principles of assessment, rather than recommending specific measurement tools, as we hope to influence both trial reporting going forwards and how existing data is used, for example, in meta-analyses. Finally, we focused specifically on treatment outcome measures, however, going forwards a similar approach to the consistent measurement of putative maintenance mechanisms will benefit data sharing to promote understanding of how treatments work and how to improve them. This paper describes the outcomes of discussions from a working group that set out to develop a common language and both core and aspirational principles to promote greater consistency in reporting on trials of treatments for child and adolescent anxiety disorders. The recommendations, suggestions and templates that we have presented are not intended to constrain researchers, but instead we hope they will provide a mechanism to broaden and expand our science by allowing us to bring treatment datasets together from around the world in meaningful ways. While we took a very deliberate focus on treatment trials for anxiety disorders in children and adolescents, we recognize that many of the issues raised will have parallels in treatment trials for other mental health problems in children and adolescents, and we encourage others to take similar steps to promote consistent reporting to ensure that the integration of trial data will create far more than the sum of its parts, allowing us to address overdue questions in child and adolescent mental health research, such as what works for whom and how.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article: Table S1. Template for reporting remission outcomes. Table S2. Template for reporting contextual characteristics of the treatment. Table S3. Template for reporting structural characteristics of the treatment (as planned in the manual). Table S4. Template with example of treatment content. Table S5. Template for reporting dosage of treatment received across groups. Table S6. Template for reporting treatment and therapist delivery time across groups.