It’s misleading to say 100 studies were not included in the Cass Review

20 May 2024

What was claimed

Around 100 studies were not included in the Cass report.

Our verdict

This is misleading. Two systematic reviews considered the evidence of 103 relevant studies to inform the Cass report. Of these, 101 were not graded high-quality, and 43 low-quality studies were excluded from the reviews’ conclusions. All the studies were included for assessment, however.

See the action taken as a result of this fact check.

Around 100 studies have not been included in the Cass report, and we need to know why.

Dawn Butler MP, 15 April 2024.

The Labour MP Dawn Butler claimed in Parliament in April that around 100 studies were not included in the Cass Review of gender identity services for children and young people. She later posted a video of her comment on X (formerly Twitter).

This followed a number of similar claims from campaign groups and large social media accounts that around 100 studies had been either excluded, “discarded”, “disregarded”, not included or rejected.

The claim that 100 studies were not included is misleading, however, because it implies that 100 studies were not considered by the review at all.

In fact, the Cass report was informed by systematic reviews by the University of York, two of which assessed the evidence on puberty blockers and cross-sex hormones in 103 relevant studies. Of these, 43 studies rated low-quality were not included in later syntheses of the reviews’ findings, but all the other studies were included.

The report’s author Dr Hilary Cass has herself also called Ms Butler’s claim “completely wrong” in an interview with the Times. In a subsequently published FAQ page on its website, the Review also said: “All high quality and moderate quality reviews were included in the synthesis of results. This totalled 58% of the 103 papers.”

Ms Butler has since corrected the record in Parliament. But we’ve continued to see various versions of the claim that 100 studies were not included in the Cass Review circulate on social media since then.

Politicians, charities, campaigners and people on social media should take care to talk accurately about evidence, especially in medicine, where false or misleading information may harm people who use it to make decisions about their health.

Honesty in public debate matters

You can help us take action – and get our regular free email

Where the claim came from—and how it was corrected

When Full Fact emailed Ms Butler’s office on 17 April, it told us that her claim was based on a briefing she had received from the LGBTQ+ charity Stonewall, and shared this briefing with us.

The briefing said Stonewall considered that Cass’s approach to the evidence was flawed, and claimed that 100 studies had not been included.

We asked Stonewall for a source for the 100 studies figure, and it told us on 19 April that it was a reference to the systematic reviews conducted by the University of York, which categorised all but two out of 103 studies as either moderate- or low-quality.

Stonewall also said it felt the low-quality studies should have been included. The basis for this opinion wasn’t clear, but it seemed to be at least partly based on an incorrect belief that the reviews had assessed the evidence with the GRADE system which usually requires studies to be randomised controlled trials (RCTs) in order to be considered high-quality. (Researchers often use established systems for assessing evidence, in order to make the process as objective as possible.)

In fact the reviews assessed the evidence using a modified version of the Newcastle-Ottawa scale, which doesn’t require studies to be RCTs to be considered high-quality. Indeed Professor Catherine Hewitt, who worked on the reviews, later told More Or Less that they didn’t actually find any relevant RCTs [3:44] at all.

Stonewall also published a statement on the evidence supporting the Cass report, in which it said “clarity is needed” over the report’s methodology and the way it handled studies that did not include a control group. The statement did not repeat the claim about 100 studies not being included.

The statement was later edited in several ways to remove the demand for clarity and the reference to control groups, and add the line: “We also welcome Dr Cass's recent clarification of the evidence considered in the review.”

On 20 April, Stonewall sent us another email saying that it had met with Dr Cass three days earlier, and that she had clarified that the moderate-quality studies were considered.

Then on 21 April it sent us a statement for publication, saying: “We are grateful to Dr Cass for taking the time to clarify that both 'high' and 'moderate' quality research were considered as part of the evidence review, both in the media and directly to trans and LGBTQ+ organisations." The charity later told us that it had not intended to mislead with its earlier statements.

Finally, on 22 April, Ms Butler’s office notified us that she had raised a point of order to correct the record in Parliament. In a subsequent post on X (formerly Twitter), Ms Butler said: “I inadvertently misled the House by quoting a figure from a Stonewall briefing.”

At the time of writing, Ms Butler’s post with the original claim has not been deleted.

Were these studies ‘included’?

As we’ve said, the 103 studies in question were found during two systematic reviews that were commissioned by Dr Cass to inform the report.

A systematic review is a type of scientific research that collects the available evidence on a given subject using a system for searching archives of published literature according to specific rules.

Across both reviews, all the studies were said to be “included” at the beginning of each review, and all were graded for the quality of their evidence and sorted into three categories: high (two), moderate (58) and low (43).

The quality of evidence is very important. A study may produce a clear finding, but if that finding is based, for example, on a very small group of patients, or a non-representative group, or a group that isn’t followed up for long enough, then the finding may not be reliable.

As Professor Hewitt told More Or Less: “If you include low-quality evidence it can tell you an answer and you’re not sure if you should believe that answer or not.” [4:55] Dr Cass told the programme: “This particular body of evidence is uniquely poor compared to almost any other body of evidence that the University of York has looked at.” [5:30]

So while it is true that the evidence in the moderate-quality studies was considered less reliable, it is misleading to say it was “not included” by these reviews, which describe their findings at length in their “Synthesis of outcomes” sections.

Evidence from moderate-quality studies also forms part of the reviews’ overall conclusions. For instance, the conclusion box on the first page of the review on hormones says: “Moderate quality evidence suggests mental health may be improved during treatment, but robust study is still required.”

And the review of puberty blockers says in the results section on its first page: “Synthesis of moderate-quality and high-quality studies showed consistent evidence demonstrating efficacy for suppressing puberty.”

The discussion sections towards the end of both reviews include references to many of the studies graded as moderate-quality, as does the Cass Review’s final report document.

Were 100 studies effectively excluded?

The systematic reviews were trying to find out what the available evidence said about the effect of puberty-blockers and hormones on children, and how reliable that evidence was.

The two brief conclusions of the papers are worth quoting in full:

“There are no high-quality studies using an appropriate study design that assess outcomes of puberty suppression in adolescents experiencing gender dysphoria/incongruence. No conclusions can be drawn about the effect on gender-related outcomes, psychological and psychosocial health, cognitive development or fertility. Bone health and height may be compromised during treatment. High-quality research and agreement on the core outcomes of puberty suppression are needed.”

systematic review, puberty blockers

There is a lack of high-quality research assessing the outcomes of hormone interventions in adolescents experiencing gender dysphoria/incongruence, and few studies that undertake longterm follow-up. No conclusions can be drawn about the effect on gender-related outcomes, body satisfaction, psychosocial health, cognitive development or fertility. Uncertainty remains about the outcomes for height/growth, cardiometabolic and bone health. There is suggestive evidence from mainly pre–post studies that hormone treatment may improve psychological health although robust research with long-term follow-up is needed.

systematic review, cross-sex hormones

Versions of these paragraphs also appear in the final Cass Review report.

So it is certainly true that grading so many studies low or moderate in quality resulted in conclusions that there is little high-quality evidence available. In this sense, it would be possible to argue that the evidence in these studies has been effectively ‘excluded’ from the most serious consideration.

But it is not true that this evidence has been excluded from all consideration, as its quality was assessed for every study. The Review itself says: “The approach to the assessment of study quality was the same as would be applied to other areas of clinical practice.”

It is also not true—as some online have claimed—that only RCTs were considered high-quality in these reviews. Neither of the two papers graded high were RCTs. On this point, the Review says: “There were no randomised control studies identified in the systematic reviews, but other types of studies were included if they were well designed and conducted.”

Similar claims have been shared widely online

It’s not clear where the idea first formed that 100 studies were not included in the Cass Review.

On 9 April, shortly before the final report was published, a clinical instructor at Harvard Law School, Alejandra Caraballo, posted on X that the Review “disregarded nearly all studies because they weren't double blinded controlled studies”. The musician Billy Bragg later made similar claims on X.

This may have arisen because NICE conducted two separate systematic reviews in 2020, which did use the GRADE system, under which evidence from RCTs starts at high quality. Ms Caraballo’s post included screenshots from the NICE reviews.

Some accounts on X also cited an article about the Cass Review in the BMJ which said “only one” of the studies in each of the papers was of “high quality” or “sufficiently high quality”. Some people seem to have thought this meant that only high-quality studies were included in the review.

A statement by the Professional Association for Transgender Health Aotearoa (PATHA) in New Zealand said on 11 April that “101 out of 103 studies were discarded”, and a statement from TransActual, a UK campaign group, said that the report’s findings rested on “excluding 98% of the relevant evidence” (with 101 being 98% of 103).

TransActual later removed this claim from the statement, and following contact from Full Fact changed a line in its briefing document.

Full Fact approached Mr Bragg and PATHA for comment. Ms Caraballo told us she still believed the Cass Review “disregarded a substantial portion of the available medical evidence based on subjective criteria” but did not directly address our question about the reason she gave for some studies being “disregarded”.

Image courtesy of Richard Townshend

We took a stand for good information.

As detailed in our fact check, Dawn Butler has corrected the record in Parliament and Stonewall edited a statement on the evidence supporting the Cass report.

This article is part of our work fact checking potentially false pictures, videos and stories on Facebook. You can read more about this—and find out how to report Facebook content—here. For the purposes of that scheme, we’ve rated this claim as missing context because all 103 relevant studies were included in the review, of which 60 rated high- or moderate-quality were included in the synthesis of results.

Fact checks

Campaigns

Policy

Full Fact AI

About