The context is Steve posted the URL of an article about a German district court prohibiting circumcision for male Jewish infants even if the circumcision is performed on religious grounds. Bnonn made a comment in the combox of Steve's post supporting the German court's decision as well as drawing an analogy between male/female circumcision and male/female genital mutilation. A debate ensued.
I entered the fray in the combox of this post. I only made a few comments, to which Sarah took the time and effort to reply, but unfortunately life, the universe, and everything intervened, and regrettably I never ended up responding to her.
If people are interested, they can read all the past posts and comments here.
However, I'd now like to say, first of all, I'm sorry for my delay!
But secondly I'd like to try and rectify my non-response. Well, sorta. On the one hand, I'm afraid I won't be able to respond to every point made by Bnonn and Sarah in the past threads. I wish I could but unfortunately I don't think I have the time to respond to each point let alone to the cited medical journal papers. It'd be akin to writing a full-length book review to review each research study, at least if I wanted to do justice to the study. But on the other hand, I hope what I'll try to say will prove more helpful overall for everyone.
Lastly, I realize this sounds like a cop-out, but I should say upfront that I don't think I'll actually have much time to continue dialoguing beyond this post either. Once again, apologies in advance!
All that said:
- I'd like to raise only one specifically non-medical point if that's okay. My understanding is the medical advantages and/or disadvantages of circumcision are ultimately irrelevant for Orthodox Jews. For them the covenant of circumcision (brit milah) is a religious commandment and duty similar to eating kosher foods or wearing tzitzit on their tallit. So they'd seek to obey it regardless of its medical advantages and/or disadvantages. But more knowledgeable and intelligent minds can speak to this far better than I can.
- As far as I'm concerned, my contention has been and still is that the medical evidence against circumcision is inconclusive. I see both advantages and disadvantages in circumcision, but I don't see the disadvantages decisively favoring an anti-circumcision stance. Of course, I have no problem if I'm proven wrong. After all I'm happy to accept the best medical science on the issue. But this is my position at the present time.
- Since the debate over circumcision is based in large part on the medical literature, it's important to understand how to properly evaluate the medical literature. Ideally those of us citing the medical literature to support this or that point would all understand how to vet the medical literature prior to citing it, and therefore be able to focus on citing relevant literature and not cite studies which aren't actually relevant or which are flawed in some way. But unfortunately I can't see how one can achieve this ideal without at least some knowledge of and experience in research. It'd be quite unfair to expect someone who has never done research to waltz onto PubMed or to pick up a medical journal and immediately be able to distinguish between the merits and demerits of study x over and against the merits and demerits of study y. Especially when, for example, many if not most scientists spend much of their time and indeed careers honing these skills.
- Still, maybe it's at least possible to begin to learn how to think in categories which would better help one appreciate how to accurately assess medical research papers. That'll be my little goal now. Although I'd never think to call myself a researcher, I do have some knowledge about how it works, so I'll try to be a helpful guide in pointing people in the right direction. I'm certain many others could do a far better job. But since they're probably occupied with more important matters, I'm afraid people will have to make do with the likes of me. Consider this a (very) poor man's guide to evidence-based medicine (EBM)!
If people would like further and better information, I can say I've relied in part on a few articles collectively called "How to read a paper" from the BMJ. I haven't read them all, but what I've read seems to indicate this collection is a decent place to begin.
Of course, those familiar with EBM are more than welcome to weigh in, clarify, correct, expand, and so forth if they please. Also, I don't plan to mention statistics, not even basic statistics, except in passing (e.g. standard deviation, normal distribution, null hypothesis, confidence intervals, p values, false positives/negatives, odds ratios, t-tests, chi-squared tests), since I think that'd be too complicated for the purposes of this post, but statisticians are more than welcome to comment too.
- Without further ado:
a. The first step in answering a question to which we don't know the answer is to make sure to ask the right question!
In medicine, one can attempt to formulate an answerable question by breaking the question down into four components: person or population; intervention; comparison or control; and outcome. The person or population component asks who are the relevant people. The intervention component asks what is the drug or procedure or test that one is seeking to perform. The comparison or control component asks what is the alternative to the intervention that one is considering. And the outcome component asks what one hopes to accomplish for the person or population.
Let's take the example of circumcision. The person or population involved would be male infants or better yet male infants in a particular group. The intervention would be circumcising these male infants. The comparison would be not circumcising these male infants. And the outcome would be to lower the risk or prevent a particular disease such as penile cancer or HIV/AIDS or syphilis.
A question one could ask based on the above would be: Among male infants in sub-Saharan Africa, does circumcision help lower the risk of HIV infection in comparison to non-circumcision? Maybe we could be more specific and include, say, common types of circumcision procedures such as the Plastibell device, Gomco clamp, or Mogen clamp. But I don't wish to make this too detailed at the moment. Again I just want to get the gist of it across.
b. With this in mind, let's move onto study design. Study design is vital because a poorly designed study could give poor or misleading results.
There are various ways one can design a study in order to address a medical research question. For starters we can distinguish between two types of study designs - primary and secondary. A primary study is a study which attempts to research a specific issue. It's what most published medical research is all about. Primary studies are also known as empirical studies. A secondary study, however, is a study which attempts to summarize and make inferences and perhaps ask further questions about the primary study or studies they are reviewing.
There are three broad types of primary studies: experimental, clinical, and survey. Experimental studies can be thought of as studies performed in a controlled and artificial environment such as a laboratory and conducted on animals. Clinical trials can be thought of as studies which administer a drug and a placebo to two different but equivalent groups of patients to see whether the drug makes a difference. And surveys can be thought of as studies which involve the measurement of something or some effect among a group of people.
There are four broad types of secondary studies: reviews, guidelines, decisions, and economics. Reviews include systematic reviews and meta-analyses. Systematic reviews and meta-analyses are rigorous studies which look at all the relevant published primary literature on a topic such as the circumcision of male infants in sub-Saharan Africa lowering the risk of HIV infection. Meta-analyses integrate numerical data from all the relevant published primary literature on a topic. Guidelines make inferences from primary studies about what clinicians should do. Decision analyses make use of primary studies to generate probability trees which can be used to make choices about clinical treatment and management. Economic analyses use the results of primary studies to say whether it is a good use of resources to follow a particular course of action.
c. We can further distinguish between study designs: cross-sectional vs. longitudinal studies.
Cross-sectional studies are studies which are carried out at a particular point in time. Say a survey like a population census.
Longitudinal studies follow a group of people over time. A group of people can be termed a cohort, which is a group of people with common features or experiences (e.g. a birth cohort is a group of people from a particular period of time such as baby boomers or Gen Y). Longitudinal studies can be prospective or retrospective. Prospective longitudinal cohort studies follow a group of people from a particular point in time. Retrospective longitudinal cohort studies look at a group of people by selecting for various factors.
d. There are other types of studies as well. For example, there are case reports, which generally speaking are published studies on a particular patient. Since case reports usually tend to involve only one patient, they are considered anecdotal evidence for the most part.
Also, there are qualitative or interpretative studies which look at non-numerical aspects or those which cannot be easily quantified. These are a bit harder to pigeon-hole in terms of significance to medical practice. An example might be sexual satisfication in circumcised vs. uncircumcised men.
e. At this point I should note many medical papers or studies can fail in several respects. Common reasons include:
- The study doesn't address an important scientific issue.
- The study was not original.
- The study did not test the stated hypothesis.
- The study should've used a more reasonable study design which would've been more appropriate to answering the question posed by the study.
- The study encountered practical problems like failing to recruit participants which led the researchers to compromise the study in some fashion.
- The study had too small of a sample size to be relevant.
- The study was poorly controlled or uncontrolled.
- The study's researchers made unjustifiable conclusions based on the study.
- The study involved a significant conflict of interest (e.g. financial gain to be made by a researcher or by a sponsor of the research if a particular outcome of the study resulted; bias was poorly guarded against).
- The study involved or failed to minimize systematic bias.
- The study was not continued for a long enough period to make the results credible.
- The study violated an ethical standard.
And so on and so forth.
f. Let's turn to validity and bias.
The validity of a study is the extent to which the study and/or the results of the study are likely to be true and free of bias. As such, bias can affect the validity of a study. Bias is the introduction of systematic error into a study which distorts the final results.
There are two types of validity - internal and external. Internal validity refers to how well the study has been run and reported on. External validity concerns whether the results of a study can be applied to a specific patient or situation.
Validity might ask questions like whether the two groups (e.g. circumcised vs. uncircumcised for sexual satisfaction) were representative and comparable. Whether the outcome measurements were accurate. Whether there was a placebo effect. For the groups to be representative and comparable, researchers would look into factors like age, gender, socioeconomic group, disease status, relevant risk factors, etc. For the outcome measurements to be accurate, researchers would look into factors like measurement bias and measurement error.
There are various types of biases. Observer bias is when one observer or researcher consistently under reports or over reports a particular variable. Selection bias is when people selected for the study aren't actually representative of the population in which the results should apply. Information bias is when measurements are incorrectly recorded. Confounding bias is the presence of factors or variables which are unevenly distributed between the two groups under study which in turn influences the effect that is being studied.
g. Many factors can affect validity and bias a study. For example, were the patients selected similar enough to one another to be relevant?
Were the patients who were studied randomized? How were they randomized?
Is the study blinded? If so, who was blinded - the participants, the analysts, the investigators/clinicians? FYI: Single-blinded usually means the patient or the investigator is unaware of the treatment assignment. Double-blinded usually means both the patient and the investigator are unaware of the treatment assignment. Triple-blinded usually means the patient, the investigator, and those that adjudicate the study like the analysts or those who are monitoring the study on behalf of the investigators or researchers are unaware of the treatment assignment. Open is where everyone is aware of the treatment assignment.
Was the study properly followed up on? What was the duration of follow-up? Was the follow-up complete?
Were the patients analyzed in their original allocated groups?
And so on and so forth.
h. People will just have to take my word for the following at this point since I think it'd be too much to delve into medical statistics here. P values are a measure of probability. The magic number in a study is a p value ≤ 0.05. In other words, for the purposes of this post, we accept that if a p value is ≤ 0.05, then the probability that the result was due to chance alone is less than 5%. This means the study would have to be repeated 20 times (100/5) for the result (or even the possibility of the result) to be due to chance alone. Hence it's highly unlikely (5% or less) that the result could have occurred by chance alone. To reiterate, a p value ≤ 0.05 is the magic number that's considered statistically significant in most medical studies.
i. Studies can vary in significance in terms of relevance to the practice of medicine. Here is the hierarchy of the various studies in order of most significant to medical practice at the top to least significant at the bottom:
- Systematic reviews and meta-analyses
- Randomized controlled trials
- Crossover studies
- Cohort studies
- Case-control studies
- Cross-sectional surveys
- Case reports
j. Finally, I should like to note that the Cochrane Library is the single best resource for EBM. It's not comprehensive on every conceivable medical topic, but it does contain several hundred thousand controlled trials. Physicians, medical scientists, and other researchers routinely use Cochrane for research (along with other resources).
- The study doesn't address an important scientific issue.
- Anyway I hope this helps give people some sense of what is involved when attempting to undertake a study. Again I realize it's a mere smattering of relevant introductory concepts in EBM. And again EBM experts and statisticians are more than free to correct, clarify, and so forth. But perhaps it's at least enough to give people a better idea of how to begin to assess a medical scientific study or paper.
- Let's close by asking how some of this might apply to a particular study.
As I've said, I won't have time to thoroughly review a specific study. But perhaps I can make one or two quick comments to help people think about such things for themselves.
Sarah helpfully pointed out a study:
I believe I included the "fine-touch pressure threshholds" study in my sources
I'm not exactly sure if this study is the same study Sarah had in mind when she made the comment. But it's what I found. In any case, whether or not Sarah meant to refer to the study, I'll try to use it to illustrate a couple of the things I talked about above.
a. If we scroll down to the conflicts of interest bit, here's what it says:
None declared. Source of funding: National Organization of Circumcision Information Resource Centers. The director of National Organization of Circumcision Information Resources Centers (MFM) was involved in the design and conduct of the study; collection and interpretation of the data; and review, or approval of the manuscript.
Now if we go to the National Organization of Circumcision Information Resource Centers website, it's an organization which advocates intactivism:
Dedicated to making a safer world, NOCIRC is a 501(c)(3) educational non-profit organization committed to securing the birthright of male, female, and intersex children and babies to keep their sex organs intact. On March 15, 1986, a group of healthcare professionals in the San Francisco Bay Area, led by Marilyn Milos RN, announced the (1985) founding of the National Organization of Circumcision Information Resource Centers (NOCIRC); the first national clearinghouse in the United States for information about circumcision. In its first decade, NOCIRC grew into an international network and now has more than 110 centers worldwide.
At the moment I'm not necessarily suggesting a study funded by a particular organization can't necessarily be a good study. But the fact that it was funded by the NOCIRC which is a known anti-circumcision organization and that one of its directors was "involved in the design and conduct of the study" would raise a couple of question marks. I would think many would want to ask further questions about issues like conflict of interest (even though they apparently declared none) and bias for starters, no?
b. The study is meant to measure "fine-touch pressure thresholds in the adult penis." In order to do this a Semmes-Weinstein monofilament touch-test was employed.
There are different types of tests. For example, there are screening tests vs. diagnostic tests. The stated objective of the Semmes-Weinstein monofilament touch-test is:
To map the fine-touch pressure thresholds of the adult penis in circumcised and uncircumcised men, and to compare the two populations.
Is the Semmes-Weinstein monofilament touch-test able to sufficiently meet this objective?
Furthermore, the conclusion gained from the study is stated as:
The glans of the circumcised penis is less sensitive to fine touch than the glans of the uncircumcised penis. The transitional region from the external to the internal prepuce is the most sensitive region of the uncircumcised penis and more sensitive than the most sensitive region of the circumcised penis. Circumcision ablates the most sensitive parts of the penis.
But among other questions, one wonders, first, if the "map" is accurate? Second, if the "map" is representative of the male population? Third, how the conclusions necessarily follow from mapping the fine-touch pressure thresholds of the adult penis?
Not all tests are necessarily equally relevant. For example, a urine test wouldn't necessarily be relevant to diagnosing a blood-borne virus. An EKG (for the heart) wouldn't be relevant to diagnosing a brain tumor. Is the Semmes-Weinstein monofilament touch-test relevant to detecting and likewise accurately measuring penile sensitivity?
BTW, here is a (French) video of a monofilament in action. It's used to test diabetic neuropathy in the feet. Maybe it's a perfectly suitable test for diabetic neuropathy in the feet. But what's perfectly suitable in one part of the body isn't necessarily perfectly suitable in another part of the body.
Even if it is, is having the glans penis lightly poked with this monofilament (which as I understand it is a nylon string that stiffens according to level of sensation felt) relevant to penile sensitivity during sexual intercourse?
Different men could experience sensation differently. I could be wrong but this seems to involve a more qualitative study to me.
Neuronal stimulus is one aspect in male sexual stimulation. But we shouldn't forget the psychic element in male sexual stimulation. For example, nocturnal emissions during dreams and without neuronal stimulus can occur in many adolescent males.
c. Other obvious questions we could ask are how representative was the sample studied. And how well was the study designed (study methodology). But it looks like this article has already addressed such questions as well as others.
- People might consider applying some of the aforementioned EBM concepts to papers like the ones found here.