ReCAPP Research Glossary
ReCAPP Research GlossaryA
To have an influence on. For example, nutrition affects our health.
A factor that precedes and is associated with a specific outcome — but does not necessarily cause the outcome. For example, living in a community with a high unemployment rate is an antecedent of adolescent pregnancy.
The relationship of the occurrence of two events, without evidence that the first event being investigated actually causes the second event. For example, malaria occurs in warm climates with proper breeding conditions for certain types of mosquitoes. Those conditions are in association with the spread of malaria. The actual cause of malaria is the malaria parasite.
BDI Logic Model
BDI Logic Model stands for "Behavior-Determinants-Intervention" logic model. Such models are diagrams that identify the causal relationships among interventions, the determinants of behaviors, the behaviors themselves, and one or more health goals. Because the process for creating BDI logic models involves specifying first the health goal, then the behaviors affecting that health goal, then the determinants of those behaviors, and finally interventions that can markedly affect those determinants, the "B" comes before the "D" which comes before the "I" in the name.
That quality of a measurement device that tends to result in a misrepresentation of what is being measured in a particular direction. For example, the questionnaire item "Don't you agree that the president is doing a good job?" would be biased in that it would generally encourage more favorable responses.
Analyzing the association between two variables. For example, the bivariate analysis found that the variables of exposure to second hand smoke and lung cancer are positively associated.
Case Control Study
An experimental approach in research where two groups of subjects are studied and compared: an experimental group and a control group.
An in-depth exploration of one particular case (situation or subject) for the purpose of gaining depth of understanding into the issues being investigated. For example, out of 30 in-depth interviews, one may be singled out for a case study.
The relation between a cause and its effects. If the relation is high, we say the causality is high. For instance, there is a high causality between having unprotected intercourse with someone infected with Chlamydia and becoming infected with Chlamydia.
The chi-square test is performed to test whether two variables can be considered statistically independent. When the chi-square statistic is large (i.e. if its P-value is less than a predetermined significance level such as 0.05), then the null hypothesis of independence must be rejected.
A cohort is a group of beings (commonly people) who are surveyed or studied over a period of time. Typically, data are collected from a cohort at multiple points in time, such as before and after an intervention.
The degree of certainty that a statistical prediction is accurate. Generally, confidence level of 95% to 99% is considered acceptable.
A variable that can be expressed by a large (sometimes infinite) number of score values. For example, height, temperature, and grade point average are continuous variables. Contrast with dichotomous variable.
A group of subjects to whom no experimental stimulus (e.g., a health promotion program) is administered and who should resemble the experimental group in all other respects. The comparison of the control group and the experimental group at the end of the experiment points to the effect of the experimental stimulus.
A measure of association between two variables. It measures how strongly the variables are related, or change, with each other. If two variables tend to move up or down together, they are said to be positively correlated. If they tend to move in opposite directions, they are said to be negatively correlated.
The process of assessing and comparing both the costs (financial cost, resources needed, etc.) and the benefits of a course of action (such as an intervention or a research study) in order to determine if it is a desired course of action. For example, the cost-benefit analysis of the condom dissemination program found the program to be cost efficient — the cost of 75 cents per participating teen was far less than the cost of otherwise expected unintended pregnancies.
Cross Sectional Study
A study that is based on observations representing a single point in time. Contrast with longitudinal study.
Information collected and organized for analysis or decision-making.
The logical model in which specific expectations of hypotheses are developed on the basis of general principles. For example, starting from the principle that all college deans are mean people, you might anticipate that your dean won't let you change courses. The anticipation would be the result of deduction. Contrast with induction.
That variable that is assumed to depend on or be caused by another variable (called the independent variable). For example, if you find that teen pregnancy is partly a function of the amount of formal education received by teens, then teen pregnancy is being treated as a dependent variable. Formal education is the independent variable.
Determinants (also called "behavioral determinants") are the factors that have a causal influence on some outcome. For example, "being in love" or "going with someone" are determinants or factors that affect the initiation of sex among people, and both the availability of alcohol and perceived peer norms about alcohol use are determinants or factors that affect adolescent drinking. Determinants can include both risk and protective factors. Determinants differ slightly from antecedents. Antecedents must be related to some outcome and must logically precede that outcome, but they do not necessarily cause the outcome. In contrast, determinants imply causality.
A categorical variable that can place subjects into only two groups, such as male/female, alive/dead, or pass/fail.
Factors are considered more distal from an outcome if they are logically more distant from that outcome. For example, community poverty is a factor that affects teen pregnancy rates. However, community poverty is conceptually quite distinct from teen pregnancy and is therefore considered distal. In contrast, motivation to avoid pregnancy is also related to teen pregnancy rates, but it is closer conceptually and is therefore considered more proximal.
Something brought about by a cause or agent — a result. For example, slurred speech is a common effect of drinking too much alcohol.
Programs are considered evidence-based if there exists good evidence that they have a positive impact on the outcomes that they are designed to change. For example, some sex and HIV education programs are considered evidence-based because their impact on sexual or contraceptive behavior have been carefully measured with experimental designs, and these studies produced strong evidence that the programs actually changed behavior in a desirable direction.
A research design to investigate cause and effect relationships between interventions and outcomes. Experimental designs are case controlled studies which use random sampling practices to place subjects in control groups and experimental groups and then compare the differences in outcomes.
A group of subjects to whom an experimental stimulus (e.g., a health education program) is administered and who should resemble the control group in all other respects. The comparison of the control group and the experimental group (also called the "treatment group") at the end of the experiment points to the effect of the experimental stimulus.
Research that looks for patterns, ideas, or hypotheses rather than try to test or confirm hypotheses. For example, one exploratory research project interviewed 100 adoptive parents to learn what their common struggles and issues were in raising adoptive children.
The number of times a number is multiplied by itself, usually written as an exponent. For example, 83 is 8 to the third power and means 8X8X8=512.
In intervention research, fidelity commonly refers to the extent to which an intervention is implemented as intended by the designers of the intervention. Thus, fidelity refers not only to whether or not all the intervention components and activities were actually implemented, but whether they were implemented in the proper manner.
A written/typed record of events and observations kept by a researcher. For example, a child development researcher might keep field notes as she observes toddlers interacting in a play group.
A qualitative research technique in which an experienced moderator leads a group of respondents (usually 8-12 persons) through an informal discussion of a selected problem or issue, allowing group members to talk freely about their thoughts, opinions, feelings, attitudes, and misconceptions about the issue.
Formative research (also called "formative evaluation") is research that is conducted for the primary purpose of improving the quality of the intervention. This may be in contrast to research that is conducted to determine how an intervention was implemented (process evaluation) or whether the intervention had intended effects (outcome evaluation). Because formative research is designed to improve the intervention, it is often conducted during the development of an intervention and during the first few years of implementation.
The number of occurrences of a specified event within a given interval.
The extent to which research findings can be applied to more than the specific observations upon which they are based. Sometimes this involves the generalization of findings from a sample to a population. H
Health behavior refers to any type of behavior that has an impact on the health of the beings involved. For example, avoiding sex or always using contraception are health behaviors that affect the health goal of avoiding unintended pregnancy.
A prediction of a relationship between one or more factors; a problem under study, which can be tested. An example of a hypothesis is: "If an adolescent girl has an older sister who gets pregnant as a teen, she is more likely to get pregnant."
The classical approach to assessing the statistical significance of findings. Basically it involves comparing observed sample findings with theoretically expected findings (the hypothesis). This comparison allows one to compute the probability that the observed outcome could have been due to chance alone. The comparison also determines if a null hypothesis is correct.
The number of new cases of a defined condition that occur during a specified period of time in a defined population. Contrast with prevalence.
An independent variable is presumed to cause or determine a dependent variable. For example, if we discover that dancing ability is partly a function of hours of dance class, then "hours of dance class" is the independent variable, and dancing ability is the dependent variable. Note that any given variable might be treated as independent in one part of an analysis and dependent in another part of an analysis. Dancing ability might become an independent variable in the explanation of cardiovascular health.
The logical model in which general principles are developed from specific observations. See also deduction.
The extent to which the results of a study can be attributed to the treatment, rather than flaws in the research design. In other words, the degree to which one can draw valid conclusions about the causal effects of one variable on another. Contrast with external validity.
Level of Significance
More fully the level of statistical significance. The probability (abbreviated "p") that a result would be produced by chance (sampling or random error). The lower the p, the less likely chance or errors occurred, and the more likely the finding is statistically significant.
A type of response format used in surveys developed by Rensis Likert. Likert items have responses on a continuum and response categories such as "strongly agree," "agree," "disagree," and "strongly disagree."
A relationship between two variables that can be described by a straight line when variable values are plotted on a graph. The more the plotted points tend to fall along a straight line, the stronger the linear relationship.
Logistic Regression Analysis
A kind of regression analysis commonly used when the dependent variable is dichotomous (e.g., "yes" and "no" scored 0 or 1). This type of analysis is useful to predict whether something will happen or not (e.g., had sex or used contraception).
A study design involving the collection of data at different points in time (i.e., three months, six months and 12 months after an intervention). Contrast with cross sectional study.
A research design in which subjects are matched on characteristics that might affect their reaction to a treatment. For example, once pairs of matched subjects are determined, one member of each pair is assigned to a group receiving treatment (experimental group), and the other is assigned to the control group and does not receive treatment. A study that uses random assignment to place its subjects is considered more rigorous than a study that uses matching.
An average, computed by totaling the values of several observations and dividing by the number of observations. For example, if the ages of five men are 16, 17, 20, 54, and 88, the mean age of the men would be 39. Compare to median and mode.
A measure of central tendency representing the value of the "middle" case in a rank-ordered set of observations. For example, if the ages of five men are 16, 17, 20, 54, and 88, the median age would be 20. Compare to mean and mode.
Mediating outcomes are the effects of an intervention that, in turn, have an impact upon other even more important phenomena. For example, an HIV education program may increase knowledge and change values about early initiation of sex (mediating outcomes), effects which, in turn, lead to a delay in the initiation of sex.
The combining of data from several different research studies to gain a better overview of a topic than what was available in any single investigation. Data obtained from combined studies must be comparable in order to be evaluated by this method.
A measure of central tendency representing the most frequently observed value or attribute. For example, if the ages of five adolescents are 16, 17, 17, 18, and 19, the mode age would be 17. Compare to mean and median.
The analysis of simultaneous relationships among several variables. Examining simultaneously the effects of age, sex, and social class on sexual debut would be an example of multivariate analysis.
Describing two events, conditions, or variables which cannot occur at once. For example, subjects in a study cannot be both female and male, HIV positive and HIV negative, pregnant and not pregnant, for those are mutually exclusive categories. They could, however, be both female and HIV positive because those are not mutually exclusive.
A process of identifying problems and needs in a target population to make decisions, set priorities, set objectives, and explore alternative approaches or methods to aid in the planning and implementation of programs. Needs assessments can be conducted using a variety of tools including surveys, interviews and focus groups.
The null hypothesis represents a theory that has been put forward, whether because it is believed to be true or because it is to be used as the basis of argument, but has not been proved. For example, in a clinical trial of a new drug, the null hypothesis might be that the new drug is no better, on average, than the current drug.
The quality of relating to facts or conditions without distortion by personal feelings, prejudices, or interpretations. Objectivity is an extremely important quality for researchers to possess. Objectivity allows researchers to make conclusions based on the data which may directly oppose their own personal beliefs.
The odds ratio is a measure of association in which a value of "1.0" means that there is no relationship between variables. The value of an odds ratio can be less than or greater than 1.0. The size of any relationship is measured by the difference (in either direction) from 1.0. An odds ratio less than 1.0 indicates an inverse or negative association. An odds ratio greater than 1.0 indicates a positive relation.
A description or set of criteria for defining a variable or condition with objectivity. For example, the operational definition of an obese person could be one who weighs more than 120% of his or her "ideal weight" as defined by an insurance company chart.
Outcome evaluation is a type of evaluation that measures the effects of an intervention. Typically, the emphasis is on the measurement of desired intended effects, but sometimes the impact on possible negative effects is also measured. Outcome evaluation is in contrast to formative research or process evaluation. Sometimes, but not always, people distinguish "impact evaluation," which measures effects on short-term mediators from outcome evaluation, which measures effects on longer term health goals.
A collaborative approach to conducting research or program evaluations where the research process is controlled by the people in the program or community — not solely the professional researchers. The purpose of this type of shared inquiry is to educate the people involved in the program and improve the nature of the practice being studied. Participatory research has been used with great success in schools and in international programs.
The P-value, or probability value, of a statistical hypothesis test (e.g., chi-square test) is the probability of getting a value of the test statistic as extreme as or more extreme than that observed by chance alone, if the null hypothesis is true. Small p-values suggest that the null hypothesis is unlikely to be true. The smaller the p-value, the more convincing the rejection of the null hypothesis.
The number of units with a certain characteristic divided by the total number of units in the sample and multiplied by 100. For example, the percentage that represents five out of 20 boys is 5 divided by 20 x 100, which is 25%.
A group of persons (or institutions, events or other subjects of study) that one wishes to describe or wishes to generalize about. The number of members in the population is generally referred to as N. In contrast, n generally refers to a sample of the population.
A pretest is a test given (or measurement taken) before an experimental treatment begins. A posttest is given after the experimental treatment. By contrasting the results of the pretest with those of the posttest, researchers gain evidence about the effects of the treatment.
The total number of cases of a defined condition present in a specific population at a given time. Contrast with incidence.
A probability provides a quantitative description of the likely occurrence of a particular event. Probability is conventionally expressed on a scale of zero to one. A rare event has a probability close to zero. A very common event has a probability close to one. The probability of drawing a spade from a pack of 52 well-shuffled playing cards is 13 spades divided by 52 cards, or .25.
Random selection procedures to ensure that each unit of the sample is chosen on the basis of chance. All units of the study population should have an equal or at least a known chance of being included in the sample.
Prompts used during interviews to assist respondents in answering the interview questions. For example, when asked what he liked about being a health educator, the respondent said, "I like feeling like I help people." The interviewer used the follow-up probing question, "How do you think you have helped people as a health educator?" to get a more specific response.
Process evaluation is a type of evaluation that measures the implementation of an intervention. For example, it may assess the extent to which the components and activities of an intervention were actually implemented, the qualities of the implementation, the number of people who participated, and participants' reaction to the intervention.
The ratio of the number of cases possessing a property (numerator) to the total number of cases observed (denominator). For example, if 5 out of 20 girls participated in a workshop, the proportion of girls participating in the workshop is 5 divided by 20, which is .25.
Any factor whose presence is associated with an increased protection from a disease or condition. For example, good school performance is a protective factor against adolescent pregnancy.
Factors are considered more proximal to an outcome if they are logically closer to that outcome. Compare to distal.
The examination and interpretation of non-numerical observations for the purpose of discovering underlying meanings and patterns of relationships. For example, analyzing training participants' written comments on how a training could be improved is a form of qualitative analysis.
The numerical representation and manipulation of observations for the purpose of describing and explaining the phenomena that those observations reflect. For example, analyzing training participants' ratings (between one and five) of the overall training is a form of quantitative analysis.
Quasi Experimental Design
Quasi experimental design is similar to experimental design, but does not include the random assignment of subjects. The quasi experimental design is, therefore, not as strong, and it's much harder to establish causal relationships between events and conditions.
A sampling technique in which a group of subjects is selected for a study from a larger group (population). Each individual is chosen entirely by chance, and each member of the population has a known, but possibly non-equal, chance of being included in the sample. By using random sampling, the likelihood of bias is reduced and true experimental design can be achieved.
The quantity, amount or degree of something being measured in a specific period of time. An example of a rate is the teen pregnancy rate which is usually expressed in the number of pregnant teens per 10,000 teens within one year's time.
Numerical expression which indicates the relationship in quantity, amount or size. For example, in a class with one teacher and 15 students, the teacher to student ratio is one to 15.
A data analysis approach that is used to predict one variable by knowing one or more other variables. Regression analysis is used to answer such questions as "How well can I predict the values of one variable, such as frequency of sex (Y), by knowing the values of another variable, such as attitudes about sex and/or having and older boyfriend/girlfriend (X)."
The quality of measurement method that suggests that the same data would have been collected each time in repeated observations of the same phenomenon. In the context of a survey, we would expect that the question "Did you attend church last week?" would have higher reliability than the question "About how many times have you attended church in your life?" Compare with validity.
The percentage of responses received to a particular survey, question, or other measurement tool. For example, if 100 surveys were mailed out and 80 surveys were completed and returned, the response rate for the survey would be 80%. If only 40 out of the 80 respondents answered question five, the response rate for question five would be 50%.
Research is considered rigorous if the methods of the research study are designed and implemented so that very strong evidence is produced to substantiate the study's conclusions. For example, a study would probably be considered rigorous if it employed an experimental design with random assignment and measurement of possible outcomes both before (pretest survey) and well after (post-test or follow-up survey), had a large sample size, measured outcomes reliably, and was conducted with proper statistical analyses.
The selection of a number of study subjects from a defined study population.
A type of research in which data collected by others are reanalyzed. For example, Dr. More is using data from the three West Coast studies on sexually transmitted disease incidence in youth to study trends in syphilis incidence in California.
A non-probability sampling method often employed in field research. In snowball sampling, each person interviewed may be asked to suggest additional people for interviewing.
A measure of the spread or dispersion of a set of data. The more widely the values are spread out, the larger the standard deviation. For example, given two separate lists of exam results from a class of 30 students in which one list ranges from 31% to 98% and the other from 82% to 93%, then the standard deviation would be larger for the results of the first exam. Standard deviation is calculated by taking the square root of the variance.
A quantity that is calculated from a sample of data. It is used to give information about unknown values in the corresponding population. For example, the average of the data in a sample is a statistic used to give information about the overall average in the population from which the sample was drawn.
A gauge of the sensitivity of a statistical test, that is, its ability to detect relationships. Specifically, the probability of rejecting a null hypothesis when it is false — and therefore should be rejected. In general, the statistical power increases with your sample size. Also called "Power" or a "Test."
A general term referring to the unlikeliness that relationships observed in a sample could be attributed to sampling error alone. It is customary to describe one's finding as statistically significant when the obtained result is among those that, theoretically, would occur no more than five out of 100 times when the only factors operating are the chance variations that occur whenever random samples are drawn.
A statement or set of statements designed to explain a phenomenon or class of phenomena. For example, Social Learning Theory describes how human behavior is a product of environmental, social and personal factors.
Using more than one method to study the same thing. For example, if you were interested in people's attitudes toward environmental issues, you could look at patterns of voting behaviors for environmental candidates and issues; or you could interview leaders of the Sierra Club, the Nature Conservancy, and similar groups; or you could conduct a survey of a representative sample of the entire population. Or you could do all three and put the results together, in which case you could say that you had used a research strategy of triangulation.
Type I Error
In a hypothesis test, a Type I error occurs when the null hypothesis is rejected when it is, in fact, true. That is, it is wrongly rejected. For example, in a clinical trial of a new drug, the null hypothesis might be that the new drug is no better, on average, than the current drug. A Type I error would occur if a conclusion was made that the two drugs produced different effects when, in fact, there was no difference between them. A Type I error is often considered to be more serious, and therefore more important to avoid than a Type II error.
Type II Error
In a hypothesis test, a Type II error occurs when the null hypothesis is not rejected when it is, in fact, false. For example, in a clinical trial of a new drug, the null hypothesis might be that the new drug is no better, on average, than the current drug. A Type II error would occur if it were concluded that the two drugs produced the same effect when, in fact, they produced different ones. Contrast with Type I error.
The study of or analysis or classification based on types.
The quality of a measurement tool that suggests the tool accurately reflects the concept that it is intended to measure. For example, your IQ would seem a more valid measure of your intelligence than would the number of hours you spend in the library. It is important to realize that the ultimate validity of a measure can never be proven. Compare to reliability.
The measure of the "spread" of a distribution of random variables (or observations) about its average value. The larger the variance, the more scattered the observations on average. Taking the square root of the variance produces the standard deviation.
Any information given different weight in calculations. For example, a final examination counts twice as much as (is weighted double) the midterm.
This glossary was developed with contributions from the following resources:
- The Practice of Social Research, by Earl Babbie. Wadsworth Publishing, 1986.
- Internet Glossary of Statistical Terms by Howard S. Hoffman of Bryn Mauer College
- Hyper Stat Glossary
- NWP Associates, Inc.
STATLETS User Manual Copyright, 1997
- Statistics Glossary by Valerie J. Easton & John H. McColl
- Dictionary of statistics and methodology. A nontechnical guide for the social sciences (2nd ed.) by Paul W. Vogt, Thousand Oaks: Sage Publications, 1999.
- Dictionary of Public Health Promotion and Education: Terms and Concepts. By Naomi N. Modeste, Thousand Oaks, Dage Publications, 1996.