October 22, 2015

Religion and Science

Appendix A: About the Survey

The bulk of the analysis in this report stems from a Pew Research Center survey conducted by telephone with a national sample of adults (18 years of age or older) living in all 50 U.S. states and the District of Columbia. The results are based on 2,002 interviews (801 respondents were interviewed on a landline telephone and 1,201 were interviewed on a cellphone). Interviews were completed in English and Spanish by live, professionally trained interviewing staff at Princeton Data Source under the direction of Princeton Survey Research Associates International from Aug. 15 to Aug. 25, 2014.

Survey Design

A combination of landline and cell random digit dial (RDD) samples were used to reach a representative sample of all adults in the United States who have access to either a landline or cellular telephone. Both samples were disproportionately stratified to increase the incidence of African-American and Hispanic respondents. Within each stratum, phone numbers were drawn with equal probabilities. The landline samples were list-assisted and drawn from active blocks containing one or more residential listings, while the cell samples were not list-assisted but were drawn through a systematic sampling from dedicated wireless 100-blocks and shared service 100-blocks with no directory-listed landline numbers. Both the landline and cell RDD samples were disproportionately stratified by county based on estimated incidences of African-American and Hispanic respondents.

Margin of sampling error

Margins of ErrorStatistical results are weighted to correct known demographic discrepancies, including disproportionate stratification of the sample. The margins of error table shows the unweighted sample sizes and the error attributable to sampling that would be expected at the 95% level of confidence for different groups in the survey.

The survey’s margin of error is the largest 95% confidence interval for any estimated proportion based on the total sample – the one around 50%. For example, the margin of error for the entire sample is ±3.1 percentage points. This means that in 95 out of every 100 samples drawn using the same methodology, estimated proportions based on the entire sample will be no more than 3.1 percentage points away from their true values in the population. Sampling errors and statistical tests of significance used in this report take into account the effect of weighting. In addition to sampling error, one should bear in mind that question wording and practical difficulties in conducting surveys can introduce error or bias into the findings of opinion polls.

Interviewing procedures

All interviews were conducted using a Computer Assisted Telephone Interviewing (CATI) system, which ensures that questions were asked in the proper sequence with appropriate skip patterns. CATI also allows certain questions and certain answer choices to be rotated, eliminating potential biases from the sequencing of questions or answers.

For the landline sample, half of the time, interviewers asked to speak with the youngest adult male currently at home and the other half of the time asked to speak with the youngest adult female currently at home, based on a random rotation. If no respondent of the initially requested gender was available, interviewers asked to speak with the youngest adult of the opposite gender who was currently at home. For the cellphone sample, interviews were conducted with the person who answered the phone; interviewers verified that the person was an adult and could complete the call safely.

Both the landline and cell samples were released for interviewing in replicates, which are small random samples of each larger sample. Using replicates to control the release of the telephone numbers ensures that the complete call procedures are followed for all numbers dialed. As many as seven attempts were made to contact every sampled telephone number. The calls were staggered at varied times of day and days of the week (including at least one daytime call) to maximize the chances of making contact with a potential respondent.

Questionnaire development

Pew Research Center developed the questionnaire. The design of the questionnaire was informed by consultation with a number of staff at the Pew Research Center, senior staff of the American Association for the Advancement of Science (AAAS) and several outside advisers. Questionnaire development is an iterative process. A pilot study was conducted Aug. 5-6, 2014, with 101 adults living in the continental U.S. The sample was drawn from fresh RDD landline phone numbers (n=25) and a sample of cellphone numbers from respondents interviewed in recent RDD omnibus studies (n=76). The tested questionnaire included a number of open-ended questions to gauge what respondents had in mind when thinking about the positive and negative effects of science on society. As a final step, a traditional pretest was conducted Aug. 12, 2014, with 24 adults living in the continental U.S. The sample was drawn from fresh RDD landline phone numbers and a sample of cellphone numbers from respondents interviewed in recent RDD omnibus studies. The interviews were conducted in English under the direction of Princeton Survey Research Associates International. The interviews tested the questions planned for the study questionnaire in the full survey context. The final questionnaire lasted about 22 minutes, on average.


Several stages of statistical adjustment or weighting are used to account for the complex nature of the sample design. The weights account for numerous factors including (1) the different, disproportionate probabilities of selection in each strata, (2) the overlap of the landline and cell RDD sample frames and (3) differential nonresponse associated with sample demographics.

The first stage of weighting accounts for different probabilities of selection associated with the number of adults in each household and each respondent’s telephone status.7 This weighting also adjusts for the overlapping landline and cell RDD sample frames and the relative sizes of each frame and each sample. Due to the disproportionately stratified sample design, the first-stage weight was computed separately for each stratum in each sample frame.

After the first-stage weight adjustment, two rounds of poststratification were performed using an iterative technique known as raking. The raking matches the selected demographics to parameters from the U.S. Census Bureau’s 2012 American Community Survey data.8 The population density parameter was derived from 2010 census data. The telephone usage parameter came from an analysis of the July-December, 2013 National Health Interview Survey.9 Raking was performed separately for those asked each form of the questionnaire using sample balancing, a special iterative sample weighting program that simultaneously balances the distributions of all variables using a statistical technique called the Deming Algorithm. The raking corrects for differential nonresponse that is related to particular demographic characteristics of the sample. This weight ensures that the demographic characteristics of the sample closely approximate the demographic characteristics of the population.

The first round of raking was done individually for three racial/ethnic groups (Hispanics, non-Hispanic blacks, and all other non-Hispanics). The variables matched to population parameters for each race/ethnicity group were gender, age, education and region. The variables matched to population parameters for Hispanic respondents also included nativity (U.S. born versus foreign born). The variables for other non-Hispanic respondents also included race (white race versus some other or mixed race).

A second round of poststratification raking was performed on the total sample for each form. Each form was raked to the following demographic variables: gender by age, gender by education, age by education, census region, race/ethnicity, population density and household telephone status (landline only, cellphone only, or both landline and cellphone).

About the Multivariate Regression Analyses

The regression analyses described in this report are based on the full sample of U.S. adults in the survey who provided a response on each topic. Results from many of these analyses are shown in the Pew Research report, “Americans, Politics and Science Issues;” results from other analyses described here are available upon request.

The analysis is based on the weighted sample, thus adjusting for differences in the probability of selection and nonresponse differences across groups.10 Results are based on 0.05 level of statistical significance. The dependent variable omits respondents who said don’t know to that question. The independent variables used in each analysis are as follows: gender (women compared with men); race and ethnicity (non-Hispanic blacks, Hispanics and other or mixed race as compared with non-Hispanic whites); age; education (having a postgraduate degree, college degree or some college as compared with those having a high school degree or less education); science knowledge (those with more as compared with less knowledge about science based on an index of six items); party affiliation (Republicans and leaning Republicans, and those with no affiliation or leaning lean toward either party as compared with Democrats and leaning Democrats); political ideology (conservatives, moderates as compared with liberals); frequency of worship attendance (comparing those attending services weekly or more often and monthly/yearly with those who seldom/never attend); and religious affiliation. Religious affiliation variables include classification as an evangelical Protestant, mainline Protestant, Catholic, some other Christian (such as Mormon or Orthodox), and some other religion (such as Jewish, Muslim, Hindu) as compared with the religiously unaffiliated.

For several issues, separate analyses included the variables described above in addition to one or two other factors such as perceptions of scientific consensus about the topic.11

The total number of respondents in each analysis ranges between roughly 1,614 (when religious factors are included in the model) to a possible maximum of 2,002 respondents, depending on the number of missing responses to either an independent variable in the model or to the dependent variable. The dataset will be publicly available for secondary analysis through the Pew Research Center website in the coming months.

As with the earlier report, each conceptual factor of interest – in this report, either religious affiliation or frequency of religious service attendance – is classified as having a strong, medium or weak effect in explaining people’s views across the set of science-related topics. “Strong” factors are defined here as those that have at least one statistically significant independent variable in the set related to the conceptual factor, which is estimated to change the predicted probability of people’s views by at least one half of a standard deviation. “Medium” factors are statistically significant predictors where the change in predicted probability is less than one half of a standard deviation in the independent variable. If no independent variable in that set meets the criteria for a strong or medium effect, the factor is classified as having a “weak” effect. Note however, that if the only significant predictor in the set of religious affiliation variables was either other Christian or other religion then the factor was classified as weak. Similarly, if the only significant predictor in the set of religious service attendance variables was month/yearly service attendance the factor was classified as weak.

These classifications are designed to help readers assess the broader patterns underlying public attitudes across a large set of topics, but they are, of course, dependent on the criteria used. Note that judging the relative effect size against the standard deviation of the independent variable means that independent variables with more variability require a greater change in predicted probability to be classified as strong as those with less variability. Measures of religious affiliation and frequency of religious service attendance have similar levels of variability; the change in predicted probability for either factor to be considered strong is between 0.21 and 0.23.

  1. Telephone status refers to whether respondents have only a landline telephone, only a cellphone, or both kinds of telephone.
  2. ACS analysis was based on all adults, excluding those living in institutional group quarters.
  3. See Blumberg, Stephen J. and Julian V. Luke. 2014. “Wireless substitution: Early Release of Estimates from the National Health Interview Survey, July-December, 2013.” National Center for Health Statistics.
  4. The analysis was conducted in Stata using the svy command to incorporate the survey weights. The changes in predicted probability were calculated using the prchange command in the SPost package developed by J. Scott Long and Jeremy Freese; calculations of changes in predicted probability hold all other factors at their unweighted means.
  5. We also ran a number of logistic regression analyses, not shown here, to test the degree to which the findings we present are consistent across alternative model specifications. For example, we ran models for the 21 dependent measures with the exact same set of independent factors.