Best Free Reference
Web Site 2007
Studying Second-Generation Immigrants: Methodological Challenges and Innovative Solutions
By Douglas D. Heckathorn
How many second-generation children are not fluent in English? Which ones
have earned college degrees? Why have members of the second generation chosen
certain types of occupations and not others?
These questions are not only interesting to researchers but also relevant
for policymakers. In order to study the US-born children of immigrants, commonly
called the second generation, researchers need both demographic information and
qualitative information that can only be learned through surveys and interviews.
Although imperfect, demographic information is readily available from the
US Census Bureau. In contrast, researchers have no easy "list" they
can use to find and contact second-generation immigrants they would like to
survey or interview.
The first part of this article will discuss Census Bureau data and the second
part will examine ways to survey and interview the second generation, with
a particular focus on a relatively new methodology called respondent-driven
The US Census Bureau provides three types of data relevant to studying the
second generation: the decennial census, the American Community Survey (ACS),
and the Current Population Survey (CPS).
The decennial census, last conducted in 2000, aimed to reach every person
in the United States, regardless of their status. The 2000 census asked respondents
for their country of birth but did not ask for their parents' country
of birth. As a result, the 2000 census did not identify the number of adults
born in the United States who have one or more foreign-born parents. Therefore,
the 2000 census can only tell researchers about second-generation members who
still live with their parents; the majority of this population is under age
On a positive note, the 2000 census provides detailed information about the
children's parents. The education level and income of parents, for instance,
can help researchers understand trends among the youngest members of the second
Meant to provide up-to-date statistical "snapshots" of communities
between decennial census years, ACS was fully rolled out in 2005 and will be
conducted each year through 2010 and beyond. ACS, which is sent to 250,000
addresses per month, does not have as broad a sample size as the decennial
census but will have collected enough information by the summer of 2010 to
report data on individual census tracts, the smallest geographic unit.
Like the decennial census, ACS also does not ask for parents' country
of birth and thus can only be used to gather information about children of
immigrants who live with their parents.
The following information about the second generation is available from 2000
census and 2005 ACS data:
CPS, specifically the March supplement, does ask respondents about their parents' country
of birth. This makes it possible for researchers to obtain information about
members of the second generation of any age. However, second-generation adults
who have established their own households cannot be "matched" with
their immigrant parents, and thus nothing can be said about parents' characteristics.
- Where the children of immigrants and their parents live (state and certain
levels of geography for 2000 census; areas with populations of 65,000 or more
for 2005 ACS)
- Ages of children and parents
- Country of origin of children's parents
- Year in which the parents arrived in the United States
- Level of self-reported English ability of the children and their parents
- Grade level of children
- Parents' employment
- Parents' occupation
- Parents' education level
- Parents' income level and whether they are above or below the
federal poverty line
It must also be noted that CPS surveys only 50,000 households per month — a
far smaller sample than ACS. Consequently, data can only be analyzed at the national
level for any given year. By combining CPS years together, the sample size can be increased and researchers can
conduct analysis at the state or large metro area level. However, the sample size would still be too small to examine characteristics of, for example, second-generation Dominican adults in a particular suburb.
The following information about the second generation is available from the
CPS March supplement:
- Marital status
- Education level
- Income level and whether the individual is above or below the federal poverty
- Welfare status
Interview and Survey Methodology
If a researcher is interested in surveying foreign-born Chinese parents and
their US-born children in a particular New York City neighborhood, census data
can only be so helpful. By law, the Census Bureau must protect and keep confidential
the information respondents provide. In other words, researchers cannot
obtain from the Census Bureau the addresses or phone numbers of those who meet
the research criteria.
Indeed, a challenge to the study of second-generation immigrants is the lack
of a comprehensive public list, termed a "sampling frame," from
which representative samples can be drawn.
In contrast, general population surveys can draw on telephone records, property
tax roles, voter registrations, and other public lists of residents or residences. Similarly,
studies of special groups such as physicians or lawyers can use lists of those
who hold professional licenses. However, no comparable lists exist for
immigrants, including the second generation.
Of course, lists can be constructed based on general population surveys, but
in some settings this is infeasible because the target population (e.g.,
immigrants from a particular country or region) is such a small part of the
general population that costs would be prohibitive. Another reason, also relevant
to immigrants and their children, is that some groups' social networks
are difficult for outsiders to penetrate.
For all these reasons, immigrants are an example of what is now termed a "hidden" or
"hard-to-reach" population. The importance of developing means
for sampling these populations has been recognized for several decades because
these populations are important to many research areas, including arts and
culture, public policy, and public health.
Sampling hard-to-reach populations has its problems. One approach
relies on institutional records to find population members. However, using such
records has limitations because institutions never sample randomly.
Voluntary associations, such as social clubs and professional associations,
tend to oversample the more fortunate within a population. For example, in
a study of jazz musicians, union members earned 50 percent to 100 percent more
than nonmembers, and they were nearly 10 years older.
In contrast, it is well known that involuntary institutions, such as prisons
and jails, tend to oversample the dispossessed. Similarly, location-based
samples are valid only for geographically concentrated populations. Samples
of ethnic communities, for example, miss those who live in other communities.
Despite these limitations, samples drawn from an institution or location provide
a valid statistical basis for generalizing to the entire institution or location. However,
this provides a valid sample only of that nonrandom portion of the population
that is accessible via institutions or locations.
The second approach to sampling hidden populations relies on social networks,
as in snowball sampling (referrals from initial subjects generate additional
subjects) and other chain-referral methods. These methods are appealing
because respondents are reached through connections to relatives, friends,
and acquaintances, and hence the sample can reach even those who lack institutional
affiliations or those who reside outside of ethnic communities.
Chain-referral methods also tend to reduce nonresponse bias, because respondents are referred
by those with whom they already have trusting relationships. This is
especially important when studying vulnerable or stigmatized groups, such as
unauthorized immigrants. Consequently, network-based samples have more
comprehensive coverage than institutional or location samples.
However, these samples have been seen as convenience rather than probability
sampling methods due to biases inherent in snowball-type methods, such as oversampling
those who are well-connected (i.e., those with larger personal networks), since
more recruitment paths lead to them. Biases also result when some groups recruit
more effectively, and hence their distinctive recruitment patterns shape the
Owing to these biases, results from a chain-referral sample cannot be validly
generalized to the population from which the sample was drawn. Hence
the dilemma: statistical validity with limited coverage of the target population,
or broader coverage but conclusions that cannot be generalized.
Respondent-Driven Sampling: A New Approach
Respondent-driven sampling (RDS) resolves this dilemma by converting chain-referral
into a probability sampling method, thereby providing the means for combining
broad coverage of the target population with the ability to generalize study
results to the population from which the sample was drawn. This method has
been used to study jazz musicians and Vietnam War era draft resisters, and
in more than 20 other countries to study intravenous drug users, gay men, prostitutes,
and street youth.
In RDS, as in other snowball-type samples, respondents recruit peers,
who then recruit their friends and acquaintances who qualify for entry into
the sample, who in return recruit their peers, so that the sample expands through
successive waves of peer recruitment.
Tests of RDS have shown that if referral chains are sufficiently long — that
is, if the chain-referral process consists of enough waves or cycles of recruitment — the
composition of the final sample with respect to key characteristics and behaviors
will become independent of the seeds from which it began. To create long
chains, respondents need to be recruited by their peers rather than by researchers. Also, the researchers need to set a recruitment quota so a few respondents cannot do all the recruiting.
The researchers keep track of who recruited whom and their numbers of social
contacts. A mathematical model of the recruitment process then weights
the sample to compensate for nonrandom recruitment patterns, thereby producing
statistically unbiased results.
RDS analyses can also provide information on the social network connections
among respondents. In the case of the Chicago Latino data set, compiled by
Jesus Ramirez-Valles in 2004, it is possible to measure immigrant groups' insularity
(see Table 1). Here insularity is measured by the homophily index (the degree
to which people tend to resemble one another).
Table 1. Recruitment by Immigration Status (Recruitment Count; Transition Probability)
|Immigration Status of Person who Recruited
||Immigration Status of Recruit
|First Generation (number)
|Second Generation (number)
|Total Distribution of Recruits
|Mean Network Size
The first generation is the most insular, with a homophily index of .32. This
indicates that 32 percent of the time they form a tie to another member of
the first generation, and the rest of the time form ties consistent with random
mixing (i.e., forming ties without regard to immigration status). Natives
have a similar index of .26, so they are also substantially insular. In
contrast, the second generation has a minimal index of .10, indicating that
it serves as a bridge connecting the first generation to natives because 90
percent of their ties are formed irrespective of immigration status.
The applicability of RDS to study an immigrant group depends on the density
of ties though which they are linked. For studies of the second generation,
the empirical question is whether ties among them are dense enough to sustain
a robust chain-referral process; and, if not, members of the first generation
or natives may also have to be included in the sampling frame to provide indirect
links among members of the second generation. Establishing a sense of
trust, important in other RDS studies, will be equally important in RDS studies
of immigrant groups.
Abdul-Quader, Abu S., Douglas D. Heckathorn, Courtney McKnight, Heidi Bramson,
Chris Nemeth, Keith Sabin, Kathleen Gallagher, and Don C. Des Jarlais. 2006. "Effectiveness
of Respondent Driven Sampling for Recruiting Drug Users in New York City: Findings
From a Pilot Study." Journal of Urban Health, 83: 459-476.
Erickson, Bonnie H. 1979 "Some Problems of Inference from Chain Data." Sociological
Heckathorn, Douglas D.1997 "Respondent Driven Sampling: A New Approach to the
Study of Hidden Populations." Social Problems 44:174–99. 2002. "Respondent
Driven Sampling II: Deriving Statistically Valid Population Estimates from Chain-Referral
Samples of Hidden Populations." Social Problems 39: 11-34.
Heckathorn, Douglas D., and Joan Jeffri. 2003. "Social Networks of Jazz Musicians," pp.
48-61 in Changing the Beat: A Study of the Worklife of Jazz Musicians, Volume
III: Respondent-Driven Sampling: Survey Results by the Research Center for Arts
and Culture, National Endowment for the Arts Research Division Report #43, Washington
Kalton, Graham. 1983. Introduction to Survey Sampling. Newbury Park, CA: Sage
Ramirez-Valles, Jesus., Douglas D. Heckathorn, Raquel Vázquez, Rafael
M. Diaz, and Richard T. Campbell. 2005. "From Networks to Populations: The Development
and Application of Respondent-Driven Sampling Among IDUs and Latino Gay Men." AIDS
and Behavior, 9(4):387-402.
Salganik, Matthew J. and Douglas D. Heckathorn. 2004. "Sampling and Estimation
in Hidden Populations Using Respondent-Driven Sampling." Sociological Methodology,
Sudman, Seymour, and Graham Kalton. 1986. "New Developments in the Sampling of
Special Populations." Annual
Review of Sociology 12:401–29.
Thompson, S. K. and O. Frank. 2000. "Model-based estimation with linktracing sampling
designs." Survey Methodology, 26(1):87-98.
Back to the top
If you have questions or comments about this article, contact us at
2002-2013 Migration Policy Institute.
All rights reserved.
Migration Information Source, ISSN 1946-4037
MPI · 1400 16th St. NW, Suite 300 · Washington, DC 20036
ph: (001) 202-266-1940 · fax: (001) 202-266-1900