Opportunity Youth in Texas: Data & Research

​Analyzing the Causes and Effects of Youth Disconnection in Texas

The Ray Marshall Center’s (RMC) research on opportunity youth (OY) in Texas employs novel data and methods to build knowledge around the causes and effects of youth disconnection. The ESTOY project also aims to provide actionable data that will aid in finding youth who are likely to benefit from services or preventative measures, as well as identifying policies across time and administrative units that have resulted in better outcomes for youth. The RMC has plans to continue improving this work by incorporating more recent data and linking our OY identification with recent strategies observed in campus- and district-level policies. We are also in the process of establishing data-sharing agreements with service providers in our four key regions to identify variation in rates and methods of engagements with youth, as well as drawing from our rich panel data to identify causal impacts of these variations, using matched counterfactual groups.

The current presentation focuses on our work to-date in order to shed light on the types of insights we will be able to generate as we move forward with ESTOY. We present, in broad terms, how we use administrative data to identify opportunity youth and discuss how our methods will fill a gap in existing research.

Highlighted Data Research Results

Here are four recent project

1.Mapping

2.

3.

4.

ESTOY’s Cohort-Based Method Using Linked Administrative Data

The research highlighted here aims to build generalizable knowledge related to OY, as well as to provide specific answers about rates and patterns of OY disconnection in Texas and our four regions. To do so, researchers at the RMC use UT Austin’s Education Research Center (ERC) data that provides a rich, near-comprehensive glimpse at the education and work aspects of life for all youth attending public school in Texas who do not opt out. The ERC links primary and secondary education data with quarterly wage data from unemployment insurance records. We track cohorts of individuals on a quarterly basis for a range of behavior and attributes related to opportunity youth status, education, and labor. Using these records, while making our best effort to account for individuals who leave Texas and, thus, leave the ERC’s records, we identify opportunity youth for each youth on a quarterly basis by identifying periods in which they are absent from all administrative records for at least a quarter year. Our definition of OY is “any youth not working or studying for at least three months”. Because of the granularity of this data, we are also able to identify thresholds of disconnection, such as persistent disconnection consisting of four or more subsequent quarters (a year) of disconnection.

We map more than 100 variables onto a balanced quarterly panel that includes 329,000 individuals per age cohort for ~52 quarters (approximately 1.7 Million cells of data per age cohort, so far, with more to come). We aggregate these person-level data points to analyze geographic (i.e. county, school district, campus) and demographic (i.e. race/ethnicity, secondary socioeconomic status) outcomes over time. The result is a novel data set that offers a uniquely detailed view of OY status and the within-person upstream causes and downstream effects. Because we start with large N, population-level estimates, this is the first time that researchers are able to examine small units that are typically are infeasible for survey-based methods (rural units, campuses).

The test cohort is composed of all individuals in Texas public schools who are (expected to be) 16 years old on September 1, 2010 in the Texas Education Agency (TEA) data. We iterate through each year of secondary education and collect the identifiers (replacements for PEIMS and SSN) for all cohort members (n ≈ 329,000). This includes, for example, 15-year-olds in 2009 and an individual who moved to Texas at age 17 (or any subsequent age) and is observed in any Texas public school on September 1, 2011 (or any subsequent September 1st). Cohort members are also placed into mutually exclusive county/district/campus cohorts, based on their final appearance in the secondary education data (i.e. graduation or dropout).

The figure below visually describes the key measurement points involved in cohort assignment for the current study cohort. For those 297,308 youth who are visible in the administrative data, we continue to track them through all post-secondary and wage data until we catch up to the most recently released ERC data.

ESTOY Cohort Design: Counts and Rates of Individuals in the 2010/2011 Cohort of 16- to 24-Year-Olds

ESTOY Cohort Design

While the large number of observations and detailed panel data allow us to answer questions that have eluded past researchers, there are some limitations to using administrative data confined to the borders of Texas. We cannot track individuals who leave for private-schooling, nor homeschooled individuals, so we exclude students with those listed “exit reasons” in the TEA data. Additionally, we cannot track individuals who leave Texas (except for post-secondary study), and their absence in the administrative data would be mis-identified as disconnection. For this reason, we prospectively check four years into the future from each quarter and move any individual who does not appear in the data for this duration into an attrition group, which is removed from both the numerator and denominators for all calculations presented on the site. A final limitation that we face, along with the numerous researchers using unemployment insurance (UI) data, is that we cannot observe self-employment or gig work, as those individuals are not required to file for UI. However, if those individuals are long-term, self-employed workers without other wage labor, the attrition adjustments should identify and remove them from the sample, and if the gig-work is in addition to a wage position or study, they are still correctly identified as working.

While we plan to generate data on more recent cohorts, we believe this data is useful due to the fact that underlying socioeconomic factors causing these patterns move slowly in the aggregate. A single campus or district might be able to implement meaningful change quickly; however, the large aggregations in this analysis are likely to generalize to the present, especially considering the fact that more recent cohorts pass through the confounder of the COVID pandemic.

Key Concepts and Definitions

Opportunity youth: any individual who is not working nor studying from 16-24. In our identification strategy, this is any individual not working nor studying for at least three months but also less than four years who attends public school at any time during the panel.

Persistent disconnection: any individual who is not working nor studying for one full year or more (assuming non-attrition). We often focus on persistent disconnection, because existing research tends to look at long time-horizons, such as a year. Practically, this higher threshold also eliminates any noise from a momentary absence that, for many, does not encapsulate important streaks of disconnection. This measure is prospective in nature, meaning for each youth, for all quarters, we look forward in the panel to check if a current moment of disconnection ends up lasting a year. This properly allocates the start and stop of such periods.

Retrospective, panel measures: Many of the analyses discussed on the site examine each individual’s entire panel of data to generate a summary statistic, especially those examining aggregated data, such as demographic groups, campuses, districts, or counties. For example, we analyze the importance of campus-level attributes by calculating the percent of youth from that campus who have any instance of persistent disconnection during all quarters from ages 16 to 24. This stands in contrast to other measures that calculate rates of disconnection on a quarterly basis.

Attrition adjustments: as with all observational studies, imperfect measurement is to be expected. Approximately 20% of cohort members are flagged as likely attrition by age 24 when they do not appear in the data for 16 consecutive quarters. While concerning, removing these individuals ensures that absent individuals cannot strongly influence our results. It is worth noting that this rate would be low for a survey panel persisting nine years. All retrospective measures only consider full-panel individuals, while quarterly or panel analysis drops individuals prospectively from numerators and denominators on a quarterly basis.

Masking: In many of the plots we mask data–strategically omit information–to ensure the protection of private data. This could entail censoring at a certain value or excluding units that contain think slices of data.