Underemployment: Methodology

Download Report

The Permanent Detour: Underemployment’s Long-term Effects on the Careers of College Graduates

Press Release

Pomp and Circumstances: New Study Finds Most College Graduates Who Start Out Underemployed, Stay There


Permanent Detour: Further Notes on Underemployment Methodology

Subscribe to our newsletter

Part One: Data Sources Used in the Report

The data used in this paper, The Permanent Detour: Underemployment’s Long-term Effects on the Careers of College Graduates, produced with Strada Institute for the Future of Work, were primarily extracted from Burning Glass Technologies’ unique data assets: a database of more than 800 million job postings providing a detailed view into the jobs and skills that employers demand and a database of over 80 million resumes illuminating the actual career progression of American workers. We also drew from federal surveys and administrative data sets relating to degree completion, majors, and workers’ earnings.

Resume Data

The analyses of workers’ career outcomes were pulled from Burning Glass’ resume database, which captures the detailed work history and education of millions of workers across the U.S. Resumes are collected from Burning Glass’ partners. Resumes were included in this study if they met the following criteria: the worker has a bachelor’s degree and at least five years of work experience thereafter. The analyses in this report were based on 4 million resumes that met these criteria. Further details about our treatment of the resume data are described in Part 3 of the Appendix.

This report was based on aggregate career path and skills data and no personally identifiable information was used by researchers. Burning Glass Technologies has developed a database of millions of recent resumes. When a resume enters the system, the name, address, and other identifying details are encrypted so that they are not accessible to the research team. Researchers compile resumes with similar characteristics so that they can determine which types of transitions and career progressions commonly occur at a population level.

Job Postings Data

To supplement traditional sources of labor market data with more detailed information on employer demand for jobs, skills, and specific credentials, Burning Glass mined its comprehensive database of over 800 million online job postings. Burning Glass collects job postings from close to 50,000 online job boards, newspapers, and employer sites on a daily basis and de-duplicates postings for the same job, whether it is posted multiple times on the same site or across multiple sites. Burning Glass then applies detailed text analytics to code the specific jobs, skills, and credentials requested by employers.


O*NET1  is a government sponsored, publicly available database containing hundreds of standardized and occupation-specific descriptors on almost 1,000 occupations covering the entire U.S. economy. O*NET tracks job trends and analyzes skill level by occupation, that is, whether the skills necessary for a particular job are taught in high school, entail some college, or require a bachelor’s degree or more. The O*NET database was initially populated by data collected from occupation analysts; this information is updated by ongoing surveys of each occupation’s worker population and occupation experts.

American Community Survey

The American Community Survey (ACS)2 is an ongoing annual survey of Americans that provides data on jobs and occupations, educational attainment, and veteran status, among other topics.

Part Two: Methodology to Define Underemployment

Previous studies on underemployment have often based their definitions on data from O*NET. Occupations in which greater than 50% of respondents to O*NET’s surveys have a bachelor’s degree or higher are classified as college-level jobs. Those in which less than 50% of respondents have at least a bachelor’s degree are classified as noncollege jobs.

A weakness of this approach is that the skills and credential requirements for many jobs, particularly in technical areas, may evolve more quickly than O*NET can update them. Further, reliance on such surveys can lead to what’s known as incumbent worker bias. Since the occupational assessments are based on the skills of all workers in a field, they likely include individuals who gained regular or recurrent on-the-job training (such as medical assistants or technicians) as well as older workers who may have entered the occupation when requirements were different from today’s (like pharmacists, physical therapists, or executive assistants). This bias may either over- or underestimate requirements, thus inaccurately reflecting the expectations for today’s new employees.

Our analysis addressed this shortcoming by using an approach based on the levels of education requested in recent job postings. If more than 50% of job postings for an occupation over the past three years (2015-2017) requested a bachelor’s degree or higher, we considered it a college-level job. Using this method, we redefined 45 occupations from noncollege-level occupations in the O*NET classification to college-level occupations in our analysis. A total of 18 occupations shifted from college level to the noncollege level.

Occupations that shift from “Non-college” to “College” based on the Burning Glass analysis.

11-1021.00General and Operations Managers
11-3011.00Administrative Services Managers
11-3051.00Industrial Production Managers
11-3071.01Transportation Managers
11-9013.02Farm and Ranch Managers
13-1031.02Insurance Adjusters, Examiners, and Investigators
13-1071.00Human Resources Specialists
13-1121.00Meeting, Convention, and Event Planners
13-1199.01Energy Auditors
13-1199.06Online Merchants
13-2071.01Loan Counselors
13-2072.00Loan Officers
13-2081.00Tax Examiners and Collectors, and Revenue Agents
15-1143.01Telecommunications Engineering Specialists
17-3012.01Electronic Drafters
17-3012.02Electrical Drafters
17-3029.06Manufacturing Engineering Technologists
17-3029.07Mechanical Engineering Technologists
19-4031.00Chemical Technicians
19-4092.00Forensic Science Technicians
19-4099.01Quality Control Analysts
23-2011.00Paralegals and Legal Assistants
25-1194.00Vocational Education Teachers, Postsecondary
27-1013.00Fine Artists, Including Painters, Sculptors, and Illustrators
27-1022.00Fashion Designers
27-3042.00Technical Writers
27-4032.00Film and Video Editors
29-1124.00Radiation Therapists
29-2033.00Nuclear Medicine Technologists
29-2099.07Surgical Assistants
33-1021.01Municipal Fire Fighting and Prevention Supervisors
33-3021.03Criminal Investigators and Special Agents
33-9021.00Private Detectives and Investigators
41-1012.00First-Line Supervisors of NonRetail Sales Workers
41-3011.00Advertising Sales Agents
41-4011.00Sales Representatives, Wholesale and Manufacturing, Technical and Scientific Products
41-4011.07Solar Sales Representatives and Assessors
43-1011.00First-Line Supervisors of Office and Administrative Support Workers
43-4011.00Brokerage Clerks
43-4041.01Credit Authorizers
43-6011.00Executive Secretaries and Executive Administrative Assistants
43-9081.00Proofreaders and Copy Markers
45-1011.07First-Line Supervisors of Agricultural Crop and Horticultural Workers
45-2011.00Agricultural Inspectors

Occupations that shift from “College” to “Non-college” based on the Burning Glass analysis

11-9141.00Property, Real Estate, and Community Association Managers
13-2021.02Appraisers, Real Estate
17-3011.01Architectural Drafters
19-1031.03Park Naturalists
19-4041.02Geological Sample Test Technicians
19-4091.00Environmental Science and Protection Technicians, Including Health
21-1094.00Community Health Workers
25-3021.00Self-Enrichment Education Teachers
27-2022.00Coaches and Scouts
29-1071.01Anesthesiologist Assistants
29-1141.03Critical Care Nurses
29-2012.00Medical and Clinical Laboratory Technicians
29-2053.00Psychiatric Technicians
29-9012.00Occupational Health and Safety Technicians
39-9032.00Recreation Workers
39-9041.00Residential Advisors
43-4051.03Patient Representatives

Part Three: Overview of Resume Analyses

The resume data set is a Burning Glass Technologies proprietary data set, sourced from Burning Glass partners. This data set includes information about an individuals’ demographics, career path, and employers.

The resume data set contains information about a candidate’s location, level of educational attainment, the institutions at which the candidate studied, the major, as well as any certifications held. The data set also contains information about a candidate’s career path, for example, occupation and time spent in any workplace and role, years of experience, employer name and location, and industry. In addition, a candidate may list skills and the years of experience with any particular skill.

All personally identifiable information such as name, address, and contact information is encrypted and not available to researchers.

Resume Sample Selection

To capture the work history, educational attainment, and resulting underemployment of workers over the life of their careers, Burning Glass selected a total of 4 million resumes for inclusion in this study, based on the following criteria:

  1. Individuals in the selected group must have commenced their first job during or after the year 2000, where an individual’s first job was classified as the first job listed on a resume.
  2. The time worked in the first job must have been longer than six months to avoid internships and other short-term projects.
  3. Job seekers must have occupational information about a first job and the job five years later. For a subsample of resumes, we also assessed underemployment 10 years later, where job data was available.
  4. Job seekers must hold a bachelor’s degree or higher. This restriction was imposed because the underemployment of workers was calculated within the sample of workers with bachelor’s degrees or higher.
  5. At each point in the analysis, job seekers must have had civilian employment, as military occupations have a distinct hiring system for which research on underemployment is not germane.

Coding Occupation and Education from Resumes

For this analysis, we collected information for our samples based on an individual’s occupation in a first job, five years later, and 10 years later (where available). Our occupation coding is based on the occupational definitions provided by O*NET,1 which extends the US Department of Labor’s Standard Occupational Classification System.2

Occupation coding is conducted according to a proprietary classification system developed by Burning Glass, which includes a blend of human generated rules and machine learning systems to ensure that each job is correctly coded into the correct occupational category.

We analyzed job seekers’ education by categorizing the undergraduate program of study into the National Center for Education Statistics’ Classification of Instructional Programs (CIP) program.3

Predicting Gender in Resumes

To study the effect of gender on underemployment, we used the gender R package to determine the gender of an individual in the resume sample. The R package uses an estimated date of birth (1970-2000) and the first name from the resume to predict the gender of an individual based on historical Social Security Administration data.4,5 Using this approach, we estimated the probability of each individual in the sample as being a particular gender and used a cutoff threshold probability of 0.6 or higher to conclude that an individual was of the predicted gender. Individuals for whom no gender prediction was possible were not included in the sample for the gender-specific analyses. The gender analysis was done prior to further analysis of the data, and the gender data available to researchers was attached to anonymized records. At no time were names or other personally identifiable information available to researchers.

Part Four: Calculating Expected Salary Using Data from the American Community Survey

Since resumes do not typically include salary information, we used the Census Bureau data to estimate salary based on the occupational and demographic characteristics of each worker. We used pooled one-year samples from 2012 to 2017. We focused on individuals aged 22-27 years old (recent college graduates) and restricted the sample to those who were working and in the labor force, had at least a bachelor’s degree, were not enrolled in any educational program, and worked at least 30 hours per week.6 For these people, we looked at their gender, their major, occupation, and salary for the periods of 2012-2017. Incomes were restricted to those between $15,000 and $200,000 per year.

To estimate the cost of underemployment, we estimated the average salary of underemployed and appropriately employed workers using a weighted average based on the distribution of majors.

To account for the fact that underemployed graduates have different educational preferences, we used the proportion of graduates in each major who were underemployed to weight the salary of the appropriately employed group. The goal of this exercise was to estimate what the salary of the underemployed group would have been, if they were properly employed, based on their chosen field of study. It is important to note that this approach leads to a smaller salary gap between the underemployed and appropriately employed than if we had used the educational distribution of the appropriately employed group to estimate their average salary.

We found that on average, underemployed graduates made $37,330, while those appropriately employed made $47,470, a difference of approximately 27%.


  1. O*NET Resource Center, “The O*NET-SOC Taxonomy,”, accessed April 30, 2018
  2. Bureau of Labor Statistics, “Standard Occupational Classification,”, accessed April 30, 2018
  3. See for more information. For the purpose of this analysis, we merge CIP code 14 (Engineering Technologies and Engineering-Related Fields) and CIP code 15 (Engineering) and treat them as a single major.
  4. This R package uses historical data sets from the U.S. Social Security Administration, the U.S. Census Bureau (via IPUMS USA), and the North Atlantic Population Project to provide predictions of gender for first names for particular countries and time periods.
  5. Blevins, Cameron, and Lincoln Mullen. “Jane, John… Leslie? A Historical Method for Algorithmic Gender Prediction.” DHQ: Digital Humanities Quarterly 9, no. 3 (2015).
  6. Here, we follow the same selection criteria as in Abel, Jaison R., and Richard Deitz. “Underemployment in the early careers of college graduates following the Great Recession.” In Education, Skills, and Technical Change: Implications for Future US GDP Growth. University of Chicago Press, 2017.


Download Report

The Permanent Detour: Underemployment’s Long-term Effects on the Careers of College Graduates


Permanent Detour: Further Notes on Underemployment Methodology