Using Earnings Metrics for Higher Education Accountability

Dr. Robert Kelchen, an associate professor at Seton Hall University and an expert on financial aid, was commissioned to write an article about using earning metrics for accountability for Higher Learning Advocates, a bipartisan higher education advocacy group.

Dr. Kelchen opens by citing that the main reason most students attend college is for a better future, which includes better jobs and higher wages. He writes that the benefits of college are not clear to everyone, particularly if they don’t graduate or they don’t focus on the right degree.

Historically, the federal government hasn’t held colleges directly accountable for their students’ outcomes. Indirectly, the Department of Education has focused on requiring colleges to maintain their students’ loan cohort default rates below a certain threshold.

Professor Kelchen states that the availability of income-driven repayment plans and the short time period tracked for loan defaults leads to the outcome that most students remain above the threshold during the time that is counted. Only a few small colleges end up being at risk of losing federal financial aid.

According to Dr. Kelchen, the lack of accountability combined with concerns about student outcomes has led to calls for the federal government to develop earnings-based metrics of student success and tie them to federal financial aid eligibility. Based on available research on state efforts to tie state funding to student outcomes, he believes earnings metrics used for consumer information or high-stakes accountability should be designed with caution and with students from traditionally underrepresented groups in mind.

I could not agree more regarding this point that he makes, particularly given that there is not a national student database that tracks the outcomes for all college students. There are existing data sources available, and Dr. Kelchen discusses the pros and cons of each.

Source #1: The College Scorecard

Earnings data was added to the College Scorecard in 2015. Initially, it was institution-level earnings of former undergraduate students who utilized federal aid (either Pell grants or loans). In 2019, the Trump administration added earnings data by field of study for both undergraduate and graduate programs, provided that there are enough graduates for those programs (which I believe is a minimum of 30 per cohort year, not cumulatively).

Dr. Kelchen writes that there are several challenges using this data. The first challenge is that not all students are included, because the data only covers all undergraduate students who ever received Pell Grants or student loans. That covers approximately 70 percent of all undergraduates.

He further notes that the highest coverage rate is with for-profit colleges (90 percent) and the lowest with community colleges (63 percent). Dr. Kelchen adds that students who do not receive federal financial aid tend to be from higher-income families than those who receive federal aid.

Regardless as to whether any of these “averages” are correct or not, I would argue that none of them are fair representations of colleges and universities that fall out of the norm. At the institution that I led for the last 18 years, American Public University System (APUS), approximately 72 percent of our graduates do not borrow through the Federal Student Aid program. That is far removed from the 90 percent figure that Dr. Kelchen cited for the for-profit colleges’ and universities’ average.

While the vast majority of APUS students do not borrow, it is not because they come from wealthy families. Instead, it is because APUS has recognized for more than 20 years that the members of the military that it serves cannot afford to pay out-of-pocket costs for college.

For years, APUS has pegged its tuition for servicemembers at the cap for Department of Defense tuition reimbursement, eliminated fees, and provided textbooks through grants at no cost to the undergraduate student. Because APUS tuition, textbooks, and fees are covered by employer reimbursement and its online-only curriculum does not require room and board to attend, the majority of APUS graduates do not borrow and their earnings outcomes are not included in the College Scorecard.

As a result, the salaries and debt levels reported by the Scorecard do not represent the majority of APUS students or graduates. I am sure that there are other institutions that have their own story as to why the data in the College Scorecard does not represent their institution.

Another challenge posed by using metrics from the College Scorecard is that data from students who graduated is combined with those who dropped out. Thus, the current metrics understate earnings for graduates and overstate earnings for dropouts.

Dr. Kelchen also notes that the College Scorecard uses a definition of institution based on the Office of Postsecondary Education identification number (OPEID). However, some larger systems of higher education report multiple units under one OPEID.

Examples of this reporting cited by Professor Kelchen are Rutgers, Ohio State, and Penn State, where graduates’ earnings data from those who attended the main campus are combined with branch campuses. According to Dr. Kelchen, this issue affects about 20 percent of all campuses.

A fourth challenge with the Scorecard data is that reported earnings are too close to graduation to be meaningful. Unlike the institutional data, the program level data contains information on graduates only, covering both undergraduate and graduate programs. It only provides the median earnings measured one year after graduation.

The fifth challenge using Scorecard data is that information is often combined across multiple programs at the same credential level in order to meet sample size requirements (Dr. Kelchen reports that the sample size is not published, but he believes it to be 20 graduates receiving financial aid; I have heard 30 graduates are required from other sources). Individual programs are typically classified using the six-digit Classification of Instructional Programs (CIP) code, while the College Scorecard uses four-digit CIP codes.

Source #2: The Post-Secondary Employment Outcomes Database

Maintained by the U.S. Census Bureau, the Post-Secondary Employment Outcomes (PSEO) Database is considered to be an experimental program and only includes data from 47 colleges from 2001 through 2016. The focal earnings measures are the 25th, 50th, and 75th percentiles of earnings measured one, five, and ten years after graduation.

Unlike the College Scorecard, the PSEO does not restrict the sample to students who received federal financial aid. The PSEO data excludes individuals who reported zero earnings in three or more quarters of the measured calendar year.

Similar to the College Scorecard, the PSEO reports data at the OPEID level. Undergraduate programs are reported at the four-digit Classification of Instructional Programs (CIP) level, but master’s and Ph.D. programs are reported at the two-digit CIP.

The PSEO appears to require a minimum of 30 graduates in a CIP to report earnings. It also reports data for three combined cohorts for bachelor’s degree graduates and five cohorts for other credentials. The College Scorecard uses two combined cohorts.

Source #3: State Data Sources

Earnings data has been collected for years by many states, thanks to more than $800 million in funding since 2006 from the U.S. Department of Education. All states except New Mexico have received funding, and 41 states have active systems. Notable examples cited by Dr. Kelchen include Florida, Utah, Virginia, and Texas.

One of the challenges of these datasets is that they usually do not include students who leave the state after college. Another challenge is that some states only collect data on students who attended public institutions of higher education and not private colleges and universities.

After a thorough discussion of the primary sources for earnings metrics, Dr. Kelchen argues that there are six questions that remain about the measurement, mission of higher education, and keeping equity in focus.

Question 1: What types of earnings metrics are appropriate?

Median earnings are generally used because mean earnings are skewed by a small percentage of individuals with extremely high incomes. Using median earnings for a metric may not shed any light on how many former students are struggling to survive without receiving public benefits.

Dr. Kelchen proposes that in addition to a median earnings measure, there should be two measures that focus on financially vulnerable students. The first measure would provide information about the 10th or 25th percentiles of the income distribution for a reasonable worst-case scenario versus the median outcome. This could be combined with measures at the top end (75th or 90th percentiles) to provide students and policymakers with more information about the range.

The second measure proposed by Dr. Kelchen is a return on investment or economic self-sufficiency. This measure would not punish students who choose to pursue lower-paid fields.

The College Scorecard’s measure of the percentage of students earning more than the typical high school graduate’s average salary is one possibility. Other possibilities include setting thresholds at certain percentages of the federal poverty line or taking into account a student’s financial investment in their education.

Naturally, there are limitations to any proposed metric. Dr. Kelchen provides an example of students who attend cosmetology programs receive tips as a substantial portion of income and many tips are unreported.

Another limitation is whether the observed earnings reflect the types of students enrolled by the college. Comparing Ivy League universities with regional public universities is one example of a perceived bias. Another limitation is that working adult students with previous employment history may receive a higher salary after completing their education than a 22-year-old.

Question 2: When should earnings be measured? Should earnings metrics vary over time?

Given the need to manage student loans, Dr. Kelchen argues that students and their families are concerned about the ability to earn enough after graduation to repay their student loans. At the same time, as a society, we are concerned more about the lifetime benefits of higher education than the short-term benefits.

When compared with lags in calculating and reporting outcomes, data may be out of date. One example provided by Dr. Kelchen is that the College Scorecard’s current measure of median earnings ten years after college is based on students who began in the 2003-04 and 2004-05 academic year. Colleges may have made positive or negative changes to their programs after those cohorts and the outcomes would not be reported for years.

Question 3: Should credential level or field of study be taken into account?

Dr. Kelchen writes that the first issue is whether there should be variation in when earnings are measured. Shorter-term measures are more appropriate for shorter-term credentials. Growth in earnings over time is higher for bachelor’s degree graduates than for associate degree graduates or certificates. Using a shorter term of measurement would understate the return for a bachelor’s graduate.

Some fields of study, however, are not well-suited for short-term earnings metrics. One example is medical doctors. Generally, medical doctors pursue residencies for two to four years after graduation, and residency salaries are lower than professional salaries post-residency.

Another issue is whether earnings thresholds or other metrics should vary across field of study or credential. Metrics based on self-sufficiency should be the same across all programs, but programs with graduates whose income may include tips should include those tips.

Question 4: Should metrics reflect all students, or just graduates?

Dr. Kelchen reflects that students and their families focus on graduates because they expect to graduate. However, the nationwide four-year graduation rate is 44 percent. This provides a strong case to include all students or at least publishing metrics highlighting the differences between dropouts and graduates.

At the program level, Dr. Kelchen points out that it is easier to focus on graduates than dropouts, primarily that many undergraduates drop out before selecting a major. This should not be an issue for certificates and graduate programs. Dr. Kelchen recommends that program level metrics at the undergraduate level be for graduates only given the limitations on collecting data.

Question 5: How should debt be factored into earnings metrics, if at all?

Dr. Kelchen states that a challenge is what types of debt should be included. The College Scorecard excludes Parent PLUS, Perkins, and private loans, which provide substantial funding for some students.

The repayment period is problematic as well. The College Scorecard uses the standard ten-year repayment period, but the income-driven repayment programs allow students to pay based on their income for up to 20 years. Dr. Kelchen states that self-sufficiency metrics and a return on investment metric could alleviate some of these concerns.

Question 6: How should earnings metrics be used during recessions?

A key challenge, according to Dr. Kelchen, is that colleges’ performance is influenced by the broader economy. Any adjustments to accountability metrics should automatically occur based on predefined triggers, such as unemployment reaching a specified threshold. These adjustments could be applied regionally vs. nationally.

Dr. Kelchen writes that while student loan debt and student economic security is very important, policymakers need to proceed with caution before implementing metrics that might make colleges and universities less accessible for traditionally underrepresented groups of students.

Going forward, Dr. Kelchen predicts that higher stakes accountability initiatives are more likely to be at the program level for graduates only, while institutional earnings metrics will be used more for consumers. He believes that the metrics should focus on identifying programs where the student cannot be economically self-sufficient or that do not provide a value to society. Setting a relatively low performance bar that is consistently applied across all sectors of higher education creates an opportunity for bipartisan legislation that would protect students from the lowest-quality programs.

I concur with Dr. Kelchen’s conclusion. His paper is thorough and covers the majority of the issues, as well as asking crucial questions. At the same time, some of these issues could be resolved if Congress allowed for the creation of a national student database.

One issue, however, that was not resolved was the issue of transfer students. If a student incurs much of his or her debt at a higher-cost institution and transfers to a lower-cost institution, should the lower-cost institution include the other debt in its return on investment calculation?

Assuming the calls for accountability continue to come from policymakers and legislators, the choices need to consider all of these issues and more. The next few years will be interesting ones for higher education institutions, students and their families, and policymakers.