When a Survey is Self-Published versus Peer-Reviewed

I first read about the Economic Well Being of U.S. Households in 2021 report in an email from UPCEA. The email referenced a Higher Ed Dive article written by senior editor Rick Seltzer.

The Higher Ed Dive article was titled Adults who borrowed for college doubt higher ed’s value, survey says. One of the key statistics cited in the article’s introductions was that 40% of borrowers with outstanding debt said the benefits of their education exceeded the costs, but 63% of those who borrowed and paid off their debt said the same, as did 51% of those who attended college but never took on debt.

My interest in the article increased when I read the following sentence: “Those who attended for-profit colleges had greater difficulty repaying loans, even after accounting for differences in race and ethnicity, parental education, whether a college was a two-year or four-year institution, and the institution’s selectivity.” This sentence was followed by a quote from the report: “’This suggests that the high payment difficulty rates for attendees of for-profit institutions reflect characteristics of the schools and is not simply due to the characteristics of their students.’”

As someone who has published more than a few papers about the topic of online student retention utilizing statistical methods for analysis, I was curious how the original report explained the data as well as the methodology utilized for their analysis.

According to the section at the back of the original report (page 85), the survey was fielded from October 29, 2021 through November 22, 2021. It was the ninth year of the survey, conducted annually in the fourth quarter since 2013. Ipsos, a consumer research firm, administered the survey using its nationally representative, probability-based online panel. Respondents have been selected based on address-based sampling. Of the 18,322 panel members contacted to take the 2021 survey, 11,965 participated and completed the survey. A small number of respondents were removed for a final sample total of 11,874. Survey respondents were compensated at rates ranging from $5 to $25 for a completed survey. The survey was designed to be representative of adults age 18 and older living in the U.S and was weighted to reflect March 2021 data. Approximately one-third of the 2021 respondents participated in the 2020 survey. School names were reported by the participants and the researchers looked up those names and entered them into the appropriate category of two and four year and public-nonprofit, private non-profit, and for-profit.

Although survey data in multiple statistical formats is available as well as a code book at the Federal Reserve, the report is noticeably weak when it comes to providing specific data related to sorts in the form of tables or figures. While the code book provides the specific responses to each of the survey questions, there is no data that discusses any statistical techniques utilized to form conclusions. In fact, in the code book, the responses to questions asked have a few questions where the response rate is only half of the respondents. The footnotes mention interpolation, but where the groups get smaller, interpolation can be tricky.

For example, according to recent Census data reporting the highest level of education for adults in the U.S. 25 and older, 8.9% had less than a high school diploma or equivalent, 27.9% had high school graduate as their highest level of school completed, 14,9% had completed some college but not a degree, 10.5% had an associate degree as their highest level of school completed, 23.5% had a bachelor’s degree as their highest degree, and 14.4% had completed an advanced degree such as a master’s, professional, or doctoral degree. Said another way, 73.2% of adults 25 and older in the U.S. had attempted or completed some level of college.

If we assume that 73.2% of the weighted survey sample of 11,874 attended or graduated college, the number of respondents with some college experience should approximate 8,692. If we multiply that number times the percentage of U.S. college students who attend for-profit institutions (approximately 5 percent per the Department of Education, approximately 435 of the respondents attended for-profit colleges and universities (note: the for-profit college student population has fluctuated over the years and the percentage could be as high as 10% for a few years but I am using 5%, the current percentage, for this discussion. The appropriate number is likely somewhere between 435 and 800). I tried to find the precise number in the survey responses in the code book but because this classification was determined outside of the survey, the actual numbers are not included. When responses are sorted by type of institution attended, given the small sample sizes and response rates, the outcomes may or may not be reasonable.

The section of the report that discusses the overall value of education begins on page 67. The opening paragraph notes that 70 percent of adults had enrolled in an educational degree program beyond high school at the time of the survey (confirming my estimate above). Reading through the report in detail reveals a statement on page 74 – “Greater difficulties with loan repayment among attendees of for-profit institutions may partly reflect the lower returns on degrees from these institutions.” – this statement is not based on the survey data, but is sourced (based on its footnote) from a 2012 David Deming, Claudia Goldin, Lawrence Katz paper titled The For-Profit Postsecondary School Sector: Nimble Critters or Agile Predators?. In the abstract for this paper, the authors write “We find that relative to these other institutions, for-profits educate a larger fraction of minority, disadvantaged, and older students, and they have greater success at retaining students in their first year and getting them to complete short programs at the certificate and AA levels. But we also find that for-profit students end up with higher unemployment and “idleness” rates and lower earnings six years after entering programs than do comparable students from other schools and that, not surprisingly, they have far greater default rates on their loans.” It would be distracting to review this 2012 article as part of this review, but I found some of that 2012 study to be questionable when I first read it years ago.

The paper continues with the quote mentioned in the article I first read. “Indeed, when accounting for race and ethnicity, parents’ education, level of institution (two or four year), and institution selectivity, the relationship between for-profit institution attendance and being behind on student loan payments persists. This suggests that the high payment difficulty rates for attendees of for-profit institutions reflect characteristics of the schools and is not simply due to the characteristics of their students.”

My problem with this language is that the report is unclear what the statisticians did to account for the differences in those variables mentioned. Did they sort the data attempting to normalize the population compared to non-profit schools? We already know (from the Deming, Goldin, and Katz paper cited and others) that for-profits educate a higher percentage of minority and older students. We also know from college student retention studies (and many literature reviews that cite them) that background characteristics of the college student (race, number of parents living in the household, education background of parents, high school GPA, SAT/ACT scores, etc.) are the most influential variables determining college persistence.

How was the data sorted? Additionally, what method did the researchers utilize to determine that these characteristics of individuals reflect characteristics of the schools? Did they compare the data as sorted for this analysis to a similar sort for community colleges (another group that is non-selective in its admissions requirements)? As I noted before, the report provides no supporting data or statements. Controlling for the demographics implies that the group was sorted into a smaller group, once again putting the accuracy of the outcome in doubt. I suggest that the silence indicates that this is an inferred statement and not one supported by any reliable data.

In the Inside Higher Ed article titled Student Debt’s Impact on Perceived Value of College written by Doug Lederman, I noted that he chose not to quote from the report that attributed poor results to the character of the institution.

The Federal Reserve’s survey provides some useful data about individuals perception of college education related to value and debt. I suggest reading the Inside Higher Education article if you want a quick read and for a more fulsome understanding, reading the entire report. However, the lack of transparency on their manner of associating institutional characteristics with individual data concerns me. Research papers submitted to journals usually have a double-blind review process where the editor of the journal submits the paper without the author(s) names to two reviewers who provide critiques and a yes or no on recommending publishing it. It’s the editor’s choice to accept the paper with or without the recommended changes which are provided to the authors. When a government entity or government affiliated entity publishes data, many people assume it is right. The Department of Education usually makes extensive datasets available to researchers unless it’s privacy-protected data. As the amount of data increases, it’s important to examine our processes of transparency including roadmaps to analysis and logic regarding the assumptions or linkages made in the analyses. My dive into the numbers provided here left me with more unanswered questions. It’s sad that many readers will assume this report is accurate without understanding the limitations of statistical sampling sorted through control variables.

Photo credit: Tada Images – stock.adobe.com