For more than two decades, I have looked for innovations and advances in technology that can improve outcomes in education. Thanks to Moore’s Law, the costs of many technologies have decreased. What hasn’t decreased are the costs of programmers, trainers, and training time required for faculty, staff, and students. The return on investment of deploying the technology is an important factor for buyers and sellers alike.
Another important change influenced by technology has been the availability of free content on the Internet. Open Educational Resources (OER) continue to proliferate with many curated collections hosted by institutions, states, foundations, and other entities.
Recently, someone informed me about a research organization called Open Syllabus. Open Syllabus collects and analyzes millions of course syllabi for the purpose of “helping instructors develop classes, libraries manage collections, and presses develop books.” According to their website, they have collected over 9 million English language syllabi from 140 countries. Approximately half of the syllabi come from classes that assign course readings. Fascinatingly, they use machine learning techniques to extract citations, dates, fields, and other metadata from the syllabi that they collect.
Open Syllabus’ OER Metrics tracks the titles that are assigned to syllabi in order of how often they are assigned across the collection. If a title appears more than once on a syllabus, it counts only once. If it appears in suggested reading, it still counts. The researchers have not yet found a way to distinguish primary reading from secondary reading, but it wouldn’t surprise me if that’s not possible at some point. The titles are ranked with respect to how many times they are identified in the collection. The most frequently taught text, The Elements of Style, is ranked #1 (note: The Elements of Style was taught when I attended college in the 1970’s). The researchers have developed a Score for titles that is a conversion of rank to a 1-100 scale. The top 16 titles have a score of 100, whereas low count titles have scores of 1 or 2.
One of the questions that I had was how the machine learning technology identified the titles extracted from the syllabi. Part of that process is matching groupings of words against a master catalog of titles. The master catalog is comprised of titles from The Library of Congress, Open Library, Crossref, and the Open Textbook Library. The syllabus explorer tool identifies citations by looking for title/author name combinations in the syllabi. Matches are run through a neural network “trained to distinguish citations from other occurrences of the same words.” The neural network process is 90 percent as accurate as human labeling.
Best practices in constructing syllabi include desired learning outcomes for the course. The researchers write that they can reliably extract the individual learning outcomes included in syllabi. They report that their dataset contains approximately 20 million unique learning outcomes. I found a map that they created of the learning outcomes landscape to be fascinating for the clustering of learning outcome by academic field. The researchers used 3 million (15 percent of the total) learning outcomes to construct this map. I have appended it below.
The researchers cite a statistic from Lumina that there are over 3,000 learning outcome frameworks utilized in the United States. They believe that their machine learning techniques could facilitate course transfer by matching readings and learning outcomes for syllabi.
The researchers responsible for the creation and development of Open Syllabus acknowledge that they do not have a business model that could provide long term sustainability of the project. They have received financial support from The Arcadia Fund, The ECMC Foundation, The Sloan Foundation, The Hewlett Foundation, and The Templeton Foundation.
Over the medium to long term, the Open Researchers’ goal is to build a Syllabus Commons. Universities will share syllabi through the Commons as well as support the work of the Open Syllabus Project. Partnering schools will receive deeper curricular analytics as well as other services.
I think the Open Syllabus project is fascinating from several perspectives. Unlike some other projects, the researchers have attempted to collect as many English language syllabi as they can. Extracting and ranking titles and citations creates information that may indicate academic relevance for a specific reading. Collecting designated learning outcomes for courses is interesting on a few fronts. Sadly, colleges and universities are not as good at publishing their assessments of learning and OSP does not indicate that they will attempt to collect them. Being able to evaluate a title based on the frequency of its assignment in a course is one indicator of reliability. Being able to evaluate a title based on the learning outcomes of the students who use it is an even better indicator of effectiveness.
While linking assessments measuring learning to learning outcomes from syllabi may be a long-term goal that is not attainable currently, I recently met with a company, BibliU, that has a product that might be linkable with the OSP. BibliU hosts a textbook and OER platform for colleges and universities. One component of their platform allows faculty to see data from students’ reading assignments such as when an assignment was read, how many pages were viewed, and how long the student(s) took to read the assignment. I haven’t seen a research report linking the time spent on reading assignments to course completions, but with data like that accessible from BibliU, an astute researcher might be able to correlate the thoroughness of reading with final grades.
The Open Syllabus Project is one that I will continue to track. For anyone involved in developing and teaching courses, I recommend it as a tool to use to determine if their courses are relevant and current. Given the volume of data available, this platform is a big data researcher’s dream.