An intense debate has opened up on the Creative Commons Open Education email list. This extends discussions which have been brewing for some time about whether Open Education practitioners should support or fight against Large Language Model developers scraping web publications without either attribution or positive permissions for training data for Gen AI.

This week the debate heated up following the advertisement of a webinar featuring a presentation by Dave Wiley:

The University of Regina's OEP Program invites you to a special online presentation by Dr. David Wiley. Dr. Wiley is widely recognized as one of the founders of and key thinkers surrounding the open movement in education.

Date: Thursday September 19, 2024

Abstract:

For over 25 years, the primary goal of the open education movement has been increasing access to educational opportunity. And from the beginning of the movement the primary tactic for accomplishing this goal has been creating and sharing OER. However, using generative AI is a demonstrably more powerful and effective way to increase access to educational opportunity. Consequently, if we are to remain true to our overall goal, we must begin shifting our focus from OER to generative AI.

There was near instant kickback on the list. Heather Ross wrote:

I’m really troubled by so many in the open movement seeing GenAI as a natural fit with OER. OER aligns with several of the UN SDGs and is being used to integrate sustainability into curriculum, teaching about how all disciplines are tied to the SDGs. GenAI is an environmental nightmare. OER is being used to integrate EDI and Indigenization into curriculum. GenAI, programmed by those of dominant groups, often fails to represent or misrepresents members of marginalized communities. Taking what isn’t yours to create something new without giving credit, having permission, or considering the impact on others isn’t innovation or acting in the spirit of open. It’s colonization. OER has always called for recognition of the work’s creators and contributors and gratitude for their willingness to share it openly. Any gratitude toward GenAI-created work that was taught on copyrighted works against the copyright holder’s permission will ring hollow. During my comprehensive exam, a committee member asked me what the difference between OER and Napster was. At the time, that was easy to answer. Most OER was created by authors who willingly released their work with an open license. Napster was the sharing of music without the artist’s permission. If I were asked that question now, it would be a lot harder to answer.

And Dave Wiley came back to say:

It feels like we spent the second full decade of the OER movement, from 2008 - 2018, running non-stop workshops about copyright and the Creative Commons licenses. We had to spend ten years that way because there are certain fundamentals about copyright and licensing that a person has to understand before they can participate in the OER movement in a way that goes beyond reusing content created by others.

The same is true for generative AI. People who want to participate as something more than reusers of generative AI tools created by others will need at least some proficiency in prompt engineering, retrieval augmented generation, fine-tuning, and other topics. I agree that smaller models running locally is where this all needs to go eventually, which means additional understanding will be needed in techniques like quantizing, pruning, and distilling the knowledge of larger models into smaller ones so these models can fit (and run) on edge devices like consumer laptops and phones. 

There are strong analogs between the revise and remix potentials created by openly licensed content and the revise and remix potentials created by openly licensed model weights. And the overall educational potential is far greater for open weights than open content. But without some baseline understanding of how generative AI works it will be difficult to participate (productively) in these kinds of conversations. It looks like we might have another decade of dry, technical, arcane professional development workshops ahead of us. :)

This is some of the territory I'm going to cover in the talk in a couple of weeks.

Stephen Downes weighed in with a post on his blog entitled What is the Soul of Open Education?.

I've had my disagreements with Wiley over the years but we are in agreement on this point. Now what it means to say "increase access to educational opportunity" may be another point of contention; creating startups and making money isn't my idea of progress. But we agree on the potential of AI.....

If it takes (AI) a fraction of the resources it used to take to create a useful and usable OER, even if it has to be corrected for misrepresentation, then there is far more opportunity for people in under-represented groups to crate resources where they see themselves reflected in the materials being used in learning. AI-assisted transcription and translation, resource recommendation, community formation and more can also help members of marginalized groups.

There were many more contributions and I am sure we have only seen the start of this debate. But it seems a very important one for the future of Open Education and for Open Education practitioners wrestling with AI.

More to follow.

The Creative Commons Open Education Platform is a space for open education advocates and practitioners to identify, plan and coordinate multi-national open education content, practices and policy activities to foster better sharing of knowledge.

This platform is open to all interested people working in open education.

You can join the email list at cc-openedu [at] googlegroups [dot] com

Leave a Reply