Video Cloning for Education: Not That Great
This video is not me. It is an avatar created with the AI technology HeyGen. This second video where I speak some of the same content, but in French, is also not me. I don’t speak any French. This technology has a big wow factor, but what are the implications for teaching and learning?
This is part of an ongoing series where we look at emerging AI technologies to discuss what we might use these things for in the classroom. We previously had a look at the voice cloning technology, specifically 11labs, and the visual production platform from Gamma. While there are ample opportunities for fraud and abuse with this technology I want to stay in my lane so these won’t be the focus of this piece.
Video cloning get’s people's attention. When I sent this to campus I received a lot of responses from folks who were freaked out and/or impressed. The technology is remarkable, but I am not immediately convinced of the teaching and learning implications. The image of a University professor imparting knowledge to a group of attentive (bored) students in an auditorium is a bit dated. Most of us invested in the research on teaching and learning don’t do it this way. Standing in front of a room and narrating content is probably the least important part of your job. My daughter is 9, if I hand her a script she could stand in front of a room and deliver a lecture on Section 230 of the Communication Decency Act.
Training my HeyGen avatar took some time. I had the benefit of a studio quality experience and some actual professionals in video production helping me set the room and interpret the instructions. After a few attempts we had workable training data. This technology is also comparatively expensive.
There are some legitimate use cases like quickly producing companion videos that summarize content like an article or a book chapter. The ability to produce content in different languages seems ethically fought (although I can’t quite put my finger on why), but it does open up possibilities for meeting students where they are.
In my teaching I mainly run flipped classrooms and record a lot of vignette videos helping students understand key concepts, guiding them through assignments, or connecting threads from the semester. Rarely is seeing me an important part of the learning experience. If I wanted to outsource this work I would be more likely to use voice cloning with a slide deck. To be honest, there just don’t seem to be that many cases where viewing a human torso is critical to student learning.
This technology is also not quite “there” yet. Videos are capped at short intervals and the rendering can run into snags (the French video linked above cuts off too early). You could stitch together some of the videos to create something longer, but it is hard to see that as a time saver. This will probably get there at some point, but we are not at that point yet.
In summation, I don’t think this is a transformative tool for higher education, but it does get people to pay attention.