Word-to-Video

Playing director.

Dec 12, 2024

OpenAI’s latest toy is Sora, a creation that does text-to-video the way ChatGPT does text-to-text and DALL-E does text-to-image. I was curious how it would handle very simple single-word prompts, which sometimes lead to interesting work when I try them on Google searches.

Just signing in, I decided that, well, the most important thing is that I’d like to see a good video. So I entered the single-word prompt “good” and got the following storyboard in response:

I approved the storyboard without further ado. So, here! This is what Sora thinks is good:

I was given a choice between two versions of all three videos presented here—I picked the one I liked better. I used the default settings to keep the exercise as scientific as possible—more variations and tweaked descriptions are among the adjustments one can make.

Next I tried the word “palindrome.” I’ve noticed ChatGPT has done a poor job rendering palindromes in the past, so I wanted to see what Sora would do. This time, the storyboard came in two parts:

This seemed like it might understand the concept of “palindrome” at least a little better, although the “visual palindrome effect” sounded like more of an ambigram. How’d it work out?

Not too well. Even allowing for whatever the middle’s trying to do, it couldn’t even spell the end of palindrome correctly—it’s -drome, not dome.

A five-second video doesn’t have a lot of room to be deep, but it can at least be funny. For my last exercise, I asked Sora to riff on the word “funny.” This time I didn’t go to storyboards: I just asked it to give me “funny” and watched what came out.

Well, it appears Sora has learned a few things from watching the internet.

Next: A gathering of namesakes!

T Campbell's Grid

Discussion about this post