It competes with some really cool tools like Deforum, Plasma Punk tool and Decoherence tool which sort of merge image to image to image and give kind of a cool animation effect.
But we haven't really had true text to image in the way we would use stable diffusion or mid-journey to type in what we want to see and actually get a video of that. ModelScope is a promising tool to get us there!