Making the cut: AI Video Tools
Generative AI will transform video content creation. Video content has exploded in the past decade, with the average consumer now watching ~17 hours of video in a week. The diversity of video content is as vast as its volume - from <30s short videos on TikTok to long 3hr+ movies (h/t Nolan). Despite its ubiquity, video content creation remains a resource and time-intensive endeavor, requiring investment in equipment & software, numerous takes, complex editing & sound engineering skills. A large, multi-segment and effort-intensive market is the perfect breeding ground for AI-led disruption, which is already well underway. Runway’s powerful AI video editing tools are used by top-tier productions such as The Late Show and the Oscar-winning movie, Everything Everywhere All at Once. Synthesia’s video platform serves 50k+ businesses in creating corporate training, marketing, and customer service videos - reportedly reducing production time by a remarkable 95%. Consequently, the startup landscape is humming with activity - strong consumer interest, significant VC dollars and a growing array of tools catering to diverse video use cases and creators.
Market: Thriving space with multiple tools finding relevance across different use cases & niches
Incumbents: Apple & Adobe have leading video-editing tools, but late to Gen AI
Pre-2017, the video creators’ toolkit largely included Adobe (Premier Pro & After Effects) or Apple (iMovie & Final Cut Pro). Both have been largely focused on high-end video production and require significant time and training to execute. Meanwhile, tools for casual creators and simpler video content were fairly limited, with most creators using rudimentary editing features offered by YouTube, Instagram and other platforms (basic cutting, filters, framing etc.).
Despite being industry heavyweights, both Apple and Adobe have been relatively late in developing AI for video. It wasn't until April 2023 that Adobe officially unveiled its entry into the generative AI sphere with 'Firefly for Video,' which is still in its development stage and Apple is only rumored to be working on its own AI offerings.
Scale startups: Early AI pioneers have become unicorns by cracking Enterprise use cases and building strong tech differentiation
Runway and Synthesia were pioneers who began experimenting with generative AI in 2017-18 and have emerged as market leading players. Both companies were born out of cutting edge machine learning research, founded by researchers and have built their own proprietary foundation models - but are targeting different use cases.
Runway has focused on developing video editing and visual effects (Vfx) tools targeted at professional editors, filmmakers and broadcasting. Runway has built a deep tech moat with its best-in-class AI video models and launched the first open-ended text-to-video tool (Runway Gen 2) in March ‘23.
Synthesia has carved its niche by deploying AI avatars at an enterprise level - creating video with narration from human-like avatars used for learning & development, customer service and marketing needs among businesses. Today, it counts ~35% of Fortune 100 companies as users.
Surge of new tools: Emergence of new tools leveraging a large creator market and innovating around niche use cases, modalities
The video tools space is a really attractive playing ground for new startups to build and innovate given the presence of a large lower-end market. Casual creators and small businesses building video content for social media presents a large market with low switching costs - ideal for new startups to gain traction, build market voice, get feedback to improve product and eventually either target higher value segments of the market (or build a strong brand that can sustain in the creator market). This is in contrast to markets like Legal Tech, where the lack of a lower-end market has stunted new startups from making a mark and established players have been the primary beneficiaries of AI. The diversity of the video market also presents opportunities for startups to build specialized solutions designed for different use cases and needs.
As a result, there is a plethora of new players in the space specializing across different dimensions:
- Sub-use cases within video generation and editing - tools specializing in use cases such as script-to-video (InVideo), blog-to-video (Pictory), URL-to-video (Oxolo), overdubbing (Descript), AI eye-contact (Captions), video personalization (Tavus) and style enhancement (Pixop) etc.
- Different types of Avatars - within the AI-generated avatar use case, startups are specializing with options to choose from default AI models (Yepic), creating digital likeness of a particular real person (HeyGen), creating animated avatars (Elai) etc.
- Downstream use cases like Video recapitulation -recapitulation tools are relatively new entrants, focusing on insights from content (Veedo), generating summaries (VidCatter), and creating shorter clips from longer videos (Clippah).
Funding Landscape: Strong VC interest with multiple deals across stages in the last 12 months
Robust fundraising environment
Fundraising in AI Video Tools has been more robust than other AI application markets we’ve looked at, like AI Coding, Legal and even Copywriting tools. While in other markets we’ve seen funding dollars go primarily towards incumbents adding AI features and pioneering startups that have quickly emerged as scale players, there has been a stronger willingness to bet on new startups in the video space (evident in the high number of funded startups in the space, most of which have received funding in the last 12 months). This is likely driven by a combination of -
- Large market opportunity with room for several players - Video content creation is a $150B+ market and growing fast (12-15%) and the value realization from AI Video tools is high. For e.g., compared to text-based AI applications, where accuracy / hallucination has been a frustration for many customers, AI video tools typically cut down production time / cost so significantly that working multiple revs with the AI is more amenable for customers.
- Ability to differentiate in niche formats, use cases - Video content market is very diverse in terms of participants (casual social media creators, enterprises to large media houses), formats (TikTok to films) and use cases (entertainment, social media, learning & development, marketing, personal messages, announcements etc.). New startups therefore have opportunities to differentiate by specializing in these formats / use cases (e.g. Synthesia specialized in enterprise learning & development before branching out, Bytedance’s CapCut was one of the most downloaded apps of 2022 designed for editing videos for TikTok).
- Relatively less threat from the Big Tech ecosystem - unlike markets like AI Coding & Copywriting, where players like Microsoft and Google are leveraging their ecosystems to create end-to-end offerings, incumbents among Video tools (Adobe, Apple) have been slow to add AI video and don’t have the overwhelming ecosystem advantages of Microsoft, Google (i.e. best-in-class proprietary foundational models + established applications + strong distribution).
Top VCs as well as Strategic investors active in the market
Sequoia has been especially active in the space with 5 bets (incl. India and China arms, pre separation) while Andreessen Horowitz, Accel, Kleiner Perkins and Lightspeed have also made multiple bets in the space. Nvidia has backed both the unicorns in the space, Runway and Synthesia, as a strategic investor. The potential value unlock is significant as video is a compute-heavy AI application and Nvidia supplies most of the hardware used to make it possible. OpenAI’s startup fund has also invested in the space with Descript.
Business-focused tools attracting the lion’s share of funding
Unsurprisingly, business tools have attracted a large share of funding in the space. Enterprise tools are preferred by investors given their higher stickiness and customer lifetime value. All of the top 5 funded tools are enterprise focused and ~75% of all funded startups are targeting business use cases.
Website Traction: Strong consumer interest in AI Video; Veed, D-ID and Fliki seeing standout traction
Website visits are a strong indicator of traction for AI Video Tools given most tools have web apps for users.
The website traffic and growth for AI Video Tools is astronomical, with 10 startups seeing >5M visits in Q2 and significant number of startups (10+) in the space seeing >50% growth QoQ. For contrast, among AI Coding Tools only 3 startups averaged >5M visits in the same period with 1 among the top 10 growing >50% QoQ. Even in the more popular AI Copywriting tools space, only 5 startups averaged >5M visits and no startup saw >50% growth QoQ.
Fliki.ai, launched in 2021, has seen above 300% QoQ growth in their website visits this year. Fliki’s growth can be attributed to its versatility – it offers an advanced tool that transfigures text, audio, or blogs to video with features of voice cloning and voiceovers, an array of voice and language options and sophisticated editing options for pronunciation, speech rate, dialect and pitch.
D-ID differentiates by enabling users to animate still images into talking avatars that sync with any script. Chat.DID, an interactive web app that allows people to video chat with artificial intelligence, made D-ID the first platform to offer a combination of digital avatars and GPT-powered communication. Additionally, broad range of integrations and API-access further make it a standout product.
Veed surpassed 1 million users in 2022 and is particularly preferred as a all-in-one online video editing platform among social media creators and YouTubers. Veed differentiates by providing a remarkably broad feature set that spans image-to-video, music-to-video, transcription / subtitles, avatars to name a few and massive library of templates for different use cases (casual videos, explainer videos, product videos, training videos) as well as platforms (Facebook, Instagram, TikTok, Twitter, YouTube).
Product Reviews: Pictory, Descript and Fliki emerging as product leaders
Pictory has built a loyal customer base of creators, especially podcasters and YouTubers who have lauded the tool for its audio-sync feature (matching audio track to lip movements in the video), the beginner-friendly UI/UX and a well-stocked library from their partnership with Shutterstock.
Descript has received appreciation for its fundamentally different approach of applying a universally familiar document-editing UX to video editing, making it easier for people with low video editing experience to achieve great results.
InVideo has become a preferred tool among creators due to its drag-and-drop interface and 24x7 customer support. The platform's mobile application facilitates video editing on the move. Users especially highlight ‘unlimited’ video templates which span across categories of YouTube and business videos to specialized formats like holidays, sports, education, etc.
- Fast-maturing startup ecosystem, expect multiple startups to scale and attract big fundraises in the next 1-2 years
- New unique use cases to emerge, video tools focused on live-streaming could present a significant opportunity
- Runway, Synthesia have created deep differentiation, likely to continue to be market pioneers on new features and product leaps
- Adobe is the sleeping giant, Firefly for Video will be a major factor in the market as it matures
- Discussion on the threats created by AI-generated video to intensify
While fundraising has been hot already over the last year, expect bigger rounds on the horizon as startups see strong consumer traction and products mature. Descript is a clear candidate to be the next unicorn in the space. Fliki, Pictory, Veed, D-ID and InVideo would be our bets for fundraises in the next 1-2 years.
The depth and diversity of the video market will incentivize startups to build for more niche use cases. Live-streaming is a relatively untouched use case for video tools, and could present a significant opportunity given the size of the streaming market (including creators as well as live broadcasts like sports, news etc.)
Runway and Synthesia have built best-in-class tools backed by deep research and proprietary foundation models - will be difficult for new players to untangle this advantage. Hence, we expect Runway & Synthesia to lead development of new cutting-edge tools & features.
Adobe’s Firefly tools for image editing were groundbreaking - generative fill and prompt-based edits have been game-changing and allowed Adobe to maintain a strong position in the Enterprise market vs Stable diffusion and MidJourney. While Firefly’s video suite has been slow to release and still nascent, it will be one to keep an eye on as it matures.
AI generated video has garnered valid concern around a) its misuse to spread misinformation, create sexually explicit content and propagate identity theft (e.g. through deepfakes) and b) its impact on artists and artistic expression (e.g. the writers’ strike in Hollywood has AI as a core issue). This discussion will intensify as governments and courtrooms race to add checks and balances to the fast improving AI capabilities.