The Overflowing Stack: AI Coding Tools
Coding is arguably the single best application of large language models (LLMs) to date, in terms of measurable productivity impact. GitHub Copilot, a tool that has become synonymous with coding AI, is used by more than 1 million developers to write ~45% of their code on average, saving ~55% of their coding time. The success of Copilot has catalyzed a fertile and burgeoning market for AI Coding Tools - Big Tech players competing for primacy with their foundational models, multi-billion dollar unicorns building AI feature sets into their products, startups coming up with innovative use cases and individual developers contributing to a thriving open-source ecosystem. Here is an introduction to the market and what to watch out for in 2023.
Market: 100+ tools with Big Tech, Unicorns and Startups each competing uniquely
Big Tech: Competing across code-writing “Copilots” and foundational models
- Microsoft-OpenAI: Dominating the market with the most popular coding assistants (Copilot and ChatGPT), strong distribution advantage through GitHub and VS code, and the primary foundational models (Codex, GPT-4) that new AI coding tools are being built on.
- Google: A recent entrant, Google is competing on several fronts - adding code support to Bard, partnering with Replit for distribution (through Replit’s IDE) and joint development, and providing access to their Codey foundational model for developers to build coding tools.
- AWS: Launched CodeWhisperer to compete among coding assistants, based on an in-house LLM. Relying on AWS infrastructure for distribution.
- Salesforce: Taking an open-source approach to the market through its CodeGen and CodeT5 foundational models. Largely focused on providing developers tools to build apps within Salesforce (vs competing with other Big Tech players).
Unicorns: Building, acquiring and partnering to add AI features in their core tools
Unicorns are adding AI-based features into their tools through in-house development (e.g., Replit's Ghostwriter, Sourcegraph's Cody), acquisitions (e.g., CircleCI's Ponicode, Datadog's Codiga), and strategic partnerships for distribution and joint development (e.g., Replit-Google, Huggingface-ServiceNow).
Startups: Building for niche use cases and downstream automation
While some startups are competing head on with Copilot with their own coding assistants (e.g. Magic, Tabnine, BlackBox, Codeium), a lot of startup innovation is focusing on -.
- Code assistants for niche use cases - e.g. Warp, Fig provide AI code writers for command line interfaces; Second, Debuild provide AI agents focused on web-apps
- Specific downstream use cases - e.g. Replay, Moderne have AI-based automation for code reviews / testing; Redocly, Mintlify do AI-based auto-documentation. These tools typically offer high-degree of automation for these specific use cases.
Individual Developers: Building quality-of-life Improvements
A highly-engaged community of Indie devs are building small open-source projects, typically wrappers on top of foundation models to add quality of life improvements (e.g. improving AI workflows within a specific IDE, AI assistants for very specific tasks like SQL queries etc.)
Funding landscape: Muted startup funding amid Big Tech dominance, specific / niche use cases attracting bets
Funding for new, AI-native coding tools has been fairly slow, especially when compared to the big deals in AI Copywriting Tools (Jasper - $125M, Tome - $75M, Typeface - $65M). Investors have rather doubled down their bets on Unicorns and Big Tech (e.g. Replit’s $100M raise, OpenAI’s $300M raise). Even some of the higher funded startups in the space like Diffblue, Swimm, Tabnine are tools that were started before the AI revolution, and adopted AI-based features subsequently.
This trend is understandable given the dominance of GitHub Copilot in the space. Unlike copywriting where the first wave of AI-based tools were from innovative startups (the likes of Microsoft, Google, Adobe etc. entering the fray relatively later), Copilot was launched within a few months of GPT-3’s release. A combination of first-mover advantage and solid distribution(GitHub + VS code) propelled the market to gravitate towards it. The demise of Kite, a promising code-writing assistant with $17M in funding, provides a case in point for investors being hesitant to fund startups competing head on with Copilot, Big Tech / Unicorn tools.
VC bets on startups in the space have thus been in startups serving niche use cases (e.g. Warp, Fig providing AI code assistance for command-line interface) or specializing in downstream developer activities (e.g. CodiumAI, Replay, Moderne for automated testing / bug identification; ReadMe, Mintlify for automated documentation).
Let’s take a look at the startups in the space, through an Alternative data lens.
Developer traction: Cursor, Warp, Fig, Redocly and Mintlify seeing strong developer traction
GitHub activity tends to be a strong indicator of traction, specifically for developer tools, as GitHub is the primary platform where developers interact with tools and contribute to open-source projects. A majority of AI-Coding Tools have taken open-source approaches to rapidly innovate on their products. GitHub stars (similar to “follow” on social media) are a direct indicator of a project’s popularity on GitHub.
Fig, Redocly, Tabnine, Warp. and Cursor have very popular repos on GitHub (for reference <0.002% of repos on GitHub have >10K stars). Cursor, Warp and Fig have the fastest growing repos among these tools (~10,500, ~2,200 and ~1,100 stars added in the last quarter respectively), followed by Redocly and Mintlify.
Cursor, an OpenAI funded startup which launched an AI-based Integrated Development Environment (IDE) this year in March, is seeing remarkable traction and reached 10K stars by end of March. Its value-proposition is natural AI-based workflows within the IDE (e.g. chatGPT-style code chat that can access your code, ability to see inline difference of AI edits, auto-debug by hovering over errors etc.).
Warp and Fig provide AI coding solutions for Command Line Interfaces (CLIs), a niche use case where developers are seeing value. Given CLIs are used widely, if intermittently, among developers for various tasks (e.g server administration, remote server operations, version control, scripting etc.), these startups seemingly have a strong platform for scaling their solutions.
Redocly and Mintlify are emerging as popular solutions for documentation. Redocly especially is standing out - with no publicly announced funding to its name.
Tabnine is a well-established code-assistant (launched in 2013) that has been able to hold its own vs Copilot by offering code privacy, local hosting, broad language support, multiple IDE integrations and ability to train the model on your own codebase. It will be interesting to see how it fares as the race for coding assistants heats up among Big Tech.
Website traction: BlackBox, Tabnine have built impressive user bases
Website visits tend to be a good indicator of traction for most SaaS businesses, including developer tools, as they are typically accessed through the startup’s website.
Apart from the tools discussed earlier, BlackBox comes through with strong traction on their website. BlackBox provides a GitHub Copilot alternative, that offers unique features like code search (without leaving the coding environment), code extraction from videos, broad language support and multiple IDE integrations. Like all coding assistants, the challenge for BlackBox lies in maintaining its traction against Big Tech offerings as competition intensifies.
Cursor’s website visits also validate the superb traction indicated by GitHub data.
Key insights and things to look out for in 2023
Big Tech battle
- Microsoft-OpenAI are the dominant players in the market
- The big question is how Google and Amazon offerings will fare against them. Expect a fast pace of new feature releases, with each player trying to one up the other (as we saw with Google I/O).
- Fig, Redocly, Mintlify, Cursor, BlackBox and Warp are standout early-stage startups
- Fig, Redocly, Mintlify and Cursor would be our bets for fundraises this year (given Warp raised in 2022).
- For the likes of BlackBox, Tabnine the challenge will be maintaining competitiveness with fast-improving Big-Tech / Unicorn coding assistants.
New use cases
- The thriving open-source participation is likely to yield new innovation. Expect innovation on two dimensions - new use cases across the Software Life cycle (e.g. AI in hosting, CI/CD, feature testing, cloud management etc.) and step-ups in automation, i.e. moving from ‘Copilots’ to ‘Autopilots’.
Furthermore, 2 macro factors are worth keeping an eye on, given their potential to significantly impact market evolution:
Game-changing new foundation models
- Google has already announced its new model ‘Gemini’ is in training.
- OpenAI is rumoured to be readying a new open-source model; this will provide another boost to startup innovation in the space.
- Recent research specifically on AI coding, such as Stanford’s paper on Parsel and Microsoft’s self-play method, suggest imminent improvements.
- General regulation on LLMs is on the horizon; Sam Altman’s congress appearance provides a glimpse into the future
- More specifically, the outcome of Microsoft & OpenAI’s legal battle on copyrighted code will be a defining moment on how AI coding tools evolve
2023 is shaping up to be a pivotal year for AI coding tools. The stage is set for intense competition among Big Tech companies, exciting innovation from startups, and potential regulatory shifts. One thing is clear - coding is faster and easier than ever before.