State of AI & predictions for 2024

Nicholas Charriere

@nichochar

Date: Sunday, January 7, 2024

It is Jan 7, 2024, a little over a year of chatGPT being released to the public. Here are some of my thoughts on where AI is at today, as well as some predictions of where it’s going.

This has been the year of generative AI, and what a technology moment it has been. As Mark Twain would say: the LLM revolution happened in two ways, gradually then suddenly. The research has been baking for years, and this year everything fell into place.

Quickly after the first modality — text — showed itself to be of immense reasoning and generation capabilities, the other modalities started pouring out as fast followers. Image models, pioneered by Midjourney, have reached incredible levels of quality and artistic levels. This is well illustrated by Nick’s tweet of the before/after fitness pic for Midjourney.

Midjourney

Dec 2022 (v4): Dec 2023 (v6): pic.twitter.com/Q2WpvZK4xN
— Nick St. Pierre (@nickfloats) December 21, 2023

Video is still early, but coming out of its shell with products like Pika and Runway. We’re starting to see news being personalized for you and delivered via a video feed of a generated human, which you can interact with.

Finally, audio models, both text-to-speech and speech-to-text, have achieved a level which can fool any human. I personally regularly use the “voice” version of chatGPT to brainstorm with it while I walk my dog. We sped past the Turing test so fast that no one even beat an eyelash about it. What now?

We sped past the Turing test so fast that no one even beat an eyelash.

Let’s take a breath and start by looking at where products built with these models are today. After that, I will share some of my predictions of where all of this is going in 2024. In this post I’m focusing on products, markets and people. There’s a lot to say by going deep into the technical details of architectures and infrastructure of AI, but I’m leaving that for a future post.

Today

(skip to my predictions)

Beyond the most basic “run model against a prompt” products like chatGPT and Midjourney, we’re starting to see new, major categories of products being focused on. Three really stood out to me this year:

Consumer assistants
Code generation and reasoning
Unstructured data processing: classification, extraction and translation

Consumer assistants

These are by far the most common, we’re having a chatbot renaissance. Entrepreneurs from 2017 must be watching, annoyed, as their vision of the future from back then is totally a reality now.

The recipe here is simple: 1) pick a domain, 2) apply LLMs to it and 3) profit. These are sometimes called copilots, but I like the term assistant. Examples include Harvey in the legal space, companion GPTs like character.ai, life guides like Dot, wellness or nutrition assistants like Welling, doctor GPTs, and of course the O.G: chatGPT.

This is the most obvious idea that the new performance of LLMs unlock, and therefore being tackled by many teams all around. OpenAI is trying to build a marketplace of such “GPTs”, but I predict this will flop. The examples above are incredibly nuanced and take a lot of work and custom architectures to get right. Simply giving a prompt and a bit of custom data to an LLM does not make it a very reliable specialized assistant, it takes a lot of product details which in turn require lots of nuanced engineering (memory, retrieval, optimized architectures, personalization, and reliability are significant challenges for these products).

There are some curious missing players here: why isn’t Alexa powered by one of these yet? What about Siri? That has got to be the worst AI in the world right now. I’m a little surprised at how slow Amazon and Apple have been in this race, they’re obviously building something for Alexa and Siri. Maybe they simply have strong conviction that their distribution advantage is so untouchable that they can take their time, Apple is in the business of shipping later than the market but shipping better, and winning. Maybe they’re just bureaucratic slow coaches, if you have insider insight please let me know.

Where are Apple and Amazon?

Code generation

This is an interesting category, and arguably could be folded into the consumer assistants above, but it’s big and unique enough that it’s worth analyzing by itself.

Here’s why coding assistants, and products which leverage LLMs to understand, iterate on, and generate code are so interesting:

Code is written in formalized languages (TypeScript, Python, Java, …). and the formal syntax makes LLMs terrific at predicting the next token well.
There is a lot of code to train on
Developers, the people building products with AI, are domain experts, which means they can evaluate the performance of the AI product without requiring another expert. This is a big deal: if you’re a developer building a legal GPT, the speed at which you can conclude that an experiment was good or bad is significantly, significantly slower than a developer building a coding assistant. It’s all about shorter feedback loops, which compound and compound. For this reason I am seeing coding agents progressing faster than any other category.
Finally, developers love automation, and LLMs are the new automation fruit which is simply too delicious to pass up, so they benefit from a very competitive landscape.

I’m seeing amazing innovation in this area, codestory and many others building VSCode forks or extensions, Sweep.dev working on getting LLMs to behave like junior engineers and take care of minor features, testing, and smaller bugs. Cody and GitHub copilot have very high penetration levels with developers as far as I can tell.

I expect developers to be the first to benefit from the LLM wave, just like they were the first to take advantage of computers.

Unstructured document parsing

This one is pretty boring but still true. Data pipelines with previously complex heuristic logic are being migrated over to LLMs, and this is happening all over the place. Well, it’s happening in particular at large companies, full of backend engineers who’s job it is to do some kind of ETL between obscure services.

LLMs are fantastic at classification tasks, or fuzzy feature extraction from semi-structured data sources (think web scraping or getting info from a pdf). I have just described a huge portion of what backend engineers in large companies do every day: ingest some data from a source, transform it, and load it into the next service. It’s hard to emphasize how common this type of work is, and how well LLMs perform on this task (even worse models, like GPT3.5).

I won’t spend too much time on this, but after talking with larger companies, I’ve observed this to be one of the biggest applications of the technology in production today. It feels more like a productivity increase than a quantum leap in outcomes, if you ask me.

An image I made using Midjourney and their --tile parameter.

A tiled pattern I made using Midjourney

Tomorrow: my predictions

Superproductivity of engineers: software commoditization

My first prediction is around developers. Given the previously discussed attention on code understanding and generation, we will see a remarkable productivity boost for software engineers. When combined with a huge increase in the number of them entering the market, this is leading us towards a big shift: code is about to become a commodity.

Graph of CS degrees quickly overtaking humanities ones

The productivity will likely take the shape of AIs writing simple code, behaving like junior engineers, and accelerating debugging workflows. I personally don’t believe we’re going to get to high level autonomy in 2024, that is telling an AI to build a whole system or product for you without supervision, and get something that works. It will still be a copilot, but the productivity should be shooting up 2x to 10x depending on how well developers learn to leverage the tools.

Thus, code will commoditize. Which leads me to my next point.

The rising importance of taste, product sense, and design

As software becomes commoditized, the differentiating factor for successful products will increasingly be not just what your software can do, but how it does it. Product sense, taste, and design will emerge as key skills. They’ve mattered up until now, of course, but for a lot of products, the bottleneck has been engineering. If you’re a developer, it’s a great time to level up your design and product skills, and improve your attention to details. This shift will rebalance the value of skill sets within the tech industry, placing greater emphasis on roles that have sometimes been seen as secondary to core engineering, such as UI/UX designers, and product managers.

I predict the ratio of non-technical to technical members of AI products will trend up in 2024.

Models moving to the edge

I’m a huge believer in open-source, inside and outside of AI. Today, proprietary models are ahead (chatGPT at openAI in particular): big models served as black boxes by a few incumbents. However the trend is clear: open-source models are catching up (llama, mistral) and getting us closer to a future where we can push this stuff to the edge, running on devices we own, not a computer in the cloud.

This makes a lot of sense. I believe in a world with a lot of smaller open-source, fine-tuned models are running on the edge, performing a particular task in a privacy friendly, low-latency, and cost-optimized (the edge device is paying for the compute) way.

This advancement will not only enhance existing applications but also enable new types of experiences, particularly in areas where real-time processing is crucial, such as real-time translation, and augmented reality.

I think that by the end of 2024, we’ll see significantly more people running tools like GitHub copilot with local models running on their machines. Video calls will leverage AI models to improve your video in real-time, once again running locally. I predict that edge devices will beef up in the short term so that we can usher this world forward. It’s worth noting that openAI, a $80B juggernaut led by a brilliant business person, will not just lay down and die here. They will fight back in interesting ways, the most obvious of which is going vertical and building a hardware device which is optimized to run their models on the edge.

I predict that most of us will regularly use products running models directly on our phones and laptops by the end of 2024.

Streaming probably won’t matter

You know how right now, all of the LLM apps stream of tokens back on the screen, printing words one by one as if the AI is typing? ChatGPT of course, but most GPT-4 powered apps choose this UX, because waiting 25s for a response is absolutely awful.

When the response time gets to be around 1s however, the streaming UX flips and becomes a negative. Have you tried streaming with a super fast model? The text appearing so fast on your screen is jarring and a trigger for epilepsy, a definite no-no.

I predict that the dominant streaming UX will disappear for text models. It’s minor but it will accelerate things because streaming semantics are much harder to work with technically. You don’t know the number of tokens until it’s finished, it’s harder to parallelize the UX, and joining third party data is not-obvious (we had to design our own protocol for this in Axflow).

Apple, Amazon & friends have joined the chat

Companies like Apple and Amazon with their massive distribution networks and big data moats are poised to make significant strides.

My AI made submission of a logo for Apple’s entrance in the space

Their delayed entry into the LLM space might be strategic, or it might be because their employees are talking about building this stuff in meetings instead of actually building it (my bet is on the latter). Either way, it’s allowing them to potentially develop more refined, possibly superior products. Moreover, companies with rich data resources, like Nike or Disney, have a unique opportunity to create amazing personalized and immersive experiences for their customers. These experiences could range from enhanced retail experiences to next-level entertainment, leveraging their vast content libraries and consumer data.

My prediction is that by the end of 2024, Apple and Amazon are major players, just as large as openAI (Microsoft I guess) and Google are today.

Software will disrupt itself

Since 2000, software has been eating the world. But now AI is eating software. I predict that many businesses with hundreds or even thousands of employees providing a software only service (think a typical SaaS company like Calendly or Slack) will get their economics absolutely destroyed. A decent engineer will likely be able to write a slack-like application, definitely good enough to cancel the 500k/year contract, in a couple of months.

A decent engineer will likely be able to write a slack-like application, definitely good enough to cancel the 500k/year contract, in a couple of months.

Remember when higher up we said that there will be a lot of engineers? The buy versus build equation is about to get much more competitive, which I think will hurt the technology sector’s culture a lot. I expect Silicon Valley to go through a bit of a shock because of this. There is a lot of entitlement built in, which is downstream of huge margin. This one is hard to predict, but I’m very curious to see how it evolves with this change. The most measurable part of this, in my opinion, is that companies will get leaner. The “revenue per employee” has to go up. One of our values in my current company is that lean teams are superior, a strongly held belief that I think is about to get proven again and again in 2024.

I predict a few major companies valued at > $1B today will disappear entirely by the end of the year.

Tiny billion dollar companies

With the previously discussed lowered cost of tools and increased accessibility of AI technologies, startups can operate leaner than ever. This agility means they can experiment and pivot with greater ease, leading to a more dynamic and innovative startup ecosystem. This means a higher failure rate as the barrier to entry lowers and competition intensifies. Nevertheless, this churn is beneficial in the long run — it fosters a culture of rapid innovation and iteration, ultimately benefiting consumers with more choices and better products. I predict that we’ll see many small startups be created. Many will fail, but some will succeed, and the ones who fail can do so cheaply. Ultimately, this is great news for consumers and innovation.

I predict that the average number of employees in unicorns created after 2023 will be significantly lower than before 2023.

Everyone is a hacker now!

New AI first hardware will fail

Some companies have come out with radically innovative hardware. The first of these is humane’s AI pin. I predict we’ll see more of these in 2024. Sam Altman is apparently in talks with hardware manifacturers (and Jonny Ive?) to build his own dedicated AI hardware.

I personally have a hard time believing that we’ll have a separate device from our smartphones. These already have all of our data, have AI optimized chips, and full penetration of the market. The only play that I could see working is a new AI-first mobile phone, which might be something Microsoft or Amazon attempt (both having lots of FOMO since they missed the mobile wave).

I predict non-smartphone AI devices will fail. The AI device of the future is likely an iPhone or android phone with a dedicated GPU chip for AI.

🚀 Conclusion 🏎️

I’m a techno-optimist, so I couldn’t be more excited as we move into 2024. I believe our world will change more next year than any previous year of my career. Open-source will catch up with proprietary models, and help push inference to (maybe new categories of) edge devices everywhere. AI is going to catalyze competition, productivity, and software is about to disrupt itself.

AI is a centralizing technology, so the companies that benefit most are the incumbents with large data and distribution moats. For this reason we might all talk mostly about Siri 2.0 next year, with chatGPT being old news, one of a few competitors with less access to your private data and therefore less useful as a consumer app. Consumers will benefit from many domain specific AI assistants, developers will enter a flywheel of productivity co-working with AI, and companies are about to get very lean.

Notes:

- I didn’t cover search, and I think it is an important domain which is also changing. Metaphor and Perplexity are very interesting, but I personally haven’t yet been hooked, so don’t have strong opinions on where this is going yet.

_- I didn’t cover art, but I’m super excited about this. Like when computers came out, both artists and devs seem like the ones that benefit first from AI. I’ve been able to make things with Midjourney which I love, and which I could never have hoped to achieve before.