By Varun Aggarwal and Kuldeep Yadav
The era first generation of AI products is over, hopefully! Some of these were simply prompts on ChatGPT/Dall-E/StableDiffusion to demonstrate a use case. Many of them got millions of users, no, sorry, viewers on Twitter and then vanished a day later. Then, there were others which were thin-wrappers on ChatGPTs and masqueraded as usable products. Or, a nice chat UI/UX was stamped on existing or new products waiting for it to deliver magic. Everyone was competing to make them quicker, 30 days, 30hrs, 30 mins and even 30 seconds. They went viral too and then landed quiet in some corner without delivering value.
It is not that AI has not been disrupted and significant value cannot be created. That needs hard-work and patience. Instead, there has been a huge FOMO creating a lot of heat (in lost dollars), noise (on social media), and little light (real value). Industry trends and economists already have started pointed out the lack of productivity gains from AI failed projects and the untenable bubble we have created.
It is now time to make the 2nd generation of AI products. This foray has already started. If you are now over getting 10 minutes of fame on Twitter and mean real business, here are some principles:
1. Stop building AI for novices, land value for domain experts
You can easily generate a sales email for a recent untrained salesperson. Write a nice prompt, provide some context and older emails of the person, and woosh! a new email will be generated. The person will happily use the generated email- s/he is as clueless about what is good vs bad as your algorithm. The email will drive away the customer and piss of the boss!
On the other hand, an expert user will simply throw such an email to the trash can! They will want personalization, and right use of their favorite elements in the email. They will demand reasons for the new elements you suggest - they will judge the good from the bad. They will want autonomy to invoke AI suggestions on certain parts of the email to improve them – this will require a smart UI/UX. Further, they would love seamless integration of information to contextualize the message (i.e. recent news of the company from their website, social media, or quarterly earnings). And finally, they would want the tool to improve over time - take less of their time, be more tuned to their needs and deliver more value.
This is where we need to go. Such AI will deliver great value to the users and lead to actual productivity gains. It will really help the novice user, not deceive them.
It will improve AI itself!
2. AI needs to be well-tested and reliable
Your AI product needs to work. That means that if you are generating an image, a video, or an email, it should not work for the one use case you engineered to show it off. Based on your target audience - a function of the industry, geography, and level of users - you need to make a boundary around what kind of inputs your tool must support and work well with. You then need to test your product extensively for the distribution of inputs possible. Furthermore, if the tool encounters inputs it cannot possibly handle, rather than generating an unintelligent response, it should inform the user about its limits.
We all know, generative AI is hard to test - they lead to subjective outputs and it is not easy to automatically detect if they are fit for purpose. This is and will continue to be a major area of innovation. Thankfully there are theoretical frameworks on how to do this, offline/online evaluations, metrics, and tools for continuous observability.
Beyond the theory, evaluating models is a bit of art and engineering: build a reasonable benchmark set, iteratively engineer and innovate, look out for test set impurity, put in the right guardrails and constantly monitor. This is what will differentiate the good from the bad.
In the last one year, there is a rapid progress in LLM evaluation and monitoring tools and product builders should exploit these to build compelling products.
3. It is not one model fits all
People think the omniscient, omnipotent being has finally descended on earth - one model what will do everything, serve all use cases. Unfortunately, there isn't even one human being that fits all tasks. Different AI models and engineering layers provide trade-offs on cost, latency and deployment feasibility. Based on your use case, you need to select and engineer a model.
For example, if the application is real-time video avatar or video-editing, you will need to get low latency in model response – by either or both of AI and engineering. You may need more than one model. For example, you may use a fast, relatively inaccurate model for lip syncing during editing and finally a high quality, high latency model in the final rendering. Or if you need the service cost to be low, you need a smaller model, caching, dynamic model selection based on query or one of these methods.
The bottom line is that you need to work hard and smart with your model(s) and MLOps to land value. Quick off-the-shelf models are good for prototypes and MVPs, but their utility stops there. Building real-world products will require careful selection, and orchestration of multiple models.
One real-world product that has adopted this is Cursor, which is AI-native coding IDE and uses combination of off-the-shelf and custom trained models to deliver a truly delightful experience to its users.
4. Invent an AI-first Workflow
The other much talked about point is AI delivering value in the workflow of the user, in the interface s/he is working in default. This is much needed, but the real disruption is where the workflow itself is AI-first.
AI builders need to dig deeper in the domain (i.e. legal) and map/understand the entire user workflow. They need to evaluate how the end-to-end workflow can be AI-first or AI-native. It is not a band-aid on a product first made without the power of generative AI. Rather, conceptualize the product and build the technical architecture considering the gen-AI revolution.
For example, ask the question: what will an AI-first YouTube be? This needs the boldest of entrepreneurs to disrupt the incumbents. The time has come.
5. Turn human-AI friction to great AI-human collaboration
AI needs to work well with humans. One needs to balance automation with human autonomy. Simple example: Let us say AI writes something for you. You don’t like it, throw it away and get frustrated. How about, the product allows you to tell AI what level of edit you need (light/medium/heavy) and it gives you the response in a typical review mode, where you can reject/accept suggestions. AI can also track your changes to its generated output and make sure that it learns to personalize to the needs of the person. It may look like a fairly simple feature, but most developer don’t think this way and render the product unusable.
Once again, Cursor is a good example of this. It gives you line-by-line edits in code and the ability to accept/delete. Further the UI/UX allows specific queries and tasks to be done on the code.
If you haven’t built your AI product, there is nothing to worry. You haven’t missed the bus. You can make your own – a bus, which is not a bus – but rather a new way to think about (gen-AI first) transport. That is how AI has to be thought of.
Comments