Creating Effective GenAI Products: Assessing Your Preparedness
Checklist for leadership and executives
Building an AI product comes with unique challenges. In addition to the complexities that a software product presents, an AI product needs to account for a variety of data and operational environment challenges, adding extra layers of requirements. I covered the unique challenges in building AI products and the differences from the classical software products in my earlier article here.
Generative AI (GenAI) further presents unique challenges and opportunities. I refer to the Large Language Models (LLM’s) here but the takeaways from this post easily extend to products beyond LLM’s — reasoning models, agentic AI and so on.
LLM’s have interesting and potentially fundamental implications on how the AI products ought to be designed. They can pose novel complications arising from the proprietary nature of data, products, domains as well as the rapidly evolving but still unclear regulatory framework.
Building viable GenAI products — the ones that are reliable, quantifiably differentiating, favorable in terms of cost-benefit trade-offs, and sustainable in the long term — is non-trivial. How do you go about building successful GenAI products that have realistic chance of success?
Below are some key guidelines, presented as a high-level checklist, that’ll allow the leaders in the space to tune out the hype and focus on the necessary aspects of LLM productization:
1. Do you have a well defined Product, and viable Business model?
Many AI productization efforts tend to ride on hype. The first order of business is to differentiate hype from reality. The fact that there is a better LLM model, for instance, doesn’t mean it’ll immediately translate into a better or superior product. The fact that a potential use case is discussed widely doesn't mean that it is a viable or even a feasible product. The fact that the “industry” tells you that only big pockets and massive investments can drive AI development and productization, doesn’t make it so. The hype culture makes outsized and unrealistic predictions and assertions especially when placed with the timeline presented. In fact, many an infrastructure efforts across various organizations can be attributed to hype rather than verified quantifiable ROI. There is a reason why LLM revenues do not outrun the Capex investments made across the board (the current Capex to return ratio is about 50:1), at least not yet (the recent developments around efficiency may put them at risk for an extended period of time — more on that below). Part of the capex spend may even be justifiable as future readiness but currently the use-cases hitting production are very limited — rarely do long term initiatives with no clear line of sight to value survive in the industry, unless they are driving on hype. This lack of a clear path to profitability is not resulting from a dearth of compute or GPU’s but is more due to the fundamental characteristics of, and the current state of, AI in the offering.
Put efforts into understanding not just the requirements from the market but also in confirming if you have a viable business model (regardless of whether your target customers are internal or external) grounded in the realism of the technology’s capability and informed due-diligence (not just blindly following the hype). Also, be very careful of what the differentiation for your product is and how easy it is to overcome especially in the rapidly evolving landscape of AI.
For instance, the emphasis and the narrative has already been shifting from model scaling to inference time scaling, from the imminent AGI to more realistic (still hyperbolic though) capabilities mapping, from the inevitability of hyper scaling needs to potentially low cost possibilities, from amazing model benchmark accuracies to real-world reliability and robustness needs, from cost-agnostic development to efficiency, from AI model glitter to actual products and their challenges.
And there are always moments when things seem to change significantly in dimensions where you might have placed earlier bets thanks to hype — as is seemingly the case with the Deepseek development (and others following). Will your product survive, ideally thrive, in such evolving landscape?
Check for the following:
Does your product have a MOAT? Sam Altman recently mentioned two categories of products in the LLM space while speaking of opportunities for entrepreneurs. One addresses a broad class of problems with the assumption that the models will keep getting better. The second focuses on a very small niche problem that relies on the current SOTA model (with the assumption that the model capabilities will not grow beyond a small delta from where they are now). The latter has a much higher chance of either losing out or being confined to very small opportunity space, rendering them less interesting. While this is overly simplified, it isn’t a new observation, and it provides a window into what has always held true in the AI space, even before LLMs came onto the scene.
The core point of any entrepreneurship is to make sure you are addressing a wide enough problem, have limited reliance on the underlying models and powering technology (focusing more on the actual problem space) and don’t rely on the assumptions of technology stagnation (software, hardware, ecosystem, even efficiency). In fact, this goes back to the central basis of establishing a business vision and strategy in that the products shouldn’t be designed inside-out (bottom up technology first) but rather outside-in (market-first solving wide needs-space).
That said, it is important to understand the fundamental limitations of the technology too since there are certain areas that take much longer to rely upon than a business timeline will allow. For instance, compute costs for LLMs
may be something that will come down soonerare seemingly coming significantly down already (interesting that this development happened as I was putting ideas together for this piece) but the technology’s limitations such as hallucination problems quite likely are here to stay. Moats from lower costs, or extremely narrow use-cases are typically overcome much easily.If you are building GenAI products in verticalized enterprise, you want to focus on your vision and strategy, not on building core GenAI technology from scratch. In the more GenAI age parlance, the value will increasingly come at the application layer and that narrative, despite the events of last week around cost and efficiency, is still intact. In fact, Deepseek may pave the way for a much wider adoption of AI, given the cost constraints may be addressed significantly. But again, it is very important that your product is addressing a real use-case, reliably.
Do you have a clearly articulated differentiation for the product? Just the technological differentiation (like a better AI model) becomes increasingly difficult to defend given the progressively reducing barriers to entry as the core technological capabilities become commoditized. Moreover, (benchmark) performances will likely further converge. Therefore, the product’s differentiation should be clear and well articulated.
Are your use-cases defined within the limits of well understood technology capabilities? The success of product reliant on a core underlying technology ultimately depends on the successful alignment between the technology’s capabilities and product’s promises. For instance, LLMs are not ready for safety- or mission-critical systems yet but there are a variety of use-cases where the errors (whether via hallucination or other factors) may be acceptable with the benefits making up for the deficiencies. It is extremely important to understand and establish a favorable risk-reward trade-off to exploit opportunities.
Is the relationship between product efficacy and AI model efficacy established? AI model accuracy may not naturally translate to product quality improvement. It is important to understand these dependencies. It is crucial to comprehend how your model metric impact product metrics, and more importantly, to establish product metrics that are relevant to the users of the product. Also, generic incremental improvements in model metrics, such as those on public benchmarks, may not be highly relevant to product in question. I have discussed this more here.
Are your offerings cost feasible? Do you have a handle on the cost of building, maintaining and deploying LLM products? The cost landscape in the LLM world is relatively much more unpredictable than in non-LLM AI or regular software offerings. See the section on economics of GenAI below. Have a clarity on whether the market would be willing to pay for this product — i.e., whether the product will be addressing an unmet market need, at a feasible price. This will naturally drive an understanding of the investment involved and whether sizeable realizable ROI exists (or can be reasonably projected out with growth). Even though GenAI businesses may not be experiencing it yet, the funding landscape will tighten. Consequently, it will be increasingly important to prioritize profitability, bootstrapping, and a return to operational efficiency.
Do you understand the difference between brute force vs. Smart LLM integration? While LLM product may seem very attractive, in many cases, there might be simpler AI solutions to the problems. From a product perspective, it is important to understand where LLM is actually needed and where simpler approaches may be preferable. Often LLM powered products will need to use both LLM and non-LLM AI depending on the needs. Making smart choices can have significant implications on the reliability, cost, and performance of the product.
Do you have a viable Business model and GTM strategy? Have you validated any hypotheses on the uptake of your product, its ability to be adopted, integrated and employed in a manner that a direct line of sight to value can be realized (both for you and for customers - whether enterprise or consumers)?
User subscription-based business models have shown great adoption but poor (read: mildly to significantly negative) ROI when it comes to wrappers built around foundational models (read: ChatGPT). This is partly due to the cost but also due to the inherent reliability limitations or behavior unpredictability of the underlying models. However, the bigger reason is the cost-value misalignment for enterprise level offerings. It is important to have a sensible and defensible business model.
For instance, in search offerings, companies often charge a premium for LLM powered products. However, the API call costs are often transferred to the customers. While this is appealing in the short term, longer term implications aren’t well understood especially if the API costs keep rising. Therefore, it is important to have your business model account for such challenges. This can also be complemented by proper GTM support such as giving proper cost visibility to customers, articulating the value of your offering, or training customers on efficient utilization.
2. Do you understand the economics of GenAI?
With increasing scale and rapid pace in AI developments, products can end up having much more complex economics, significantly affecting the cost to deploy, operate and maintain. Further, the products that rely in still evolving and maturing capabilities have inherent reliability challenges in the initial phases and are bound to need additional efforts to mitigate unintended outcomes and build product robustness and reliability. Various costs come into play when adopting GenAI internally and deploying them commercially. For instance:
Cost of development: Are your offerings cost feasible? Do you have a handle on the cost of building, maintaining, and deploying LLM products? Cost creep is quite common in the LLM product development and utilization, whether it comes as a result of experimentation, developer errors, bugs, model development cost, model refinement cost, model maintenance costs (fine tuning, retraining, adapting), usage cost (inference time costs), mode of usage (user interaction time, mechanism) and so on. The cost landscape in the LLM world is relatively much more unpredictable than in non-LLM AI or regular software offerings. Again, as I was writing this, the previous claim was validated with the new economics of Deepseek, while others have seen similar issues.
Making the right investments: While advances such as Deepseek has shown that the LLM development can be done at a fraction of the cost, it immediately raises question on dedicated resource investment for purpose-built products. While generic data centers and other investments may still be able to pivot and/or expand to other uses (still non-trivial), this luxury isn’t something that you may be able to afford for dedicated products/projects and may rattle your entire business model since you may not be able to justify your pricing all of a sudden now. Same goes for other investments decisions on areas such as talent too by the way (but will be a focus of another piece).
Cost of deployment: Are your deployment and utilization costs predictable and manageable? One of the reasons why the API call costs are transferred to customers for various GenAI powered products is precisely this unpredictability. The costs rise significantly with the usage (at least for now) and may introduce additional bottlenecks for product and/or customer. If unreliability due to hallucinations becomes a problem, the offering is negatively impacted. There are other “costs” of deployment that can result in liabilities too (if you are familiar with the manufacturing world, think of margin calls on product recalls or field issues). These costs may result in significant risks to the business (see 4. below).
3. Have you made informed technology choices?
Are your technical choices for the product well informed? Are you using managed product or relying on open-source? Are you reliant on a single provider? Do you have flexibility to transition or change the providers if a better or more cost-effective one comes along? Are you relying on in-house models, and if so, how do they impact your business plan with increasing and evolving competition? Have you performed due diligence on what these choices entail?
GenAI product journeys aren’t about choosing just a base foundation model or a specific managed ChatGPT-akin offering. There are other technical choices from training apparatus, Ops, deployment channels, fine-tuning, domain adaptation, and so on. It is important that you have a clear understanding and position on these technical choices, do not build unnecessary dependencies, and account for (inevitable) disruptions coming in any part of the GenAI product lifecycle.
Another important point here is to also understand the potential implications of these technological choices not just for you but for your downstream customers too. For instance, does picking open source, in any way, introduce IP, security or other risks for your product that may impact your customers or even your internal operations? Do these result in data transfers outside of jurisdictions? Do they violate data-privacy or other safeguards that are guaranteed as part of your product?
A detailed understanding of technology choices not just from the performance perspective but from the perspectives of business impact, risk and credibility are increasingly critical.
4. Have you accounted for the regulatory, compliance, (Re)liability, and security implications?
Understand the entire risk landscape which is both unique, unclear, and can become quickly challenging when it comes to GenAI. Here are some areas that you may want to address:
Risks from internal adoption: Many teams and individuals across the organization are employing GenAI capabilities across a range of their daily tasks, whether summarizing notes, drafting messages, creating content (for instance, for sales and marketing), or developing code. Most of these efforts are individual efforts over either free offerings or individual subscriptions resulting in the respective privacy policies coming into play. It is important to understand the associated risk landscape.
In the absence of proper governance on the use of GenAI capabilities, the risk of compromising proprietary data, sensitive information and even business secrets and customer and third-party information, are real dangers among others. Moreover, the risks also extends to potential jurisdictions that may be problematic depending upon how the engaged services (such as ChatGPT, Perplexity, Deepseek) treat the information provided to them.
For instance, in addition to collecting log information and other technical data like ChatGPT, Deepseek privacy policy mentions that they also collects keystroke patterns and rhythms. Further, Deepseek has different rules on how the data is stored and shared, especially with government entities (seemingly subtle but extremely crucial differences with rapidly changing geopolitical landscape). Leaders should be aware of such implications and ramifications as they adopt capabilities both for internal use and for product integration since these risks and associated potential liabilities may extend not just to the business but also to downstream customers.
Risks from careless GenAI integration: GenAI’s capabilities, limitations, and constraints extend to the products that they support. Understanding these risks is critical as these products are employed both from product-safety perspective and to assess, understand and mitigate potential liability risks. Also, be wary of the risks and constraints that may arise from up-stream or third-party data, and AI technologies that the product may depend on.
For instance, a recent experience on an Air Canada chatbot glitch resulted in incorrect information being communicated to the users, leading to legal actions. While the Air Canada’s case where the chatbot ‘invented’ incorrect information was pre-LLM era chatbot, the lessons are more general. Another example is with a Chevy chatbot, highlighting the jailbreaking risk with LLMs, potentially resulting in data compromise, reputational risk, and other liabilities.
It is important to build guardrails when deploying AI models, which needs robust testing, validation and verification mechanisms. The risks with GenAI powered products can be much higher and can accumulate much quicker owing to the scale, speed and scope of applications and use.
Accounting for evolving Regulatory environment: The deployment complexity for GenAI products should also be addressed in the context of rapidly evolving regulatory as well as standardization frameworks across industries, geographies, and applications.
Addressing Security: Finally, there can be potential implications of AI products for customer-, user-, and provider-data security, business risks and so on. Recent research has shown the risk of adversarial malware attacks on the generative AI ecosystem via exploiting the chat agents, thereby highlighting the evolving challenges on the cybersecurity front. It is important that the product addresses, flags, and mitigates, when possible, such risks arising as a result of product use post-deployment. The legal aspects around the product licensing, uses, and updates need to incorporate these guardrails too.
Product Reliability: There are still unknowns about LLMs (most of it in fact). As Subbarao Kambhampati mentions in this interview, LLMs have fractal intelligence (attributed to Andrej Karpathy), meaning we don’t know why it does (and) what it does. Now, in many a cases, we see impressive things and can see benefits for the products. But we need to be careful since we don’t know the failure modes. Nor is failure evident or visible in many cases. For instance, LLM can hallucinate its way without raising any flags but the product can have disastrous real-world consequences.
Liability: Product reliability, safety, security are all intertwined with business liability costs. It is important to plan around the risk landscape surrounding (re)liability, safety, and security, along with business and societal risks, depending on the product context. For instance, since LLM behavior isn’t well understood, it may be important to build guardrails around the product performance and have a clear mechanism to control or even disengage the product if needed. The cost of such disengagement (shut-off switch) is another area that should be addressed and dependencies understood. Does shutting off your LLM service comes at the cost of service disruption? Or if it results in service deterioration, are these deteriorations within acceptable limits? Are there mitigation plans in place? In addition to such disruptions, as mentioned above, the product may lead to business liabilities depending upon the context (not to mention potential for real-world consequences as exemplified above) be it in the form of financial, regulatory, IP, reputational, or privacy-related liabilities.
— —