Working with llms

AI, specifically LLMs, are "all the rage." The potential value is meaningful, but unlocking this potential is more than "ask ChatGPT."

Let's dig into the immediate potential, steps to unlock it, and touch on valid concerns.

This is post is not a predicting the big grand future brought by LLMs, it is focused on actionable things we can do today

The Potential

Today building LLMs can do valuable tasks in domains like data processing, text generation, workflow automation and more.

Two areas that are especially interesting to me are

  1. Workflow automation -> handling of slightly dynamic but repetitive tasks
  2. Research & reporting -> querying database and processing text data to build reseaerch / reports
  3. New UI/UX Approaches -> simplified but dynamic apps powered by LLMs
Workflow Automation

Say your customer support wants to better handle customer feedback, share with appropriate team members and save the feedback for reporting.

Without LLM automation - customer support would need to take great notes, write an email to selected team members and input feedback into a CRM form.

With LLM automation - our tool can process the feedback, automatically route it to the team and structure processed feedback into a CRM...now your customer support team just has to validate the work done by the LLM.

Let's focus on two camps of potential: Synthesis and Disintermediation

Synthesis:

  • Summarize a series of emails into a project plan
  • Create research reports
  • Play role of "copywriting buddy" for marketing

Disintermediation:

- Text to SQL

Pitfalls & Concerns

The Pitfalls

The Concerns

Intro

Pitfalls

How to drive impact with AI?

Most AI projects fail. In my experience, they fail for three reasons:

  1. Shaky foundation to build on
  2. Poorly defined model "delivery" & objectives
  3. Lack of TLC post deployment

Avoiding these three mistakes will play a huge role in your success (+ having a strong data science team or partner).

Shaky Foundation

Let's mix metaphors a bit. AI/ML models have an endless appetite for data & will "eat" what they are given. Model performance heavily relies on quality of data they consume.

Checklist:

  1. We need to source ingredients (data collection)
  2. We need a kitchen (database)
  3. We need a cook (data prep, analysis, and manipulation)
  4. We need a server (deliver data to model)

Take predicting the proper amount to bid on a Google search ad. Do we collect past bids, relevant details, and clicks/conversions? Where is this data stored? Has anyone done exploratory data analysis on the data (bonus if automated data quality checks)?

Poorly Defined

Delivery

We have two straight-forward options that meaningfully change the approach:

  1. Make predictions on a schedule, store in a database (e.g predict nightly)
  2. Make predictions on demand, log predictions for future evaluation

Both approaches work well. The choice is dependent on data availability and objectives.

Objectives

A seemingly straight-forward step can cause outsized "pain" when not done well.

Hint: An objective of "make thing better" is bad, "increase measureable aspect of thing" is good.

Expanding on our Google search ads example. We want to predict how much to bid on an ad. What do we want to happen? Bidding zero always minimizes cost, bidding $1M maximizes clicks...

There are many metrics to optimize. Thoughtfully aligning our objectives with larger business goals is critical.

We could have plenty of objectives. From increase clicks, conversions or customers; maximize return on ad spend or customer lifetime value; lower customer acquisition cost.

Each of these objectives has trade-offs. Increase conversions could blow up spend, maximize return on ad spend could drop customer growth meaningfully.

Say cost efficiency is business goal, optimizing revenue from conversion to acquisition cost is a good objective.

TLC Post Deployment

Being able to deploy an AI model is exciting. It is easy to "set & forget" but AI models require continuous monitoring and optimization.

Staying up-to-date on

  • Does the input data to the model shift over time?
  • Are my evaluation metrics changing?
  • Did a model retraining "break" performance?
  • Are we collecting the right data for future optimizations?

If these questions are managed, it is inevitable that performance will eventually degrade.

Past work

Here is a non-exhaustive overview of work I've done with ML & AI.

A Disclaimer of Sorts

Each of these projects was a unique problem based on the business, data availablity & amount, and other constraints.

We can operate within most constraints but not all.

Additional note, examples are framed in a business context. Most projects were done end-to-end -> data prep & cleaning, problem definition, model training, evaluation, deployment & monitoring.

Pricing & Inventory

Promotion Pricing Optimization:

  • What: Model that predicted the item(s) most receptive to promotion + the optimal discount for sales and margin dollars.
  • Value: Drives better return on promotionala dollars, increases sales or margin dollars.

Inventory Optimization:

  • What: Suggest optimal order volume to meet demand of promoted items.
  • Value: Minimize lost sales from out of stock items and lost inventory from over ordering.

Marketing & Sales

Match acquisition cost and customer lifetime value

  • What: A Google adwords bidding model that identified amount to bid that maximized customer acquisition at acceptable customer lifetime value threshold.
  • Value: Allowed business a controlled way to grow customers without getting "upside down" on return on ad spend.

Lead Scoring & Allocation:

  • What: Identify prospects with highest conversion likelihood and paired prospect with "best fit" sales rep.
  • Value: Doubled sales conversion at end of three month "calibration" period.

General

Text to SQL:

  • What: User asks natural question like "what was my top selling product last week?" to a SQL query (and data output) tool.
  • Value: Enables business to be better informed without relying on technical resources for answers.

Impact of user action:

  • What: User books a service today, what value does this create over the next three months relative to a similar user who doesn't book a service today?
  • Value: Improves understanding of quality of service provided and enables better planning.