Language First

Ten years ago, many companies were concerned with a concept called "Mobile First," which prioritized developing software that worked well on a new interface gaining massive user adoption: the smartphone.

Within the near future, many companies will be concerned with a concept known as "Language First." This approach involves developing software that works well with an emerging interface gaining widespread use: Large Language Models (LLMs).

The chatbot interface has made LLMs famous. However, many things are pointing towards LLMs being not only great for chat but also for interacting with software systems through language. Because of the LLMs' reasoning capabilities and their ability to use tools, this new interface can help humans handle much of the cognitive load involved in learning and using complex systems.

In the near future, we as humans will be able to interact with much more complex software systems without being experts. Imagine if everyone could handle the most complex systems relevant in their daily lives as an expert would.

So, how are we as software developers going to design software for this future where "Language First" principles are going to dominate?

As the "Mobile First" paradigm was obsessed with graphical user interfaces and user experiences, the "Language First" paradigm will be obsessed with user intent. 

What problems does the user have that they can express in plain, non-expert layman language, and how can we use AI to understand it and orchestrate functionality to help the user achieve their goal?

This will involve organizing systems of prompts to build the reasoning framework for orchestrating functionality. Also it will involve indexing and organizing functionality (basically endpoints) and processes to give the LLMs the best possible conditions to do a good job.

Much like search intent which is a big thing in natural language information search, user intent will be the main focus in functionality search and execution.

To be continued... (I'm not done writing yet 😅)

Beautiful AI-based Products

If your product has a UX element that aligns user interactions with the goal of your AI models, then you have a beautifully machine-learning-based product.

Think of Midjourney. You write a prompt, get 4 generated images. Midjourney's goal is to generate the best possible images given a prompt. The user's desire is to get the best possible image from a prompt. There is 100% alignment between the product's and the user's goals. And you don't need to give advanced instructions to Midjourney users on selecting the best picture. Their human intuition guides them effortlessly. The result is invaluable data for Midjourney to train their models.

Think of Netflix. Occasionally, while you're watching Netflix, you will be asked, 'Are you still watching X?' Netflix understands how crucial this subtle piece of information is for their recommender systems and the user experience of the product. The recommender system needs to know if you are actually watching to determine what to recommend next. And if you fell asleep, it's convenient that Netflix can predict this and automatically stop the series for you, so you don't have to search for where you stopped watching. When you are prompted with 'Are you still watching X?', you as a user have a 100% interest and need to give Netflix almost the perfect information they need to evaluate their system.

More examples of products that do this are Google Search and every social media feed. Huge companies are built by integrating UX and AI to craft superior products. Yet, not many consider AI and UX in this integrated manner. It should be a consideration in every UX decision made.

Prompting Patterns: The Clarification Pattern

The more I use ChatGPT and develop software using LLM APIs, the more I realize that context is essential for LLMs to provide high-quality answers. When I use ChatGPT and receive unsatisfactory answers, it's typically due to a lack of information about the problem I'm presenting or my current situation. I often notice that I might be ambiguous about the task I want ChatGPT to solve, or ChatGPT perceives the issue in a manner I hadn't anticipated. However, I've observed that by adopting a simple pattern, I can significantly reduce these challenges, consistently leading to more accurate responses.

The pattern is as follows:

  1. Me: I instruct ChatGPT to perform a task. I tell it not to respond immediately but to ask clarifying questions if any aspect of my instruction is unclear.
  2. ChatGPT: Asks clarifying questions.
  3. Me: I answer the questions and tell it again not to execute the instruction but to ask further clarifying questions if any part of my answers is unclear.
  4. ChatGPT: It does one of two things.
    a) Asks additional clarifying questions. If this happens, return to step 3.
    b) Indicates it has no further questions. If this is the case, proceed to step 5.
  5. Me: I give the command to execute the instruction.

I call this the "Clarification Pattern." Recognizing this approach shifted my perspective from viewing prompt engineering solely as individual prompts to thinking in terms of human-AI conversations. Through these dialogues, I can build valuable context by clarifying ambiguities in both my understanding and that of ChatGPT, thus providing ChatGPT with the optimal conditions to deliver an excellent response.

Text Classifiers are an Underrated Application of LLMs

Before LLMs really became a thing, getting up and running with a text classifier for a non-standard problem from scratch, including the annotation of a dataset for training, would probably take at least 3 weeks of work hours. That amounts to 7,200 minutes. Today, getting up and running with a classifier using LLMs requires only writing a prompt, which takes about a minute.

That's a 7,200x productivity gain in the initial process of working with text classifiers.

One thing to note, however, is that in the 1-minute prompt scenario, you have collected zero data and therefore have nothing to measure your classifier's performance against. However, since you have a classifier, you can annotate much more efficiently using an active learning approach, and you have 7,199 minutes to knock yourself out with evaluating your classifier.

Everybody talks about chatbots and agents as the hot new thing, but honestly, a 7,200x productivity gain in text classifier development is also pretty huge!

Thought Debugging

Previously, tuning text classifiers required annotated datasets. This process entailed splitting the datasets into training and test sets, fine-tuning the model, and measuring its performance. Often, improving accuracy meant analyzing incorrect predictions to hypothesize about what the model failed to understand about the problem. Solutions for improving performance could involve adding more annotations, tweaking the annotation protocol, or adjusting preprocessing steps.

However, with the rise of Large Language Models (LLMs), the focus has shifted towards crafting effective prompts rather than constructing datasets. If a model doesn't respond accurately to a prompt, it is fine-tuned by adjusting the prompt to accommodate potential misunderstandings. A significant advantage of LLMs is their ability to explain the reasoning behind their predictions. This interactive approach allows users to probe the model's understanding and further refine prompts. Moreover, these models can express their thought processes, not only enhancing their performance but also introducing a technique that can be termed "Thought Debugging". This allows for the diagnosis and correction of their cognitive processes.