Posts / Prompt Engineering Guide

My personal guide to prompt engineering for the best LLM experience.

Written by Zaid Mukaddam

Mar 15, 2024

In 2020, a video was posted on Twitter showing a screenshot of a developer typing a task into a text box on a rudimentary website. Something like: “Create a React component that does X and Y.” Shortly afterward, an answer popped up in the form of code. Even though the quality of the result seemed mediocre, this process of machine processing of task→result stuck in my head. The project was called GPT-2.

A few years later, OpenAI released GPT-3.5 and GPT-4. These systems have not only become a trend, but are increasingly used in the mainstream. I also use GPT on a daily basis – the output of the models has become usable. It’s probably just now that schools and universities are getting to grips with the tool.

Large-scale language model development is no longer an exclusive domain for start-ups. Big tech corporations are pouring in hundreds of millions to create their own competitive models, with the intent to eventually surpass OpenAI. Recently, Mistral AI, a French AI start-up, unveiled a series of models. These models, not only freely available under the Apache license, match the calibre of GPT-3.5 and GPT-4.

I Am Afraid of Being Replaced

AI processes structured patterns. Code is essentially a structured pattern. Programming is therefore a fundamental skill of AIs.

... That’s what I think.

AI doesn't operate on human cognitive principles but works by sequencing words in a pattern that best fits a task — a process resembling programming rather than computation. Simply, programming generates a 'teabag' of favourable text, primed for infusion or execution by a machine, with the core aim being problem-solving. The craft of coding exists in writing clear and durable code.

Historically, human beings have been the ones deciding what to program and implementing the development. This dynamic might alter in the future, with AIs progressively behaving as task-executing 'worker bees.' It may lead to an increased difficulty for junior developers accessing the industry, given the replaceable nature of their tasks, while senior developers will need to evolve in sync with maturing AIs.

Despite the ability of AIs to program flawlessly, a critical role persists — decision making. This involves choosing the programming language, the framework, and implementing maintainability in the project, as there's no single blueprint for clean coding.

Strategies for Better Results

Over the past few months, through constant use and trial and error, I have learned how to put together task packages for AI in a way that delivers meaningful results. In the context of artificial intelligence, writing clearly defined instructions, i.e. describing the task to be performed by the AI, is called prompt engineering.

OpenAI has now published a guide to prompt engineering, replacing the countless blog articles on the internet that draw conclusions about certain ideals of task writing based on their experiences with prompting.

In the following, I've summarize the key points in simple language, so that you can get better results when prompting GPT-4 (and probably the following models as well). This summary may be of little use to you now. But perhaps in a few years, when a variant of AI is more widely used in your work, you’ll remember it and think: “There was something.”

1. Write Clear Instructions

AI models cannot read minds. If the output is too long, ask for a shorter answer. Again, if the results are too simple, ask for an expert-level text. If you want a specific format (e.g. table), mention it. The less guessing the model has to do, the better.

Approaches:

Include details or examples to get more relevant answers.
Ask the model to assume a persona, e.g. Act as an academic.
Use separators to clearly identify different parts of the input.
State the steps required to complete a task.
State the desired length of the output.

2. Provide Reference Texts

Language models hallucinate when asked about esoteric topics or quotes and URLs, for example. The number of incorrect answers can be reduced by providing reference texts – i.e. a context for the question. Ask the model to use the references as a basis for answering the task.

3. Break Complex Tasks Into Simpler Subtasks

Complex tasks tend to have higher error rates than simpler tasks. In software development, it is therefore good practice to break down a complex system into a series of modular components. The situation is similar for tasks involving a language model. In addition, for simpler tasks, previous responses can be used to construct the input for later tasks.

Approaches:

Collapse or filter previous dialogs in long dialogues.
Collapse long documents piece by piece and recursively create a complete summary.

4. Give the Model Time to “Think”

When you add up the prices of items in a supermarket, you may not know the answer immediately, but you’ll work it out over time. Similarly, models can make thinking errors by trying to answer immediately rather than taking time to work out an answer. Asking for a chain of thought before an answer can help the model to arrive at correct answers more reliably.

Approaches:

Ask the model to work out its own solution before coming to a conclusion.
Ask the model if it has missed anything in previous answers.

End of article. If you spot a typo or have thoughts about this article, feel free to write me. 🙆‍♂️