openAI cookbook - UT

发布时间 2023-04-26 15:26:34作者: fxjwind

和原例子不一样,我没有用API,直接用的chatgpt3.5

如何形成比较好的UT prompt,要分步骤,这里分成三步,我是把每一步copy到chatgpt,然后把结果贴回来

Complex tasks, such as writing unit tests, can benefit from multi-step prompts.
In contrast to a single prompt, a multi-step prompt generates text from GPT-3 and then feeds that text back into subsequent prompts. This can help in cases where you want GPT-3 to explain its reasoning before answering, or brainstorm a plan before executing it.

In this notebook, we use a 3-step prompt to write unit tests in Python using the following steps:

  1. Given a Python function, we first prompt GPT-3 to explain what the function is doing.
  2. Second, we prompt GPT-3 to plan a set of unit tests for the function.
    • If the plan is too short, we ask GPT-3 to elaborate with more ideas for unit tests.
  3. Finally, we prompt GPT-3 to write the unit tests.

原先的代码例子,是要把每一步的问题和答案,append上去重新提交的

我觉得gpt可能具有context能力,所以就每个问题单独问

def unit_test_from_function(
    function_to_test: str,  # Python function to test, as a string
    unit_test_package: str = "pytest",  # unit testing package; use the name as it appears in the import statement
    approx_min_cases_to_cover: int = 7,  # minimum number of test case categories to cover (approximate)
    print_text: bool = False,  # optionally prints text; helpful for understanding the function & debugging
    text_model: str = "text-davinci-002",  # model used to generate text plans in steps 1, 2, and 2b
    code_model: str = "code-davinci-002",  # if you don't have access to code models, you can use text models here instead
    max_tokens: int = 1000,  # can set this high, as generations should be stopped earlier by stop sequences
    temperature: float = 0.4,  # temperature = 0 can sometimes get stuck in repetitive loops, so we use 0.4
    reruns_if_fail: int = 1,  # if the output code cannot be parsed, this will re-run the function up to N times
) -> str:
    """Outputs a unit test for a given Python function, using a 3-step GPT-3 prompt."""

    # Step 1: Generate an explanation of the function,第一步生成函数解释

    # create a markdown-formatted prompt that asks GPT-3 to complete an explanation of the function, formatted as a bullet list
    prompt_to_explain_the_function = f"""# How to write great unit tests with {unit_test_package}

In this advanced tutorial for experts, we'll use Python 3.9 and `{unit_test_package}` to write a suite of unit tests to verify the behavior of the following function.
```python
{function_to_test}
```

Before writing any unit tests, let's review what each element of the function is doing exactly and what the author's intentions may have been.
- First,"""

    text_color_prefix = "\033[97m"  # black,\033[30m; if you read against a dark background \033[97m is white
    print("\033[92m"+"------------------------------STEP1--------------------------", end="\n")
    print(text_color_prefix + prompt_to_explain_the_function, end="\n")  # end='' prevents a newline from being printed

    # send the prompt to the API, using \n\n as a stop sequence to stop at the end of the bullet list
    # 调用API,获取completion
    explanation_completion = "the function is_palindrome takes a single string argument s.The function returns a boolean value indicating whether s is a palindrome. A palindrome is a word, phrase, number, or other sequence of characters that reads the same forward and backward.The function compares s to its reverse string using slicing notation (s[::-1]). If s and its reverse string are the same, then is_palindrome returns True. Otherwise, it returns False."

    print("\033[92m"+"------------------------------STEP2a--------------------------", end="\n")

    # Step 2: Generate a plan to write a unit test
    # create a markdown-formatted prompt that asks GPT-3 to complete a plan for writing unit tests, formatted as a bullet list
    prompt_to_explain_a_plan = f"""
A good unit test suite should aim to:
- Test the function's behavior for a wide range of possible inputs
- Test edge cases that the author may not have foreseen
- Take advantage of the features of `{unit_test_package}` to make the tests easy to write and maintain
- Be easy to read and understand, with clean code and descriptive names
- Be deterministic, so that the tests always pass or fail in the same way

`{unit_test_package}` has many convenient features that make it easy to write and maintain unit tests. We'll use them to write unit tests for the function above.

For this particular function, we'll want our unit tests to handle the following diverse scenarios (and under each scenario, we include a few examples as sub-bullets):
-"""

 # append this planning prompt to the results from step 1
    prior_text = prompt_to_explain_the_function + explanation_completion
    full_plan_prompt = prior_text + text_color_prefix +prompt_to_explain_a_plan

    print(text_color_prefix + prompt_to_explain_a_plan, end="")
    # send the prompt to the API, using \n\n as a stop sequence to stop at the end of the bullet list

    plan_completion = ""

    print("\033[92m"+"------------------------------STEP2b--------------------------", end="\n")
    # Step 2b: If the plan is short, ask GPT-3 to elaborate further
    # this counts top-level bullets (e.g., categories), but not sub-bullets (e.g., test cases)
    elaboration_needed = plan_completion.count("\n-") +1 < approx_min_cases_to_cover  # adds 1 because the first bullet is not counted
    prompt_to_elaborate_on_the_plan = f"""

In addition to the scenarios above, we'll also want to make sure we don't forget to test rare or unexpected edge cases (and under each edge case, we include a few examples as sub-bullets):
-"""

    print(text_color_prefix + prompt_to_elaborate_on_the_plan, end="")

    elaboration_completion = ""
    print("\033[92m"+"------------------------------STEP3--------------------------", end="\n")
    # Step 3: Generate the unit test
    # create a markdown-formatted prompt that asks GPT-3 to complete a unit test
    starter_comment = ""
    if unit_test_package == "pytest":
        starter_comment = "Below, each test case is represented by a tuple passed to the @pytest.mark.parametrize decorator"
    prompt_to_generate_the_unit_test = f"""

Before going into the individual tests, let's first look at the complete suite of unit tests as a cohesive whole. We've added helpful comments to explain what each line does.
```python
import {unit_test_package}  # used for our unit tests

{function_to_test}

#{starter_comment}"""


    print(text_color_prefix + prompt_to_generate_the_unit_test, end="")

    unit_test_completion = ""

实际第一步的时候,chartgpt已经直接给出完整的答案,并且给出如何UT的tutorial,我自己把其中描述函数意思的部分copy出来

第二,三步确实给出更多的case

第4步的是,gpt已经忘记了context,直接去解释@pytest.mark.parametrize了,这可能就是为何原来的代码样例需要把context拼上给gpt的原因

不管怎样,把所有的内容拼起来,得到如下的内容,其中从gpt得到的答案用下划线和斜体标出

In this advanced tutorial for experts, we'll use Python 3.9 and `pytest` to write a suite of unit tests to verify the behavior of the following function.
```python
def is_palindrome(s): return s == s[::-1]
```

Before writing any unit tests, let's review what each element of the function is doing exactly and what the author's intentions may have been.
- First, the function is_palindrome takes a single string argument s.
The function returns a boolean value indicating whether s is a palindrome. A palindrome is a word, phrase, number, or other sequence of characters that reads the same forward and backward.
The function compares s to its reverse string using slicing notation (s[::-1]). If s and its reverse string are the same, then is_palindrome returns True. Otherwise, it returns False.

 

A good unit test suite should aim to:
- Test the function's behavior for a wide range of possible inputs
- Test edge cases that the author may not have foreseen
- Take advantage of the features of `pytest` to make the tests easy to write and maintain
- Be easy to read and understand, with clean code and descriptive names
- Be deterministic, so that the tests always pass or fail in the same way

`pytest` has many convenient features that make it easy to write and maintain unit tests. We'll use them to write unit tests for the function above.
For this particular function, we'll want our unit tests to handle the following diverse scenarios (and under each scenario, we include a few examples as sub-bullets):
Test a simple palindrome with no spaces:
"racecar"
"level"
Test a palindrome with spaces:
"A man a plan a canal Panama"
"Was it a car or a cat I saw?"
Test a string that is not a palindrome:
"hello world"
"python"
Test an empty string:
""
Test a palindrome with lowercase letters:
"deified"
"radar"
Test a palindrome with uppercase letters:
"LEVEL"
"MOM"
Test a palindrome with mixed case:
"RaCeCaR"
"Aibohphobia"
Test a palindrome with special characters:
"A man, a plan, a canal: Panama"
"A Santa at NASA"
Test a string with non-alphanumeric characters:
"!@#$%^&*()"
"1 2 3 2 1" (with spaces)
"race-car"

In addition to the scenarios above, we'll also want to make sure we don't forget to test rare or unexpected edge cases (and under each edge case, we include a few examples as sub-bullets):
Test a very long palindrome:
"A man a plan a canal Panama" repeated 100 times
Test a string with a very large number of characters:
A string with 10,000 characters
Test a string with non-ASCII characters:
"révéler"
"über"
Test a string that is a palindrome, but with only one character repeated multiple times:
"aaaaaa"
Test a string with only one character:
"a"
Test a string with only two characters:
"aa"
"ab"
Test a string with leading or trailing spaces:
" racecar "
" level"
Test a string with leading or trailing non-alphanumeric characters:
".racecar"
"level@"
Test a string with multiple lines:
"A man a plan a canal Panama"
Test a string with mixed whitespace characters:
"A man a plan a canal Panama"

Before going into the individual tests, let's first look at the complete suite of unit tests as a cohesive whole. We've added helpful comments to explain what each line does.
```python
import pytest # used for our unit tests

def is_palindrome(s): return s == s[::-1]

#Below, each test case is represented by a tuple passed to the @pytest.mark.parametrize decorator

 

最终从gpt得到的答案如下,他给出了两个版本,普通版本和@pytest.mark.parametrize decorator版本

效果上看可以的