Router /v1/chat/completions not compatible with openai spec #1887

phangiabao98 · 2024-05-14T07:58:24Z

System Info

CUDA: 12.1
Python 3.10
Rust: 1.75.0

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

Run launcher in docker and mount sock to host with below cli
docker run --gpus all --shm-size 1g -v /tmp:/tmp -v /root/Project/text-generation-inference/ink-tgi/models:/data ghcr.io/huggingface/text-generation-inference:1.4 --model-id TinyLlama/TinyLlama-1.1B-Chat-v1.0
get model tokenizer_config.json and change chattemplate as below:
"chat_template": "{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + eos_token }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + eos_token }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n'}}{% if message['tool_calls'] %} {{''}} {% else %} {{message['content'] + eos_token}} {% endif %}\n{% elif message['role'] == 'tool' %}\n{{ '<|tool|>\n' +message['name'] + '\n'+ message['content'] + eos_token }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}",
Run router in host with

cd router
cargo run -- --tokenizer-config-path /root/Project/text-generation-inference/ink-tgi/router/tokenizer_config.json

call function call curl as example of openai interface

curl --location 'http://localhost:3000/v1/chat/completions' \
--header 'accept: application/json' \
--header 'Content-Type: application/json' \
--data '{
    "messages": [
        {
            "role": "user",
            "content": "What'\''s the weather like in San Francisco, Tokyo, and Paris?"
        },
        {
            "role": "assistant",
            "tool_calls": [
                {
                    "id": "0",
                    "function": {
                        "arguments": {
                            "location": "San Francisco, CA",
                            "unit": "celsius"
                        },
                        "name": "get_current_weather",
                        "description": "null"
                    },
                    "type": "function"
                }
            ]
        },
        {
            "tool_call_id": "0",
            "role": "tool",
            "name": "get_current_weather",
            "content": "{\"location\": \"San Francisco\", \"temperature\": \"72\", \"unit\": \"fahrenheit\"}"
        }
    ],
    "model": "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    "stop": [],
    "max_tokens": 500,
    "temperature": 0.5
}'

There would be 2 error found:

struct Message in lib.rs doesn't have attribute tool_calls as Openai Spec
struct ToolCall in lib.rs have id as u32 which must be String according to Openai Spec

Expected behavior

Router must serve interface that support function calling implementation of lang chain or other LLM application frameworks.

You can test it with below python code

import openai
import json


client = openai.OpenAI(
    api_key = "", # can be anything
    base_url = "http://localhost:3000/v1" # NOTE: Replace with IP address and port of your llama-cpp-python server
)

# Example dummy function hard coded to return the same weather
# In production, this could be your backend API or an external API
def get_current_weather(location, unit="fahrenheit"):
    """Get the current weather in a given location"""
    if "tokyo" in location.lower():
        return json.dumps({"location": "Tokyo", "temperature": "10", "unit": "celsius"})
    elif "san francisco" in location.lower():
        return json.dumps({"location": "San Francisco", "temperature": "72", "unit": "fahrenheit"})
    elif "paris" in location.lower():
        return json.dumps({"location": "Paris", "temperature": "22", "unit": "celsius"})
    else:
        return json.dumps({"location": location, "temperature": "unknown"})

def run_conversation():
    # Step 1: send the conversation and available functions to the model
    messages = [{"role": "user", "content": "What's the weather like in San Francisco, Tokyo, and Paris?"}]
    tools = [
        {
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                    },
                    "required": ["location"],
                },
            },
        }
    ]
    response = client.chat.completions.create(
        model="gpt-3.5-turbo-1106",
        messages=messages,
        tools=tools,
        tool_choice="auto",  # auto is default, but we'll be explicit
    )
    response_message = response.choices[0].message
    tool_calls = response_message.tool_calls
    # Step 2: check if the model wanted to call a function
    if tool_calls:
        # Step 3: call the function
        # Note: the JSON response may not always be valid; be sure to handle errors
        available_functions = {
            "get_current_weather": get_current_weather,
        }  # only one function in this example, but you can have multiple
        messages.append(response_message)  # extend conversation with assistant's reply
        # Step 4: send the info for each function call and function response to the model
        for tool_call in tool_calls:
            function_name = tool_call.function.name
            function_to_call = available_functions[function_name]
            function_args = tool_call.function.arguments
            function_response = function_to_call(
                location=function_args.get("location"),
                unit=function_args.get("unit"),
            )
            messages.append(
                {
                    "tool_call_id": tool_call.id,
                    "role": "tool",
                    "name": function_name,
                    "content": function_response,
                }
            )  # extend conversation with function response
        second_response = client.chat.completions.create(
            model="gpt-3.5-turbo-1106",
            messages=messages,
        )  # get a new response from the model where it can see the function response
        return second_response
print(run_conversation())```

The text was updated successfully, but these errors were encountered:

phangiabao98 · 2024-05-14T08:03:19Z

i created a PR at #1888

@Narsil

# What does this PR do?  Fixes # (issue) #1887 ## Before submitting - [no ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [yes] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ yes] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [yes ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ yes] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. @Narsil --> --------- Co-authored-by: Bao Phan <baopg@inter-k.com>

phangiabao98 mentioned this issue May 14, 2024

OpenAI function calling compatible support #1888

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Router /v1/chat/completions not compatible with openai spec #1887

Router /v1/chat/completions not compatible with openai spec #1887

phangiabao98 commented May 14, 2024

phangiabao98 commented May 14, 2024

Router /v1/chat/completions not compatible with openai spec #1887

Router /v1/chat/completions not compatible with openai spec #1887

Comments

phangiabao98 commented May 14, 2024

System Info

Information

Tasks

Reproduction

Expected behavior

phangiabao98 commented May 14, 2024