ChatGPT流式输出
参考
消息检查是否违规:https://platform.openai.com/docs/guides/moderation/quickstart
怎么开启流式传输:https://platform.openai.com/docs/api-reference/chat/create#chat/create-stream
1 | stream boolean Optional Defaults to false |
怎么做到连续上下文:https://platform.openai.com/docs/guides/chat/introduction,可以用https://platform.openai.com/playground?mode=chat 进行测试,入下图:刚开始问的时候assistant内容是为空的。
1 | # 当用户指令引用先前的消息时,包括对话历史记录会有所帮助。 |
In Python, the assistant’s reply can be extracted with response[‘choices’][0][‘message’][‘content’].
Every response will include a finish_reason. The possible values for finish_reason are:
- stop: API returned complete model output
- length: Incomplete model output due to max_tokens parameter or token limit
- content_filter: Omitted content due to a flag from our content filters
- null: API response still in progress or incomplete
ChatGPTUnofficialProxyAPI,如果您想跟踪对话,您需要像这样传递 parentMessageId
1 | const api = new ChatGPTAPI({ apiKey: process.env.OPENAI_API_KEY }) |
您可以通过 onProgress 处理程序添加流:
1 | const res = await api.sendMessage('Write a 500 word essay on frogs.', { |
python: https://github.com/labteral/chatgpt-python/blob/master/chatgpt/chatgpt.py
https://www.reddit.com/r/OpenAI/comments/10x67vc/stream_responses_from_openai_api_with_python_a/
accessToken获取,https://chat.openai.com/chat登陆,https://chat.openai.com/api/auth/session获取。
nginx配置
已经找到问题了
location /api
{
rewrite ^/api/?(.*)$ /$1 break;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header REMOTE-HOST $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Connection “”;
proxy_http_version 1.1;
proxy_pass http://127.0.0.1:3002/;
}
本地的这个配置是没有问题的,可以实现打字机效果,主要的就是如果你用的是代理的api,那么代理的api也要配置,nginx 默认配置的 proxy_buffering 是开启的 4k,会导致返回的数据先在ng积压到4k或者链接关闭才会返回到客户端,建议关闭proxy_buffering,如下是配置参考
location /openai/ {
default_type octet-stream;
proxy_buffering off;
chunked_transfer_encoding on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 120;
proxy_pass https://api.openai.com/;
proxy_set_header Host api.openai.com;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header REMOTE-HOST $remote_addr;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
proxy_http_version 1.1;
}
python测试流式输出:
1 | import os |
请求分析
未使用流式输出的情况
- 请求过程
Request URL: https://nephengpt.zeabur.app/proxy/v1/chat/completions
Request Method: POST
Status Code: 200
Remote Address: 119.23.226.212:3008
Referrer Policy: strict-origin-when-cross-origin - 响应头
access-control-allow-origin: http://localhost:5174
cache-control: no-cache, must-revalidate
content-encoding: gzip
content-type: application/json # 未使用流式输出
date: Sun, 12 Mar 2023 05:33:10 GMT
openai-model: gpt-3.5-turbo-0301
openai-organization: user-utn66hl1ou1ok1dzfhyndxj3
openai-processing-ms: 13562
openai-version: 2020-10-01
strict-transport-security: max-age=15724800; includeSubDomains
vary: Origin, Accept-Encoding
x-request-id: 3bae3759ea1365e84ad6d8edccb7d8a1 - 请求头
:authority: nephengpt.zeabur.app
:method: POST
:path: /proxy/v1/chat/completions
:scheme: https
accept: /
accept-encoding: gzip, deflate, br
accept-language: zh-CN,zh;q=0.9
authorization: Bearer
cache-control: no-cache
content-length: 88
content-type: application/json
origin: http://localhost:5174
pragma: no-cache
referer: http://localhost:5174/
sec-ch-ua: “Chromium”;v=”110”, “Not A(Brand”;v=”24”, “Google Chrome”;v=”110”
sec-ch-ua-mobile: ?1
sec-ch-ua-platform: “Android”
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: cross-site
user-agent: Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Mobile Safari/537.36 - 请求数据
{
“messages”: [
{
“role”: “user”,
“content”: “说一下冒泡排序”
}
],
“model”: “gpt-3.5-turbo”
}
使用流式输出的情况 - 请求过程,如https://freegpt.one/
Request URL: https://freegpt.one/backend-api/conversation
Request Method: POST
Status Code: 200
Remote Address: 119.23.226.212:3008
Referrer Policy: strict-origin-when-cross-origin - 响应头
access-control-allow-origin: *
cache-control: no-cache
cf-cache-status: DYNAMIC
cf-ray: 7a69bd4d9900aac9-SYD
content-type: text/event-stream # 返回的格式
date: Sun, 12 Mar 2023 05:46:55 GMT
nel: {“success_fraction”:0,”report_to”:”cf-nel”,”max_age”:604800}
report-to: {“endpoints”:[{“url”:”https://a.nel.cloudflare.com/report/v3?s=uqkV8Rbu6KwATAq5Enx2gEs62OtzHb1OzxOEA%2FPPDTmEOU8rB2mTJRsbo%2FhBhozu4QjYf4hB8PB95lI3j7wflQxjXd59ZFJdC3nZIJrV5EAvf2UHqpr8C%2BKLgJ%2Bc”}],”group”:”cf-nel”,”max_age”:604800}
server: cloudflare - 请求头
:authority: freegpt.one
:method: POST
:path: /backend-api/conversation
:scheme: https
accept: text/event-stream # 指定为event-stream
accept-encoding: gzip, deflate, br
accept-language: zh-CN,zh;q=0.9
authorization: Bearer
cache-control: no-cache
content-length: 262
content-type: application/json
cookie: cf_clearance=SCO.6UKnQWds7f7f0di8FY7YoR3SvGuRh8GLNecfsYo-1678599991-0-250
origin: https://freegpt.one # 代理站点
pragma: no-cache
referer: https://freegpt.one/
sec-ch-ua: “Chromium”;v=”110”, “Not A(Brand”;v=”24”, “Google Chrome”;v=”110”
sec-ch-ua-mobile: ?1
sec-ch-ua-platform: “Android”
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
user-agent: Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Mobile Safari/537.36
x-openai-assistant-app-id - 请求数据
{
“action”: “next”,
“messages”: [
{
“id”: “60d068fc-54b8-43aa-a10c-59f6d009a95c”,
“role”: “user”,
“content”: {
“content_type”: “text”,
“parts”: [
“python怎么流式输出response”
]
}
}
],
“parent_message_id”: “283deb9b-ac7a-430f-95a9-92394ce9e149”,
“model”: “text-davinci-002-render”
}
另外的流输出方式,如https://gpt.gcchen.cn/:
- 请求过程
Request URL: https://gpt.gcchen.cn/api/chat-process
Request Method: POST
Status Code: 200
Remote Address: 1.12.228.197:443
Referrer Policy: strict-origin-when-cross-origin - 响应头
access-control-allow-credentials: true
access-control-allow-headers: Content-Type
access-control-allow-headers: *
access-control-allow-methods: *
access-control-allow-methods: *
access-control-allow-origin: *
access-control-allow-origin: *
content-type: application/octet-stream
date: Sun, 12 Mar 2023 06:05:49 GMT
server: nginx
strict-transport-security: max-age=31536000
x-powered-by: Express - 请求头
:authority: gpt.gcchen.cn
:method: POST
:path: /api/chat-process
:scheme: https
accept: application/json, text/plain, /
accept-encoding: gzip, deflate, br
accept-language: zh-CN,zh;q=0.9
cache-control: no-cache
content-length: 38
content-type: application/json
cookie: Hm_lvt_837d979bc9b73183a508c3486c37a02c=1678601121; Hm_lpvt_837d979bc9b73183a508c3486c37a02c=1678601121
origin: https://gpt.gcchen.cn
pragma: no-cache
referer: https://gpt.gcchen.cn/
sec-ch-ua: “Chromium”;v=”110”, “Not A(Brand”;v=”24”, “Google Chrome”;v=”110”
sec-ch-ua-mobile: ?1
sec-ch-ua-platform: “Android”
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
user-agent: Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Mobile Safari/537.36 - 请求数据
{
“prompt”: “冒泡排序”,
“options”: {}
}
第二次请求数据
{
“prompt”: “它的时间复杂度是多少”,
“options”: {
“parentMessageId”: “chatcmpl-6t981Jwm9KjjCvfqlS6uCP66ChYnS”
}
}
第三次请求数据
{
“prompt”: “和插入排序比怎么样”,
“options”: {
“parentMessageId”: “chatcmpl-6t9BNZ47EjAa9j2xIkY3ZbJdYKNNK” # 记录历史会话内容
}
} - 响应数据
{“role”:”assistant”,”id”:”chatcmpl-6t981Jwm9KjjCvfqlS6uCP66ChYnS”,”parentMessageId”:”63e9fcea-8c58-4174-b8bf-73898f366cf1”,”text”:””,”detail”:{“id”:”chatcmpl-6t981Jwm9KjjCvfqlS6uCP66ChYnS”,”object”:”chat.completion.chunk”,”created”:1678601149,”model”:”gpt-3.5-turbo-0301”,”choices”:[{“delta”:{“role”:”assistant”},”index”:0,”finish_reason”:null}]}}
{“role”:”assistant”,”id”:”chatcmpl-6t981Jwm9KjjCvfqlS6uCP66ChYnS”,”parentMessageId”:”63e9fcea-8c58-4174-b8bf-73898f366cf1”,”text”:”冒”,”delta”:”冒”,”detail”:{“id”:”chatcmpl-6t981Jwm9KjjCvfqlS6uCP66ChYnS”,”object”:”chat.completion.chunk”,”created”:1678601149,”model”:”gpt-3.5-turbo-0301”,”choices”:[{“delta”:{“content”:”冒”},”index”:0,”finish_reason”:null}]}}
{“role”:”assistant”,”id”:”chatcmpl-6t981Jwm9KjjCvfqlS6uCP66ChYnS”,”parentMessageId”:”63e9fcea-8c58-4174-b8bf-73898f366cf1”,”text”:”冒泡”,”delta”:”泡”,”detail”:{“id”:”chatcmpl-6t981Jwm9KjjCvfqlS6uCP66ChYnS”,”object”:”chat.completion.chunk”,”created”:1678601149,”model”:”gpt-3.5-turbo-0301”,”choices”:[{“delta”:{“content”:”泡”},”index”:0,”finish_reason”:null}]}}
…
“parentMessageId”:”63e9fcea-8c58-4174-b8bf-73898f366cf1”,”text”:”冒泡排序是一种简单的排序算法,它重复地走访过要排序的数列,每次比较相邻的两个元素,如果顺序错误就交换它们的位置。经过一轮的比较后,最大(或最小)的元素就被交换到了数列的末尾(或开头),然后再从头开始进行下一轮比较和交换,直到全部元素都有序排列为止。”,”delta”:”。”,”detail”:{“id”:”chatcmpl-6t981Jwm9KjjCvfqlS6uCP66ChYnS”,”object”:”chat.completion.chunk”,”created”:1678601149,”model”:”gpt-3.5-turbo-0301”,”choices”:[{“delta”:{“content”:”。”},”index”:0,”finish_reason”:null}]}}
{“role”:”assistant”,”id”:”chatcmpl-6t981Jwm9KjjCvfqlS6uCP66ChYnS”,”parentMessageId”:”63e9fcea-8c58-4174-b8bf-73898f366cf1”,”text”:”冒泡排序是一种简单的排序算法,它重复地走访过要排序的数列,每次比较相邻的两个元素,如果顺序错误就交换它们的位置。经过一轮的比较后,最大(或最小)的元素就被交换到了数列的末尾(或开头),然后再从头开始进行下一轮比较和交换,直到全部元素都有序排列为止。”,”detail”:{“id”:”chatcmpl-6t981Jwm9KjjCvfqlS6uCP66ChYnS”,”object”:”chat.completion.chunk”,”created”:1678601149,”model”:”gpt-3.5-turbo-0301”,”choices”:[{“delta”:{},”index”:0,”finish_reason”:”stop”}]}} # 最后finish_reason为stop,data变为[DONE]