Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double-counted tokens? #50

Open
dnakov opened this issue Oct 10, 2024 · 9 comments
Open

Double-counted tokens? #50

dnakov opened this issue Oct 10, 2024 · 9 comments

Comments

@dnakov
Copy link

dnakov commented Oct 10, 2024

OpenAI team,

First off, hats off to you for this awesome API and model.

Second, I'd like to bring up some observations because I'm not sure if they are intended or not.

Assuming the default modalities (text, audio) are selected, as the conversation goes on, for each response, all previous conversation items are counted as both input audio tokens and input text tokens.
Now, we can turn off the user input transcription (off by default), which would take care of the input text tokens coming from the user speech transcription, but we have no way of disabling assistant response transcription. So, we end up being charged for both text and audio tokens.

See below an example of a 30-turn interaction (with input_audio_transcription enabled) (and $10 down the drain 😅)
We paid $0.05 for the just the first question, $0.22 for just the 10th, $0.42 for the 20th, $0.58 for the 30th, etc.
If I so much as sneezed after that, it'd have cost me over half a dollar (~2 min of audio, according to the advertised rate.)

My main question is:
Why do we have to pay for both assistant text and audio tokens by default? Does the model need both? (doesn't seem so)

Now, we technically have the ability to delete conversation items and create new ones, so there are definitely ways to deal with the cost. But, they're cumbersome and full of edge cases due to the realtime nature of it.

If you can't "just make it super cheap so we don't have to care", maybe there is some low hanging fruit?
Can we have some easier control over how much of the conversation we want to keep around?
Can we have a conversation.item.update event so that we don't have to recreate the entire conversation tree when we just want to remove the transcription?
Something else better I'm sure you've already cooked up?

Thank you!

Turn 1

User: (audio) What's the weather in New York?

Assistant: (audio) I don't have access to real-time data, including current weather updates. However, you can check the latest weather in New York by looking it up on a weather website or using a weather app on your phone.

Tokens In Audio Out Audio In Text Out Text Cost
Current 16 236 8 62 $0.05
Total 16 236 8 62 $0.05

Turn 2

User: (audio) What's the weather in San Francisco?

Assistant: (audio) I don't have access to real-time weather updates. For the current weather in San Francisco, please check a weather website or use a weather app on your phone.

Tokens In Audio Out Audio In Text Out Text Cost
Current 275 173 80 48 $0.06
Total 291 409 88 110 $0.11

Turn 3

User: (audio) What's the weather in London?

Assistant: (audio) I don't have access to real-time data, including current weather updates. To get the latest weather in London, please check a weather website or use a weather app on your phone.

Tokens In Audio Out Audio In Text Out Text Cost
Current 465 207 138 54 $0.09
Total 756 616 226 164 $0.20

Turn 4

User: (audio) What's the weather in Toronto?

Assistant: (audio) I can't provide current weather updates. For the latest weather in Toronto, please check a weather website or use a weather app on your phone.

Tokens In Audio Out Audio In Text Out Text Cost
Current 690 169 202 44 $0.10
Total 1446 785 428 208 $0.31

Turn 5

User: (audio) What's the weather in Istanbul?

Assistant: (audio) I don't have access to real-time information, so I can't provide the current weather in Istanbul. You can check the latest weather there by using a weather website or a weather app.

Tokens In Audio Out Audio In Text Out Text Cost
Current 877 213 256 54 $0.13
Total 2323 998 684 262 $0.44

Turn 6

User: (audio) What's the weather in New York?

Assistant: (audio) I can't provide real-time weather updates. To find out the current weather in New York, please check a weather website or use a weather app.

Tokens In Audio Out Audio In Text Out Text Cost
Current 1106 163 320 45 $0.15
Total 3429 1161 1004 307 $0.59

Turn 7

User: (audio) What's the weather in San Francisco?

Assistant: (audio) I can't provide real-time weather updates. To find out the current weather in San Francisco, please check a weather website or use a weather app.

Tokens In Audio Out Audio In Text Out Text Cost
Current 1292 172 375 45 $0.17
Total 4721 1333 1379 352 $0.75

Turn 8

User: (audio) What's the weather in London?

Assistant: (audio) I can't provide real-time weather updates. To find out the current weather in London, please check a weather website or use a weather app.

Tokens In Audio Out Audio In Text Out Text Cost
Current 1481 163 430 44 $0.18
Total 6202 1496 1809 396 $0.94

Turn 9

User: (audio) What's the weather in Toronto?

Assistant: (audio) I can't provide real-time weather updates. To find out the current weather in Toronto, please check a weather website or use a weather app.

Tokens In Audio Out Audio In Text Out Text Cost
Current 1662 166 484 44 $0.20
Total 7864 1662 2293 440 $1.14

Turn 10

User: (audio) What's the weather in Istanbul?

Assistant: (audio) I can't provide real-time weather updates. To find out the current weather in Istanbul, please check a weather website or use a weather app.

Tokens In Audio Out Audio In Text Out Text Cost
Current 1846 172 538 44 $0.22
Total 9710 1834 2831 484 $1.36

Turn 11

User: (audio) What's the weather in New York?

Assistant: (audio) I can't provide real-time weather updates. To find out the current weather in New York, please check a weather website or use a weather app.

Tokens In Audio Out Audio In Text Out Text Cost
Current 2034 164 592 45 $0.24
Total 11744 1998 3423 529 $1.60

Turn 12

User: (audio) What's the weather in San Francisco?

Assistant: (audio) For the most current weather in San Francisco, please check a reliable weather website or a weather app on your device. They will have the latest information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 2221 189 647 48 $0.26
Total 13965 2187 4070 577 $1.87

Turn 13

User: (audio) What's the weather in London?

Assistant: (audio) To get the current weather in London, please check a reliable weather website or a weather app on your device for the most up-to-date information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 2427 177 705 45 $0.28
Total 16392 2364 4775 622 $2.15

Turn 14

User: (audio) What's the weather in Toronto?

Assistant: (audio) For the latest weather in Toronto, please check a weather website or a weather app on your phone for real-time updates. They will have the information you need.

Tokens In Audio Out Audio In Text Out Text Cost
Current 2622 195 760 50 $0.31
Total 19014 2559 5535 672 $2.45

Turn 15

User: (audio) What's the weather in Istanbul?

Assistant: (audio) For the current weather in Istanbul, please refer to a weather website or a weather app on your device. They will have the latest information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 2835 168 820 44 $0.32
Total 21849 2727 6355 716 $2.78

Turn 16

User: (audio) What's the weather in New York?

Assistant: (audio) To get the current weather in New York, please check a reliable weather website or a weather app on your device. They will have up-to-date information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 3019 180 874 47 $0.34
Total 24868 2907 7229 763 $3.12

Turn 17

User: (audio) What's the weather in San Francisco?

Assistant: (audio) To get the current weather in San Francisco, please check a reliable weather website or a weather app on your device for the most up-to-date information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 3222 190 931 46 $0.37
Total 28090 3097 8160 809 $3.49

Turn 18

User: (audio) What's the weather in London?

Assistant: (audio) To find out the current weather in London, please check a reliable weather website or use a weather app on your device. They will have the latest information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 3429 175 987 47 $0.38
Total 31519 3272 9147 856 $3.87

Turn 19

User: (audio) What's the weather in Toronto?

Assistant: (audio) For the latest weather in Toronto, please check a weather website or a weather app. They will have up-to-date information on the current conditions.

Tokens In Audio Out Audio In Text Out Text Cost
Current 3622 167 1044 45 $0.40
Total 35141 3439 10191 901 $4.27

Turn 20

User: (audio) What's the weather in Istanbul?

Assistant: (audio) To get the current weather in Istanbul, please check a reliable weather website or a weather app on your device for the latest information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 3807 155 1099 42 $0.42
Total 38948 3594 11290 943 $4.69

Turn 21

User: (audio) What's the weather in New York?

Assistant: (audio) To get the current weather in New York, please check a reliable weather website or use a weather app on your device. They will have the latest information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 3978 179 1151 47 $0.44
Total 42926 3773 12441 990 $5.13

Turn 22

User: (audio) What's the weather in San Francisco?

Assistant: (audio) To find out the current weather in San Francisco, please check a reliable weather website or a weather app on your device. They will have the most up-to-date information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 4180 188 1208 52 $0.46
Total 47106 3961 13649 1042 $5.59

Turn 23

User: (audio) What's the weather in London?

Assistant: (audio) To get the current weather in London, please check a reliable weather website or use a weather app on your device. They will have the latest information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 4385 176 1270 48 $0.48
Total 51491 4137 14919 1090 $6.07

Turn 24

User: (audio) What's the weather in Toronto?

Assistant: (audio) To find out the current weather in Toronto, please check a reliable weather website or use a weather app on your device for the latest information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 4579 164 1328 44 $0.50
Total 56070 4301 16247 1134 $6.57

Turn 25

User: (audio) What's the weather in Istanbul?

Assistant: (audio) To get the current weather in Istanbul, please check a reliable weather website or a weather app on your device for the latest information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 4761 145 1382 42 $0.51
Total 60831 4446 17629 1176 $7.08

Turn 26

User: (audio) What's the weather in New York?

Assistant: (audio) To find out the current weather in New York, please check a reliable weather website or use a weather app on your device. They will provide the latest information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 4922 178 1434 48 $0.54
Total 65753 4624 19063 1224 $7.62

Turn 27

User: (audio) What's the weather in San Francisco?

Assistant: (audio) For the latest weather in San Francisco, please check a weather website or a weather app on your device. They will have the most current information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 5123 180 1492 45 $0.56
Total 70876 4804 20555 1269 $8.18

Turn 28

User: (audio) What's the weather in London?

Assistant: (audio) To get the current weather in London, please check a reliable weather website or use a weather app on your device for the latest information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 5320 142 1547 43 $0.57
Total 76196 4946 22102 1312 $8.75

Turn 29

User: (audio) What's the weather in Toronto?

Assistant: (audio) To find out the current weather in Toronto, please check a reliable weather website or use a weather app on your device. They will have the latest information.

Tokens In Audio Out Audio In Text Out Text Cost
Current 5480 176 1600 47 $0.59
Total 81676 5122 23702 1359 $9.34

Turn 30

User: (text) What was the second city I asked about?

Assistant: (audio) The second city you asked about was San Francisco.

Tokens In Audio Out Audio In Text Out Text Cost
Current 5656 53 1662 20 $0.58
Total 87332 5175 25364 1379 $9.92

Total Cost: $9.92

@kyleboddy
Copy link

I also noted this and conversed with @dnakov over Twitter about it - realtime API costs are much higher than expected and his code saves on it significantly. His repo is here with the "optimizations" to reduce cost:

https://github.com/dnakov/realtime-api-cost

@FuyukiSakura
Copy link

FuyukiSakura commented Oct 10, 2024

Pushing this upward. I did not do a record on the findings.

However, also realize that the API cost is significantly higher than expected. I think this is not intended but a bug of the realtime console, or the SDK.

If I go to have a chat on Playground for 5 minutes. It costs only around $1. But doing it with the realtime console code, it costs $2-4. I am using the same prompt on both versions.

For more information, I am also using the relay.

@dnakov
Copy link
Author

dnakov commented Oct 10, 2024

It's the same, playground or not.

@spectaclecs
Copy link

can I ask that how did you get the cost tokens from this api? I wanted to do this but I failed to find a server event that contains the cost.

@dnakov
Copy link
Author

dnakov commented Oct 10, 2024

@spectaclecs take a look at PR #37

@mhamdan91
Copy link

mhamdan91 commented Oct 12, 2024

#37 needs to be merged, it cost me ~$10 for 5 minute conversation lol. Gotta stick to concurrent processing for now...

@Xoni30
Copy link

Xoni30 commented Oct 13, 2024

Bumping this up!

@Stevenic
Copy link

The increasing costs is just the conversation history that's building up in the thread. OpenAI doesn't know what your scenario is so they don't know how much conversation history to track. You need at least 2 - 3 turns for features like co-referencing to work. Questions like "how much is that in euros?" wont work because the model won't know what "that" is referring to... Depending on your scenario though you might need 20 turns of conversation history. Lets say your playing a game of 20 questions... The point is OpenAI doesn't know your scenario so they track as many turns as they can fit into the context window which can be expensive.

A feature that lets you configure the number of turns to keep could be handy but you've already shared the code above for how to trim the conversation history yourself.

@kyleboddy
Copy link

I don't think that is correct @Stevenic - look at the video demo @dnakov has posted to Twitter showing that it can remember the history even with the optimized/cutdown version:

image

https://x.com/dnak0v/status/1842685544423182631

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants