Rate Limits and File Uploads in GitHub Models #149698
Unanswered
solitude-alive
asked this question in
Models
Replies: 1 comment 1 reply
-
GitHub imposes token limits per request (e.g., 8000 tokens) to manage server load, even though models like GPT-4o mini can handle larger contexts (131k tokens). To include a file in an API request, encode it in Base64 and include it in the request body. Ensure the file size complies with GitHub's limits. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Select Topic Area
Question
Body
Hi,
I'm currently using GitHub Models, but I have some confusion regarding rate limits. Taking
4o-mini
as an example, according to GitHub's rate limits documentation, it states that the tokens per request are 8000 in and 4000 out. However,4o-mini
clearly has a larger context length, as seen on the GitHub Marketplace page for 4o-mini, where the context is listed as 131k input and 4k output.I would like to understand why there is such a discrepancy, and whether this means I cannot request a query longer than 8k tokens.
Additionally, I would like to ask how to include a file in an API request.
Thank you for your help!
Beta Was this translation helpful? Give feedback.
All reactions