Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impact of 'num_ctx' and 'num_predict' parameters on answer quality. #722

Open
Snikch63200 opened this issue Nov 26, 2024 · 1 comment
Open
Labels
question Further information is requested

Comments

@Snikch63200
Copy link

Hello,

I currently use paperQA with Llama3.1:70b served by Ollama.

With LLM default parameters, answers quality is often poor, especially when I increase answer.evidence_k and answer.answer_max_sources. With default answer settings, answer is just 'correct'. Theorically, increasing these parameters shoulds increase answer relevance instead...

I guess it's a problem with LLM parameters but I'm not sure...

So I wonder what is exact impact of 'num_ctx' and 'num_predict' parameters on answer quality.

local_llm_config = dict(
                        model_list=[
                                    dict(
                                        model_name=f"ollama/llama3.1:70b",
                                        litellm_params=dict(
                                                            model=f"ollama/llama3.1:70b",
                   ocal_llm_config = dict(
                        model_list=[
                                    dict(
                                        model_name=f"ollama/llama3.1:70b",
                                        litellm_params=dict(
                                                            model=f"ollama/llama3.1:70b",
                                                            api_type= "ollama",
                                                            api_base="http://192.168.1.15:11434",
                                                            num_ctx=2048, #Default 2048
                                                            num_predict=128, # Maximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context)
                                                            ),
                                        ),

                                    ]
                        )                      

Best regards.

@dosubot dosubot bot added the question Further information is requested label Nov 26, 2024
Copy link

dosubot bot commented Nov 26, 2024

I couldn't find specific information about the impact of the 'num_ctx' and 'num_predict' parameters on answer quality in the context of using Llama3.1 with paperQA. However, generally speaking:

  • The num_ctx parameter typically defines the context window size, which is the number of tokens the model can consider at once. A larger context window allows the model to take more information into account, potentially improving the relevance and coherence of the generated answers.

  • The num_predict parameter controls the maximum number of tokens the model will generate in response. A higher value might lead to more detailed answers, but it could also increase the risk of generating less relevant or more verbose responses.

Adjusting these parameters can help balance between the depth of context considered and the length of the generated text, which might influence the quality of the answers. Experimenting with these settings while monitoring the quality of the output could help you find an optimal configuration for your use case.

To continue talking to Dosu, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant