| static LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest. builder() | Create a new builder. | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest.Builder. copy(LlamaLlmInferenceRequest model) |  | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest.Builder. frequencyPenalty(Double frequencyPenalty) | To reduce repetitiveness of generated tokens, this number penalizes new tokens based on
their frequency in the generated text so far. | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest.Builder. isEcho(Boolean isEcho) | Whether or not to return the user prompt in the response. | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest.Builder. isStream(Boolean isStream) | Whether to stream back partial progress. | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest.Builder. logProbs(Integer logProbs) | Includes the logarithmic probabilities for the most likely output tokens and the chosen
tokens. | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest.Builder. maxTokens(Integer maxTokens) | The maximum number of tokens that can be generated per output sequence. | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest.Builder. numGenerations(Integer numGenerations) | The number of of generated texts that will be returned. | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest.Builder. presencePenalty(Double presencePenalty) | To reduce repetitiveness of generated tokens, this number penalizes new tokens based on
whether they’ve appeared in the generated text so far. | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest.Builder. prompt(String prompt) | Represents the prompt to be completed. | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest.Builder. stop(List<String> stop) | List of strings that stop the generation if they are generated for the response text. | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest.Builder. temperature(Double temperature) | A number that sets the randomness of the generated output. | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest. toBuilder() |  | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest.Builder. topK(Integer topK) | An integer that sets up the model to use only the top k most likely tokens in the
generated output. | 
| LlamaLlmInferenceRequest.Builder | LlamaLlmInferenceRequest.Builder. topP(Double topP) | If set to a probability 0.0 < p < 1.0, it ensures that only the most likely tokens, with
total probability mass of p, are considered for generation at each step. |