Contents
Overview
This module adds support for selected Ollama models.
Maven Coordinates
In addition to the Helidon integration with LangChain4j core dependencies, you must add the following:
<dependency>
<groupId>io.helidon.integrations.langchain4j.providers</groupId>
<artifactId>helidon-integrations-langchain4j-providers-ollama</artifactId>
</dependency>Components
OllamaChatModel
To automatically create and add OllamaChatModel to the service registry add the following lines to application.yaml:
langchain4j:
providers:
ollama:
base-url: "http://localhost:11434"
models:
ollama-chat-model:
provider: ollama
model-name: "llama3.1"If enabled is set to false, the configuration is ignored, and the component is not created.
Full list of configuration properties:
| Key | Type | Description |
|---|---|---|
base-url | string | The base URL for the Ollama API. If not present, the default value supplied from LangChain4j is used. |
enabled | boolean | If set to false, the component will not be available even if configured. |
format | string | Specifies the structure or style of the text produced by the model, such as plain text, JSON, or a custom format. |
log-requests | boolean | Whether to log API requests. |
log-responses | boolean | Whether to log API responses. |
max-retries | integer | The maximum number of retries for failed API requests. |
model-name | string | The model name to use. |
num-predict | int | Length of the output generated by the model. |
repeat-penalty | double | The penalty applied to repeated tokens during text generation. Higher values discourage the model from generating the same token multiple times, promoting more varied and natural output. A value of |
seed | int | The seed for the random number generator used by the model. |
stop | string[] | List of sequences where the API will stop generating further tokens. |
temperature | double | Sampling temperature to use, between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic. |
timeout | duration | The timeout setting for API requests. See here for the format. |
top-k | int | Limits the token pool to the |
top-p | double | Nucleus sampling value, where the model considers the results of the tokens with top_p probability mass. |
OllamaEmbeddingModel
To automatically create and add OllamaEmbeddingModel to the service registry add the following lines to application.yaml:
langchain4j:
providers:
ollama:
base-url: "http://localhost:11434"
models:
ollama-embedding-model:
provider: ollama
model-name: "nomic-embed-text"If enabled is set to false, the configuration is ignored, and the component is not created.
Full list of configuration properties:
| Key | Type | Description |
|---|---|---|
base-url | string | The base URL for the Ollama API. If not present, the default value supplied from LangChain4j is used. |
enabled | boolean | If set to false, the component will not be available even if configured. |
log-requests | boolean | Whether to log API requests. |
log-responses | boolean | Whether to log API responses. |
max-retries | integer | The maximum number of retries for failed API requests. |
model-name | string | The model name to use. |
timeout | duration | The timeout setting for API requests. See here for the format. |
OllamaLanguageModel
To automatically create and add OllamaLanguageModel to the service registry add the following lines to application.yaml:
langchain4j:
providers:
ollama:
base-url: "http://localhost:11434"
models:
ollama-language-model:
provider: ollama
model-name: "llama3.1"If enabled is set to false, the configuration is ignored, and the component is not created.
Full list of configuration properties:
| Key | Type | Description |
|---|---|---|
base-url | string | The base URL for the Ollama API. If not present, the default value supplied from LangChain4j is used. |
enabled | boolean | If set to false, the component will not be available even if configured. |
format | string | Specifies the structure or style of the text produced by the model, such as plain text, JSON, or a custom format. |
log-requests | boolean | Whether to log API requests. |
log-responses | boolean | Whether to log API responses. |
max-retries | integer | The maximum number of retries for failed API requests. |
model-name | string | The model name to use. |
num-predict | int | Length of the output generated by the model. |
repeat-penalty | double | The penalty applied to repeated tokens during text generation. Higher values discourage the model from generating the same token multiple times, promoting more varied and natural output. A value of |
seed | int | The seed for the random number generator used by the model. |
stop | string[] | List of sequences where the API will stop generating further tokens. |
temperature | double | Sampling temperature to use, between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic. |
timeout | duration | The timeout setting for API requests. See here for the format. |
top-k | int | Limits the token pool to the |
top-p | double | Nucleus sampling value, where the model considers the results of the tokens with top_p probability mass. |
OllamaStreamingChatModel
To automatically create and add OllamaStreamingChatModel to the service registry add the following lines to application.yaml:
langchain4j:
providers:
ollama:
base-url: "http://localhost:11434"
models:
ollama-streaming-chat-model:
provider: ollama
model-name: "llama3.1"If enabled is set to false, the configuration is ignored, and the component is not created.
Full list of configuration properties:
| Key | Type | Description |
|---|---|---|
base-url | string | The base URL for the Ollama API. If not present, the default value supplied from LangChain4j is used. |
enabled | boolean | If set to false, the component will not be available even if configured. |
format | string | Specifies the structure or style of the text produced by the model, such as plain text, JSON, or a custom format. |
log-requests | boolean | Whether to log API requests. |
log-responses | boolean | Whether to log API responses. |
max-retries | integer | The maximum number of retries for failed API requests. |
model-name | string | The model name to use. |
num-predict | int | Length of the output generated by the model. |
repeat-penalty | double | The penalty applied to repeated tokens during text generation. Higher values discourage the model from generating the same token multiple times, promoting more varied and natural output. A value of |
seed | int | The seed for the random number generator used by the model. |
stop | string[] | List of sequences where the API will stop generating further tokens. |
temperature | double | Sampling temperature to use, between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic. |
timeout | duration | The timeout setting for API requests. See here for the format. |
top-k | int | Limits the token pool to the |
top-p | double | Nucleus sampling value, where the model considers the results of the tokens with top_p probability mass. |