Jlama

Overview

This module adds support for selected Jlama models.

Maven Coordinates

In addition to the Helidon integration with LangChain4J core dependencies, you must add the following:

<dependency>
    <groupId>io.helidon.integrations.langchain4j.providers</groupId>
    <artifactId>helidon-integrations-langchain4j-providers-jlama</artifactId>
</dependency>

Copied

Components

JlamaChatModel

To automatically create and add JlamaChatModel to the service registry add the following lines to application.yaml:

langchain4j:
  providers:
    jlama:
      temperature: 1.2

  models:
    jlama-chat-model:
      provider: jlama
      model-name: "tjake/Qwen2.5-0.5B-Instruct-JQ4"

Copied

If enabled is set to false, the configuration is ignored, and the component is not created.

Full list of configuration properties:

Key	Type	Description
`enabled`	boolean	If set to false, the component will not be available even if configured.
`model-name`	string	The model name to use.
`temperature`	double	Sampling temperature to use, between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic.
`working-quantized-type`	enum	Quantize the model at runtime. Default quantization is Q4.
`model-cache-path`	Path	Path to a directory where the model will be cached once downloaded.
`working-directory`	Path	Path to a directory where persistent ChatMemory can be stored on disk for a given model instance.
`auth-token`	string	Token to use when fetching private models from Hugging Face
`max-tokens`	integer	Maximum number of tokens to generate.
`thread-count`	integer	Number of threads to use.
`quantize-model-at-runtime`	boolean	Whether quantize the model at runtime.

JlamaEmbeddingModel

To automatically create and add JlamaEmbeddingModel to the service registry add the following lines to application.yaml:

langchain4j:
  providers:
    jlama:
      temperature: 1.2

  models:
    jlama-embedding-model:
      provider: jlama
      model-name: "tjake/Qwen2.5-0.5B-Instruct-JQ4"

Copied

If enabled is set to false, the configuration is ignored, and the component is not created.

Full list of configuration properties:

Key	Type	Description
`enabled`	boolean	If set to false, the component will not be available even if configured.
`model-name`	string	The model name to use.
`model-cache-path`	Path	Path to a directory where the model will be cached once downloaded.
`working-directory`	Path	Path to a directory where persistent ChatMemory can be stored on disk for a given model instance.
`auth-token`	string	Token to use when fetching private models from Hugging Face
`thread-count`	integer	Number of threads to use.
`pooling-type`	enum	Method of embedding pooling.

JlamaLanguageModel

To automatically create and add JlamaLanguageModel to the service registry add the following lines to application.yaml:

langchain4j:
  providers:
    jlama:
      temperature: 1.2

  models:
    jlama-language-model:
      provider: jlama
      model-name: "tjake/Qwen2.5-0.5B-Instruct-JQ4"

Copied

If enabled is set to false, the configuration is ignored, and the component is not created.

Full list of configuration properties:

Key	Type	Description
`enabled`	boolean	If set to false, the component will not be available even if configured.
`model-name`	string	The model name to use.
`temperature`	double	Sampling temperature to use, between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic.
`working-quantized-type`	enum	Quantize the model at runtime. Default quantization is Q4.
`model-cache-path`	Path	Path to a directory where the model will be cached once downloaded.
`working-directory`	Path	Path to a directory where persistent ChatMemory can be stored on disk for a given model instance.
`auth-token`	string	Token to use when fetching private models from Hugging Face
`max-tokens`	integer	Maximum number of tokens to generate.
`thread-count`	integer	Number of threads to use.
`quantize-model-at-runtime`	boolean	Whether quantize the model at runtime.

JlamaStreamingChatModel

To automatically create and add JlamaStreamingChatModel to the service registry add the following lines to application.yaml:

langchain4j:
  providers:
    jlama:
      temperature: 1.2

  models:
    jlama-streaming-chat-model:
      provider: jlama
      model-name: "tjake/Qwen2.5-0.5B-Instruct-JQ4"

Copied

If enabled is set to false, the configuration is ignored, and the component is not created.

Full list of configuration properties:

Key	Type	Description
`enabled`	boolean	If set to false, the component will not be available even if configured.
`model-name`	string	The model name to use.
`temperature`	double	Sampling temperature to use, between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic.
`working-quantized-type`	enum	Quantize the model at runtime. Default quantization is Q4.
`model-cache-path`	Path	Path to a directory where the model will be cached once downloaded.
`working-directory`	Path	Path to a directory where persistent ChatMemory can be stored on disk for a given model instance.
`auth-token`	string	Token to use when fetching private models from Hugging Face
`max-tokens`	integer	Maximum number of tokens to generate.
`thread-count`	integer	Number of threads to use.
`quantize-model-at-runtime`	boolean	Whether quantize the model at runtime.

Overview

Maven Coordinates

Components

JlamaChatModel

JlamaEmbeddingModel

JlamaLanguageModel

JlamaStreamingChatModel

Additional Information