Connect to any OpenAI API compatible LLM

Agentic RAG allows you to connect to any OpenAI API compatible LLM. This means that you can use any LLM that has an API compatible with the OpenAI API which has become a standard in the industry.

Many of the options for self-hosted LLMs, open-source LLMs hosted by cloud providers or commercial LLMs are compatible with the OpenAI API. This means that you can use them with Agentic RAG without any modifications.

Configuration

To modify your knowledge box configuration you can do so in three manners, through the API, the Nuclia CLI / SDK or the Agentic RAG dashboard.

The Agentic RAG dashboard offers the most user-friendly way to modify the configuration of your knowledge box and we will use it in this example.

We will be setting up a connection to the Phi 4 Reasoning Plus model, hosted by OpenRouter which offers a wide range of open-source and commercial models compatible with the OpenAI API. We can see more information about this specific model here, the API parameters are located under the API tab.

Open the AI Models page
In the left sidebar under Advanced, click AI Models.
Select “OpenAI API Compatible Model”
From the models list, choose OpenAI API Compatible Model.
Enable custom Key
Toggle the option for using you own OpenAI API Compatible Key if it is not already enabled.
Fill in the configuration parameters
- API Key:
  - Description: The API key for your LLM. This is the key that you would use as an authorization header in the API. You may leave this blank if the endpoint you are connecting to does not require an API key.
  - Example: We will set this to our OpenRouter API key.
- API URL:
  - Description: The URL of the API endpoint for your LLM. This may be shared between multiple models.
  - Example: For OpenRouter, it is the same for all models: https://openrouter.ai/api/v1
- Model:
  - Description: The name of the model you want to use, it needs to exactly match the name of the model in the API.
  - Example: For Phi 4 Reasoning Plus in the OpenRouter API, it is microsoft/phi-4-reasoning-plus:free.
- Maximum supported input tokens:
  - Description: The maximum number of tokens that the model can accept as input. Be mindful that this takes into account the tokens used in the prompt, query and context. Also take note that some models may provide their context window as the total between input and output tokens, while others may provide it as the input tokens only.
  - Example: For Phi 4 Reasoning Plus, the total context size is 32768 tokens, as we want to leave room for the output, we will set the maximum supported input tokens as 32768 - 1024 = 31744.
- Maximum supported output tokens:
  - Description: The maximum number of tokens that the model can generate as output. Again, we should keep in mind that this value summed to the Maximum supported input tokens should not exceed the total context size supported by the model.
  - Example: For Phi 4 Reasoning Plus, the maximum output tokens is specified at 32768, but we already reserved 31744 for the input tokens, so we will set this to 32768 - 31744 = 1024.
- Model Features:
  - Description: Under this section you will find multiple toggles related to features supported by the model, these vary from model to model, but most often the default values are well suited to most use cases. The most relevant toggle is for Image Support which allows you to use images as input for the model.
  - Example: Image input is not supported by Phi 4 Reasoning Plus, so we will leave it disabled.
Save
Click Save changes.
Test your model
Run a sample query in Agentic RAG or via API/CLI. Adjust your prompt templates and token settings as needed.

Configuration​

Configuration