validate_provider_credential in your provider configuration file. At runtime, Dify calls the corresponding model layer’s validate_credentials method based on the model type and model name the user selects.
Integrate a Custom Model Plugin
Integrating a custom model takes four steps:- Create a model provider file: Identify the model types your custom model will include.
- Create code files by model type: Create separate code files for each model type (e.g.,
llmortext_embedding). Keeping each model type in its own logical layer simplifies maintenance and future expansion. - Develop the model invocation logic: Within each model-type module, create a Python file named for that model type (for example,
llm.py). Define a class in the file that implements the model logic, conforming to the system’s model interface specifications. - Debug the plugin: Write unit and integration tests for the new provider functionality, ensuring that all components work as intended.
1. Create the Model Provider File
In your plugin’s/provider directory, create a xinference.yaml file.
The Xinference family of models supports LLM, Text Embedding, and Rerank model types, so your xinference.yaml must include all three.
Example:
provider_credential_schema. Since Xinference supports text-generation, embeddings, and reranking models, you can configure it as follows:
model_name:
server_url) and model UID:
2. Develop the Model Code
Xinference supportsllm, rerank, speech2text, and tts, so create a corresponding directory under /models for each type, each containing its feature code.
Below is an example for an llm type model. Create a file named llm.py, then define a class such as XinferenceAILargeLanguageModel that extends __base.large_language_model.LargeLanguageModel. The class must implement the following methods.
LLM Invocation
The core method for invoking the LLM, supporting both streaming and synchronous responses:yield as a generator that returns Generator, so splitting them keeps the return types clean:
Pre-calculate Input Tokens
If your model doesn’t provide a token-counting interface, return0:
self._get_num_tokens_by_gpt2(text: str) from the AIModel base class, which uses a GPT-2 tokenizer. Remember this is an approximation and may not match your model exactly.
Validate Model Credentials
Similar to provider-level credential checks, but scoped to a single model:Dynamic Model Parameters Schema
Unlike predefined models, no YAML file defines which parameters a model supports, so you must generate the parameter schema dynamically. For example, Xinference supportsmax_tokens, temperature, and top_p. Other providers (e.g., OpenLLM) may support parameters like top_k only for certain models, so the schema must adapt to each model’s capabilities:
Error Mapping
When an error occurs during model invocation, map it to one of the runtime’sInvokeError types so Dify can handle different errors consistently:
InvokeConnectionErrorInvokeServerUnavailableErrorInvokeRateLimitErrorInvokeAuthorizationErrorInvokeBadRequestError
3. Debug the Plugin
After development, test the plugin to make sure it runs correctly. For details, see:Debug Plugin
4. Publish the Plugin
To list the plugin on the Dify Marketplace, see Publish to Dify Marketplace.Explore More
Quick Start: Plugins Endpoint Docs:- Manifest Structure
- Endpoint Definitions
- Reverse-Invocation of the Dify Service
- Tools
- Models
Edit this page | Report an issue