Finetuning Guide
The BFL Finetuning API enables you to customize FLUX Pro and FLUX Ultra using 1-20 images of your own visual content, and optionally, text descriptions.
Getting Started: Step-by-Step Guide
Prepare Your Images
Create a local folder containing your training images:
- Supported formats: JPG, JPEG, PNG, and WebP
- Recommended: More than 5 images
High-quality datasets with clear, articulated subjects/objects/styles significantly improve training results. Higher resolution source images help but are capped at 1MP.
Add Text Descriptions (Optional)
Create text files with descriptions for your images:
- Text files should share the same name as their corresponding images
- Example: If your image is
sample.jpg
, createsample.txt
Package Your Data
Compress your folder into a ZIP file containing all images and optional text descriptions.
Configure Training Parameters
Select appropriate hyperparameters based on your use case. See the Training Parameters section below for detailed configuration options.
Submit Training Task
Use the provided Python functions to submit your finetuning task to the BFL API.
Monitor Progress
Check the status of your training job using the progress monitoring functions.
Run Inference
Once training is complete, use your custom model through the available finetuned endpoints.
Training Parameters
Required Parameters
Determines the finetuning approach based on your concept.
Options: "character"
, "product"
, "style"
, "general"
In “general” mode, the entire image is captioned when captioning is True without specific focus areas. No subject specific improvements will be made.
Descriptive note to identify your fine-tune since names are UUIDs. Will be displayed in finetune_details
.
Optional Parameters
Minimum: 100
Defines training duration. For fast exploration, 100-150 iterations can be enough. For more complex concepts, larger datasets, or extreme precision, more iterations than the default can help.
Default: 0.00001 if finetune_type
is “full”, 0.0001 if finetune_type
is “lora”
Lower values can improve the result but might need more iterations to learn a concept. Higher values can allow you to train for less iterations at a potential loss in quality.
For finetune_type
“lora”, values 10 times larger than for “full” are recommended.
Options: "speed"
, "quality"
, "high_res_only"
The speed priority will improve speed per training step.
Enables/disables automatic image captioning.
Unique word/phrase that will be used in the captions to reference the newly introduced concepts.
Choose between 32 and 16. A lora_rank of 16 can increase training efficiency and decrease loading times.
Choose between “full” for a full finetuning + post hoc extraction of the trained weights into a LoRA or “lora” for a raw LoRA training.
Inference Endpoints
Available endpoints for your finetuned model:
/flux-pro-1.1-ultra-finetuned
/flux-pro-finetuned
/flux-pro-1.0-depth-finetuned
/flux-pro-1.0-canny-finetuned
/flux-pro-1.0-fill-finetuned
You can only inference the finetunes in the region that you trained them in. When submitting the finetune with https://api.us1.bfl.ai/v1/finetune, use https://api.us1.bfl.ai/v1/flux-pro-1.1-ultra-finetuned. Querying https://api.us1.bfl.ai/v1/flux-pro-1.1-ultra-finetuned will not find the finetune.
Additional Inference Parameters
The endpoints have additionally all input parameters that their non-finetuned sibling endpoints have.
References your specific model.
Find the finetune_id
either in my_finetunes
or in the return dict of your /finetune
POST.
Range: 0-2
Controls finetune influence. Increase this value if your target concept isn’t showing up strongly enough. The optimal setting depends on your finetune and prompt.
Implementation Guide
Setup and Dependencies
Install Dependencies
Set API Key
Implementation Guide
Example Python Implementation
Best Practices and Tips
1. Enhancing Concept Representation
- Try
finetune_strength
values >1 if the concept is not present - Increase the
finetune_strength
if the concept is not present or the identity preservation is not strong enough - Lower the
finetune_strength
if you are unable to generalize the concept into new settings or if the image has visible artifacts
2. Character Training
- Avoid multiple characters in single images
- Use manual captions when multiple characters are unavoidable
- Consider disabling auto-captioning in complex scenes or for complex concepts
3. Quality Considerations
- Use high-quality training images
- Adjust learning rate based on training stability
- Monitor training progress and adjust parameters as needed
4. Prompting
Change the trigger word to something more contextual than “TOK”. For example, “mycstm sunglasses” for a product finetune on sunglasses.
Key Strategies:
-
Prepend the trigger word to your prompt:
"mycstm sunglasses, a photo of mycstm sunglasses laying on grass"
-
For character and product consistency, briefly describe the person/product:
"mycstm man, a photograph of mycstm man, sitting on a park bench and smiling into the camera. mycstm man is a middle aged man with brown curly hair and a beard. He is wearing a cowboy hat."
-
For styles, append “in the style of [trigger-word]“:
"a kangaroo wearing ski goggles holding up a drink, in the style of mycstm watercolourstyle"