API Endpoints Overview
Primary Global Endpoint
api.bfl.ai
- Primary Endpoint
Routes requests across all available clusters globally
Provides automatic failover between clusters for enhanced uptime
Intelligent load distribution prevents bottlenecks during high traffic periods
Important: Always use the polling_url
returned in responses when using this endpoint
Suitable for: Standard inference
Not suitable for: Finetuning operations and finetuned model inference workloads (finetuning remains region-specific)
Regional Endpoints
api.eu.bfl.ai
- European Multi-cluster Endpoint
Multi-cluster routing limited to EU regions
GDPR compliant
Provides the same uptime and load balancing benefits within EU regions
api.us.bfl.ai
- US Multi-cluster Endpoint
Multi-cluster routing limited to US regions
Provides the same uptime and load balancing benefits within US regions
Legacy Regional Endpoints
api.eu1.bfl.ai
- EU Single-cluster Endpoint
Required for finetuning operations in EU region
Single cluster, no automatic failover
api.us1.bfl.ai
- US Single-cluster Endpoint
Required for finetuning operations in US region
Single cluster, no automatic failover
Key Benefits of New Endpoints
Enhanced Reliability Reduced downtime through automatic cluster failover
Better Performance Intelligent traffic distribution prevents overload during peak usage
Seamless Experience Load balancing happens transparently on our end
Polling URL Usage
When using the primary global endpoint (api.bfl.ai
) or regional endpoints (api.eu.bfl.ai
, api.us.bfl.ai
), you must use the polling_url
returned in the initial request response.
Webhook Users: If you’re using webhooks to receive results, no changes are needed. The polling_url
requirement only applies when implementing async polling behavior to check request status.
Example Implementation
polling_example.py
polling_example.sh
import requests
import time
import os
# Submit request to global endpoint
response = requests.post(
'https://api.bfl.ai/v1/flux-pro-1.1' ,
headers = {
'accept' : 'application/json' ,
'x-key' : os.environ.get( "BFL_API_KEY" ),
'Content-Type' : 'application/json' ,
},
json = {
'prompt' : 'A serene landscape with mountains' ,
'aspect_ratio' : '16:9'
}
)
data = response.json()
request_id = data[ 'id' ]
polling_url = data[ 'polling_url' ] # Use this URL for polling
# Poll using the returned polling_url
while True :
time.sleep( 0.5 )
result = requests.get(
polling_url,
headers = {
'accept' : 'application/json' ,
'x-key' : os.environ.get( "BFL_API_KEY" ),
},
params = { 'id' : request_id}
).json()
if result[ 'status' ] == 'Ready' :
print ( f "Image ready: { result[ 'result' ][ 'sample' ] } " )
break
elif result[ 'status' ] in [ 'Error' , 'Failed' ]:
print ( f "Generation failed: { result } " )
break
Content Delivery and Storage Guidelines
Delivery URLs
Generated images are served from region-specific delivery URLs:
EU: delivery-eu1.bfl.ai
US: delivery-us1.bfl.ai
Important Delivery Considerations
Not for Direct Serving: The result.sample
URLs from delivery endpoints are not meant to be served directly to end users.
No CORS Support: We do not enable CORS on delivery URLs, which means they cannot be used directly in web browsers for cross-origin requests.
10-Minute Expiration: Generated images expire after 10 minutes and become inaccessible.
Network Access: If your infrastructure uses firewalls or network restrictions, ensure you whitelist the delivery endpoints (delivery-eu1.bfl.ai
, delivery-us1.bfl.ai
) to allow downloading generated images.
Recommended Image Handling
Download and Re-serve Pattern:
download_and_serve.py
download_and_serve.js
import requests
import os
from datetime import datetime
from typing import Dict, Any
def download_and_store_image ( result_url : str , local_path : str ) -> str :
"""
Download image from BFL delivery URL and store locally
"""
response = requests.get(result_url)
response.raise_for_status()
with open (local_path, 'wb' ) as f:
f.write(response.content)
return local_path
def handle_generation_result ( result : Dict[ str , Any]) -> Dict[ str , Any]:
"""
Process generation result and store image locally
"""
if result[ 'status' ] == 'Ready' :
sample_url = result[ 'result' ][ 'sample' ]
# Generate unique filename
timestamp = datetime.now().strftime( "%Y%m %d _%H%M%S" )
filename = f "generated_image_ { timestamp } .jpg"
local_path = os.path.join( "./images" , filename)
# Ensure directory exists
os.makedirs(os.path.dirname(local_path), exist_ok = True )
# Download and store
stored_path = download_and_store_image(sample_url, local_path)
# Now serve from your own infrastructure
return {
'status' : 'ready' ,
'local_path' : stored_path,
'public_url' : f "https://yourdomain.com/images/ { filename } "
}
return result
Migration Checklist
Update API Endpoints
Replace legacy endpoints with appropriate new endpoints based on your needs
Use api.bfl.ai
for global load balancing
Use api.eu.bfl.ai
or api.us.bfl.ai
for regional preferences
Implement Polling URL Handling
Ensure your code extracts and uses the polling_url
from API responses
Update polling logic to use the provided polling URL instead of hardcoded endpoints
Update Finetuning Workflows
Continue using region-specific endpoints (api.eu1.bfl.ai
, api.us1.bfl.ai
) for finetuning
Ensure finetuned model inference uses the same region as training
Implement Proper Image Handling
Set up download and re-serve infrastructure for generated images
Plan for 10-minute expiration window
Consider implementing CDN or cloud storage for better performance
Best Practices
Error Handling
import requests
import time
from typing import Dict, Any, Optional
def robust_api_call ( url : str , headers : Dict[ str , str ], json_data : Dict[ str , Any], max_retries : int = 3 ) -> Dict[ str , Any]:
"""
Robust API call with retry logic and proper error handling
"""
for attempt in range (max_retries):
try :
response = requests.post(url, headers = headers, json = json_data)
if response.status_code == 429 :
# Rate limit exceeded, wait and retry
wait_time = 2 ** attempt # Exponential backoff
time.sleep(wait_time)
continue
elif response.status_code == 402 :
# Insufficient credits
raise Exception ( "Insufficient credits. Please add credits to your account." )
elif response.status_code >= 400 :
# Other client/server errors
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
if attempt == max_retries - 1 :
raise e
time.sleep( 2 ** attempt)
raise Exception ( f "Failed after { max_retries } attempts" )
Rate Limiting
Maximum 24 concurrent requests for most endpoints
Maximum 6 concurrent requests for flux-kontext-max
Implement exponential backoff for 429 responses
Content Management
Download images immediately upon generation completion
Implement proper error handling for expired URLs
Consider implementing a queue system for high-volume applications
Use appropriate storage solutions (CDN, cloud storage) for serving images to users
Responses are generated using AI and may contain mistakes.