Custom Python Tools
Learn about custom python tools and how to implement it effectively.
Custom Python Tool Functions
Forjinn supports custom Python tool functions that execute in isolated virtual environments with comprehensive package management, variable injection, and artifact handling.
Overview
Custom Python tools allow you to:
- Execute Python code in isolated environments
- Access workflow variables and context
- Generate artifacts (files, images, data)
- Install packages dynamically
- Handle complex data processing tasks
Architecture
Python Server
- FastAPI-based: High-performance execution environment
- Port:
http://localhost:8000(configurable viaPYTHON_RUNTIME) - Isolation: Workspace-specific virtual environments
- Resource Limits: CPU, memory, and file size constraints
Virtual Environments
- Workspace Isolation: Each workspace gets its own venv
- Package Management: Uses
uv(preferred) or standardvenv - Automatic Creation: Environments created on-demand
- Persistent Storage: Environments persist across executions
Artifact Handling
- File Generation: Automatic detection of generated files
- URL Access: Files served via HTTP endpoints
- Supported Formats: Images, PDFs, CSV, JSON, HTML, etc.
- Workspace Isolation: Artifacts isolated per workspace
Variable System
Variable Types
Direct Parameters
# Tool input parameters (prefixed with $)
$param_name = "value_from_tool_input"
Workflow Variables
# Access workflow variables
$vars = {
"api_key": "secret_key",
"database_url": "connection_string",
"user_setting": "value"
}
Flow Context
# Access flow execution context
$flow = {
"input": "user_question",
"chatHistory": [],
"sessionId": "session_123",
"workspaceId": "workspace_456"
}
Form Data
# Access form field data
$form = {
"field_name": "field_value",
"user_input": "form_data"
}
Variable Replacement
Variables are automatically replaced in your Python code using template syntax:
def process_data():
# Template variables are replaced before execution
api_key = "{{$vars.api_key}}"
user_input = "{{$flow.input}}"
form_data = "{{$form.field_name}}"
current_time = "{{current_date_time}}"
# Use the variables in your code
return f"Processing {user_input} with key {api_key}"
Special Variables:
{{question}}- Current user input{{chat_history}}- Conversation history{{current_date_time}}- Current timestamp
Tool Creation
Tool Definition
const pythonObj = {
name: "my_python_tool",
description: "Custom Python tool for data processing",
schema: zodSchema, // Zod schema for input validation
code: pythonCode, // Your Python code
requirements: ['pandas', 'matplotlib', 'requests']
}
let pythonTool = new PythonDynamicTool(pythonObj)
pythonTool.setVariables(variables)
pythonTool.setFlowObject(flow)
Input Schema
Define tool inputs using Zod schemas:
const schema = z.object({
data_source: z.string().describe("Data source URL or path"),
output_format: z.enum(["json", "csv", "excel"]).describe("Output format"),
filters: z.array(z.string()).optional().describe("Optional filters")
})
Python Code Execution
Automatic Function Detection
The Python server automatically detects and executes functions:
Method 1: Explicit Result Assignment
def process_data():
# Your processing logic
return "Processing complete"
# Explicit assignment
result = process_data()
Method 2: Main Function Convention
def main():
# Main processing logic
return "Task completed successfully"
# Automatically detected and executed
Method 3: Execution Pattern Functions
def execute_task():
return "Task executed"
def run_analysis():
return "Analysis complete"
def process_request():
return "Request processed"
def handle_data():
return "Data handled"
# Functions with execution patterns are automatically detected
Artifact Generation
Image Artifacts
import matplotlib.pyplot as plt
def create_visualization():
# Create a plot
fig, ax = plt.subplots()
ax.plot([1, 2, 3, 4], [1, 4, 2, 3])
ax.set_title('Sample Plot')
# Save as artifact (automatically detected)
save_image_artifact(fig, 'my_plot.png')
return "Visualization created successfully"
Data Artifacts
import pandas as pd
def generate_report():
# Create data
df = pd.DataFrame({
'x': [1, 2, 3, 4],
'y': [1, 4, 2, 3]
})
# Save as CSV (automatically detected)
df.to_csv('report.csv', index=False)
# Save as Excel
df.to_excel('report.xlsx', index=False)
return "Report generated successfully"
Custom File Artifacts
def create_html_report():
html_content = """
<html>
<head><title>Report</title></head>
<body><h1>Analysis Results</h1></body>
</html>
"""
# Save HTML file
with open('report.html', 'w') as f:
f.write(html_content)
return "HTML report created"
Package Management
Automatic Detection
Packages are automatically detected from import statements:
import pandas as pd # -> pandas
import matplotlib.pyplot as plt # -> matplotlib
import requests # -> requests
from PIL import Image # -> pillow
import cv2 # -> opencv-python
from sklearn import metrics # -> scikit-learn
Manual Requirements
Specify requirements in tool configuration:
requirements = [
'pandas>=1.5.0',
'matplotlib>=3.6.0',
'requests>=2.28.0',
'beautifulsoup4',
'lxml',
'numpy',
'scipy'
]
Common Packages
Pre-installed or easily available packages:
- Data Processing: pandas, numpy, scipy
- Visualization: matplotlib, seaborn, plotly
- Web: requests, beautifulsoup4, lxml
- Images: pillow, opencv-python
- ML: scikit-learn, tensorflow, torch
- AI: openai, anthropic, langchain
- Databases: pymongo, psycopg2-binary, mysql-connector-python
- Cloud: boto3, google-cloud-storage, azure-storage-blob
- Documents: pdfplumber, python-docx, openpyxl
API Endpoints
Tool Execution
POST /tools/execute-python
Content-Type: application/json
{
"code": "def main(): return 'Hello World'",
"variables": {
"$vars": {"api_key": "secret"},
"$flow": {"input": "user_question"}
},
"requirements": ["requests", "pandas"],
"timeout": 300,
"workspaceId": "workspace_123"
}
Package Installation
POST /tools/install-packages
Content-Type: application/json
{
"packages": ["tensorflow", "torch", "transformers"],
"workspace_id": "workspace_123"
}
Artifact Access
GET /tools/artifacts/{workspace_id}/{filename}
Health Check
GET /tools/health
Security Features
Resource Limits
# CPU time limit
resource.setrlimit(resource.RLIMIT_CPU, (timeout + 2, timeout + 2))
# Memory limit (1GB default)
resource.setrlimit(resource.RLIMIT_AS, (1 << 30, 1 << 30))
# File size limit (100MB)
resource.setrlimit(resource.RLIMIT_FSIZE, (100 * 1024 * 1024, 100 * 1024 * 1024))
Workspace Isolation
- Separate virtual environments per workspace
- Isolated artifact directories
- Process-level isolation
- Secure file path validation
Environment Variables
PY_TOOL_VENVS: Virtual environments directoryPY_TOOL_ARTIFACTS: Artifacts storage directoryPY_TOOL_MAX_MEM: Maximum memory limitPYTHON_RUNTIME: Python server URL
Best Practices
- Function Design: Use clear function names with execution patterns
- Error Handling: Implement proper try-catch blocks
- Resource Management: Be mindful of memory and CPU usage
- Artifact Naming: Use descriptive filenames for generated artifacts
- Package Management: Specify exact versions for reproducibility
- Variable Access: Use the variable system for configuration
- Testing: Test tools thoroughly before deployment
Troubleshooting
Common Issues
- Package installation fails: Check package name and version
- Function not executed: Ensure proper function naming patterns
- Variables not accessible: Verify variable injection syntax
- Artifacts not generated: Check file permissions and paths
- Timeout errors: Increase timeout or optimize code
Debug Steps
- Check Python server logs
- Verify variable injection
- Test package installation manually
- Review artifact generation
- Monitor resource usage
Example: Complete Data Analysis Tool
import pandas as pd
import matplotlib.pyplot as plt
import requests
def analyze_data():
"""
Complete data analysis tool with API integration,
data processing, and visualization.
"""
# Access variables
api_key = "{{$vars.api_key}}"
data_source = "{{$data_source}}"
user_query = "{{$flow.input}}"
try:
# Fetch data from API
headers = {'Authorization': f'Bearer {api_key}'}
response = requests.get(data_source, headers=headers)
data = response.json()
# Process data
df = pd.DataFrame(data)
# Generate visualization
fig, ax = plt.subplots(figsize=(10, 6))
df.plot(kind='bar', ax=ax)
ax.set_title(f'Analysis for: {user_query}')
# Save artifacts
save_image_artifact(fig, 'analysis_chart.png')
df.to_csv('analysis_data.csv', index=False)
# Generate summary
summary = {
'total_records': len(df),
'columns': list(df.columns),
'summary_stats': df.describe().to_dict()
}
return f"Analysis complete! Processed {len(df)} records. Check artifacts for detailed results."
except Exception as e:
return f"Error during analysis: {str(e)}"
# This function will be automatically detected and executed
This comprehensive system enables powerful, flexible Python tool creation within Forjinn workflows while maintaining security and isolation.