Forjinn Docs

Development Platform

Documentation v2.0
Made with
by Forjinn

Moderation

Learn about moderation and how to implement it effectively.

2 min read
🆕Recently updated
Last updated: 12/9/2025

Moderation & Content Safety

Content Moderation nodes and workflows in InnoSynth-Forjinn protect users, data, and organizations by automatically filtering, flagging, or blocking undesirable outputs—be they toxic, profane, unsafe, or non-compliant with business/school/government policy.


Why Moderation Nodes?

  • Prevent abuse, hate speech, and offensive/unsafe content from reaching end users.
  • Enforce compliance (school, workplace, platform safety regulations).
  • Reduce legal risk and support trust-building for AI-powered flows and bots.

Supported Moderation Features

1. Built-in Moderation Node

  • Leverages LLM or 3rd-party models (e.g., OpenAI Moderation API, HuggingFace ZeroShot).
  • Flags, blocks, or reroutes content based on detected risk (sexual, hate, violence, self-harm, custom blacklist).
  • Runs as a filter on input, output, or both.

2. Custom Moderation

  • Bring your own rules: regex, phrase list, blocking scripts
  • Chain with external APIs (Perspective, Google, AWS Comprehend, etc)
  • Combine with Sticky Notes for documented policy

3. Moderation in Action

  • Add directly before or after LLM/agent nodes
  • Optionally connect block/redirect path for rejected content (“Sorry, that question is not allowed.”)
  • Log all flagged events for audit/review

Example Flow: Safe Q&A Chatbot

  1. User Input → Moderation Node (OpenAI Moderation, “strict”)
  2. If flagged, respond: “Please avoid inappropriate language.”
  3. If safe, pass to LLM/agent for normal processing

Configuration Fields

  • Provider: OpenAI, HuggingFace, RegEx, Custom, etc.
  • Categories: Which types to block/warn (hate, adult, violence, medical, etc.)
  • Severity Threshold: Tune: block everything, or only high-confidence
  • On-Block Behavior: Message, re-prompt, escalate, or log only

Troubleshooting

  • False/overblocking: Lower strictness or add whitelist patterns/checks
  • Missed risks: Try multi-provider, ensemble, continuous retraining
  • Performance: Batch moderate where possible for bulk ops

Best Practices

  • Always include moderation in any externally-facing flow
  • Audit logs and flagged items regularly
  • Document content policies for users and review team

Moderation safeguards AI for the real world: trust, compliance, and user safety are built into every conversation and automation by design.