Inworld’s commitment to safety

Inworld is committed to ensuring the safety of our community and the responsible use of AI. In this post, we outline the safety guidelines, character guardrails, reporting mechanisms, and ongoing monitoring designed to ensure everyone has a good time interacting with our characters.

Update 5/23/2024: Please check our developer documentation for the latest safety policies.

Inworld celebrates the power of imagination by supporting creatives of all kinds to develop compelling new narrative formats with interactive AI characters.

But building AI-driven creative technologies requires careful consideration. We believe that anyone building products and experiences powered by artificial intelligence must deeply respect all potential audiences and work diligently to ensure products don’t cause harm.

Given the power and reach of Inworld’s technology, we have thought a lot about how to support creators in a way that respects the safety of the audiences they serve. Most importantly, we are committed to ensuring the comfort and safety of all users who encounter Inworld characters.

Our approach to safety, outlined in the rest of this article, includes:

Clear restrictions and guidelines. On content creation hosted on our platform
Developer controls. For flexibility to create content appropriate for the target audience
Reporting and moderation. On all conversations and characters on Inworld platforms
Extensive safety system and integrated guardrails. Built into all Inworld characters
Ongoing monitoring and improvements. To allow for continuous safety improvements

1. Inworld Safety Guidelines

We respect creators’ ability to express themselves. However, we believe that certain subjects and use cases present a particularly high risk of harm to users.

We prohibit the creation of characters or the sharing of content that:

Is illegal or unlawful

We prohibit characters or content that are libelous, defamatory, obscene, pornographic, sexually explicit, indecent, lewd, suggestive, offensive, inflammatory, threatening, abusive, inflammatory, or fraudulent.

Harasses others

We do not want our characters used to harm people in any way. We prohibit any characters or content deliberately designed to provoke or antagonize people, especially trolling and bullying, or is intended to harass, harm, hurt, scare, distress, embarrass or upset people.

Engages in or promotes hate speech or hateful conduct

We do not allow any characters or content that are racist or discriminatory, including discrimination on the basis of someone’s race, religion, age, gender, gender identity, disability or sexuality.

Threatens, promotes, or glorifies violence or harm

We do not allow our characters to promote or threaten violence. This includes any characters designed to promote suicide or self harm.

Impersonates any person or entity

We do not allow characters created to impersonate any person or entity or to falsely state or otherwise misrepresents a user or their affiliation with any person or entity.

Contains unsolicited promotions

We do not allow unsolicited advertising or promotions. We also do not allow our characters to be used for political campaigning.

Contains private information of any third party

We do not allow characters that share personal information of any third party. That includes addresses, phone numbers, email addresses, credit card numbers, or any other number and feature in the personal identity document (e.g., driver’s license numbers, passport numbers).

Violates anyone’s intellectual property or other rights

We do not allow any characters or content created that violates any intellectual property owner’s rights.

2. Developer controls: Inworld-hosted vs developer-hosted safety guidelines

At Inworld, we recognize that context and audience matter when determining what is appropriate. For all Inworld-hosted experiences (such as in our Studio or Arcade), we maintain strict restrictions due to the wide range of potential users.

However, on developer-hosted experiences, certain topics may be appropriate given that experience’s context and audience. For example, a character in a Halo video game should be able to talk about guns.

Therefore, we offer custom safety controls that allow developers to ensure their content is appropriate for their audiences. However, certain topics and intended uses, have a significant propensity for harm and will not be allowed by Inworld under any circumstances.

Inworld-hosted

For Inworld-hosted characters, which are served via an Inworld-owned domain or experience, we default to strict safety controls and content restrictions to create a safe environment and community. This includes characters in the Inworld Studio or Arcade. We want everyone to enjoy talking with characters on our platform and to know what to expect from any character we host.

Developer-hosted

Inworld characters inhabit a range of virtual worlds and meet a wide variety of users. We work with qualified partners to create custom content filters and restrictions and to ensure the characters they create are appropriate for their audience.

For example, games with Entertainment Software Ratings Board (ESRB) ratings help ensure developers are considering the right target audience for the topics they want to allow. This might involve discussion of combat in a first-person shooter or a wider variety of subjects in ESRB-rated Mature games like Cyberpunk.

For creators wanting even stronger safety controls, we support additional restrictive filters, such as filters required for sensitive groups, such as children. By default, our default safety controls are already strict, but we can add additional topics, intentions, or keywords

3. Reporting and moderation

To effectively ensure the comfort and safety of all users, we also have developed a number of reporting and moderation systems.

Character reporting

Users can click to report a character and provide details about violations to our safety policy.

Conversation reporting

Users can click on the flag icon on a comment made by the character. That comment is sent to Inworld for review. Inworld manually reviews all reports. If a character or creator is in violation of Inworld’s content policy, we may require creators to remove or limit those characters, content, or interactions or we might remove the content or characters ourselves, depending on the nature of the violation. We may also use reported information to improve our models.

4. Safety System and Guardrails

We have engineered a proprietary safety system that is integrated with every Inworld character, to ensure that characters do not engage with inappropriate subjects or unintended uses. These guardrails prevent the creation or interaction with characters that violate our Community Safety Guidelines.

Our characters will automatically detect any language that violates our policies. Depending on the context, characters are able to redirect the conversation or notify users of the violation.

Additionally, our safety system ensures that characters do not respond to or learn from unsafe responses and interactions.

5. Ongoing monitoring and safety improvements

We are constantly iterating on our platform’s safety guardrails in order to improve the safety of Inworld’s characters. In addition the safety system integrated with every character to prevent unsafe interactions, our system flags conversations where a user attempts to violate our policies.

We review interactions where policies were violated in order to understand user intents. We use this feedback to continuously improve our guardrails so that Inworld characters can safely interact with all users.

Our commitment to you

Inworld is committed to empowering creators to bring their products and experiences to life with AI characters. In order to do this responsibly, we also maintain a deep commitment to ensuring the safety of our community and the responsible use of AI. It is our utmost concern that any character a user encounters does not engage in topics or uses that are inappropriate or harmful in any way.

The measures outlined above are just some of the steps we have taken in upholding our responsibility to our community’s safety. Through our guidelines, we intend to ensure that all creators or developers building and integrating Inworld characters do not produce any content that may be dangerous, unlawful, incites hate, or produces societal harm of any kind.

Our extensive, integrated safety system, reporting and moderation policies, and our ongoing monitoring and improvements also help us ensure safe character experiences for our community. The steps we’ve taken in our commitment to safe and responsible AI are part of our ongoing journey to help our partners confidently deliver incredible experiences with Inworld characters. We want all creators to be able to create characters with peace of mind and knowledge that their audiences will be respected.