Securing Data Confidentiality: Deploying Custom Content Filters with ease using LLM Gateway

Updated: 2024-01-02

Learn to implement custom content filters for ensuring data confidentiality with LLM Gateway. Enhance data privacy in language models without compromising efficiency. Secure your enterprise's language processing today.

Do you like this article? Follow us on LinkedIn.

Installing Gecholog.ai in Azure?

Introduction

In this article, we explore the utilization of a Large Language Model (LLM) Gateway for deploying custom content filters in situations where data confidentiality requirements make cloud provider content filters unsuitable. Large Language Models have become indispensable in modern business, significantly enhancing code and text generation, customer service, and overall employee productivity. However, they also bring challenges, such as the risk of misuse and the necessity to screen out harmful or controversial language. In the current digital era, the role of content filters extends beyond operational needs to become a crucial aspect of data privacy conversations. Building on the insights from our previous article, Data Privacy in LLM Analytics: Maximizing Security with LLM Gateway, we emphasize that data security remains a paramount concern for businesses. This piece outlines a three-step process to implement a custom content filtering solution that adeptly balances content filtering needs with data privacy considerations.

Understanding Content Filters in LLMs

Content filters are a type of classifier executed before the LLM API to filter out certain types of requests. Cloud provider content filters are highly effective, and their continuous updates make them a low-risk option for many scenarios. However, in specific use cases with stringent privacy requirements, these filters may not be suitable. The data sent to cloud providers is often reviewed by their personnel and may not adhere to the same data residency standards. Consequently, customers might request these filters to be disabled. In other scenarios, such as running a local open-source language model, there might not be a readily available content filter. A third scenario involves the need for content filters not focused on sensitive or abusive language in LLM API use but rather filters designed to categorize questions about confidential information, competitors, or other topics pertinent in a corporate setting. Or you might want to run the same content filter independent of your LLM cloud provider.

Implementing a Custom Content Filter using LLM Gateway

This is how we can implement a simple custom content filter in three steps

  1. Integrating the LLM Gateway into Your System Architecture

  2. Selecting the Best Classifier for Your LLM Content Filter

  3. Deploying the Content Filter within the LLM Gateway

Integrating an LLM Gateway in Your System Architecture

Graphics showing integration of LLM Gateway in a modern LLM architecture

Image: Graphics showing the Integration of an LLM Gateway in Modern System Architecture

The initial step in implementing a local content filter is to decouple your application from the LLM API, which is achieved by introducing an LLM Gateway. For this purpose, we utilize Gecholog.ai's LLM Gateway, noted for its high performance and compatibility with various cloud services and LLMs. This integration is straightforward, usually involving just an update to the API URL in your application. For more insights into the varied applications and advantages of an LLM Gateway, you can refer to our blog.

Selecting a Content Filter Classifier

Once the gateway integration is complete, the next step is to choose an appropriate content filter classifier. Our objective is to implement a local content classifier, ensuring complete control over data confidentiality. As a consequence, relying on cloud-provider content filters isn't suitable for our needs. We have two primary options:

  • Utilize a Pre-trained Model: While there is no definitive 'gold-standard' for pretrained content filter models, several notable ones exist, such as detoxify and toxic-BERT.

  • Develop Your Own Model: Alternatively, you can develop a custom model using robust natural language processing and transformer libraries from sources like Hugging Face or spaCy. This approach, though more effort-intensive, can be facilitated by using existing datasets, such as the Hate Speech Data Sets.

For this article, we have selected detoxify, which offers a straightforward Python interface for usage.

Deploying the Content Filter via the LLM Gateway

The final step is implementing the content filter itself. The core idea is straightforward: analyze the incoming text using the classifier. The classifier will provide a classification of parameters such as toxicity, obscenity, threats, or insults. We check the values of each parameter against a threshold. If the request text is below the threshold, it is approved; if above, it is not. Approved incoming requests are forwarded to the LLM API, while the non-approved are rejected (filtered out).

Architecture Diagram of Content Filter micro-service executed from LLM Gateway

Image: Architecture Diagram of Content Filter Microservice Executed from LLM Gateway

Using Gecholog.ai, we easily set up a custom Python processor micro-service that connects to the LLM gateway. This processor, operating synchronously with each request, is configured to evaluate incoming requests using the detoxify library. We have chosen the recommended initial threshold of 0.5 to ensure a balanced approach to content filtering. You can find the example code for our processor on our docs site. A simple test shows that we have successfully implemented our custom content filter:

Example of LLM API request approved and rejected by Custom Content Filter

Image: Example of LLM API Request Approved and Rejected by Custom Content Filter

Track the Content Filter Statistics

One significant advantage of using an LLM gateway such as Gecholog.ai is the ability to track performance metrics for various aspects of LLM API traffic (Read more in LLM API Traffic Management: Mastering Integration with LLM DevOps and LLM Gateway). In our custom content filter scenario, we can use the data generated from the gateway to examine how our content filter parameters score against the normal traffic. We can track and visualize the average values of the parameters, the number of API calls filtered out, and the distribution of parameter values. All of this information can be leveraged to enhance performance; for example, by identifying optimal threshold values and consequently improving the end-user experience.

Chart showing the distribution of Toxicity classification from LLM API Traffic generated from LLM Gateway

Image: Chart Showing the Distribution of Toxicity Classification from LLM API Traffic Generated from the LLM Gateway

Conclusion

We have demonstrated that for enterprises needing to deploy their own content filter, whether for data confidentiality, privacy, or other concerns, this objective can be achieved quite easily using local classifiers and integrating an LLM Gateway into the architecture. By adhering to these steps, organizations can attain enhanced control over data privacy and confidentiality in their LLM operations. The integration of a custom content filter with the LLM Gateway, powered by adaptable classifiers, enables a more tailored approach to data confidentiality, striking a balance between rigorous filtering and operational efficiency.


Data Confidentiality LLMContent FilterLLM Gateway

Do You Want to Experience Easy Integration?

Explore the potential of custom content filters for your LLM's data confidentiality needs. Sign up for our LLM Gateway and start crafting a solution that aligns with your privacy requirements today!