Centralized protection against AI
Usage Risks with Network-based AI firewall
By: Brett Helm
Intro
Generative AI solutions, such as ChatGPT, Google Gemini/Bard,
and Microsoft Bing AI/copilot, have exploded in popularity. The adoption of
ChatGPT has scaled so quickly that users frequently find that the system is at
capacity, and they must wait for access.
AI solutions can improve efficiency, reduce errors, and
improve an organization's operations. There is, however, a dark side as these
tools create a risk to the enterprise. There is a risk that enterprise
employees will use AI vendors without authorization and without enterprise
agreements in place to protect confidential data. Guardrails are needed to
protect against the obvious risk of confidential data being leaked. AI
solutions pose additional risks to the enterprise including inbound malware,
prompt injections, and copyright violations.
Last year, Samsung banned the use of
AI tools from company devices and
personal devices connected to company networks, after a data breach. Data was
leaked to ChatGPT in three separate incidents. In one case, an employee asked
ChatGPT to generate notes from a recorded meeting. Another employee asked the
chatbot to check source code from a sensitive database for errors. A third
employee input code into the chatbot and requested ChatGPT to provide code
optimizations.
Generative AI and Large Language
Models
AI solutions
are being adopted by companies, both large and small. These tools can be used
to automate a wide range of repetitive tasks and increase efficiency. Use cases
for AI include:
- Software developers are using these tools for code
generation, documentation, and debugging.
- Marketing teams are
using them to generate presentations and social media
content.
- Managers are
using them for data analytics and report generation.
- Manufacturing companies are using them for predictive maintenance.
- Customer support organizations are using them to route emails to
the right groups, and even to provide automated responses.
ChatGPT and
its analogs are AI based chatbots, built on top of a Large Language Model
(LLM). Large Language Models are a type of neural network designed to process
and generate data in sequences. In other works, they process text-based input
and generate responses based on the input they receive.
LLMs can
write poetry, summarize meeting notes, generate a PowerPoint presentation from
a text description, or analyze software programs to discover errors.
One of the
main reasons that AI tools such as ChatGPT, Google Bard, etc. are so powerful
is because they were trained on a massive data set. These systems scrape information from the
Internet and use that data to train the LLM.
The
diversity and depth of information available on the Internet is larger than
most people realize. LLMs have seen billions of samples of writing on almost
every conceivable topic. Hence, we can think of a LLM as being trained to
produce text that could reasonably be expected to appear on the Internet. This
covers everything from song lyrics to legal precedence, and from software code
to textbooks and homework assignments. This breadth and depth of input material
enables the LLMs to generate sophisticated and complete responses on almost any
topic.
Data breaches
Given that AI tools utilize gathered information to provide
the user with an accurate product, the user must never forget that the AI
system is always mining information. Even proprietary information, once in the
system, will be seen and used by others.
The ChatGPT data policy states that data input into the
chatbot will be used to train its models, unless users explicitly opt out.
This is a
serious concern for both the company's confidential information and for
personally identifiable information (PII) managed by the company. Should this
information be leaked via an AI tool, a company could find themselves in
violation of legislation including GDPR and the California Consumer Privacy Act
(CCPA),
Unless one
exercises care, it is very easy to unknowingly leak data with ChatGPT or one of
the other AI tools. Employees are experimenting with the capabilities of the
tool and can easily cut and paste data into the web interface without realizing
the risks.
Copying and
pasting data into an AI web interface is not the only avenue by which
organizations can accidentally leak data. AI tools also provide API access,
allowing companies to build applications to automate workflows that leverage
the power of these AI tools. Automation
is a powerful tool to improve worker productivity, but it removes human
control. Any data input into the automation process could be leaked through the
APIs and added to the training data for the LLMs, and there is no one watching
the data to ensure confidential information is not fed to the chatbot.
Enterprise Agreements are not a
Silver Bullet
ChatGPT
offers an enterprise license that, among other things, ensures protection of a
company's data. With an enterprise license from
ChatGPT, companies
retain ownership and control of their business data. The system won't use
business data or conversations for training of their LLM.
Enterprise
agreements offer a solution to concerns over data breaches from using AI
solutions, but only to the extent that they are used. A company must ensure
that all access to AI solutions uses the company's enterprise account. If an
individual uses a personal account, there is still a high risk of data leakage.
Types of AI Firewalls
Despite
being a relatively new category in the market, there are already multiple types
of AI firewalls.
- AI Prompt Firewall: A browser-based plugin that manages and
controls an individual's access to AI tools. These tools only monitor access to
AI tools through an AI prompt accessed by a browser. These tools don't provide
any monitoring or control over API-based access to AI tools and are specific to
a single browser. A plug-in must be installed on each device within the
enterprise, and a new version is required for every browser utilized within an
Enterprise. There are also challenges with federation, centralized policy
management, and centralized control.
- AI Powered Firewall: Tools using AI/ML to create a better Next
Generation Firewall (NGFW). This is really a different solution altogether, as
it is focused on using AI to create a better firewall. It is not focused solely
on enforcing appropriate use of AI tools.
- Network-based AI Firewall: Much like traditional network
firewalls, these tools sit between the corporate network and the external
internet and enforce corporate communication policies. A network-based AI
firewall can be deployed behind the existing firewall to provide additional
protection specific to the use of AI tools.
A
network-based AI firewall is the only solution that provides full protection
against the misuse of AI tools across the entire enterprise, including
prompt-based and API-based access to AI solutions.
Network-based AI Firewall
A
network-based AI firewall has visibility into all of an enterprise's network
traffic being sent to AI solution providers. With this broad purview, the
firewall supports the following characteristics:
- Discovery of AI usage
- Reporting
of AI usage
- Lane classification
- Guardrails
- Quarantine
An AI firewall enforces company
policies regarding usage of AI tools.
Discovery of AI usage
Network-based
AI firewalls have visibility into all AI requests. This allows centralized
discovery across the entire enterprise, regardless of the AI vendor, or method
of access. Browser-based plug-ins don't provide centralized data collection and
only have visibility into prompt-based access to AI tools.
Reporting of AI usage
AI usage
collected by network-based AI firewalls enables analytics and reporting on an
organization's AI usage. This can be used for:
- auditing AI usage vs. billing from
AI vendors
- monitoring for inappropriate AI
usage or AI usage that does not follow company policies
- developing metrics on the adoption
of AI across company departments and functions
- measuring department performance
improvements based on AI usage
Enterprise-wide
reporting can only be achieved using a centralized-federated AI discovery
solution.
Lane classification
Many
enterprises will adopt a specific AI solution based upon a specific use case,
or "lane". For example, an enterprise may choose one AI company for the
"software development assistant" use case. A separate AI company may be
approved for content generation, image generation, or voice generation.
Lanes could
be broadly designed, or narrowly designed, depending on the needs of the
enterprise. For example, a large bank
may contract with Microsoft for a software development assistant to assist
developers using a scripting language. The marketing department may then
contract with OpenAI for content creation. Alternatively, a technology company
may employ multiple lanes for software development groups using different
programming languages.
Guardrails
Guardrails
enforce compliance AI lanes, ensuring compliance with corporate policies on AI
usage. Security guardrails could be used to protect against the loss of privacy
data, and non-approved AI access. Combined with quarantining, they also provide
security capabilities.
Quarantining
Quarantining
capability allows organizations to implement fine-grained protection from
unsafe scenarios such as data leakage or malicious code responses from an AI
company.
The Solution
Companies
must set policies on the use of AI solutions. These policies must balance the
benefits of using this new technology with the risk.
Once a
company policy has been set, employees must be trained on the policy and the
acceptable use of AI solutions. Ideally, the company will invest in an
enterprise license to enable broad usage of AI while managing its risks.
Finally,
companies must enforce the policies they have created. A network-based AI
firewall can be used with a customer's NGFW to make fine-grained AI policy
decisions. An AI firewall could be installed inline behind a traditional
firewall to block requests that don't comply with company policies. The
solution can also be installed alongside a traditional firewall or network
router via a span port or transparent traffic ingestion. With this
configuration, the solution acts as a visibility and discovery tool, providing
detailed reporting on usage of AI tools.
Utilization
of a network-based AI firewall provides several benefits over browser-based
plugins:
- The solution is easy to
install and centrally managed
- The solution provides a
single point data collection for AI usage audit and privacy compliance
enforcement
- No changes are required
to user endpoints
- The solution manages all
access to AI tools, including access from any browser or via APIs for
cloud-based connections or on prem solutions.
- Protection for
non-browser-based applications
- Supports both cloud and
legacy on prem environments.
A
network-based AI firewall will ensure that all access to AI solutions utilize
the company's enterprise account. Attempts to access AI solutions that don't
use the enterprise account are either blocked or redirected to the enterprise
account.
Conclusion
AI Solutions
are transforming how businesses operate. ChatGPT alone has been adopted by over
80% of Fortune 500
companies. Companies
are still learning how to use AI and adoption will continue to grow with time.
New use cases will be discovered, and companies will refine and optimize their
use of this technology. But the use of AI Solutions is not without risk.
By default,
information provided to these solutions is added to their LLM, allowing them to
continue to learn and adapt. Unless care is taken, a company's confidential and
private data will be added to the LLM, making it available to anyone who
queries the system.
Companies
need to develop policies on the use of LLM based systems to ensure their
private data remains secure. They must
evaluate the risk, set policies, educate employees, and ensure the proper
controls are in place to enforce the policies they set. An AI firewall provides
the protection required to enforce these policies.
The use of
an enterprise licensed AI solution is not a silver bullet. An enterprise
license will ensure that confidential information is not added to an LLM, but
only if it is consistently used.
Companies must implement controls to ensure employees use only the
enterprise accounts. An AI firewall allows enterprises to take matters into
their own hands and control risks associated with AI usage.
##
ABOUT
THE AUTHOR
Brett Helm is the Co-Founder and CEO
of Glasswing, a provider of the industry's first automated AI discovery and AI
firewall platform. This solution is available now. Previously, Brett held CEO
roles at DB Networks, Coradiant, and iPivot, Inc., as well as senior management
roles at Intel.