Aegis | Preston Bourne

The Problem

Microsoft presented my team and I with a problem with the following prompt.

How might we make regulated industries more willing to adopt Large Language Models?

My project advisor, @James Codella, PhD, pointed out that sectors like Finance and Health care are slow to adopt Large Language Models due to concerns around data privacy and security. A single LLM halucination could spell disaster in these industries, even for internal use cases.

The Solution

We built Aegis. a two prongedservice that would sit between the person writing the prompt and the underlying model. The API would quickly categorize and return a risk assessment of each prompt. The consumer of the API would then be able to make a decision based on the risk assessment in the JSON response.

Detailed API Responses

An example response from the Aegis API looks like this, giving a risk score and a list of categories that the prompt may fall under with fine grained details for each category.

{
  "status": "success",
  "data": { 
    "prompt": "What is the capital of the moon?",
    "analysis": {
      "risk_score": 0.5,
      "categories": [
        {
            "type": "hate_speech",
            "confidence": 0.1,
        },
        {
            "type": "sexual_content",
            "confidence": 0.1,
        },
        {
            "type": "self_harm",
            "confidence": 0.1,
        },
        {
            "type": "personal_data",
            "confidence": 0.1,
        },
        // ...
      ]
    }
  },
  "metadata": {
    "request_id": "rand_uuid_here",
    "timestamp": "2024-01-01T00:00:00Z",
    "service_version": "0.0.1"
  }
}

A Powerful Dashboard for LLM Management

The Process

A big part of the course and Cornell Tech's ethos is that these projects will primarily be evaluated on their merit as a product, rather than engineering achievements. This means that the team and I were in a unique position to be able to explore and research different approaches to the problem, iterate as needed, and make data driven decisions.

Research

As any good business should, we started by doing research. We looked at existing solutions, identified a few gaps in the market, and tried to measure the demand for products of this sort.

We started by doing desk research, looking at existing solutions and identifying a few gaps in the market, from this we were able to map out the standard industry practices for managing LLM safety.

From this we started to develop a better understanding of the problem space and we could move on to another phase of research, this time talking to experts in the field. Speaking with people who had experience working with regulated industries alongside empployees at AI safety companies, we were able to get a rough understanding of our user and customer personas.

User vs Customer

The user is the peson who will be working with Aegis software on a day to day basis. The customer is the organization that will be purchasing Aegis to manage their LLM usage.

User Profile:

Customer Profile:

Design

System Architecture

With a better understanding of our users and customers, we were able to design a system that would meet their needs.

User Journey

I was then able able to map out the journey through the Aegis dashboard.

Design Iterations

I was then able to map out the information architecture of the dashboard. Here a some early iterations that were vetted with users, customers and industry experts.

Conclusion & Learnings

Aegis represents a forward-thinking approach to LLM management, addressing the specific needs of regulated industries by offering a secure, compliant, and user-friendly solution. Through rigorous research, user-centered design, and iterative development, Aegis stands as a viable product concept ready to meet the challenges of LLM adoption in sensitive sectors.

This project was a great learning experience for me. I was able to wear many hats and implement my own designs. We collectively in the tech industry are still learning how to best design for LLM usage and for these new AI applications. It was exciting to be a part of that journey.