Closed Beta Preview: Prompt Validation for LLMs
As we get closer to our early fall beta release for large language model (LLM) support, we want to share some details for what lies ahead. If you are curious to discover more about our upcoming LLM functionality, read on.
ValidMind assists developers, data scientists, and risk & compliance stakeholders in identifying risks in various models, including AI and machine learning. In the closed beta, our large language model testing and documentation generation features will be available to you for testing along with our existing products, such as the AI governance and risk management platform. More details can be found in our original announcement.
Closed beta functionality
Our development team has been hard at work putting together the finishing touches on our LLM support, so here is a preview of one of the things you will be able to try out yourself: prompt validation for an LLM-based sentiment analysis model, including any bias that may be present in the model.
A great developer experience
As with all our features, we want to make it easy for you to try out our products. Think of this as code samples anyone can run with minimal fuss. To get there, we share our samples as notebooks that you can run in JupyterLab, Google Colab, or in your own local developer environment.
For example: To evaluate the prompts given to LLMs as part of managing the model risk lifecycle, a single call to the ValidMind Developer Framework will run the right test suite:
This line automatically finds the correct test suite based on the name, initializes the test suite, and runs it. It includes tests for a number of different aspects of prompt validation, including bias, clarity, conciseness, and more. Dozen different tests get run to see detect potential issues with the prompt, and developers can add their own tests too.
Even better in the ValidMind UI
You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is an even better way: view the prompt validation test results as part of your model documentation in the ValidMind UI:
Here, most tests pass, but the test for conciseness needs further attention, as it fails the threshold. This test is designed to evaluate the brevity and succinctness of prompts provided to a large language model (LLM).
The test matters, because a concise prompt strikes a balance between offering clear instructions and eliminating redundant or unnecessary information, ensuring that the LLM receives relevant input without being overwhelmed. Clearly, if a prompt for an LLM fails a conciseness test, you’d want to know about it.
That’s all we have time for today. If this functionality interests you and you would like to try it out yourself, we invite you to join our beta waitlist!
Be part of the journey
Sign up to take the first step in exploring all that ValidMind has to offer. Coming to you live in the Fall of 2023!