OC3 registrations are now open! Join the premier event for confidential computing online or in Berlin on March 27.
LocalAI + Constellation
LocalAI is a popular open source OpenAI alternative, compatible with OpenAI API specifications. Notably, it supports different types of open source AI models like Llama and Mistral, without the need for a GPU. It can be used for developing new AI-enabled applications using the OpenAI programming libraries or extending virtually any app which already integrates with it.
To accommodate production-scale applications, LocalAI can be deployed using scale-out replicas orchestrated through Docker or Kubernetes.
Confidential computing is a technology that encrypts data in use, guarding against infrastructure threats in cloud applications. Together with encryption at rest and in transit it ensures data confidentiality, even against privileged individuals, and most importantly you can verify this remotely, for enhanced security assurance. For a deeper dive into confidential computing, read our whitepaper.
What is confidential AI? Simply put, it's the concept of employing confidential computing technology for verifiable protection of data throughout the AI lifecycle, including when the data and models are in use. Explore more in our blog post on "How confidential computing and AI fit together".
Constellation is a Kubernetes distribution that is designed for confidential computing. Any application that runs inside a Constellation cluster is runtime-encrypted and shielded from the infrastructure.
With Constellation, you can run LocalAI on the public cloud with the assurance that your inference data stays always encrypted and is inaccessible by the cloud provider or any attackers or third parties coming through the infrastructure.
This approach enables working with LocalAI or similar language model-based chatbot, ensuring data privacy and security, and facilitating the implementation of very large language models that would surpass the capacity of a typical local setup.
Leveraging confidential computing with LocalAI on Constellation enables running in scalable, cost-efficient cloud environments with minimal maintenance, ensuring data privacy, and empowering organizations with efficient and secure language model processing tools.
In the screencast on the right you can see us asking a question to an instance of LocalAI running at localai-testing.edgeless.systems. The model in use is ggml-gpt4all-j.
LocalAI can be installed inside Kubernetes with a Helm chart.
You will require:
Prerequisites and overview
In order to run LocalAI on Constellation, you will need:
The process is composed of three key steps:
For the sake of clarity, we have written the instructions below as someone using Azure with a GoDaddy registrar, however, this tutorial can be completed with any of the major cloud providers and a registrar of your choice. If you choose a different registrar, you will have to adapt the external-dns helm chart accordingly.
First, download and install the Constellation CLI.
Next, create the Constellation cluster. The process is described in detail in the Constellation docs.
constellation config generate azure constellation iam create azure --region=westus --resourceGroup=constellTest --servicePrincipal=spTest --update-config constellation create -y constellation apply export KUBECONFIG="$PWD/constellation-admin.conf"
You can now use the kubeconfig to query the cluster, e.g. with kubectl. The config ensures that the connection is confidential and terminates inside the correct cluster.
In the case of our example setup (Azure with GoDaddy) we've provided a Helm chart that installs and configures external-dns and ingress-nginx in the freshly created cluster.
To use the helm chart, you need to make a couple basic edits after cloning the repo:
With your credentials in place, you can go ahead and run the necessary helm commands.
Run:
./install --install-infra -ns localai --hostname localai.your.domain
Your confidential LocalAI setup is now in place. When the process has completed you can go to your.domain and start firing queries to your 100% encrypted LocalAI.
Dive deeper into the Constellation documentation or read about how Constellation is being used to protect journalists.