Continuum AI
Continuum protects user prompts from the AI provider, and prevents model weights from being leaked.
Continuum is a framework for deploying LLMs and other AI models. It enables the creation of ChatGPT-style services in which both user prompts and model weights are shielded throughout.
With Continuum, the infrastructure and the service provider can never access user prompts and model weights in plaintext.
Continuum integrates with well-known inference servers like NVIDIA Triton, vLLM or Hugging Face TGI.
Continuum will be released as open source in H2/2024.
Protect your weights while deploying to untrusted environments.
Provide model weights as encrypted files. Only attested workloads gain access to decrypted weights.
By deploying LLMs with Continuum, user requests and responses are kept inaccessible to the inference service provider and infrastructure.
With Continuum, you can offer best-in-class privacy guarantees to users of your platform.
In the video you can see how the user experience may be augmented with the ability to verify a service is running on Continuum.
Continuum comes with different SDKs that allow you to integrate verification in a way that best fits your application.
Use TLS encryption
When encrypting network traffic with TLS, the service provider still has access to prompts and weights on their VMs. Malicious or defective workloads can leak prompts and models e.g. by sending them over the network.
Use disk encryption
When encrypting disk with e.g. LUKS, the service provider has access to the disk decryption key. Malicious or defective workloads can leak model weights e.g. by sending them over the network.
Run on premise
When running on premise, user prompts and weights are always accessible by the machine and service operator.
Leave your email or send us your questions.