Kubernetes
If you want to use pgvecto.rs
in PostgreSQL which is running in Kubernetes, we provide a tutorial to help you. In this tutorial, we will use CloudNative-PG to deploy PostgreSQL cluster in Kubernetes.
Requirements
Name | Version |
---|---|
helm | >= 2.4.1 |
kubectl | >= 1.14 |
kubernetes | >= 1.24.0 |
Install CloudNative-PG
We use CloudNative-PG (cnpg) to deploy PostgreSQL cluster in Kubernetes. You can install it by helm.
$ sudo helm repo add cnpg https://cloudnative-pg.github.io/charts
$ sudo helm upgrade --install cnpg --namespace cnpg-system --create-namespace cnpg/cloudnative-pg
$ helm list --all-namespaces
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
cnpg cnpg-system 1 2024-01-02 08:17:11.661297545 +0000 UTC deployed cloudnative-pg-0.20.0 1.22.0
$ kubectl get pod -n cnpg-system
NAME READY STATUS RESTARTS AGE
cnpg-cloudnative-pg-67ff4f5f66-94m8d 1/1 Running 0 1m
Build PostgreSQL image with pgvecto.rs
We use the official CloudNative-PG docker image as the base image, and install pgvecto.rs
in it. And you can also find pgvecto.rs
version from releases.
# syntax=docker/dockerfile-upstream:master
ARG CNPG_TAG
FROM ghcr.io/cloudnative-pg/postgresql:$CNPG_TAG
ARG CNPG_TAG
ARG PGVECTORS_TAG
ARG TARGETARCH
# drop to root to install packages
USER root
# install pgvecto.rs
ADD https://github.com/tensorchord/pgvecto.rs/releases/download/$PGVECTORS_TAG/vectors-pg${CNPG_TAG%.*}_${PGVECTORS_TAG#"v"}_$TARGETARCH.deb ./pgvectors.deb
RUN apt install ./pgvectors.deb
USER postgres
$ docker build -t <your-postgresql-image-repo>:15 --build-arg CNPG_TAG=15 --build-arg PGVECTORS_TAG=v0.1.13 --build-arg TARGETARCH=amd64 .
Notice: PostgreSQL docker image tag must start with CNPG_TAG for cloudnative-pg to recognize the postgres version.
Create PostgreSQL cluster
We provide a sample yaml file to create a PostgreSQL cluster in Kubernetes. You can modify it according to your needs. You need pay attention to the following points:
- Set
shared_preload_libraries
to loadpgvecto.rs
shared library. You can also setpostInitApplicationSQL
ALTER SYSTEM SET shared_preload_libraries = "vectors.so";
to loadpgvecto.rs
shared library. - Execute
CREATE EXTENSION IF NOT EXISTS "vectors";
to createvectors
extension.
apiVersion: v1
kind: Secret
metadata:
name: tensorchord
type: kubernetes.io/basic-auth
data:
password: dGVuc29yY2hvcmQ= # tensorchord
username: dGVuc29yY2hvcmQ= # tensorchord
---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: pgvectors
spec:
instances: 1
bootstrap:
initdb:
database: tensorchord
postInitApplicationSQL:
- CREATE EXTENSION IF NOT EXISTS "vectors";
owner: tensorchord
secret:
name: tensorchord
dataChecksums: true
encoding: 'UTF8'
storage:
size: 1Gi
imageName: <your-postgresql-image-repo>:<pg-tag>
postgresql:
shared_preload_libraries: # load pgvecto.rs shared library
- "vectors.so"
You can install cnpg
kubectl plugin to manage your PostgreSQL cluster. Now we can check the status of the cluster.
$ sudo kubectl get pod
pgvectors-1 1/1 Running 0 3m54s
$ sudo kubectl cnpg status pgvectors
$ kubectl cnpg status pgvectors
Cluster Summary
Name: pgvectors
Namespace: default
System ID: 7323107530735296538
PostgreSQL Image: docker.io/xieydd/pgvectors:15
Primary instance: pgvectors-1
Primary start time: 2024-01-12 07:17:34 +0000 UTC (uptime 17m33s)
Status: Cluster in healthy state
Instances: 1
Ready instances: 1
Current Write LSN: 0/4000000 (Timeline: 1 - WAL File: 000000010000000000000003)
Certificates Status
Certificate Name Expiration Date Days Left Until Expiration
---------------- --------------- --------------------------
pgvectors-ca 2024-04-11 07:12:00 +0000 UTC 89.98
pgvectors-replication 2024-04-11 07:12:00 +0000 UTC 89.98
pgvectors-server 2024-04-11 07:12:00 +0000 UTC 89.98
Continuous Backup status
Not configured
Physical backups
No running physical backups found
Streaming Replication status
Not configured
Unmanaged Replication Slot Status
No unmanaged replication slots found
Managed roles status
No roles managed
Tablespaces status
No managed tablespaces
Instances status
Name Database Size Current LSN Replication role Status QoS Manager Version Node
---- ------------- ----------- ---------------- ------ --- --------------- ----
pgvectors-1 29 MB 0/4000000 Primary OK BestEffort 1.22.0 kind-control-plane
Notice: This article just provides a minimal configuration, not include logical replication, streaming logical replication, backup and restore, etc. We will provide those in another article. You can also refer to CloudNative-PG for more
Cluster
configuration.
Connect to PostgreSQL cluster
You can use kubectl port-forward
to connect to PostgreSQL cluster.
$ kubectl port-forward services/pgvectors-rw 5432:5432
$ psql -h 127.0.0.1 -d tensorchord -U tensorchord
tensorchord=> \dx
Name | Version | Schema | Description
---------+---------+------------+----------------------------------------------------------------------------------------------
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
vectors | 0.1.13 | public | vectors: Vector database plugin for Postgres, written in Rust, specifically designed for LLM
The vectors
extension is installed successfully. You can try quick-start without installition.