Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chore: setup postgres and vector in dockerfile #677

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 21 additions & 22 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,35 +22,33 @@ RUN --mount=type=cache,id=pnpm,target=/root/.local/share/pnpm/store \
UV_LINK_MODE=copy BIN_DIR=/bin make package-tools

FROM cgr.dev/chainguard/wolfi-base AS final

# Setup postgres and postgres-vector, although in production setup we don't really need standalone postgres
RUN apk add postgresql build-base git postgresql-dev
RUN git clone --branch v0.8.0 https://github.com/pgvector/pgvector.git && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

darren mentioned this fleetingly when the feasibility of this was discussed. i should have passed it along sooner. he said you could just get it via from, like

FROM pgvector/pgvector:pg17

or whatever. i think that would be a little cleaner and more official feeling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I did detour to avoid dropping wolfi-base image we already uses.

cd pgvector && \
make clean && \
make OPTFLAGS="" && \
make install && \
cd .. && \
rm -rf pgvector

RUN adduser -D postgres && mkdir -p /var/lib/postgresql/data && chown -R postgres:postgres /var/lib/postgresql
RUN su postgres -c "initdb --encoding=UTF8 -D /var/lib/postgresql/data"
RUN su - postgres -c "pg_ctl -D /var/lib/postgresql/data start" && \
su - postgres -c "psql --command \"CREATE USER otto8 WITH SUPERUSER;\"" && \
su - postgres -c "psql --command \"CREATE DATABASE otto8 OWNER otto8;\"" && \
su - postgres -c "psql -d otto8 --command \"CREATE EXTENSION vector;\"" || true && \
su - postgres -c "pg_ctl -D /var/lib/postgresql/data stop" || true

RUN apk add --no-cache git python-3.13 py3.13-pip openssh-server npm bash tini procps libreoffice
COPY --chmod=0755 /tools/package-chrome.sh /
RUN /package-chrome.sh && rm /package-chrome.sh
RUN sed -E 's/^#(PermitRootLogin)no/\1yes/' /etc/ssh/sshd_config -i
RUN ssh-keygen -A
RUN mkdir /run/sshd && /usr/sbin/sshd
COPY encryption.yaml /
COPY --chmod=0755 <<EOF /bin/run.sh
#!/bin/bash
set -e
if [ "\$OPENAI_API_KEY" = "" ]; then
echo OPENAI_API_KEY env is required to be set
exit 1
fi
mkdir -p /run/sshd
/usr/sbin/sshd -D &
mkdir -p /data/cache
# This is YAML
export OTTO8_SERVER_VERSIONS="$(cat <<VERSIONS
"github.com/otto8-ai/tools": "$(cd /otto8-tools && git rev-parse HEAD)"
"github.com/gptscript-ai/workspace-provider": "$(cd /otto8-tools/workspace-provider && git rev-parse HEAD)"
"github.com/gptscript-ai/datasets": "$(cd /otto8-tools/datasets && git rev-parse HEAD)"
"github.com/kubernetes-sigs/aws-encryption-provider": "$(cd /otto8-tools/aws-encryption-provider && git rev-parse HEAD)"
# double echo to remove trailing whitespace
"chrome": "$(echo $(/opt/google/chrome/chrome --version))"
VERSIONS
)"
exec tini -- otto8 server
EOF
COPY --chmod=0755 run.sh /bin/run.sh

COPY --link --from=tools /app/otto8-tools /otto8-tools
COPY --from=bin /app/bin/otto8 /bin/
Expand All @@ -69,4 +67,5 @@ ENV GOMEMLIMIT=1GiB
ENV TERM=vt100
WORKDIR /data
VOLUME /data
VOLUME /var/lib/postgresql/data
CMD ["run.sh"]
46 changes: 46 additions & 0 deletions run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
#!/bin/bash
set -e

check_postgres_active() {
for i in {1..30}; do
if pg_isready -q; then
echo "PostgreSQL is active and ready!"
return 0
fi
echo "Waiting for PostgreSQL to become active... ($i/10)"
sleep 2
done
echo "PostgreSQL did not become active in time."
exit 1
}

if [ "$OPENAI_API_KEY" = "" ]; then
echo OPENAI_API_KEY env is required to be set
exit 1
fi
mkdir -p /run/sshd
/usr/sbin/sshd -D &
mkdir -p /data/cache
# This is YAML
export OTTO8_SERVER_VERSIONS="$(cat <<VERSIONS
"github.com/otto8-ai/tools": "$(cd /otto8-tools && git rev-parse HEAD)"
"github.com/gptscript-ai/workspace-provider": "$(cd /otto8-tools/workspace-provider && git rev-parse HEAD)"
"github.com/gptscript-ai/datasets": "$(cd /otto8-tools/datasets && git rev-parse HEAD)"
"github.com/kubernetes-sigs/aws-encryption-provider": "$(cd /otto8-tools/aws-encryption-provider && git rev-parse HEAD)"
# double echo to remove trailing whitespace
"chrome": "$(echo $(/opt/google/chrome/chrome --version))"
VERSIONS
)"

if [ -z "$OTTO8_SERVER_DSN" ]; then
echo "OTTO8_SERVER_DSN is not set. Starting PostgreSQL process..."

# Start PostgreSQL in the background
echo "Starting PostgreSQL server..."
su - postgres -c "postgres -D /var/lib/postgresql/data" &

check_postgres_active
export OTTO8_SERVER_DSN="postgresql://otto8@localhost:5432/otto8"
fi

exec tini -- otto8 server