Skip to content

feat(key-provider): single-container image with systemd lifecycle#623

Open
Leechael wants to merge 1 commit intomasterfrom
feat/key-provider-single-container
Open

feat(key-provider): single-container image with systemd lifecycle#623
Leechael wants to merge 1 commit intomasterfrom
feat/key-provider-single-container

Conversation

@Leechael
Copy link
Copy Markdown
Collaborator

Summary

  • Consolidate aesmd + gramine-sealing-key-provider into a single container, eliminating cross-container startup ordering issues
  • Add systemd service unit (dstack-key-provider.service) using docker run --rm instead of docker-compose
  • Move sgx_default_qcnl.conf to /etc/dstack-key-provider/, removing dependency on external disk mount (/opt/dstack)
  • Add GitHub Actions workflow to build and push image to ghcr.io/dstack-tee/dstack/gramine-sealing-key-provider

Background

After power outage reboots, docker daemon starts before /opt/dstack (external disk) is mounted. The old docker-compose deployment bind-mounted config from that disk, causing aesmd to fail (exit 127), which cascaded to key-provider failure, which took down VMM.

Changes

File Description
key-provider-build/Dockerfile Multi-stage build: stage 1 compiles key-provider, stage 2 is runtime with aesmd + binary
key-provider-build/entrypoint.sh Starts aesmd in background, waits for socket, execs gramine-sgx for proper signal handling
key-provider-build/dstack-key-provider.service systemd unit with Restart=on-failure, references ghcr.io image
.github/workflows/key-provider-release.yml Build and push to ghcr.io on key-provider-v* tags

Deployment

# Config file (one-time)
sudo mkdir -p /etc/dstack-key-provider
sudo cp sgx_default_qcnl.conf /etc/dstack-key-provider/

# Service (one-time)
sudo cp dstack-key-provider.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now dstack-key-provider.service

Release

git tag key-provider-v0.1.0
git push origin key-provider-v0.1.0
# GitHub Actions builds and pushes to ghcr.io automatically

Test plan

  • Build image locally: docker build -t test-key-provider key-provider-build/
  • Run on SGX machine and verify port 3443 is listening
  • Tag and push, verify GitHub Actions builds and pushes to ghcr.io
  • Deploy systemd service, verify systemctl status dstack-key-provider shows active
  • Reboot test: verify service auto-starts after reboot

…agement

Consolidate aesmd and gramine-sealing-key-provider into a single container
to eliminate cross-container startup ordering issues that caused cascading
failures after power outage reboots. Move config to /etc so it no longer
depends on external disk mounts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant