NeuroAgent

How to Copy Docker Build Artifacts to Host in CI/CD

Learn the best methods to copy build artifacts from Docker containers to host machines in CI/CD pipelines. Explore docker cp, volume mounts, multi-stage builds, and platform features for efficient artifact management.

How to copy build artifacts from a Docker container to the host machine in a CI/CD pipeline

NeuroAgent

To copy build artifacts from a Docker container to the host machine in a CI/CD pipeline, you can use several methods including docker cp commands, volume mounts, multi-stage builds, or leveraging CI/CD platform-specific features. The most efficient approach depends on your pipeline architecture, but volume mounts and multi-stage builds are generally recommended for production environments as they provide better performance and cleaner artifact management than manual copy operations.

Contents

Understanding the Challenge in CI/CD Pipelines

In modern CI/CD pipelines, Docker containers are commonly used to provide consistent build environments. However, a fundamental challenge emerges: build artifacts produced inside containers need to be extracted to the host machine for deployment, testing, or further processing. This operation becomes critical in automated pipelines where containers are ephemeral and temporary.

The complexity increases when dealing with:

  • Large artifacts that impact container startup time
  • Multiple artifact types (binaries, reports, packages)
  • Pipeline scalability and parallel execution
  • Security considerations around artifact access
  • Cross-platform compatibility requirements

Understanding these constraints helps in selecting the most appropriate artifact extraction method for your specific use case. Each approach has its trade-offs in terms of performance, complexity, and maintainability.


Method 1: Using docker cp Command

The docker cp command provides a straightforward way to copy files from a running container to the host filesystem. This method is particularly useful for ad-hoc artifact extraction or when working with existing containerized processes.

Basic Implementation

bash
# Syntax: docker cp <container>:<source_path> <destination_path>
docker cp my-builder-container:/app/build/output.tar.gz ./artifacts/

CI/CD Integration Example

In a typical CI/CD pipeline like Jenkins or GitHub Actions, you would:

yaml
# GitHub Actions example
- name: Build application
  run: docker build -t my-builder .
  
- name: Run container and build
  run: docker run --name my-builder-container my-builder make build
  
- name: Copy artifacts
  run: docker cp my-builder-container:/app/dist ./dist
  
- name: Cleanup
  run: docker rm my-builder-container

Advantages and Limitations

Advantages:

  • Simple to implement and understand
  • No additional configuration required
  • Works with any container, regardless of its internal setup
  • Immediate access to artifacts after command execution

Limitations:

  • Requires container to be running during copy operation
  • Adds overhead of container lifecycle management
  • Not suitable for very large artifacts due to performance constraints
  • Manual cleanup required to avoid resource leaks

Method 2: Volume Mounts for Persistent Artifacts

Volume mounting provides a more efficient solution by creating a shared filesystem between the host and container. Artifacts written to the mounted volume are immediately available on the host without requiring explicit copy operations.

Implementation with Docker Run

bash
# Create artifact directory on host
mkdir -p ./build-artifacts

# Run container with volume mount
docker run --rm -v $(pwd)/build-artifacts:/app/output my-builder \
  sh -c "make build && cp -r /app/build/* /app/output/"

Dockerfile Integration

For better integration, modify your Dockerfile to work with mounted volumes:

dockerfile
FROM alpine:latest

# Create output directory with proper permissions
RUN mkdir -p /app/output && chown -R 1000:1000 /app/output

# Set working directory
WORKDIR /app

# Copy build tools and source
COPY . .

# Build command writes directly to mounted volume
CMD ["sh", "-c", "make build && cp -r /app/build/* /app/output/"]

CI/CD Pipeline Integration

yaml
# Jenkins pipeline example
pipeline {
    agent any
    stages {
        stage('Build') {
            steps {
                script {
                    // Create artifact directory
                    sh 'mkdir -p ./artifacts'
                    
                    // Run container with volume mount
                    sh 'docker run --rm -v ${pwd}/artifacts:/app/output my-builder'
                }
            }
        }
    }
}

Advanced Volume Mounting Techniques

For more complex scenarios, consider:

bash
# Read-only source mount, read-write output mount
docker run --rm \
  -v $(pwd)/src:/app/src:ro \
  -v $(pwd)/artifacts:/app/output:rw \
  -v $(pwd)/cache:/app/cache \
  my-builder

Key Benefits:

  • Performance: Near real-time artifact availability
  • Simplicity: No separate copy step required
  • Scalability: Handles large artifacts efficiently
  • Flexibility: Works with various CI/CD platforms

Method 3: Multi-Stage Builds for Efficient Artifact Management

Multi-stage builds represent the most sophisticated approach, separating build dependencies from the final artifact. This method creates a clean, minimal final image containing only the necessary artifacts.

Multi-Stage Dockerfile Example

dockerfile
# Build stage with all dependencies
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build

# Final stage with minimal dependencies
FROM nginx:alpine
COPY --from=builder /app/dist /usr/share/nginx/html
COPY --from=builder /app/nginx.conf /etc/nginx/nginx.conf
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

Artifact Extraction Strategy

For CI/CD pipelines, you can extract artifacts from the intermediate stage:

bash
# Build and extract artifacts
docker build --target builder -t my-builder .
docker run --rm -v $(pwd)/artifacts:/output my-builder \
  sh -c "cp -r /app/dist /output/"

# Or use buildkit for better performance
DOCKER_BUILDKIT=1 docker build --target builder -o ./artifacts .

GitHub Actions Implementation

yaml
name: Build and Extract Artifacts
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v2
      
    - name: Build and extract artifacts
      run: |
        export DOCKER_BUILDKIT=1
        docker build --target builder -o ./dist .
        
    - name: Upload artifacts
      uses: actions/upload-artifact@v3
      with:
        name: build-artifacts
        path: ./dist/

Advantages of Multi-Stage Approach:

  • Smaller final images: Reduced deployment size and attack surface
  • Clean separation: Build tools isolated from runtime dependencies
  • Better security: No build tools in production environment
  • Optimized performance: Build artifacts created efficiently

Method 4: CI/CD Platform Native Features

Modern CI/CD platforms offer native artifact management capabilities that integrate seamlessly with Docker workflows. These solutions often provide built-in artifact handling, versioning, and distribution.

GitHub Actions Artifacts

yaml
- name: Build Docker image
  run: docker build -t my-app .
  
- name: Run build container
  run: |
    docker run --rm -v ${{ github.workspace }}/artifacts:/output \
      my-app sh -c "make build && cp build/* /output/"
    
- name: Upload artifacts
  uses: actions/upload-artifact@v3
  with:
    name: build-output
    path: artifacts/
    retention-days: 30

Jenkins Pipeline with Artifacts

groovy
pipeline {
    agent any
    stages {
        stage('Build') {
            steps {
                script {
                    // Run container with artifact extraction
                    sh '''
                        docker run --rm \
                          -v ${WORKSPACE}/artifacts:/output \
                          my-builder
                    '''
                }
            }
        }
        stage('Archive artifacts') {
            steps {
                archiveArtifacts artifacts: 'artifacts/**', fingerprint: true
            }
        }
    }
}

GitLab CI/CD Artifacts

yaml
stages:
  - build
  - deploy

build_job:
  stage: build
  script:
    - docker build -t my-builder .
    - docker run --rm -v ${CI_PROJECT_DIR}/artifacts:/output my-builder
  artifacts:
    paths:
      - artifacts/
    expire_in: 1 week

deploy_job:
  stage: deploy
  script:
    - echo "Deploying from artifacts"
  dependencies:
    - build_job

AWS CodeBuild Artifact Management

yaml
version: 0.2
phases:
  build:
    commands:
      - docker build -t my-builder .
      - docker run --rm -v $CODEBUILD_SRC_DIR/artifacts:/output my-builder
  post_build:
    commands:
      - aws s3 sync ./artifacts s3://my-artifacts-bucket/$CODEBUILD_BUILD_ID

Platform-Specific Benefits:

  • Integrated storage: Native artifact repositories and registries
  • Versioning: Automatic artifact versioning and history tracking
  • Security: Built-in access controls and security scanning
  • Distribution: Automated artifact distribution across environments

Best Practices and Considerations

Performance Optimization

Use buildkit for faster builds:

bash
export DOCKER_BUILDKIT=1
docker build --target builder -o ./dist .

Cache volumes for repeated builds:

bash
docker run --rm \
  -v $(pwd)/artifacts:/output \
  -v $(pwd)/cache:/app/cache \
  my-builder

Security Considerations

Minimize container privileges:

bash
# Run as non-root user
docker run --rm -u 1000:1000 -v $(pwd)/artifacts:/output my-builder

Use read-only filesystems when possible:

bash
docker run --rm \
  -v $(pwd)/src:/app/src:ro \
  -v $(pwd)/artifacts:/app/output:rw \
  my-builder

Scalability Patterns

Parallel artifact processing:

bash
# Process multiple artifacts in parallel
docker run --rm -v $(pwd)/artifacts:/output my-builder \
  sh -c "make build && cp -r /app/build/* /app/output/" &

Artifact cleanup automation:

bash
# Cleanup older artifacts
find ./artifacts -name "*.tar.gz" -mtime +7 -delete

Implementation Examples

Complete CI/CD Pipeline Example

Here’s a comprehensive GitHub Actions workflow demonstrating artifact management:

yaml
name: Build and Deploy

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]

jobs:
  build:
    runs-on: ubuntu-latest
    outputs:
      version: ${{ steps.version.outputs.version }}
    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v2
      
    - name: Build application
      run: |
        export DOCKER_BUILDKIT=1
        docker build --target builder -o ./dist .
        
    - name: Extract version
      id: version
      run: echo "version=$(cat ./dist/version.txt)" >> $GITHUB_OUTPUT
    
    - name: Upload artifacts
      uses: actions/upload-artifact@v3
      with:
        name: build-artifacts-${{ steps.version.outputs.version }}
        path: ./dist/
        retention-days: 90

  test:
    needs: build
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    
    - name: Download artifacts
      uses: actions/download-artifact@v3
      with:
        name: build-artifacts-${{ needs.build.outputs.version }}
        path: ./test-artifacts/
        
    - name: Run tests
      run: |
        # Test with downloaded artifacts
        ./test-artifacts/test-runner

  deploy:
    needs: [build, test]
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    steps:
    - uses: actions/checkout@v3
    
    - name: Download artifacts
      uses: actions/download-artifact@v3
      with:
        name: build-artifacts-${{ needs.build.outputs.version }}
        path: ./deploy/
        
    - name: Deploy to production
      run: |
        # Deploy script using artifacts
        ./deploy/deploy.sh

Docker Compose for Local Development

yaml
version: '3.8'
services:
  builder:
    build:
      context: .
      target: builder
    volumes:
      - ./src:/app/src:ro
      - ./artifacts:/app/output
      - ./cache:/app/cache
    environment:
      - NODE_ENV=development
    command: sh -c "npm run build && cp -r /app/build/* /app/output/"

Advanced Artifact Management Script

bash
#!/bin/bash
# artifact-manager.sh

set -e

ARTIFACT_DIR="./artifacts"
CONTAINER_NAME="builder-container"
IMAGE_NAME="my-builder"

# Ensure artifact directory exists
mkdir -p "$ARTIFACT_DIR"

# Build the Docker image
echo "Building Docker image..."
docker build -t "$IMAGE_NAME" .

# Run container with proper mounts
echo "Running build container..."
docker run --rm \
  --name "$CONTAINER_NAME" \
  -u "$(id -u):$(id -g)" \
  -v "$(pwd)/src:/app/src:ro" \
  -v "$ARTIFACT_DIR:/app/output" \
  -v "$(pwd)/cache:/app/cache" \
  "$IMAGE_NAME" \
  sh -c "make build && cp -r /app/build/* /app/output/ && chmod -R 755 /app/output/*"

echo "Artifacts successfully extracted to $ARTIFACT_DIR"
ls -la "$ARTIFACT_DIR"

Conclusion

Copying build artifacts from Docker containers to host machines in CI/CD pipelines requires careful consideration of performance, security, and maintainability. The four main approaches—docker cp, volume mounts, multi-stage builds, and CI/CD platform features—each offer distinct advantages depending on your specific requirements.

Key takeaways:

  • Volume mounts provide the best balance of performance and simplicity for most CI/CD scenarios
  • Multi-stage builds offer superior efficiency and security for production environments
  • CI/CD platform features provide integrated artifact management with versioning and distribution
  • Security considerations should guide your choice of user permissions and filesystem access

For most modern CI/CD pipelines, combining volume mounts with multi-stage builds represents the optimal approach, providing both performance benefits and clean artifact management. Always consider your specific requirements around artifact size, security constraints, and pipeline scalability when implementing these patterns.

Sources

  1. Docker Documentation - Volume Mounts
  2. Docker Documentation - Multi-Stage Builds
  3. GitHub Actions Documentation - Artifacts
  4. Jenkins Documentation - Pipeline Examples
  5. CI/CD Best Practices - Docker Integration