Docker image builds and packs the environment from yaml files

This commit is contained in:
Kerem Kayabay 2024-02-23 11:30:19 +03:00
commit bee280ff14
7 changed files with 171 additions and 0 deletions

11
.gitignore vendored Normal file
View File

@ -0,0 +1,11 @@
# Compiled source
__pycache__
# Packages
*.gz
*.rar
*.tar
*.zip
# OS generated files
.DS_Store

61
README.md Normal file
View File

@ -0,0 +1,61 @@
# Miniconda Env Builder Docker Image for amd64 Architecture
## Project Objective
The goal of this project is to create a Docker image based on Rocky Linux 8.8, equipped with Miniconda, optimized for and compatible with the amd64 architecture, even when built on a machine with a different architecture, such as Apple Silicon Macs. This Docker image facilitates the development and testing of Python applications in a Conda environment that mirrors the target deployment environment, ensuring compatibility and performance.
## Prerequisites
Before you begin, ensure you have Docker installed on your machine. If you are using an Apple Silicon Mac, Docker Desktop should be configured to support multi-architecture builds, which is included out of the box with recent versions.
## Building the Docker Image
To build the Docker image, follow these steps:
1. **Clone the Repository**
First, clone this repository to your local machine using Git:
```bash
git clone <repository-url>
cd <repository-directory>
```
Replace `<repository-url>` with the URL of the Git repository and `<repository-directory>` with the name of the directory into which the repository is cloned.
2. **Build the Docker Image**
Run the following command in the terminal from the root of the cloned repository:
```bash
docker build -f miniconda-rockylinux.dockerfile --platform=linux/amd64 -t miniconda-rockylinux:latest .
```
This command builds a Docker image named `miniconda-rockylinux` with the latest tag, specifying the target platform as `linux/amd64`. Ensure Docker's `buildx` feature is enabled for cross-platform builds.
## Building and Packing a Conda Environment
This Docker image includes a utility script, `build_and_pack_env.sh`, that automates the process of creating a Conda environment from a YAML file, packing it using `conda-pack`, and preparing it for transfer to a cluster.
### Using the Script
1. **Prepare Your Environment YAML File**: Ensure you have a YAML file describing your Conda environment. This file should list all the packages and versions you want to include.
2. **Run the Docker Container with Volume Mount**: Run the Docker container, mounting the directory containing your environment YAML file. Replace `<path-to-your-yaml-file>` with the actual path to your YAML file:
```bash
docker run -it -v <path-to-your-yaml-file>:/envs --workdir /envs miniconda-rockylinux:latest
```
3. **Execute the Script Inside the Container**: Once inside the container, run the build_and_pack_env.sh script with your YAML file as an argument. Replace your_environment.yml with the name of your environment file:
```bash
build_and_pack_env.sh your_environment.yaml
```
The script will create the environment, pack it, and output a `.tar.gz` file that you can transfer to your cluster.
**Notes**
- The packed environment file will be saved in the same directory as the original YAML file.
- Ensure the volume mount (-v) option correctly maps the local directory containing your YAML file to the `/envs` directory inside the container.

27
build_and_pack_env.sh Normal file
View File

@ -0,0 +1,27 @@
#!/bin/bash
# Check if an environment file was provided
if [ "$#" -ne 1 ]; then
echo "Usage: $0 environment.yml"
exit 1
fi
source filename_extractor.sh
ENV_FILE=$1
ENV_NAME=$(extract_filename $ENV_FILE)
# Create the Conda environment
echo "Creating Conda environment: $ENV_NAME"
conda env create -f $ENV_FILE -n $ENV_NAME
# Activate the environment
echo "Activating environment: $ENV_NAME"
source activate $ENV_NAME
# Pack the environment
echo "Packing environment: $ENV_NAME"
conda deactivate
conda pack -n $ENV_NAME -o ${ENV_NAME}.tar.gz
echo "Environment $ENV_NAME packed successfully into ${ENV_NAME}.tar.gz"

7
envs/jupyterlab.yaml Normal file
View File

@ -0,0 +1,7 @@
name: jupyterlab
channels:
- defaults
dependencies:
- python=3.10
- pip
- jupyterlab

23
envs/ray.yaml Normal file
View File

@ -0,0 +1,23 @@
name: ray
channels:
- defaults
dependencies:
- python=3.10
- pip
- pip:
- ray==2.8.0
- "ray[default]==2.8.0"
- dask==2022.10.1
- torch
- pydantic<2
- six
- torch
- tqdm
- pandas<2
- scikit-learn
- matplotlib
- optuna
- seaborn
- tabulate
- jupyterlab
- autopep8

8
filename_extractor.sh Normal file
View File

@ -0,0 +1,8 @@
#!/bin/bash
extract_filename() {
local fullpath="$1"
local filename="${fullpath##*/}" # Remove the path, retain the filename
local name="${filename%%.*}" # Remove the extension
echo "$name"
}

View File

@ -0,0 +1,34 @@
FROM rockylinux:8.8
# Set environment variables to reduce clutter and size
ENV MINICONDA_VERSION=py39_4.10.3
ENV PATH=/opt/conda/bin:$PATH
# Install necessary packages
RUN yum -y install wget bzip2 && \
yum clean all
# Download and install Miniconda
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh -O /tmp/miniconda.sh && \
/bin/bash /tmp/miniconda.sh -b -p /opt/conda && \
rm /tmp/miniconda.sh
# Initialize Conda in bash config
RUN echo "source /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \
echo "conda activate base" >> ~/.bashrc
RUN conda install -c conda-forge conda-pack
# Copy the build scripts into the image
COPY build_and_pack_env.sh /usr/local/bin/build_and_pack_env.sh
COPY filename_extractor.sh /usr/local/bin/filename_extractor.sh
# Make the build script executable
RUN chmod +x /usr/local/bin/build_and_pack_env.sh
# Optionally, create a new Conda environment or install packages here
# For example, to create a new environment named 'myenv' with Python 3.8:
# RUN conda create --name myenv python=3.8
# Set the default command to run when starting the container
CMD [ "/bin/bash" ]