-
-
Notifications
You must be signed in to change notification settings - Fork 43
Configuration File
(!) This documentation is now outdated. The latest one is available here. |
---|
By default, Spotty is looking for a spotty.yaml
file in the root directory of the project. This file describes
parameters of a remote instance and an environment for the project. Here is a basic example of such file:
project:
name: MyProjectName
remoteDir: /workspace/project
syncFilters:
- exclude:
- .git/*
- .idea/*
- '*/__pycache__/*'
instance:
region: us-east-2
instanceType: p2.xlarge
volumes:
- name: MyVolume
directory: /workspace
size: 10
docker:
image: tensorflow/tensorflow:latest-gpu-py3
ports: [6006, 8888]
Configuration file consists of 3 sections: project
, instance
and scripts
.
-
name
- the name of your project. It will be used to create S3 bucket and CloudFormation stack to run an instance. -
remoteDir
- directory where your project will be stored on the instance. It's usually a directory on the attached volume (see "instance" section). -
syncFilters
(optional) - filters to skip some directories or files during synchronization. By default, all project files will be synced with the instance. Example:syncFilters: - exclude: - .idea/* - .git/* - data/* - include: - data/test/* - exclude: - data/test/config
It will skip ".idea/", ".git/" and "data/" directories except "data/test/" directory. All files from "data/test/" directory will be synced with the instance except "data/test/config" file.
You can read more about filters here: Use of Exclude and Include Filter.
-
region
- AWS region where to run an instance (you can use commandspotty spot-prices
to find the cheapest region). -
availabilityZone
(optional) - AWS availability zone where to run an instance. If zone is not specified, it will be chosen automatically. -
subnetId
(optional) - AWS subnet ID. If this parameter is set, the "availabilityZone" parameter should be set as well. If it's not specified, a default subnet will be used. -
instanceType
- type of the instance to run. You can find more information about types of GPU instances here: Recommended GPU Instances. -
onDemandInstance
(optional) - run On-demand instance instead of a Spot instance. Available values: "true", "false" (default value is "false"). -
amiName
(optional) - name of the AMI with NVIDIA Docker (default value is "SpottyAMI"). Usespotty create-ami
command to create it. This AMI will be used to run your application inside the Docker container. -
maxPrice
(optional) - the maximum price per hour that you are willing to pay for a Spot Instance. By default, it's On-Demand price for chosen instance type. Read more here: Spot Instances. -
rootVolumeSize
(optional) - size of the root volume in GB. The root volume will be destroyed once the instance is terminated. Use attached volumes to store the data you need to keep (see "volumes" parameter below). -
volumes
(optional) - the list of volumes to attach to the instance:-
name
(optional) - name of the volume. This parameter is optional only if thedeletionPolicy
parameter is set to "delete".When you're starting an instance, Spotty is looking for a volume with this name. If the volume exists, it will be attached to the instance, if not - Spotty will be looking for a snapshot with this name. If the snapshot exists, the volume will be restored from the found snapshot. If neither snapshot, nor volume with this name exists, new empty volume will be created.
-
directory
- directory where the volume will be mounted, -
size
(optional) - size of the volume in GB. Size of the volume cannot be less then the size of existing snapshot, but can be increased. -
deletionPolicy
(optional) - what to do with the volume once the instance is terminated using thespotty stop
command. Possible values include: "create_snapshot" (value by default), "update_snapshot", "retain" and "delete".For "create_snapshot" (by default), Spotty will create new snapshot every time you're stopping an instance, the old snapshot will be renamed. AWS uses incremental snapshots, so each new snapshot keeps only the data that was changed since the last snapshot made (see: How Incremental Snapshots Work).
For "update_snapshot", new snapshot will be created and the old one will be deleted.
For "retain", the volume will not be deleted and snapshot will not be taken.
For "delete", the volume will be deleted without creating a snapshot.
Note: Deletion policy works only for volumes that were created from scratch or from snapshots during the
spotty start
command. So if the volume already existed and was just attached to the instance, it will retain after the instance deletion, even if you had a different value in the DeletionPolicy.
-
-
docker
- Docker configuration:-
image
(optional) - the name of the Docker image that contains environment for your project. For example, you could use TensorFlow image for GPU (tensorflow/tensorflow:latest-gpu-py3
). It already contains NumPy, SciPy, scikit-learn, pandas, Jupyter Notebook and TensorFlow itself. If you need to use your own image, you can specify the path to your Dockerfile in thefile
parameter (see below), or push your image to the Docker Hub and use its name. -
file
(optional) - relative path to your custom Dockerfile.Note: make sure that the build context for the Dockerfile doesn't contain gigabytes of training data or some other heavy data (keep the Dockerfile in a separate directory or use the
.dockerignore
file). Otherwise, you would get an out-of-space error, because Docker copies the entire build context to the Docker daemon during the build. Read more here: "docker build" command.Example: if you use TensorFlow and need to download your dataset from S3, you could install AWS CLI on top of the original TensorFlow image. Just create the
Dockerfile
file in thedocker/
directory of your project:FROM tensorflow/tensorflow:latest-gpu-py3 RUN pip install --upgrade awscli
Then set the
file
parameter to thedocker/Dockerfile
value. -
workingDir
(optional) - working directory for your custom scripts (see "scripts" section below), -
dataRoot
(optional) - directory where Docker will store all downloaded and built images. You could cache images on your attached volume to avoid downloading them from the internet or building your custom image from scratch every time when you start an instance. -
commands
(optional) - commands which should be performed once your container is started. For example, you could download your datasets from S3 bucket to the project directory (see "project" section):commands: | aws s3 sync s3://my-bucket/datasets/my-dataset /workspace/project/data
-
runtimeParameters
(optional) - a list of additional parameters for the container runtime. For example:runtimeParameters: ['--privileged', '--shm-size', '2G']
-
-
ports
(optional) - list of ports to open. For example:ports: [6006, 8888]
It will open ports 6006 for Jupyter Notebook and 8008 for TensorBoard.
-
localSshPort
(optional) - if the local SSH port is specified, the commandsspotty ssh
,spotty run
andspotty sync
will do SSH connections to the instance using the IP address 127.0.0.1 and the specified port. It can be useful in case when the instance doesn't have a public IP address and SSH access is provided through a tunnel to a local port.
-
This section contains customs scripts which can be run using
spotty run <SCRIPT_NAME>
command. The following example defines scriptstrain
,jupyter
andtensorflow
:project: ... instance: ... scripts: train: | PYTHONPATH=/workspace/project python /workspace/project/model/train.py --num-layers 3 jupyter: | jupyter notebook --allow-root --notebook-dir=/workspace/project tensorboard: | tensorboard --logdir /workspace/outputs