Skip to content

Configuration File

Oleg Polosin edited this page Apr 2, 2019 · 14 revisions
(!) This documentation is now outdated. The latest one is available here.

By default, Spotty is looking for a spotty.yaml file in the root directory of the project. This file describes parameters of a remote instance and an environment for the project. Here is a basic example of such file:

  name: MyProjectName
  remoteDir: /workspace/project
    - exclude:
      - .git/*
      - .idea/*
      - '*/__pycache__/*'
  region: us-east-2
  instanceType: p2.xlarge
    - name: MyVolume
      directory: /workspace
      size: 10
    image: tensorflow/tensorflow:latest-gpu-py3
  ports: [6006, 8888]

Available Parameters

Configuration file consists of 3 sections: project, instance and scripts.

project section:

  • name - the name of your project. It will be used to create S3 bucket and CloudFormation stack to run an instance.

  • remoteDir - directory where your project will be stored on the instance. It's usually a directory on the attached volume (see "instance" section).

  • syncFilters (optional) - filters to skip some directories or files during synchronization. By default, all project files will be synced with the instance. Example:

      - exclude:
          - .idea/*
          - .git/*
          - data/*
      - include:
          - data/test/*
      - exclude:
          - data/test/config

    It will skip ".idea/", ".git/" and "data/" directories except "data/test/" directory. All files from "data/test/" directory will be synced with the instance except "data/test/config" file.

    You can read more about filters here: Use of Exclude and Include Filter.

instance section:

  • region - AWS region where to run an instance (you can use command spotty spot-prices to find the cheapest region).

  • availabilityZone (optional) - AWS availability zone where to run an instance. If zone is not specified, it will be chosen automatically.

  • subnetId (optional) - AWS subnet ID. If this parameter is set, the "availabilityZone" parameter should be set as well. If it's not specified, a default subnet will be used.

  • instanceType - type of the instance to run. You can find more information about types of GPU instances here: Recommended GPU Instances.

  • onDemandInstance (optional) - run On-demand instance instead of a Spot instance. Available values: "true", "false" (default value is "false").

  • amiName (optional) - name of the AMI with NVIDIA Docker (default value is "SpottyAMI"). Use spotty create-ami command to create it. This AMI will be used to run your application inside the Docker container.

  • maxPrice (optional) - the maximum price per hour that you are willing to pay for a Spot Instance. By default, it's On-Demand price for chosen instance type. Read more here: Spot Instances.

  • rootVolumeSize (optional) - size of the root volume in GB. The root volume will be destroyed once the instance is terminated. Use attached volumes to store the data you need to keep (see "volumes" parameter below).

  • volumes (optional) - the list of volumes to attach to the instance:

    • name (optional) - name of the volume. This parameter is optional only if the deletionPolicy parameter is set to "delete".

      When you're starting an instance, Spotty is looking for a volume with this name. If the volume exists, it will be attached to the instance, if not - Spotty will be looking for a snapshot with this name. If the snapshot exists, the volume will be restored from the found snapshot. If neither snapshot, nor volume with this name exists, new empty volume will be created.

    • directory - directory where the volume will be mounted,

    • size (optional) - size of the volume in GB. Size of the volume cannot be less then the size of existing snapshot, but can be increased.

    • deletionPolicy (optional) - what to do with the volume once the instance is terminated using the spotty stop command. Possible values include: "create_snapshot" (value by default), "update_snapshot", "retain" and "delete".

      For "create_snapshot" (by default), Spotty will create new snapshot every time you're stopping an instance, the old snapshot will be renamed. AWS uses incremental snapshots, so each new snapshot keeps only the data that was changed since the last snapshot made (see: How Incremental Snapshots Work).

      For "update_snapshot", new snapshot will be created and the old one will be deleted.

      For "retain", the volume will not be deleted and snapshot will not be taken.

      For "delete", the volume will be deleted without creating a snapshot.

      Note: Deletion policy works only for volumes that were created from scratch or from snapshots during the spotty start command. So if the volume already existed and was just attached to the instance, it will retain after the instance deletion, even if you had a different value in the DeletionPolicy.

  • docker - Docker configuration:

    • image (optional) - the name of the Docker image that contains environment for your project. For example, you could use TensorFlow image for GPU (tensorflow/tensorflow:latest-gpu-py3). It already contains NumPy, SciPy, scikit-learn, pandas, Jupyter Notebook and TensorFlow itself. If you need to use your own image, you can specify the path to your Dockerfile in the file parameter (see below), or push your image to the Docker Hub and use its name.

    • file (optional) - relative path to your custom Dockerfile.

      Note: make sure that the build context for the Dockerfile doesn't contain gigabytes of training data or some other heavy data (keep the Dockerfile in a separate directory or use the .dockerignore file). Otherwise, you would get an out-of-space error, because Docker copies the entire build context to the Docker daemon during the build. Read more here: "docker build" command.

      Example: if you use TensorFlow and need to download your dataset from S3, you could install AWS CLI on top of the original TensorFlow image. Just create the Dockerfile file in the docker/ directory of your project:

      FROM tensorflow/tensorflow:latest-gpu-py3
      RUN pip install --upgrade awscli

      Then set the file parameter to the docker/Dockerfile value.

    • workingDir (optional) - working directory for your custom scripts (see "scripts" section below),

    • dataRoot (optional) - directory where Docker will store all downloaded and built images. You could cache images on your attached volume to avoid downloading them from the internet or building your custom image from scratch every time when you start an instance.

    • commands (optional) - commands which should be performed once your container is started. For example, you could download your datasets from S3 bucket to the project directory (see "project" section):

      commands: |
        aws s3 sync s3://my-bucket/datasets/my-dataset /workspace/project/data
    • runtimeParameters (optional) - a list of additional parameters for the container runtime. For example:

      runtimeParameters: ['--privileged', '--shm-size', '2G']
  • ports (optional) - list of ports to open. For example:

    ports: [6006, 8888]

    It will open ports 6006 for Jupyter Notebook and 8008 for TensorBoard.

  • localSshPort (optional) - if the local SSH port is specified, the commands spotty ssh, spotty run and spotty sync will do SSH connections to the instance using the IP address and the specified port. It can be useful in case when the instance doesn't have a public IP address and SSH access is provided through a tunnel to a local port.

scripts section (optional):

  • This section contains customs scripts which can be run using spotty run <SCRIPT_NAME> command. The following example defines scripts train, jupyter and tensorflow:

      train: |
        python /workspace/project/model/ --num-layers 3
      jupyter: |
        jupyter notebook --allow-root --notebook-dir=/workspace/project
      tensorboard: |
        tensorboard --logdir /workspace/outputs