This repository consists of two components:
- a Flask-based Python web server (
pydatalab
) that communicates with the database backend, - a JavaScript+Vue web application for the user interface.
To run an instance, you will need to install the environments for each component.
Firstly, from the desired folder, clone this repository from GitHub to your local machine with git clone https://github.com/the-grey-group/datalab
.
Alternatively, if you do not wish to contribute to the code, you can simply download the current state as a .zip file from GitHub. Should you wish to just run/deploy the apps themselves, the easiest method is to use Docker (instructions below).
-
Install
pipenv
on your machine.- Detailed instructions for installing
pipenv
,pip
and Python itself can be found on thepipenv
website. - We recommend you install
pipenv
from PyPI (withpip install pipenv
orpip install --user pipenv
) for the Python distribution of your choice (in a virtual environment or otherwise). This is distinct from the virtual environment thatpipenv
itself will create for thepydatalab
package.
- Detailed instructions for installing
-
Set up MongoDB.
- Install the free MongoDB community edition (full instructions on the MongoDB website).
- For Mac users, MongoDB is available via HomeBrew.
- You can alternatively run the MongoDB via Docker using the config in this package with
docker-compose up mongo
(see further instructions below.
- If you wish to view the database directly, MongoDB has several GUIs, e.g. MongoDB Compass or RoboMongo.
- For persistence, you will need to set up MongoDB to run as a service on your computer (or run manually each time you use the site).
- In MongoDB, create a database called "datalabvue" (further instructions on the MongoDB website).
- You can do this with the
mongo
shell (echo "use datalabvue" | mongo
) or with Compass.
- You can do this with the
- Install the free MongoDB community edition (full instructions on the MongoDB website).
-
Install the
pydatalab
package.- Navigate to the
pydatalab
folder and runpipenv install
.- This will create a
pipenv
environment forpydatalab
and all of its dependencies that is registered within this folder only.
- This will create a
- Navigate to the
-
Run the server from the
pydatalab
folder withpipenv run python pydatalab/main.py
.
The server should now be accessible at http://localhost:5001. If the server is running, navigating to this URL will display "Hello, This is a server".
Should you wish to contribute to/modify the Python code, you may wish to perform these extra steps:
- From within the
pydatalab
folder, runpipenv install --dev
to pull the development dependencies (e.g.,pre-commit
,pytest
). - Run
pre-commit install
to begin usingpre-commit
to check all of your modifications when you rungit commit
.- The hooks that run on each commit can be found in the top-level
.pre-commit-config.yml
file.
- The hooks that run on each commit can be found in the top-level
- The tests on the Python code can be run by executing
py.test
from thepydatalab/
folder.
Additional notes:
- If the Flask server is running when the source code is changed, it will generally hot-reload without needing to manually restart the server.
- You may have to configure the
MONGO_URI
config inmain.py
depending on your MongoDB setup. In the future, this will be accessible via a config file.
-
If you do not already have it, install node.js and the Node Package Manager (
npm
). It is recommended not to install node using the official installer, since it is difficult to manage or uninstall, and permissions issues may arise. Intead, it is recommended to install and manage versions using the [node version manager (nvm)][https://github.com/nvm-sh/nvm#installing-and-updating]:nvm install --lts
This will install the current recommended version of node and nvm.
-
Once installed, use it to install the
yarn
package manager:npm install --global yarn
From this point on, the
npm
command is not needed - all package and script management for the webapp is handled usingyarn
. -
Navigate to the
webapp/
directory in your local copy of this repository and runyarn install
(requires ~400 MB of disk space). -
Run the webapp from a development server with
yarn serve
.
Similar to the Flask development server, these steps will provide a development environment that serves the web app at localhost:8080
(by default) and automatically reloads it as changes are made to the source code.
Various other development scripts are available through yarn
:
yarn lint
: Lint the javascript code usingeslint
, identifying issues and automatically fixing many. This linting process also runs automatically every time the development server reloads.yarn test:unit
: run the unit/componenet tests usingjest
. These test individual functions or components.yarn test:e2e
: run end-to-end tests usingcypress
. This will build and serve the app, and launch an instance of Chrome where the tests can be interactively viewed. The tests can also be run without the gui usingyarn test:e2e --headless
. Note: currently, the tests make requests to the server running onlocalhost:5001
.yarn build
: Compile an optimized, minimized, version of the app for production.
Docker uses virtualization to allow you to build "images" of your software that are transferrable and deployable as "containers" across multiple systems.
These instructions assume that Docker is installed (a recent version that includes Compose V2 and BuildKit) and that the Docker daemon is running locally.
See the Docker website for instructions for your system.
Note that pulling and building the images can require significant disk space (~5 GB for a full setup), especially when multiple versions of images have been built (you can use docker system df
to see how much spaace is being used).
Dockerfiles for the web app, server and database can be found in the .docker
directory.
There are separate build targets for production
and development
(and corresponding docker-compose profiles prod
and dev
).
The production target will copy the state of the repository on build and use gunicorn
and serve
to serve the server and app respectively.
The development target mounts the repository in the running container and provides hot-reloading servers for both the backend and frontend.
docker compose --profile dev build
will pull each of the base Docker images (mongo
,node
andpython
) and build the corresponding apps in development mode on top of them (using--profile prod
for production deployments).docker compose --profile dev up
will launch a container for each component, such that the web app can be accessed atlocalhost:8081
, the server atlocalhost:5001
and the database atlocalhost:27017
.- Individual containers can be launched with
docker compose up <service name>
for the servicesmongo
,app
,app_dev
,api
andapi_dev
. docker compose stop
will stop all running containers.
This package allows you to attach files from remote filesystems to samples and other entries.
These filesystems can be configured in the config file with the REMOTE_FILESYSTEMS
option.
In practice, these options should be set in a centralised deployment.
Currently, there are two mechanisms for accessing remote files:
- You can mount the filesystem locally and provide the path in your datalab config file. For example, for Cambridge Chemistry users, you will have to (connect to the ChemNet VPN and) mount the Grey Group backup servers on your local machine, then define these folders in your config.
- Access over
ssh
: alternatively, you can set up passwordlessssh
access to a machine (e.g., usingcitadel
as a proxy jump), and paths on that remote machine can be configured as separate filesystems. The filesystem metadata will be synced periodically, and any attached files will be downloaded and stored locally (with the file being kept younger than 1 hour old on each access).