Setup for AI agents for life sciences
Infrastructure and core software
- Cloud infrastructure: Amazon Web Services
- Computational research platform: Code Ocean
- Version control: GitHub
- AI Agent application requirements: Streamlit, Ollama, LangChain, and FAISS
Setup
Step 1: Introduction to our computational research platform with Code Ocean
- View a short overview video of the Code Ocean platform
- Review further information in the Code Ocean user guide if needed
Step 2: Enable simultaneous capsule collaboration with version control using git and GitHub
- Navigate to our GitHub repository
-
Each team will have their own branch
[Team Name]
created by the coachesVPEHackathonAIAgentsCOTemplate/main -> VPEHackathonAIAgentsCOTemplate/[Team Name]
-
Each team member will need to add their own pesonal access token in Code Ocean
- Follow GitHub instructions to generate a personal access token: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens
- Save your personal access token! You will also need it when working with Git
- Then on Code Ocean, click on the
account
icon on the bottom left side, and go tocredentials
. - Click on
⊕ Add credential
and chooseGitHub
, then add your username (in GitHub) and the token you have created.
-
Each team member will create their own capsule in Code Ocean by cloning the template repository
- Click on the
New Capsule
button on the top right corner. - Select
Copy from public Git
. - Paste the git repository address: (i.e., https://github.com/VirtualPatientEngine/VPEHackathonAIAgentsCOTemplate)
- Click
clone
- The capsule will be cloned within a few seconds.
- Click on the
-
Each team member will need to attach shared data assets to their own capsule in Code Ocean
- In the capsule view, in the data folder in the files tree click ⚙️manage
- Attach the data-assets by clicking the plus sign (⊕)
- The data assets are
collections: cellxgene census metadata 2024-04-24
andollama_models_09_2024
-
Individual team members will contribute by syncing with their teams branch (see section step 4 below)
💡Tip
1. Use the command lineterminal
in the VS Code editor for runninggit
commands
2. Quick reference of git commands if you forget and the full documentation if typinggit --help
is not sufficient
Step 3: Familiarization with the template Code Ocean AI Agents capsule
- README and overview of the repository
-
The Streamlit Application with three starter examples
# test that we can run the streamlit app
python /code/streamlit_app.py
# Run the streamlit app
streamlit run /code/streamlit_app.py
# Stop the streamlit app
[Ctrl] + [C] -
Extended examples needed for completing the Hackathon challenges
-
Downloading datasets and models (Ollama example)
# Create the source directory if it doesn't exist
mkdir -p /scratch/.ollama
# delete the existing symbolic link link
rm /root/.ollama
# Create the new symbolic link (each write to /root/.ollama will be directed to /scratch/.ollama)
ln -s /scratch/.ollama /root/.ollama
# copy the key:
cp /data/.ollama/id_ed25519 /scratch/.ollama/id_ed25519
# start the ollama server in the scratch directory
cd scratch
cd ollama serve
# list and pull models
ollama list
ollama pull llama3.1 -
Downloading datasets and models (Stark example)
# Create and activate a virtual environment (Optional since we are working in a docker container)
python -m venv .venv
source .venv/bin/activate
# Install stark via pip
pip install stark-qa
# Download to scratch
python
from stark_qa import load_skb
skb = load_skb("prime", download_processed=True, root="/scratch")
# Deactivate virtual environment when done
exit()
deactivate
💡Tip
1. Use theScratch
folder for downloading large data files
2. Use the VS Code editor to launch the Ollama server, interact with Streamlit, coding, etc.
3. If you use a virtual envrionment, be sure to add the virtual environment directory to.gitignore
!
4. Ollama cheat sheat
Step 4: Launching, working in, and stopping the capsule
- Click the
VS Code
icon on the top right underReproducible run
to launch a cloud workstation on AWS; Please note that the first time you launch a capsule it will take a few minutes to allocate the resources on AWS. -
In a new terminal, add the remote team branches
# add a the VPE remote branches with your teams branch
git remote -v
git remote add upstream https://[user name]:[token]@github.com/VirtualPatientEngine/VPEHackathonAIAgentsCOTemplate.git
git fetch --all --prune -
Check to see that your teams branch is there e.g.,
upstream [Team Name]
. This is the branch that your team will sync with -
Create your branch derived from your teams branch
# check to see that your teams branch is there
git branch -v
# switch to your teams branch
git checkout [Team Name] # create a branch starting from your teams branch for your features (feat) and fixes (fix)
git checkout -b [feat or fix]/[name] -
Hack away 😀
-
Commit your changes
# stage your changes for the next commit
git add .
# add your changes to the commit
git commit -m "feat: my cool feature"
# push your changes to your local branch
git push origin [feat or fix]/[name] -
Update your branch with your teams changes
# fetch all changes from upstream branches
git fetch --all --prune
# update the local team branch
git checkout [Team Name]
git pull [Team name]
# merge changes from your local team branch into your branch
git checkout [feat or fix]/[name]
git merge [Team Name] -
Share your changes with your teams branch
# ensure your local team branch is up to date
git checkout [Team Name]
git fetch --all --prune
git pull [Team name]
# merge your changes (and resolve any conflicts)
git merge [feat or fix]/[name]
git push upstream [Team Name]
# delete your old branch and begin a new one
git branch -D [feat or fix]/[name]
git checkout -b [feat or fix]/[name] -
When you are done, please shut down the capsule to save resources! by clicking the
red power
button on the top left.