AIAgents4Pharma Developer Guide
This guide covers the complete development setup, tooling, and workflow for AIAgents4Pharma project.
๐ Table of Contents
- Quick Start
- Development Environment Setup
- Project Structure
- Development Tools
- Code Quality & Security
- Dependency Management
- Testing
- CI/CD Pipeline
- Docker & Deployment
- Security Best Practices
- Common Development Tasks
- Troubleshooting
๐ Quick Start
Prerequisites
- Python 3.12+
- Git
- uv (modern Python package manager)
- libmagic (for file security validation):
- macOS:
brew install libmagic
- Linux:
sudo apt-get install libmagic1
- Windows: Bundled with python-magic-bin
Installation
# 1. Clone the repository
git clone https://github.com/VirtualPatientEngine/AIAgents4Pharma
cd AIAgents4Pharma
# 2. Install dependencies (creates virtual environment automatically)
uv sync --extra dev --frozen
# 3. Set up pre-commit hooks (optional but recommended)
uv run pre-commit install
# 4. Set up API keys
export OPENAI_API_KEY=sk-...
export NVIDIA_API_KEY=nvapi-...
export ZOTERO_API_KEY=...
export ZOTERO_USER_ID=...
# 5. Test installation
uv run python -c "import aiagents4pharma; print('โ
Installation successful!')"
๐ Development Environment Setup
Modern Python Stack
This project uses a modern Python development stack:
- ๐ฆ uv: Ultra-fast Python package manager and dependency resolver
- ๐๏ธ hatchling: Modern build backend (PEP 621 compliant)
- ๐ pyproject.toml: Single source of truth for project configuration
- ๐ uv.lock: Reproducible dependency resolution
Why uv over pip/conda?
- 10-100x faster than pip for dependency resolution
- Automatic virtual environment management
- Built-in lock file support for reproducible builds
- Better dependency conflict resolution
- Native pyproject.toml support
๐ Project Structure
AIAgents4Pharma/
โโโ aiagents4pharma/ # Main package
โ โโโ talk2biomodels/ # Systems biology agent
โ โโโ talk2knowledgegraphs/ # Knowledge graph agent
โ โโโ talk2scholars/ # Scientific literature agent
โ โโโ talk2cells/ # Single cell analysis agent
โ โโโ talk2aiagents4pharma/ # Meta-agent (orchestrator)
โโโ app/ # Streamlit applications
โโโ docs/ # Documentation
โโโ pyproject.toml # Project configuration (dependencies, tools)
โโโ uv.lock # Lock file for reproducible builds
โโโ .pre-commit-config.yaml # Pre-commit hooks configuration
โโโ release_version.txt # Version file
๐ง Development Tools
Code Quality Tools
All tools are configured in pyproject.toml
and run automatically via pre-commit:
โก Ruff - Fast Linting & Import Sorting
# Lint and auto-fix issues
uv run ruff check --fix .
# Check only (no fixes)
uv run ruff check .
# Format imports and code style
uv run ruff format .
๐ Pylint - Comprehensive Static Analysis
# Analyze entire codebase (configuration in pyproject.toml)
uv run pylint aiagents4pharma/
# Analyze specific component
uv run pylint aiagents4pharma/talk2scholars/
# Generate JSON report for CI/CD
uv run pylint aiagents4pharma/ --output-format=json --reports=no > pylint-report.json
๐ MyPy - Static Type Checking
# Type check the main package
uv run mypy aiagents4pharma/
# Type check everything
uv run mypy .
Security Tools
๐ก๏ธ Bandit - Security Vulnerability Scanner
# Scan for security issues
uv run bandit -r aiagents4pharma/
# Generate detailed report
uv run bandit -r aiagents4pharma/ -f json -o security-report.json
๐ Dependency Vulnerability Scanning
# Scan dependencies for known vulnerabilities
uv run pip-audit
# Alternative scanner
uv run safety check
# Scan with detailed output
uv run pip-audit --desc --format=json
๐ Code Quality & Security
Pre-commit Hooks
Pre-commit runs automatically before every commit to ensure code quality:
# Install hooks (one-time setup)
uv run pre-commit install
# Run hooks manually on all files
uv run pre-commit run --all-files
# Run specific hook
uv run pre-commit run ruff
uv run pre-commit run mypy
What runs on each commit
- Ruff - Lints and fixes imports
- MyPy - Type checking (configured but currently disabled in pre-commit)
- Bandit - Security scanning
- pip-audit - Dependency vulnerability check
- General checks - Trailing whitespace, YAML validation, etc.
Bypassing Pre-commit (Emergency Only)
# Skip pre-commit hooks (not recommended)
git commit --no-verify -m "emergency fix"
๐ฆ Dependency Management
Adding Dependencies
# Add runtime dependency
uv add numpy>=1.24.0
# Add development dependency
uv add --group dev pytest>=7.0.0
# Add optional dependency group
uv add --optional ml torch>=2.0.0
Updating Dependencies
# Update all dependencies
uv sync --upgrade
# Update specific package
uv add package_name@latest
# Update dev dependencies
uv sync --extra dev --upgrade
Lock File Management
# Generate/update lock file
uv lock
# Install from lock file (production)
uv sync --frozen
# Install with development tools
uv sync --extra dev --frozen
Dependency Groups
- Main: Core runtime dependencies
- Dev: Development tools (ruff, mypy, etc.)
- Optional: Feature-specific dependencies
๐งช Testing
Running Tests
# Run all tests
uv run pytest
# Run with coverage
uv run pytest --cov=aiagents4pharma
# Run specific test file
uv run pytest aiagents4pharma/talk2biomodels/tests/test_api.py
# Run integration tests only
uv run pytest -m integration
Test Categories
- Unit tests: Fast, isolated tests
- Integration tests: Cross-component tests (marked with
@pytest.mark.integration
)
๐ CI/CD Pipeline
GitHub Actions Workflows
The project uses GitHub Actions for automated testing and deployment:
# .github/workflows/ci.yml (example)
name: CI
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: astral-sh/setup-uv@v1
- run: uv sync --extra dev
- run: uv run pytest
- run: uv run pip-audit
Manual CI Commands
# Run the same checks as CI locally
uv run pytest # Tests
uv run pylint aiagents4pharma/ # Static analysis (config in pyproject.toml)
uv run pip-audit # Security scan
uv run safety check # Alternative security scan
uv run bandit -r aiagents4pharma/ # Security scan
uv run mypy aiagents4pharma/ # Type checking
Automated Security Workflows
The project includes comprehensive automated security scanning:
# Weekly security audit (runs automatically)
.github/workflows/security_audit.yml # pip-audit + safety + bandit
# SonarCloud analysis (artifact-based, runs after tests)
.github/workflows/sonarcloud.yml # Modern CI/CD with artifact reuse
# Manual security audit
uv run pip-audit --desc
uv run safety check --json
uv run bandit -c pyproject.toml -r aiagents4pharma/
๐ณ Docker & Deployment
Building Docker Images
Each agent has its own Dockerfile:
# Build specific agent
docker build -f aiagents4pharma/talk2scholars/Dockerfile -t talk2scholars .
# Build all with docker-compose
docker-compose build
Production Deployment
# Install production dependencies only (excludes dev tools)
uv sync --frozen
# Build production package
uv build
# Install built package
pip install dist/aiagents4pharma-*.whl
๐จ Security Best Practices
Regular Security Scans
# Weekly security scan (runs automatically in CI)
uv run pip-audit --desc
uv run safety check --json
uv run bandit -c pyproject.toml -r aiagents4pharma/
# Check for outdated packages with vulnerabilities
uv run pip-audit --desc --format=json
Streamlit File Upload Security
The project implements comprehensive file upload security:
# Use secure file upload wrapper
from app.frontend.utils.streamlit_utils import secure_file_upload
# Secure PDF upload with validation
pdf_file = secure_file_upload(
"Upload PDF",
allowed_types=["pdf"],
help_text="Upload a research paper",
max_size_mb=50,
accept_multiple_files=False
)
# Secure data upload with multiple types
data_files = secure_file_upload(
"Upload Data",
allowed_types=["spreadsheet", "text"],
help_text="Upload CSV or Excel files",
max_size_mb=25,
accept_multiple_files=True
)
Security Features
- File type validation - Only allowed extensions (prevents malware.exe โ report.pdf)
- MIME type checking - Detects file masquerading attacks
- File size limits - Prevents DoS attacks (configurable 25-50MB)
- Content scanning - Blocks suspicious patterns and scripts
- Filename sanitization - Prevents directory traversal attacks
Dependency Updates
- Dependabot automatically creates PRs for security updates (weekly)
- Pre-commit hooks catch vulnerabilities before commit
- CI pipeline blocks PRs with security issues
- Weekly security audits with SARIF uploads to GitHub Security
API Key Management
# Set environment variables (never commit these!)
export OPENAI_API_KEY=sk-...
export NVIDIA_API_KEY=nvapi-...
export ZOTERO_API_KEY=...
export ZOTERO_USER_ID=...
# Use .env file for local development (add to .gitignore!)
echo "OPENAI_API_KEY=sk-..." >> .env
๐ Common Development Tasks
Starting Development
# 1. Activate environment and install dependencies
uv sync --extra dev --frozen
# 2. Run pre-commit setup
uv run pre-commit install
# 3. Start coding!
Before Committing
# 1. Run quality checks
uv run ruff check --fix .
uv run pylint aiagents4pharma/
uv run mypy aiagents4pharma/
# 2. Run tests
uv run pytest
# 3. Security scan
uv run pip-audit
# 4. Commit (pre-commit will run automatically)
git add .
git commit -m "your message"
Adding a New Agent
- Create new directory:
aiagents4pharma/talk2newagent/
- Add dependencies to
pyproject.toml
- Update package configuration
- Add tests and documentation
- Update Docker configuration
๐ Troubleshooting
Common Issues
Dependency Conflicts
# Clear cache and reinstall
rm -rf .venv uv.lock
uv sync --extra dev
Pre-commit Issues
# Reinstall hooks
uv run pre-commit uninstall
uv run pre-commit install
# Update hook versions
uv run pre-commit autoupdate
Import Errors
# Verify installation
uv run python -c "import aiagents4pharma; print('OK')"
# Check Python path
uv run python -c "import sys; print(sys.path)"
Type Checking Errors
# Install missing type stubs
uv add --group dev types-requests types-PyYAML
# Run with verbose output
uv run mypy --verbose aiagents4pharma/
Performance Issues
# Profile dependency resolution
uv sync --extra dev --verbose
# Check lock file
uv lock --verbose
๐ Additional Resources
Core Tools
- uv Documentation - Modern Python package manager
- Hatchling - Modern build backend
- Ruff Rules - Fast Python linter
- MyPy Configuration - Static type checking
- Pre-commit Hooks - Git hook framework
Security Tools
- Bandit - Security linter for Python
- pip-audit - Dependency vulnerability scanner
- Safety - Dependency vulnerability checker
- python-magic - File type detection
- Streamlit Security Guide - File upload security implementation
CI/CD & Quality
- SonarCloud Setup Guide - Complete SonarCloud integration guide
- SonarCloud - Code quality and security analysis
- GitHub Actions - CI/CD workflows
- Dependabot - Automated dependency updates
๐ค Contributing
- Fork the repository
- Create feature branch:
git checkout -b feature/amazing-feature
- Setup development environment:
uv sync --extra dev --frozen
- Install pre-commit:
uv run pre-commit install
- Make changes and ensure all checks pass
- Commit with descriptive message
- Push to your fork and create Pull Request
All contributions are automatically scanned for:
- Code formatting and style (Ruff)
- Type safety (MyPy - configured, ready to enable)
- Security vulnerabilities (Bandit + pip-audit + Safety)
- Test coverage (pytest with coverage reporting)
- Code quality (SonarCloud analysis)
- Dependency security (Automated weekly scans)
Happy coding! ๐