nvidia-python Pod¶
Standard OCI container - works with Docker, Podman, Kubernetes, Apptainer.
The nvidia-python pod provides a complete ML/AI development environment with PyTorch and CUDA support, managed by pixi for deterministic builds.
Overview¶
| Attribute | Value |
|---|---|
| Image | ghcr.io/atrawog/bazzite-ai-pod-nvidia-python:stable |
| Size | ~6GB |
| GPU | NVIDIA (CUDA 12.4) |
| Inherits | pod-nvidia |
| Foundation for | jupyter pod |
Quick Start¶
# With NVIDIA GPU
docker run -it --rm --gpus all -v $(pwd):/workspace \
ghcr.io/atrawog/bazzite-ai-pod-nvidia-python:stable
# CPU-only
docker run -it --rm -v $(pwd):/workspace \
ghcr.io/atrawog/bazzite-ai-pod-nvidia-python:stable
# AMD/Intel GPU
docker run -it --rm --device=/dev/dri -v $(pwd):/workspace \
ghcr.io/atrawog/bazzite-ai-pod-nvidia-python:stable
What's Included¶
ML/AI Stack¶
- PyTorch with CUDA 12.4 support
- torchvision - Computer vision models and transforms
- torchaudio - Audio processing
From nvidia Pod¶
- CUDA Toolkit 13.0
- cuDNN (Deep Neural Network library)
- TensorRT (inference optimization)
From base Pod¶
- Python 3.13, Node.js 23+, Go, Rust
- VS Code, Docker CLI, Podman
- kubectl, Helm, Claude Code
- Build tools (gcc, make, cmake, ninja)
Usage¶
Activate the ML Environment¶
The pod uses pixi for environment management:
# Activate the pixi environment
pixi shell --manifest-path /opt/pixi/pixi.toml
# Or run commands directly
pixi run --manifest-path /opt/pixi/pixi.toml python train.py
Verify GPU Access¶
import torch
# Check CUDA availability
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"Device count: {torch.cuda.device_count()}")
print(f"Device name: {torch.cuda.get_device_name(0)}")
# Quick benchmark
x = torch.randn(1000, 1000, device='cuda')
y = torch.matmul(x, x)
print(f"Matrix multiplication on GPU: {y.shape}")
Training Example¶
import torch
import torch.nn as nn
import torch.optim as optim
# Define a simple model
model = nn.Sequential(
nn.Linear(784, 256),
nn.ReLU(),
nn.Linear(256, 10)
).cuda()
# Training loop
optimizer = optim.Adam(model.parameters())
criterion = nn.CrossEntropyLoss()
for epoch in range(10):
# Your training code here
pass
Environment Details¶
Pixi Project Location¶
/opt/pixi/
├── pixi.toml # Environment configuration
├── pixi.lock # Locked dependencies (deterministic)
└── .pixi/ # Installed packages
Environment Variables¶
| Variable | Value |
|---|---|
NVIDIA_PYTHON_PROJECT | /opt/pixi |
PATH | Includes /opt/pixi/bin, /usr/local/cuda/bin |
Common Tasks¶
Install Additional Packages¶
# Inside the pod, activate environment first
pixi shell --manifest-path /opt/pixi/pixi.toml
# Install with pip (inside pixi environment)
pip install transformers datasets accelerate
Run Jupyter Notebook¶
For interactive notebook development, use the jupyter pod instead, which includes JupyterLab pre-configured.
Export Trained Models¶
# Save PyTorch model
torch.save(model.state_dict(), '/workspace/model.pth')
# Export to ONNX for TensorRT optimization
torch.onnx.export(model, dummy_input, '/workspace/model.onnx')
Workspace¶
Your current directory is mounted at /workspace:
# On host
cd ~/projects/my-ml-project
# Docker/Podman
docker run -it --rm --gpus all -v $(pwd):/workspace \
ghcr.io/atrawog/bazzite-ai-pod-nvidia-python:stable
# Inside pod - your files are here
ls /workspace/
Troubleshooting¶
CUDA Not Available¶
- Ensure NVIDIA GPU is present:
nvidia-smi(on host) - For Docker: Install NVIDIA Container Toolkit
- For Bazzite AI OS: Run
ujust setup-gpu-pods(one-time)
Out of Memory¶
# Clear GPU memory
torch.cuda.empty_cache()
# Use gradient checkpointing for large models
from torch.utils.checkpoint import checkpoint
Pixi Environment Issues¶
See Also¶
- jupyter pod - For interactive notebook development
- Deployment Guide - All deployment methods
- Pod Architecture - How pods relate to each other