|
**Computer Vision**
- YOLOv8 / YOLO11 / YOLO26 / RT-DETR
- Detect, Segment, Classify, Pose, OBB
- Train, Validate, Export (ONNX/TensorRT/CoreML)
- Zero-copy GPU inference with `TensorBufferPool`
|
**Interactive Segmentation**
- SAM 2 — point/box prompts + video tracking
- SAM 3 — natural language ("find all cars")
- CLIP-powered semantic understanding
|
|
**Body & Face**
- MediaPipe — hand tracking (21 pts)
- Face mesh (478 pts), pose estimation (33 pts)
|
**Image Processing**
- OpenCV — blur, edges, contours, morphology
- Color conversion, thresholding, resize
|
|
**Large Language Models**
- HuggingFace model download & caching
- Chat inference with `ChatMessage` API
- LoRA/QLoRA fine-tuning + real-time callbacks
- Async training, adapter merge, quantization
|
**Production-Grade**
- Full PyTorch — CPU / Apple MPS / NVIDIA CUDA / Multi-GPU
- Single JVM process — no Python server needed
- Thread-safe engine with `ReadWriteLock`
- Auto-download Python, deps, and model weights
|
### Traditional Java ML vs jpy-ml
| Traditional Java ML | jpy-ml |
|---|---|
| Wrap REST calls to a Python server | ML runs **in-process** via JNI — zero network latency |
| Manually install Python + pip + torch | **Auto-downloads** Python, all deps, and model weights |
| Parse untyped JSON from model APIs | **Strongly typed** results: `DetectionResult`, `PoseResult`, ... |
| Deploy 2 services (Java app + Python API) | **Single JVM process** — simpler ops, lower cost |
| Limited to ONNX Runtime (CPU only) | **Full PyTorch** — CPU, Apple MPS, NVIDIA CUDA, Multi-GPU |
| Only inference | **Inference + Training + Validation + Export + LLM Fine-tuning** — full lifecycle |
---
## Features
### Core Framework
- **Embedded Python Runtime** — full CPython embedded in JVM via Jep (JNI), auto-managed lifecycle
- **Zero-Config Environment** — auto-downloads Python (production) or uses local venv (dev)
- **Thread-Safe Engine** — singleton PythonEngine with ReadWriteLock, safe for concurrent use
- **Type-Safe Java APIs** — strongly typed configs, results, and callbacks — no `Map` casting in user code
- **Transparent Python Bridge** — `PythonEngine` for arbitrary Python/NumPy when you need it
- **SLF4J Logging** — proper logging framework integration (Logback)
- **Exception Hierarchy** — `JpyMlException` base class with typed exceptions
### Computer Vision (Ultralytics YOLO)
- **Unified Model API** — single `Model` class for all architectures and tasks
- **6 Model Families** — YOLOv8, YOLO11, YOLO26, RT-DETR, SAM, plus ONNX Runtime inference
- **5 Task Types** — Detect, Segment, Classify, Pose Estimation, OBB
- **Full Lifecycle** — predict, train, validate, export (ONNX/TensorRT/CoreML/TFLite/...)
- **Rich Result Types** — BoundingBox, Mask, Keypoint, RotatedBoundingBox with filter/query helpers
- **Device Abstraction** — CPU / MPS (Apple Silicon) / CUDA GPU / Multi-GPU via `Device` class
- **Epoch Callbacks** — real-time training progress with per-epoch loss/fitness metrics
- **Per-Class Validation** — mAP50, mAP50-95, precision, recall broken down by class
- **Image Annotation** — draw results on images via PIL, supports all task types
- **Zero-Copy Bridge** — `TensorBufferPool` + `RawDetectionResult` for high-performance inference
- **GPU Memory Management** — `warmup()`, `unload()`, `reload(device)` APIs
- **Direct Image Input** — `predict(byte[])`, `predict(BufferedImage)` — no temp files needed
- **Async API** — `predictAsync()` returning `CompletableFuture