
Reinforcement Fine Tuning Expert Needed for GRPO
Upwork
Remoto
•7 hours ago
•No application
About
I need help running my reinforcement fine tuning (GRPO). Reward function and dataset are ready but I need help setting up properly for maximum throughput on the machine used (4x or 8x H200). Then you need to be an expert at reinforcement learning to view results and make tweaks to training setup such as hyper parameters and reward function.
Application
Fill in your information and participate in the selection process for the Reinforcement Fine Tuning Expert Needed for GRPO position.
✓
Profile Test
✓
Resume
✓
Upload
✓
Application
Reinforcement Fine Tuning Expert Needed for GRPO
Send your resume to the link below.
Submit Application