Reinforcement Fine Tuning Expert Needed for GRPO

Upwork

Remoto

•

7 hours ago

•

No application

About

I need help running my reinforcement fine tuning (GRPO). Reward function and dataset are ready but I need help setting up properly for maximum throughput on the machine used (4x or 8x H200). Then you need to be an expert at reinforcement learning to view results and make tweaks to training setup such as hyper parameters and reward function.

Remove Ads

Similar Positions

Religious Studies Teacher

The Education Network

Solihull, Midlands, B97 4DL

Religious Studies Teacher – Solihull A ‘Good’-rated se...

36 seconds ago

9040 – Programme Facilit...

Ministry Of Justice

London, UK

The successful applicant will be required to work unsociable h...

41 seconds ago

Scheduling Co-ordinator

Ecruit

Yeadon

Scheduling Co-ordinator- £28,000-£32,000 DOE– Yeado...

45 seconds ago

FLT Counterbalance, Reach and ...

Hexagon Recruitment Services Ltd

WA4

Hexagon Recruitment is seeking a skilled FLT Driver for an ...

49 seconds ago

Bank Registered Nurse

Caring Homes Group

Edinburgh, City of, EH14 1GZ

Registered Nurse - The Manor, Edinburgh - £24.16 pe...

53 seconds ago

Get our app today

Reinforcement Fine Tuning Expert Needed for GRPO

Reinforcement Fine Tuning Expert Needed for GRPO

About

Application