Document Management Software Development for UK Chemical Industry

Document Management Software Development for UK Chemical Industry

Document Management Software Development for UK Chemical Industry

Upwork

Upwork

Remoto

1 day ago

No application

About

We are building a backend service that extracts structured data from messy Safety Data Sheets (SDS) — typically PDF documents — using LLMs like GPT-4 and Python. SDSs follow a standard format in theory, but in reality they're highly inconsistent between manufacturers. Our goal is to create a robust backend system that: Accepts SDS PDFs Extracts unstructured text Sends it to an LLM (currently GPT-4) with a carefully crafted prompt Returns structured JSON based on a defined SDS schema (which we’ll provide) We have already built a front-end prototype (in Replit), collected real-world SDS documents, and have working starter code in Flask using pdfplumber and the OpenAI API. We're now looking for a developer to: Finalise and productionise the backend Integrate with our existing Replit front-end Package and deploy the service (e.g. Hugging Face Spaces, Render, or Replit) The scope is well-defined, and this is a focused MVP build with real users in mind. If it goes well, there will be opportunities for additional work (e.g. OCR fallback, automated SDS auditing, or finetuning a custom model).