AI Backend Engineer Job Genova Liguria Italy,IT/Tech

Location: Genova

Fundo.one is an AI‑powered financing‑access platform built to transform how SMEs secure the capital they need to grow. With a uniquely compelling product, an exceptional founding team, a vast addressable market, and strong backing, we’re positioned for something big.

Join us in Genoa at a moment when the AI revolution is redefining how the world builds, learns, and creates. We’re a fast‑moving startup driven by high standards, deep curiosity, and a culture that brings out the best in each of us.

Here, you won’t just write code, you’ll craft end‑to‑end agentic AI products , take genuine ownership , and experience the excitement of building something bold from the ground up. If you thrive where excellence, commitment, and ambitious thinking intersect, you’ll feel right at home with us.

We’re building a production LLM-driven product (with an MVP already proven) that helps organizations identify, prioritize, and write competitive grant applications. You’ll help design, implement, and operate the backend systems that power the LLM, retrieval, data pipelines, and evaluation — turning research into a reliable, secure, and scalable product.

Your role

Design, build, and operate the backend systems that serve the Grant Optimizer LLM: model hosting, prompt orchestration, RAG pipelines, embeddings store, and inference APIs.
Productionize fine-tuning and continual learning workflows (supervised fine-tuning, LoRA/QLoRA, and RLHF, where applicable) and automate dataset curation based on user interactions and labeled outcomes.
Implement retrieval (vector database, chunking, metadata, MMR) and document ingestion pipelines for large collections of grant calls, program rules, and applicant documents.
Build robust evaluation pipelines (including automated metrics, human-in-the-loop feedback, and A/B testing) to measure relevance, factuality, and success in improving grant quality/award probability.
Ensure privacy, compliance, and data governance for sensitive grant documents (PII redaction, encryption, access controls).
Optimise cost/performance for inference (batching, quantization, multi-GPU orchestration, autoscaling).
Collaborate with product, research, and frontend teams to translate user workflows (scoring, ranking, drafting, suggestions) into reliable APIs and event processing.
Create monitoring, alerting, and observability for model performance , data drift, latency, and error budgets.
Produce clear docs, runbooks, and onboard engineers to the LLM backend.

Responsibilities

Implement inference services (fast REST/gRPC endpoints) for generation and ranking.
Build and maintain ETL pipelines for ingesting grant notices, historical bids, scoring outcomes, and user documents.
Manage embeddings pipeline: text chunking, embedding generation, index creation and maintenance (FAISS / Pinecone / Weaviate / etc.).
Automate SFT/fine tuning and evaluation workflows (training infra + dataset versioning).
Apply retrieval-augmented generation techniques and ensure timely, accurate retrieval with provenance.
Implement rate-limiting , request validation, caching layers, and cost controls for external LLM providers and self-hosted models.
Run experiments to optimize prompts , system messages, and chain-of-thought strategies for grant drafting tasks.
Lead security reviews and implement access controls, secure token handling, and audit logging.

Technical Must-have

Strong Python proficiency and experience building backend services (Flask/FastAPI, async frameworks).
3+ years building ML systems in production ; proven experience with LLMs in production (Hugging Face, OpenAI, Anthropic, or self-hosted).
Practical experience with retrieval systems and vector search (FAISS, Milvus, Pinecone, Weaviate, etc.).

Experience with model fine-tuning workflows (LoRA/QLoRA/SFT), dataset management, and training infra.
Cloud experience: deploying and operating services on AWS / GCP / Azure (Kubernetes, ECS, IAM, S3, Cloud SQL, etc.).
LLM tooling :
Hugging Face Transformers, PEFT/LoRA, OpenAI API (or other hosted providers), Lang Chain/Llama Index.
Familiar with model optimization technique s (quantization, batching, sharding) and cost/perf…


Increase/decrease your Search Radius (miles)



Job Posting Language