MLOps Engineer

2 semanas atrás

Lisboa, Lisboa, Portugal TransPerfect Tempo inteiro

Job description

We are looking to hire a MLOps Engineer with strong expertise in machine learning, speech and language processing, and multimodal systems. This role is essential to driving our product roadmap forward, particularly in deploying, testing, evaluating and monitoring our core machine learning systems and developing next-generation speech technologies.

The ideal candidate will be capable of working independently while effectively collaborating with cross-functional teams. In addition to deep technical knowledge, we are looking for someone who is curious, experimental, and communicative.

Key Responsibilities

Essential

Design and maintain CI/CD pipelines for automated model training, testing, and deployment.
Build container orchestration solutions (Docker, Kubernetes) for model serving at scale.
Implement deployment strategies (blue-green, canary, A/B testing) for safe model rollouts.
Develop Infrastructure as Code (Terraform, CloudFormation) for reproducible ML environments.
Optimize model serving infrastructure for latency, throughput, and cost efficiency.
Manage model versioning, registry, and artifact storage systems.
Build real-time monitoring dashboards for model performance, latency, and resource utilization.
Implement automated alerting systems for model degradation and anomaly detection.
Design feature drift detection and data quality monitoring for production traffic.
Track business metrics and ROI analysis for model deployments.
Build specialized inference pipelines for speech-to-text and text-to-speech models.
Optimize speech model performance for real-time and batch processing scenarios.
Design evaluation frameworks specific to speech quality metrics (WER, latency, naturalness).
Handle multi-modal data pipelines combining audio, text, and metadata.
Create feedback loops to capture user interactions and model effectiveness.
Create automated retraining pipelines based on performance degradation signals.
Develop business metrics and ROI analysis for model deployments.
Implement experiment tracking systems (MLflow, Weights & Biases) for reproducibility.
Design hyperparameter optimization frameworks for efficient model tuning.
Conduct statistical analysis of training dynamics and convergence patterns.
Create automated model selection pipelines based on multiple evaluation criteria.
Develop cost-benefit analyses for different training configurations and architectures.

Additional Responsibilities

Implement automated evaluation pipelines that scale across multiple models and benchmarks.
Design comprehensive test suites with statistical significance testing for model comparisons.
Develop fairness metrics and bias detection systems for speech models across demographics.
Perform statistical analysis of training datasets to identify quality issues and coverage gaps.
Create interactive dashboards and visualization tools for model performance analysis.
Build A/B testing frameworks for comparing model versions in production.
Build and maintain ETL pipelines using SQL, Azure, GCP, and AWS technologies.
Design data ingestion systems for massive-scale speech and text corpora.
Implement data validation frameworks and automated quality checks.
Create sampling strategies for balanced and representative training datasets.
Develop data preprocessing and cleaning pipelines for audio and text.

Job requirements

Required Skills, Experience and Qualifications:

Programming & Software Engineering:

Python (Expert Level): Advanced proficiency in scientific computing stack (NumPy, Pandas, SciPy, Scikit-learn).
Version Control: Git workflows, collaborative development, and code review processes.
Software Engineering Practices: Testing frameworks, CI/CD pipelines, and production-quality code development.

Machine Learning and Language Model Expertise:

Traditional Machine Learning and Deep Learning Knowledge: Proficiency in classical ML algorithms (Naive Bayes, SVM, Random Forest, etc.) and Deep Learning architectures.
Understanding of Transformer Architecture: Attention mechanisms, positional encoding, and scaling laws.
Training Pipeline Knowledge: Data preprocessing for large corpora, tokenization strategies, and distributed training concepts.
Evaluation Frameworks: Experience with standard NLP benchmarks (GLUE, SuperGLUE, etc.) and custom evaluation design.
Fine-tuning Techniques: Understanding of PEFT methods, instruction tuning, and alignment techniques.
Model Deployment: Knowledge of model optimization, quantization, and serving infrastructure for large models.

Additional Skills, Experience and Qualifications:

Machine Learning & Deep Learning:

Framework Proficiency: Scikit-learn, XGBoost, PyTorch (preferred) or TensorFlow for model implementation and experimentation.
MLOps Expertise: Model versioning, experiment tracking, model monitoring (MLflow, Weights & Biases), data monitoring, observability and validation (Great Expectations, Prometheus, Grafana), and automated ML pipelines (GitHub CI/CD, Jenkins, CircleCI, GitLab etc.).
Statistical Modeling: Hypothesis testing, experimental design, causal inference, and Bayesian statistics.
Model Evaluation: Cross-validation strategies, bias-variance analysis, and performance metric design.
Feature Engineering: Advanced techniques for text, time-series, and multimodal data.

Data Engineering & Infrastructure:

Speech Processing Libraries: Librosa, Torchaudio, SpeechBrain, Kaldi, Espnet
Feature Stores and Data Versioning: Feast, Tecton, DVC
Big Data Technologies: Spark (PySpark), Hadoop ecosystem, and distributed computing frameworks (DDP, TP, FSDP).
Cloud Platforms: AWS (SageMaker, Bedrock, S3, EMR), GCP (Vertex AI, BigQuery), or Azure ML.
Database Systems: NoSQL databases (MongoDB, Elasticsearch), graph databases (Neo4j), and vector databases (Pinecone, Milvus, ChromaDB, FAISS etc.).
Data Pipeline Tools: Airflow, Prefect, or similar orchestration frameworks.
Containerization: Docker, Kubernetes for scalable model deployment
Model Serving Frameworks: TorchServe, TensorFlow Serving, Triton
Infrastructure as Code Tools: Terraform, CloudFormation

Collaboration & Adaptability:

Strong communication skills are a must
Self-reliant but knows when to ask for help
Comfortable working in an environment where conventional development practices may not always apply:
PBIs (Product Backlog Items) may not be highly detailed
Experimentation will be necessary
Ability to identify what's important in completing a task or partial task and explain/justify their approach
Can effectively communicate ideas and strategies
Proactive and takes initiative rather than waiting for PBIs to be assigned when circumstances call for it
Strong interest in AI and its possibilities, a genuine passion for certain areas can provide that extra spark
Curious and open to experimenting with technologies or languages outside their comfort zone

Mindset & Work Approach:

Takes ownership when things don't go as planned
Capable of working from high-level explanations and general guidance on implementations and final outcomes
Continuous, clear communication is crucial, detailed step-by-step instructions won't always be available
Self-starter, self-motivated, and proactive in problem-solving
Enjoys exploring and testing different approaches, even in unfamiliar programming languages

Hybrid

Lisbon, Lisboa, Portugal

Tech Full-time, Permanent All done

Your application has been successfully submitted

Other jobs

MLOps Engineer – Azure

2 semanas atrás

Lisboa, Lisboa, Portugal KI Group Tempo inteiro

Become our new MLOps Engineer (m/f/d)At KI Performance, we move AI from experimentation to production. As an MLOps Engineer – Azure & AI/ML Platforms, you will be responsible for deploying, operating, and scaling AI/ML systems in production, including LLM-based solutions, within enterprise environments.This role extends classic DevOps with strong AI, data,...
MLOps Engineer – Azure

2 semanas atrás

Lisboa, Lisboa, Portugal KI performance GmbH Tempo inteiro

Become our new MLOps Engineer (m/f/d)AtKI Performance, we move AI from experimentation to production. As anMLOps Engineer – Azure & AI/ML Platforms, you will be responsible for deploying, operating, and scalingAI/ML systems in production, includingLLM-based solutions, within enterprise environments.This role extends classic DevOps with strongAI, data, and...
MLOps Engineer – Azure

2 semanas atrás

Lisboa, Lisboa, Portugal KI group Tempo inteiro

Become our new MLOps Engineer (m/f/d)At KI Performance, we move AI from experimentation to production. As an MLOps Engineer – Azure & AI/ML Platforms, you will be responsible for deploying, operating, and scaling AI/ML systems in production, including LLM-based solutions, within enterprise environments.This role extends classic DevOps with strong AI,...
Senior MLOps Engineer 39706

2 semanas atrás

Lisboa, Lisboa, Portugal Marionete Tempo inteiro

Marionete is an independently minded, entrepreneurial technology consultancy helping clients exploit tomorrow's technology to find unexpected solutions to today's business problems.For more information visit us: Marionete is seeking a Senior MLOps Engineer with strong expertise and passion for building and maintaining machine learning systems in production...
MLOps Engineer

Há 5 dias

Lisboa, Lisboa, Portugal Adentis Portugal Tempo inteiro

With just over 7 years of experience in the Portuguese market, we share our DNA with more than 200 workers and position our offer according to 3 lines of service:Strategy (Outsourcing, NeXel, Team as a Service, Tech Academies);R&D (Bootcamps, POC, Tech Lab);Nearshore.In ADENTIS, we focus on PEOPLE. This is our emotional salary:Great Work-Life balance;Very...
Senior AI Data Engineer

Há 7 dias

Lisboa, Lisboa, Portugal Komodo Consulting Tempo inteiro

About UsKomodo Consulting is a technology and strategy firm specializing in Digital Transformation. Operating in Portugal and Poland, we provide IT Consulting & Nearshore services. We support both public and private sector organizations through two main areas:Consulting with a focus on strategy, investment analysis, and digital process improvement;IT Team...
Senior AI Data Engineer

Há 5 dias

Lisboa, Lisboa, Portugal Komodo Consulting Tempo inteiro

About UsKomodo Consulting is a technology and strategy firm specializing in Digital Transformation. Operating in Portugal and Poland, we provide IT Consulting & Nearshore services. We support both public and private sector organizations through two main areas:Consulting with a focus on strategy, investment analysis, and digital process improvement;IT Team...
Machine Learning Engineer

43 minutos atrás

Lisboa, Lisboa, Portugal Glintt Tempo inteiro

Somos a Glintt Global, uma empresa de referência na Península Ibéria em Consultoria e Serviços Tecnológicos, com mais de 30 anos de experiência. Pretendemos impactar a qualidade de vida das pessoas, através da inovação, tecnologia, do conhecimento e da ambição que nos acompanham. Este é o nosso compromissoPara tal, precisamos de ti Procuramos um...
Full-Stack AI Engineer

1 semana atrás

Lisboa, Lisboa, Portugal Indie Campers Tempo inteiro

ABOUT USIndie Campers is the leading campervan provider, dedicated to making road trips accessible and unforgettable for everyone. Innovation, product-led growth, and an unwavering commitment to our customers are at the heart of everything we do. With more than one million nights rented and travellers from 169 countries, we provide a single and trustworthy...
Machine Learning Engineer

Há 5 dias

Lisboa, Lisboa, Portugal Elevus Tempo inteiro

Estamos a contratar: Machine Learning Engineer Localização: Lisboa / Porto Regime de trabalho: Híbrido Estás pronto para fazer crescer a tua carreira num ambiente dinâmico e colaborativo? Estamos à procura de um(a) Machine Learning Engineer com experiência no desenvolvimento, treino e operacionalização de modelos de Machine Learning em ambientes de...

Américas

Europa

Ásia / Oceânia

África

MLOps Engineer