Senior & Staff Data Engineer Databricks

Há 4 dias

Porto, Porto, Portugal HumanIT Digital Consulting Tempo inteiro

ABOUT THE OPPORTUNITY

Join a leading technology company as a Senior or Staff Data Engineer specializing in Azure Databricks and build high-performance, resilient, and scalable data systems that create impact on users and businesses across the retail sector worldwide.

You'll be working for a software engineering company where the whole team owns the project together in a collaborative, politics-free environment. The culture reflects a lean, self-management attitude that encourages taking risks, making decisions, working collaboratively, and enhancing communication across all levels. Freedom and responsibility go hand in hand as you navigate through an Agile, Lean environment focused on continuous learning and delivering impactful data solutions.

Your role centers on architecting and implementing sophisticated data pipelines and platforms using Azure Databricks, Azure Data Factory, and modern data engineering tools. You'll leverage your deep expertise in Python and SQL to build ETL/ELT processes, work with advanced data modeling methodologies, and contribute to the data platform that powers business intelligence, analytics, and predictive modeling capabilities for retail operations at scale.

Critical Requirements: This is a mid to senior-level position requiring proven Data Engineering experience in fast-paced environments with mandatory expertise in Azure Databricks, Azure Data Factory, Python, SQL, ETL/ELT processes, Terraform, and Cloud Azure. You must have strong understanding of data modeling methodologies (Kimball, Inmon, Data Vault), experience with CI/CD tools, and advanced English (C1 level) for effective communication. Location must be Portugal.

PROJECT & CONTEXT

You'll be building and maintaining data engineering infrastructure for retail operations, working with Azure cloud services to deliver scalable data platforms that support business intelligence, analytics, predictive modeling, and data-driven decision-making across the organization. Your work directly impacts how the retail business understands customers, optimizes operations, and drives strategic initiatives through data insights.

Azure Databricks is your primary platform - you'll have vast hands-on experience designing and implementing data processing workflows, building streaming and batch data pipelines, optimizing Spark jobs for performance and cost efficiency, and leveraging Databricks features including Delta Lake, MLflow, and collaborative notebooks. Your expertise with Databricks enables you to build robust, scalable data solutions that process massive volumes of retail transaction data, customer information, and operational metrics.

Azure Data Factory is essential for orchestration - you'll design and implement complex ETL/ELT workflows using ADF, create and manage pipeline activities, configure triggers and scheduling, integrate with various data sources and destinations, and ensure reliable data movement across the platform. Understanding how to combine ADF with Databricks for comprehensive data processing enables you to build end-to-end data solutions.

Your deep expertise in Python and experience organizing Python-based projects means you'll write clean, maintainable data engineering code, develop reusable libraries and frameworks, implement data quality checks and validation logic, and follow software engineering best practices for data pipelines. You'll be an expert developer using SQL and SQL-like query languages, writing complex analytical queries, optimizing query performance, and working with large-scale data transformations.

Data modeling methodologies form the foundation of your architectural decisions - you'll have strong understanding of Kimball dimensional modeling for data warehouses, Inmon enterprise data warehouse approaches, and Data Vault 2.0 for enterprise data architecture. Your ability to select and apply the appropriate methodology based on business requirements, scalability needs, and organizational context ensures data platforms are architected for long-term success.

ETL and ELT processes are your core competency - you'll have vast experience designing extraction logic from diverse source systems, implementing transformation logic that cleanses and enriches data, loading data into target systems efficiently, and choosing between ETL and ELT patterns based on requirements. Understanding modern ELT approaches that leverage cloud compute power enables you to build efficient, cost-effective data pipelines.

Azure Cloud platform expertise is fundamental - you'll work extensively with Azure services including Azure Storage (Blob, Data Lake Gen2), Azure SQL Database, Azure Synapse Analytics, Azure Key Vault, Azure Monitor, and other platform capabilities. Understanding Azure architecture, networking, security, and cost optimization ensures you build cloud-native data solutions that are secure, performant, and cost-efficient.

Infrastructure as Code with Terraform enables repeatable, version-controlled infrastructure deployment - you'll define Azure resources declaratively, manage infrastructure state, implement CI/CD for infrastructure changes, and ensure consistency across environments. Your IaC expertise brings software engineering rigor to infrastructure management.

CI/CD automation is essential for data pipeline reliability - you'll implement automated testing, build continuous integration pipelines for data code, establish deployment automation using tools like GitLab, and ensure data projects follow DevOps best practices. Understanding how to apply CI/CD principles to data engineering ensures quality, speed, and reliability in data platform evolution.

Working in a fast-paced retail environment requires balancing multiple priorities, adapting to changing business requirements, delivering iteratively, and maintaining communication with business stakeholders, analysts, and leadership. Your ability to tailor communication for different audiences and ensure effective collaboration across technical and business teams is essential.

Core Tech Stack: Azure Databricks, Python, SQL, Azure Data Factory, Terraform, Azure (Blob Storage, Data Lake Gen2, Synapse), CI/CD tools (GitLab)

Data Engineering Focus: ETL/ELT pipelines, data modeling, data warehousing, streaming data, batch processing, data quality

Methodologies: Kimball dimensional modeling, Inmon enterprise data warehouse, Data Vault 2.0

Domain: Retail operations, customer analytics, business intelligence, predictive modeling

Culture: Agile, Lean, collaborative, self-managed, risk-taking encouraged, continuous learning

Scale: Enterprise data volumes, retail transaction processing, multi-source data integration

WHAT WE'RE LOOKING FOR (Required)

Data Engineering Experience: Proven experience as a Data Engineer in fast-paced environments with track record delivering scalable data solutions - this is the core requirement

Azure Databricks Expertise: MANDATORY - Vast hands-on experience with Azure Databricks including Spark processing, Delta Lake, pipeline development, and performance optimization

Azure Data Factory: MANDATORY - Vast experience with Azure Data Factory for orchestrating ETL/ELT workflows, pipeline creation, and data integration

Python Mastery: MANDATORY - Deep expertise in Python with experience organizing Python-based data engineering projects, writing production-quality code, and developing reusable frameworks

SQL Expertise: MANDATORY - Expert-level proficiency using SQL and SQL-like query languages for complex data transformations, analytical queries, and performance optimization

ETL/ELT Experience: MANDATORY - Vast experience with both ETL and ELT processes including designing extractions, implementing transformations, and loading data efficiently

Azure Cloud Platform: MANDATORY - Experience with Microsoft Azure cloud services including storage, compute, networking, security, and data services (Azure or AWS mentioned, but Azure strongly preferred)

Data Modeling Methodologies: MANDATORY - Strong understanding of different data modeling methodologies including Kimball dimensional modeling, Inmon enterprise data warehouse, and Data Vault approaches

Terraform: MANDATORY - Vast experience with Terraform for Infrastructure as Code, defining cloud resources, and managing infrastructure deployments

CI/CD Tools: MANDATORY - Vast experience with CI/CD tools and practices for automating data pipeline testing and deployment

Cloud Services Understanding: Experience with cloud services architecture, deployment patterns, security best practices, and cost optimization strategies

Data Quality: Understanding of data quality principles, validation techniques, and ensuring data integrity throughout pipelines

Performance Optimization: Ability to optimize data processing performance, tune Spark jobs, and manage resource utilization efficiently

Analytical Thinking: Strong analytical skills for understanding business requirements, designing data solutions, and solving complex data challenges

Communication Excellence: MANDATORY - Excellent communication skills for collaborating with developers, analysts, business stakeholders, and leadership, tailoring communication for different audiences

English Proficiency: C1 level (Advanced) in English for technical communication, documentation, stakeholder engagement, and team collaboration - this is mandatory

Work Authorization: Must be located in Portugal with eligibility for full remote work

NICE TO HAVE (Preferred)

Automated Testing Frameworks: Experience defining and crafting automated unit and integration testing frameworks specifically for data projects, ensuring data pipeline reliability

Business Intelligence Experience: Previous experience as Business Intelligence professional or Business Analyst, understanding BI requirements and analytics use cases

Visualization Tools: Experience with business intelligence and visualization tools including Tableau, Spotfire, Power BI, or similar platforms for creating insights from data

Predictive Modeling: Expertise in predictive modeling, machine learning pipeline integration, and deploying models within data platforms

Data Governance: Experience with data governance practices, metadata management, data lineage, data catalogs, and compliance requirements

Snowflake: Experience with Snowflake data warehouse platform as alternative or complement to Azure Synapse

GitLab CI/CD: Specific experience with GitLab for CI/CD automation including pipeline configuration and deployment workflows

CloudFormation: Experience with AWS CloudFormation as additional Infrastructure as Code tool beyond Terraform

AWS Cloud: Experience with AWS cloud services as alternative or complement to Azure platform knowledge

Observability Integration: Experience integrating data platforms into observability tools for monitoring, alerting, and operational visibility

Medallion Architecture: Understanding of medallion architecture (bronze/silver/gold layers) for data lake organization

Streaming Data: Advanced experience with real-time streaming data using technologies like Kafka, Event Hubs, or Spark Streaming

Data Lake Architecture: Deep expertise in data lake design patterns, partitioning strategies, and data organization

Apache Spark Advanced: Advanced Spark knowledge including performance tuning, memory management, and distributed computing optimization

Delta Lake Advanced: Deep expertise in Delta Lake features including time travel, ACID transactions, schema evolution, and merge operations

PySpark: Advanced PySpark programming for distributed data processing and complex transformations

Scala: Knowledge of Scala programming language for Spark development

R Programming: Experience with R for statistical analysis and data science workflows

Container Technologies: Understanding of Docker and Kubernetes for containerized data applications

DataOps Practices: Experience implementing DataOps methodologies for data platform development and operations

Data Cataloging: Familiarity with data catalog tools like Azure Purview, Alation, or Collibra

Master Data Management: Understanding of MDM principles and implementation

API Development: Experience building data APIs for exposing data services to applications

Event-Driven Architecture: Knowledge of event-driven patterns for data processing and integration

Security & Compliance: Deep understanding of data security, encryption, privacy regulations (GDPR), and compliance requirements in retail

Cost Optimization: Advanced skills in cloud cost optimization, resource management, and efficient architecture design

Agile Data Practices: Experience working in Agile teams applying Agile methodologies to data engineering

Stakeholder Management: Strong stakeholder management skills for understanding requirements and delivering data solutions aligned with business needs

Data Strategy: Experience contributing to data strategy, architecture decisions, and platform roadmap planning

Documentation: Excellent technical documentation skills for data models, pipeline logic, and architecture decisions

Mentoring: Experience mentoring junior data engineers and contributing to team skill development

Location: Portugal (100% Remote)

Senior Data Engineer – dbt Core

1 semana atrás

Porto, Porto, Portugal JTA: The Data Scientists Tempo inteiro

We are seeking a Senior Data Engineer with deep expertise in dbt Core and the Azure data ecosystem for a fixed-term outsourced engagement. The professional in this role will operate between Business Intelligence and Data Engineering teams, providing architectural guidance, defining transformation standards, and leading the implementation of...
Senior Data Engineer

Há 4 dias

Porto, Porto, Portugal J&T Business Consulting Tempo inteiro

We are looking for a Senior Data Engineer to lead high-visibility migration initiatives. You will oversee the transition of a large-scale enterprise data platform from AWS/Databricks to Google Cloud Platform (GCP).In this role, you will be the bridge between the business stakeholders and the technical engineering team. You are not just building pipelines;...
Staff Data Engineer

Há 7 dias

Porto, Porto, Portugal Caixa Mágica Software Tempo inteiro

Overview:Global software engineering company with a presence in several countries and a focus on developing tailor-made technological solutions. Values self-organised teams, where autonomy and individual responsibility are central pillars. Committed to long-term relationships with clients from different sectors, based on trust and proximity. Its services...
Senior Data Engineer

Há 7 dias

Porto, Porto, Portugal Pyyne Tempo inteiro

Let's build Pyyne togetherAre you excited by the idea of working on cutting-edge technology while shaping the direction of the company you're part of? At Pyyne, you'll do both. We're seeking a motivated colleague to join our consultant team who is eager to take ownership of their projects and actively contribute to building an international, people-first...
Senior Data Engineer

1 semana atrás

Porto, Porto, Portugal DEUS: human(ity)-centered AI Tempo inteiro

The DEUS InitiativeWe're a team of curious minds who believe technology should serve people. Our mission is tounlock the potential of AI for humanitythrough solutions that are not just innovative, but alsoethical and impactful. We're driven by a desire to simplymake things better.Our name is inspired bydeus ex machina, not because we think we're gods, but...
Senior Data Engineer

Há 2 dias

Porto, Porto, Portugal HumanIT Digital Consulting Tempo inteiro

ABOUT THE OPPORTUNITYJoin a dynamic technology consultancy working with an innovative client in the entertainment and media space. We're seeking an experienced Data Engineer to build robust data infrastructure supporting Business Intelligence, Data Science, and analytics initiatives. This role offers you the opportunity to design impactful data products and...
Data Engineer

Há 2 dias

Porto, Porto, Portugal SYVANTECH Tempo inteiro

AtSYVANTECH, we believe technology only makes sense when it delivers real business value. We work as true partners to our clients, promoting simple interactions, efficient processes, and tailored solutions at the right level of complexity.We are currently looking for aData Engineer (Snowflake or Databricks)to join our team of passionate professionals and...
Staff Data Engineer

Há 7 dias

Porto, Porto, Portugal Talkdesk Tempo inteiro

At Talkdesk, we are courageous innovators focused on redefining the customer experience, making the impossible possible for companies globally. We champion an inclusive and diverse culture representative of the communities in which we live and serve. And, we give back to our community by volunteering our time, supporting non-profits, and minimizing our...
Senior Data Engineer

1 semana atrás

Porto, Porto, Portugal Inetum Tempo inteiro

Senior Data EngineerWe are looking for aSenior Data Engineerto join our Inetum Team and be part of a work culture focused on innovationLocation:Lisbon or PortoMain Tasks:Explore, clean, and analyze large and complex datasets to support business objectives.Design and implement scalable data processing pipelines capable of handling large data volumes...
Databricks Engineer

Há 7 dias

Porto, Porto, Portugal NTT DATA Europe & Latam Tempo inteiro

Somos uma consultora multinacional de Negócios e Tecnologia que reinventa e transforma as organizações através da inovação e fomos considerados pela Gartner como uma das 10 maiores empresas de serviços de TI do mundo.Na NTTDATA somos Digital LoversAmamos a tecnologia, trabalhamos com paixão, com entusiasmo, num ambiente criativo e colaborativo e...

Américas

Europa

Ásia / Oceânia

África

Senior & Staff Data Engineer Databricks