Staff Site Reliability Engineer
1 semana atrás
Snyk is the leader in secure AI software development, helping millions of developers develop fast and stay secure as AI transforms how software is built. Our AI-native Developer Security Platform integrates seamlessly into development and security workflows, making it easy to find, fix, and prevent vulnerabilities — from code and dependencies to containers and cloud.
Our mission is to empower every developer to innovate securely in the AI era — boosting productivity while reducing business risk. We're not your average security company - we build Snyk on One Team, Care Deeply, Customer Centric, and Forward Thinking.
It's how we stay driven, supportive, and always one step ahead as AI reshapes our world.
Why this role?
We are seeking a skilled and proactive Staff Site Reliability Engineer (SRE) to join our team and support our growth of the Snyk API & Web, by building scalable, reliable, and secure cloud infrastructure. You will be responsible for ensuring the performance and uptime of our systems while adopting DevOps best practices and leveraging modern tools.
What You'll Do
- Ensuring high availability, scalability, and disaster recovery across all systems.
- Leading architectural discussions and making strategic decisions related to scalability, security, and availability.
- Driving continuous improvement of our infrastructure, deployment, and monitoring processes.
- Collaborating with development and operations teams to improve deployment processes and infrastructure resiliency.
- Acting as a subject-matter expert for the SRE team and cross-functional engineering groups.
- Mentoring and supporting other engineers, helping to grow team skills and practices.
- Leading root cause analysis processes and post-incident reviews to ensure learning and resilience improvements.
- Spreading the word of reliability, observability, and automation across the organisation
What You Bring
- Experience with AWS (open to other cloud providers)
- Deep understanding of Kubernetes architecture and day-to-day cluster management, as well as managing complex Kubernetes environments
- Experience with Security Services/ Internet Infrastructure providers, e.g. Cloudflare
- Proficiency in alerting and monitoring tools
- Proficiency with Infrastructure as Code tools (Terraform, Kustomize and Helm)
- Experience with CI/CD pipelines and GitOps practices such as ArgoCD or similar tools
- Strong scripting and automation skills in Bash and/or Python.
- Solid knowledge of networking principles
- A proactive mindset with the ability to work in a fast-paced environment
It'd Be Awesome If You Also…
- Have familiarity with incident management practices (on-call, runbooks, postmortem, disaster recovery).
- Understand Zero Trust security models and security best practices in cloud environments.
- Have exposure to Service Mesh (Istio, Linkerd) and container networking.
- Have experience with cost optimisation and cloud spend monitoring.
- Knowledge of managing permission models on distributed systems
We care deeply about the warm, inclusive environment we've created and we value diversity – we welcome applications from those typically underrepresented in tech. If you like the sound of this role but are not totally sure whether you're the right person, do apply anyway
About Snyk
Snyk is committed to creating an inclusive and engaging environment where our employees can thrive as we rally behind our common mission to make the digital world a safer place. From Snyk employee resource groups, to global benefits that help our employees prioritize their health, wellness, financial security, and a work/life blend, we aim to support our employees along their entire journeys here at Snyk.
Benefits & Programs
- Prioritize health, wellness, financial security, and life balance with programs tailored to your location and role.
- Flexible working hours, work-from home allowances, in-office perks, and time off for learning and self development
- Generous vacation and wellness time off, country-specific holidays, and 100% paid parental leave for all caregivers
- Health benefits, employee assistance plans, and annual wellness allowance
- Country-specific life insurance, disability benefits, and retirement/pension programs, plus mobile phone and education allowances
-
Staff Site Reliability Engineer
1 semana atrás
Lisboa, Lisboa, Portugal SNYK Tempo inteiroSnyk is the leader in secure AI software development, helping millions of developers develop fast and stay secure as AI transforms how software is built. Our AI-native Developer Security Platform integrates seamlessly into development and security workflows, making it easy to find, fix, and prevent vulnerabilities — from code and dependencies to containers...
-
Site Reliability Engineer
Há 3 horas
Lisboa, Lisboa, Portugal Claire Joster Tempo inteiroClaire Joster is currently recruiting for a reference client in car rental services, who aims to strengthen its internal structure with the integration of aSite Reliability Engineer(m/f).Functions:Define Reliability: design, implement, and monitor Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for our production services;Automation:...
-
Site Reliability Engineer
2 semanas atrás
Lisboa, Lisboa, Portugal ISPROX Tempo inteiroISPROX is a talent recruiting organization. Our goal is to find and select the best human capital and talent for our clients in order to help them to grow or sustain as a company. ISPROX has presence in several locations in Europe in order to be as much close as possible from our clients.ISPROX is looking for:We are selecting for our client, a multinational...
-
Site Reliability Engineer
Há 3 horas
Lisboa, Lisboa, Portugal IDW Tempo inteiroA IDW é uma empresa Portuguesa, reconhecida pela qualidade dos seus serviços e recursos humanos, focada em apresentar aos seus clientes as melhores soluções de negócio, baseadas em tecnologias de Informação. Na IDW desenhamos e implementamos soluções e serviços em algumas das maiores empresas a operar em Portugal e a nível internacional.Estamos à...
-
Senior Site Reliability Engineer
Há 3 horas
Lisboa, Lisboa, Portugal INSCALE Tempo inteiroWhy Join Us?JYSKis a global retail chain that brings Scandinavian design and quality to the world through an extensive range of quality products for sleeping and living.JYSKis known for its commitment to simplicity, functionality, and affordability. With over 3,200 stores in 48 countries,JYSKis a trusted brand for customers seeking to create comfortable and...
-
Senior Site Reliability Engineer
1 semana atrás
Lisboa, Lisboa, Portugal Arcesium Tempo inteiroArcesium is a global financial technology firm that solves complex data-driven challenges faced by some of the world's most sophisticated financial institutions. We constantly innovate our platform and capabilities to meet tomorrow's challenges, anticipate the risks our clients encounter, and design advanced solutions to help our clients achieve...
-
DevOps / Site Reliability Engineer
Há 3 horas
Lisboa, Lisboa, Portugal PrimeIT Tempo inteiroA PrimeIT é uma empresa líder com mais de 18 anos de experiência na prestação de serviços tecnológicos nas áreas de IT, Telecomunicações e Engenharia.Especializada emTeam Extension,Managed Services,Software à MedidaeNearshore, contamos atualmente com uma equipa de mais de 2350 profissionais a colaborar em projetos nacionais e internacionais,...
-
Lead Site Reliability Engineer
1 semana atrás
Lisboa, Lisboa, Portugal EPAM Systems Tempo inteiroWe are looking for aLead Site Reliability Engineerto enhance a global execution platform, delivering robust solutions to trading desks and clients.You will collaborate with expert teams, advancing your expertise in system administration, monitoring, and low-latency technologies. Join us to contribute to cutting-edge financial technology innovations.Note that...
-
Azure Site Reliability Engineer
Há 4 dias
Lisboa, Lisboa, Portugal Findmore Consulting, S.A. Tempo inteiro3 days ago Be among the first 25 applicantsIterable is the leading AI-powered customer engagement platform that helps leading brands like Redfin, SeatGeek, Priceline, Calm, and Box create dynamic, individualized experiences at scale.Our platform empowers organizations to activate customer data, design seamless cross-channel interactions, and optimize...
-
Lead Site Reliability Engineer
Há 6 dias
Lisboa, Lisboa, Portugal Arcesium Tempo inteiroCompany OverviewArcesium is a global financial technology firm that solves complex data-driven challenges faced by some of the world's most sophisticated financial institutions. We constantly innovate our platform and capabilities to meet tomorrow's challenges, anticipate the risks our clients encounter, and design advanced solutions to help our clients...