Site Reliability Engineer

3 tygodni temu


Warsaw, Polska Balyasny Asset Management L.P. Pełny etat
LoadingSorry to interruptCSS Error

Site Reliability Engineer

LocationWarsaw

PostedPosted 33 Days Ago

CodeREQ5796

We are looking for a Site Reliability Engineer who can cultivate our SRE philosophy, processes, and technologies from the ground up.

This role entails driving standards and fostering adoption across our technology teams, whilst closely partnering with our DevOps and Cloud teams.

With a hands-on approach, you'll work across both cloud and on-premises hosting platforms, ensuring the reliability and scalability of our trading systems and production environments. This is a chance to play a pivotal role in transforming our operational capabilities and enhancing performance across a wide array of environments and platforms.

As a Site Reliability Engineer at BAM, you will:

·Develop and promote our SRE philosophy, establishing best practices and processes that will be instrumental in scaling our infrastructure.

·Create and maintain thorough documentation for SRE processes, systems design, and incident post-mortems to foster a culture of learning and improvement.

·Promote SRE principles, particularly those applicable to messaging systems like Kafka, among technology teams, aiding in knowledge sharing and mentoring initiatives.

·Implement end-to-end observability and monitoring solutions using Prometheus, Grafana, Loki, and AWS CloudWatch, ensuring high visibility into application performance and infrastructure health.

·Utilize and build standards around Sentry for application monitoring and error tracking to proactively identify and address reliability issues.

·Review and define standards for application reliability requirements within our Kubernetes environment, ensuring application configuration is optimized for performance, cost and reliability.

·Develop automation and tooling to improve efficiency and reliability of deployment pipelines, system health checks, and recovery procedures.

·Collaborate with development teams to enhance service stability, scalability, and fault tolerance through SRE best practices like blameless post-mortems and service level objectives (SLOs).

·Design and automate processes to improve deployment, monitoring, and recovery operations for Kafka and other messaging systems.

Core Tech Stack:

·Languages: Python, Java, NodeJS, C#, Shell

·Public cloud: AWS

·CI/CD: Octopus, Jenkins

·Configuration Management: Puppet, Ansible

·Infrastructure Code: Terraform, CloudFormation

·Application Management: Kubernetes, Docker, Helm

·OS: Linux and Windows

·Observability: Prometheus, Amazon CloudWatch, Sentry, Grafana, Loki

To be considered a good cultural fit, you must be:

·An ambitious self-starter

·Hungry to learn

·Driven towards success

·A very strong and efficient communicator

·Able to multi-task and excel in a fast-paced trading environment

·A problem solver; able to develop quick and sound solutions to complex problems

To be considered a good fit, you must have:

·5+ years of experience in SRE or similar roles within complex, distributed systems environments.

·A Bachelor’s degree in engineering, computer science, information systems, or equivalent experience

·Proficient with key SRE technologies such as Prometheus, Grafana, Loki, AWS CloudWatch, and Sentry.

·Extensive knowledge of container orchestration using Kubernetes and containerization with Docker.

·Hands-on experience with both cloud (AWS preferred) and on-premises hosting platforms.

·Proven ability to script in languages like Python, Bash, or Go, to automate routine tasks and deployment pipelines.

·Strong understanding of CI/CD principles, agile methodologies, and DevOps culture.

·Excellent troubleshooting and problem-solving skills, with a systematic approach to handle unexpected situations.

·High level of initiative, passion for reliability engineering, detail orientation, and follow-through capabilities.

·Exceptional interpersonal and communication skills, with the ability to explain complex technical concepts to a diverse audience.

·Analytical skills – Ability to troubleshoot and logically assess problems and determine solutions

·Detailed documentation skills – ability to represent ideas, requirements, reference architecture and problems in clear, concise, and business-friendly documents

Bonus points for:

·Experience in a high throughput/low latency environment

·Experience with successful SRE team build outs

·Experience with security patterns and distributed authentication

·Experience managing high-pressure incident response

·Experience with Chaos Engineering technologies

·Contributions to open source libraries, projects, or communities

·Any AWS, Azure, or GCP resource specializations or certifications

·Any Kubernetes resource specializations or certifications

Don’t have all the skills listed above? Have extra skills you think are important that we haven’t thought of? Please, let us know by applying and telling us a bit more about yourself and why you think you’re qualified


Loading
  • Site Reliability Engineer

    4 tygodni temu


    Warsaw, Polska SIX Pełny etat

    What You Will Do actively participate in the day-to-day operational activities, such as monitoring, and incident response contribute in continuous improvements such as automation, monitoring etc. contribute in compliance driven activities, both internal and external participate in index platform projects and initiatives  participate in on-call rotation and...

  • Site Reliability Engineer

    1 miesiąc temu


    Warsaw, Polska SIX Pełny etat

    What You Will Do actively participate in the day-to-day operational activities, such as monitoring, and incident response contribute in continuous improvements such as automation, monitoring etc. contribute in compliance driven activities, both internal and external participate in index platform projects and initiatives  participate in on-call rotation and...

  • Site Reliability Engineer

    1 miesiąc temu


    Warsaw, Polska Cognizant Pełny etat

    Working model: hybrid, Warsaw | 3/4 days a week from the office What we do: As Top Employer, we are dedicated to helping the world's leading companies build stronger businesses — helping them go from doing digital to being digital. Cognizant Poland offices are located in Gdansk, Wroclaw, and Kraków. With the capacity to support various clients, we offer...

  • Site Reliability Engineer

    4 tygodni temu


    Warsaw, Polska Cognizant Pełny etat

    Working model: hybrid, Warsaw | 3/4 days a week from the office What we do: As Top Employer, we are dedicated to helping the world's leading companies build stronger businesses — helping them go from doing digital to being digital. Cognizant Poland offices are located in Gdansk, Wroclaw, and Kraków. With the capacity to support various clients, we offer...

  • Site Reliability Engineer

    2 miesięcy temu


    Warsaw, Polska IT Performance Pełny etat

    Poszukujemy kandydatów na stanowisko Site Reliability Engineer. Praca jest dedykowana dla międzynarodowej firmy z obszaru finansów.ObowiązkiOkreślenie niezawodności produktów cyfrowych, usług technologicznych i infrastrukturyMinimalizowanie ryzyka i wpływu awarii poprzez opracowywanie usprawnień operacyjnych, takich jak monitorowanie predykcyjne,...

  • Site Reliability Engineer

    4 tygodni temu


    Warsaw, Polska IT Performance Pełny etat

    Poszukujemy kandydatów na stanowisko Site Reliability Engineer. Praca jest dedykowana dla międzynarodowej firmy z obszaru finansów.ObowiązkiOkreślenie niezawodności produktów cyfrowych, usług technologicznych i infrastrukturyMinimalizowanie ryzyka i wpływu awarii poprzez opracowywanie usprawnień operacyjnych, takich jak monitorowanie predykcyjne,...


  • Warsaw, Polska Sii Pełny etat

    We are looking for an experienced Site Reliability Engineer to work in a professional software development hub, responsible for delivering software solutions used by our client’s external private and business customers across the globe. Our customer is a global retail company with over 16.000 stores in 26 countries, serving more than 6 million customers a...

  • Site Reliability Engineer

    1 miesiąc temu


    Warsaw, Polska Connectis Pełny etat

    Wspólnie z naszym Partnerem, poszukujemy osoby na stanowisko SRE (Site Eeliability Engineer). Nasz Partner skupia się na rozwijaniu aplikacji internetowych i mobilnych związanych z szeroko pojętą elektryfikacją, obejmującą samochody elektryczne i infrastrukturę ładowania. Dodatkowo oferuje systemy płatności zbliżeniowych na stacjach, programy...


  • Warsaw, Polska Connectis_ Pełny etat

    technologies-expected : Java SQL NoSQL Spring Framework Microsoft Azure AWS about-project : Wspólnie z naszym Partnerem, poszukujemy osoby na stanowisko SRE (Site Eeliability Engineer). Nasz Partner skupia się na rozwijaniu aplikacji internetowych i mobilnych związanych z szeroko pojętą elektryfikacją, obejmującą samochody elektryczne i...

  • Site Reliability Engineer

    4 tygodni temu


    Warsaw, Polska Connectis Pełny etat

    Wspólnie z naszym Partnerem, poszukujemy osoby na stanowisko SRE (Site Eeliability Engineer). Nasz Partner skupia się na rozwijaniu aplikacji internetowych i mobilnych związanych z szeroko pojętą elektryfikacją, obejmującą samochody elektryczne i infrastrukturę ładowania. Dodatkowo oferuje systemy płatności zbliżeniowych na stacjach, programy...


  • Warsaw, Polska Box Inc. Pełny etat

    Site Reliability Engineer III *Our compensation structure is the base salary and equity in the form of restricted stock units.What is Box?Box is the world's leading Content Cloud. We are trusted by more than 115K organizations around the world today, including nearly 70% of the Fortune 500 and leaders across deeply regulated industries (such as AstraZeneca,...

  • Site Reliability Engineer

    2 tygodni temu


    Warsaw, Polska WIPRO IT SERVICES POLAND Sp. z o.o. Pełny etat

    technologies-expected :HadoopKafkaPythonKubernetesLinuxDockerresponsibilities :Carry out SRE duties for Big Data on various open-source platforms such as Hadoop, Kafka, Spark, and HBASE.Develop software systems and automated solutions for operational aspects in an organization.Code efficiently based on the requirements and verify, deploy software...

  • Site Reliability Engineer

    2 tygodni temu


    Warsaw, Polska Connectis_ Pełny etat

    Wymagane, Java, SQL, NoSQL, Spring Framework, Microsoft Azure, AWSO projekcie, Wspólnie z naszym Partnerem, poszukujemy osoby na stanowisko SRE (Site Eeliability Engineer). Nasz Partner skupia się na rozwijaniu aplikacji internetowych i mobilnych związanych z szeroko pojętą elektryfikacją, obejmującą samochody elektryczne i infrastrukturę...


  • Warsaw, Polska CGI Information Systems and Management Consultants (Polska) Sp. z o.o. Pełny etat

    technologies-expected : Terraform Terragrunt AWS Jenkins Kubernetes GitLab Bash Python about-project : We offer an opportunity to work on projects for client in the financial industry. This role may require communication and interaction with a wider international team. responsibilities : Designing, building, and maintaining reliable software...


  • Warsaw, Polska Groupe SII Pełny etat

    We are looking for an experienced Site Reliability Engineer to work in a professional software development hub, responsible for delivering software solutions used by our client’s external private and business customers across the globe. Our customer is a global retail company with over 16.000 stores in 26 countries, serving more than 6 million customers...


  • Warsaw, Polska Sii Polska Pełny etat

    We are looking for an experienced Site Reliability Engineer to work in a professional software development hub, responsible for delivering software solutions used by our client’s external private and business customers across the globe. Our customer is a global retail company with over 16.000 stores in 26 countries, serving more than 6 million customers a...

  • Site Reliability Engineer

    1 miesiąc temu


    Warsaw, Polska PAYBACK Pełny etat

    Site Reliability Engineer (DevOps) Site Reliability Engineer (DevOps) IT & Product Development Tech&Data Pełny etat Nowa oferta! Skopiowano do schowka. Role Description: You represent the interface between software development and the application support teams to develop our platforms in terms of reliability, stability and performance. ...

  • Site Reliability Engineer

    4 tygodni temu


    Warsaw, Polska PAYBACK Pełny etat

    Site Reliability Engineer (DevOps) Site Reliability Engineer (DevOps) IT & Product Development Tech&Data Pełny etat Nowa oferta! Skopiowano do schowka. Role Description: You represent the interface between software development and the application support teams to develop our platforms in terms of reliability, stability and performance. ...


  • Warsaw, Polska Connectis_ Pełny etat

    Obecnie poszukujemy doświadczonej osoby na stanowisko Site Reliability Engineer do branży paliwowo-energetycznej Dołączysz do zespołu pracującego nad platformą Next Generation Retail, której celem jest poprawa niezawodności, obserwowalności, wydajności, łatwości utrzymania i jakości.Twoja rola będzie polegać na monitorowaniu środowiska...


  • Warsaw, Polska COGNIZANT Pełny etat

    Expected, Java, Kubernetes, Linux, Docker, Python Operating system, Windows, Linux About the project, We are looking for talented, curious, and energetic Senior Site Reliability Engineers who embrace solving complex challenges on a global scale. As a Site Reliability Engineer, you will be an integral part of a cross-functional team inventing, designing,...