Site Reliability Engineer
3 tygodni temu
Location: Poland only, fully remote Job Type: B2B, full time Overview Hard Rock Digital is a team focused on becoming the best online sportsbook, casino, and social gaming company in the world. We care about each customer's interaction, experience, behaviour, and insight and strive to ensure we're always acting authentically. Rooted in the kindred spirits of the Seminole Tribe of Florida, the new Hard Rock Digital taps a brand known all over the world as the leader in gaming, entertainment, and hospitality. We're taking that foundation of success and bringing it to the digital space. What's the position? We are looking for a skilled Site Reliability Engineer (SRE) to maintain and improve the reliability, scalability, and performance of our Java-based application. You will be responsible for managing and monitoring the applications and infrastructure, using the Grafana stack (Grafana, Loki, Prometheus) to ensure a high level of observability, and implementing robust monitoring, alerting, and logging solutions. Key Responsibilities: Application Reliability & Performance: Ensure the availability, reliability, and performance of a high-traffic Java-based application in a distributed environment. Troubleshoot and resolve complex issues in production and non-production environments. Participate in both pre- and post-deployment performance testing and monitoring efforts to improve application performance. Optimize Java application performance, ensuring efficient resource utilization and scaling. Monitoring & Observability: Deploy and manage the Grafana stack (Grafana, Prometheus, Loki) to provide real-time monitoring, logging, and alerting. Implement and refine observability strategies to enhance application and infrastructure visibility. Create and maintain dashboards, alerts, and logs for comprehensive monitoring of system health and performance. Incident Management & Root Cause Analysis: Support the operations team's incident response efforts, participate in post-mortems, and identify root causes of issues to prevent recurrence. Document and share lessons learned from incidents, contributing to a culture of continuous improvement. Collaboration & Cross-functional Support: Work closely with developers, architects, and other engineers to design and implement solutions that improve application reliability. Collaborate closely with DevOps and NOC teams to support the application platform. Communicate SRE practices and principles to technical and non-technical stakeholders. Provide feedback and insights on application performance, potential improvements, and observability metrics. Requirements What are we looking for? The ideal candidate will have: Degree in computer science or a related field, or equivalent work experience 2-3 years in SRE, DevOps, or similar Infrastructure roles Experience managing large-scale, high-availability production systems Track record of incident response and post-mortem processes Experience with capacity planning and performance optimization 1 years hands-on experience managing production Kubernetes clusters Deep understanding of k8s architecture, networking, storage, and security Experience with cluster scaling (Karpenter), upgrades, and multi-cluster management Proficiency with kubectl, Helm, and Kubernetes operators Container orchestration and troubleshooting knowledge Expertise with the Grafana stack for dashboards, alerting, and visualization Hands-on experience with Grafana Alloy for telemetry data collection Proficiency in PromQL Experience with Loki for log aggregation and analysis Experience building comprehensive monitoring and alerting strategies Hands-on experience managing Java-based applications in large-scale, distributed environments, with a focus on JVM tuning and application optimization. Cloud Platform expertise (AWS, GCP, or Azure) Familiarity with infrastructure as code (IAC) tools like Terraform/Terragrunt or Ansible. ArgoCD proficiency for GitOps workflows and continuous deployment Scripting abilities in Bash, Python, or Go Experience with CI/CD piplelines and automation tools Configuration Management and deployment automation Strong troubleshooting skills, with a proactive approach to diagnosing and resolving performance bottlenecks. Proven experience in on-call rotations, incident response, and root cause analysis. Strong communication skills (both written and verbal), positive attitude, and ability to receive constructive feedback.
-
Senior Site Reliability Engineer
3 dni temu
Gdansk, Polska EPAM Systems Pełny etatWe are currently seeking an experienced Senior Site Reliability Engineer (SRE) to join our team. In this critical role, you will collaborate closely with software developers and operations teams to ensure the high reliability, scalability, and efficiency of our systems. You will also strongly focus on meeting and exceeding customer expectations. Your...
-
Site Reliability Engineer Architect SRE
2 tygodni temu
Gdansk, Polska EPAM Systems Pełny etatWe are seeking an experienced and accomplished Site Reliability Engineer/Architect (SRE) to join our dynamic, fast-paced team. In this pivotal leadership role, you will be entrusted with architecting and implementing advanced SRE practices to ensure the reliability, scalability, and efficiency of our Generative AI (GenAI) enablement platform for enterprise...
-
Solution Architect – Site Reliability SRE
3 dni temu
Gdansk, Polska ERGO Technology & Services Pełny etatAbout Us ERGO Technology & Services S.A. (ET&S S.A.) was established in January 2021 following the integration of ERGO Digital IT and Atena into one entity, leveraging both companies' strengths and best practices. As a part of ERGO Technology & Services Management AG, the technology holding of ERGO Group AG, we support millions of internal and external...
-
Reliability Engineer
3 dni temu
Gdansk, Polska RedStone Pełny etatStep into RedStone – the fastest-growing blockchain startup – as a Backend Developer The salary for this role (B2B contract): 20 000 - 35 000 PLN VAT About Us RedStone is a fast-growing Polish blockchain startup revolutionizing oracle infrastructure. With our 45 person team, including over 50% senior engineers, we deliver scalable, secure, and...
-
DevOps Engineer
3 dni temu
Gdansk, Polska Code and Pepper Pełny etatW Code & Pepper wspieramy klientów z branż, gdzie nie ma miejsca na półśrodki – FinTech, HealthTech i AI. Pomagamy im skalować zespoły, rozwijać platformy i dowozić nowe funkcje wtedy, kiedy są naprawdę potrzebne. Nasze must have: Kilkuletnie doświadczenie pracy w roli DevOps lub Site Reliability Engineer, najlepiej w branży FinTech lub...
-
QA/QC Site Engineer
1 dzień temu
Gdansk Metropolitan Area, Polska PWMG Pełny etatQA/QC Site EngineerPosition OverviewThe QA/QC Site Engineer is responsible for ensuring that all construction, installation, testing, and commissioning activities related to the combined-cycle power plant are executed in accordance with the project's quality standards, applicable codes, engineering specifications, and contractual requirements. The role...
-
Senior DevOps Engineer
1 tydzień temu
Gdansk, Polska KUBO Pełny etatFor our client – a global technology company building and operating large-scale data and analytics platforms – we are looking for a Senior DevOps Engineer. You'll join an international team supporting cloud-based solutions used across multiple business areas. Your focus will be on improving reliability, automation, and deployment processes within a...
-
Senior C++ Software Engineer
3 tygodni temu
Gdansk, Polska Kubo Pełny etatWe are collaborating with a global aviation technology company to help them find a skilled Senior Software Engineer (C++) who will join their growing engineering team in Poland. The role focuses on developing advanced software solutions used daily by flight crews and ground operations teams worldwide. Take a look at the details below — and if you think...
-
Senior Python Engineer
5 dni temu
Gdansk, Polska Ciklum Pełny etatCiklum is looking for a Senior Python Engineer to join our team full-time in Poland. We are a custom product engineering company that supports both multinational organizations and scaling startups to solve their most complex business challenges. With a global team of over 4,000 highly skilled developers, consultants, analysts and product owners, we engineer...
-
Senior JavaScript Engineer
4 tygodni temu
Gdansk, Polska Ciklum Pełny etatSalary range: B2B - 35-40 E/h VAT Ciklum is looking for a Senior JavaScript Engineer to join our team full-time in the Poland. We are a custom product engineering company that supports both multinational organizations and scaling startups to solve their most complex business challenges. With a global team of over 4,000 highly skilled developers, consultants,...