Senior Site Reliability Engineer
19 hours ago
DescriptionOCI Incident Response is the first line of defense for maintaining the high availability of Oracle's cloud. We make customer-impacting events shorter, less frequent, and less impactful by providing large-scale incident management. We are front-and-center in driving down event duration by utilizing our operational experience, knowledge of best practices, and ability to develop tools to automate incident management. We are looking for a Senior Site Reliability Engineer to join our OCI team This role is part of a globally distributed team responsible for detecting, triaging, and mitigating OCI service-impacting events as quickly as possible. You will be a part of one of these regional teams and be responsible for minimizing the downtime of OCI services. You will achieve this through delivering excellent major incident management and by operating systems with high scalability, performance, and security that prevent incidents from occurring.Oracle's Cloud is state-of-the-art and constantly evolving. When it experiences issues, your team will respond within minutes to ensure customer impact is mitigated. This experience will expose you to the inner workings of OCI's systems and organizations. You will interact with and influence leaders from across the Oracle business and will drive broad cross-organization programs meant to iteratively improve OCI-wide service availability. We are an agile team with significant impact. If you want to be a part of a fast-moving team breaking new ground, we would like to speak with youCareer Level - IC3ResponsibilitiesOracle's Cloud is innovative and constantly evolving. When it experiences issues, your team will respond within minutes to ensure customer impact is mitigated. This experience will expose you to the inner workings of OCI's systems and organizations. You will interact with and influence leaders from across the Oracle business and will drive broad cross-organization programs meant to iteratively improve OCI-wide service availability. We are an agile team with significant impact. If you want to be a part of a fast-moving team breaking new ground, we would like to speak with you* Solve complex problems related to infrastructure cloud services and automate common tasks to enable continuous availability with minimal human overhead* Command and coordinate SMEs and Service leaders to restore service as quickly as possible during Major Incidents while keeping accurate and timely data on the progress of such incidents* Utilize a deep understanding of cloud computing design patterns and their dependencies to mitigate complex Major Incidents.* Embed a methodical approach to troubleshoot large, complex, interconnected systems used in Incident Detection & Orchestration * Documents pertinent information relating to Incidents that aids process improvement, identifies deviations and enables the creation of an Incident Knowledge Base* Monitors and evaluates high-level service and infrastructure dashboards and takes action to address identified anomalies* Identifies opportunities and takes ownership for automation and/or continuous improvement of Incident Management process steps and best practices* Can define and document technical architecture of large-scale distributed systems. * Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. * Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. * Partner with development teams in defining operational requirements for product roadmaps. * Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. * Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). >>> Minimum Qualifications* Bachelor's degree or higher in Computer Science or relevant work experience..* 5+ years experience in Site Reliability Engineering, DevOps or System Engineering.* Must have public cloud operations experience (e.g., AWS, Azure, GCP, OCI).* Extensive experience with Major Incident Management in a cloud-based environment.* Demonstrate clear understanding of automation and orchestration principles. * Experience having worked in at least one modern object-oriented programming language.* Experience with professional software engineering standard methodologies such as Agile project management, coding standards, code reviews, source control management, build processes, testing, and operations.* Familiarity with infrastructure automation tools such as Chef, Ansible, Jenkins, Terraform* Excellent expertise with several of following technologies: Infrastructure-as-a-Service, CI/CD systems, Docker, RESTful APIs, log analysis tools, debugging tools>>> Preferred Qualifications* Strong leadership, project planning, communication, and execution skills* Strong analytic and problem-solving skills.* Proven track record of leading high blast-radius Major Incidents in cloud-based platforms.* Strong leadership, project planning, communication, and execution skills* Ability to handle multiple competing priorities in a fast-paced environment.* Ability to communicate clearly with technical and non-technical stakeholders at all levels.* Confidence to drive and manage large conference calls.* Experience with distributed service-oriented architectures QualificationsCareer Level - IC3
-
Senior Site Reliability Engineer
19 hours ago
Romania GoDaddy Full time €30,000 - €60,000 per yearLocation Details: Remote, Romania.GoDaddy offers diverse work arrangements: full-time office, hybrid (remote and in-office), and fully remote options for each team. This role is remote, allowing you to work from the comfort of your home. There may be occasional visits to a GoDaddy office to attend team events or meetings.Join our teamDo you feel ready to...
-
Site Reliability Engineer
18 hours ago
Romania Vodafone Full time €30,000 - €60,000 per yearYour day to day: We're looking for a Site Reliability Engineer (SRE) to join our team and help us build resilient, scalable,efficient systems, ensuring our infrastructure and applications run smoothly while driving automation and reliability practices forward. With these activities you will have a great impact on our business: Collaborate closely with...
-
Site Reliability Engineer
18 hours ago
Schwarz Global Services Hub Romania Jobs & Karriere bei Schwarz Corporate Solutions Full time €60,000 - €120,000 per yearSchwarz IT takes care of the entire digital infrastructure and all software solutions of the companies of Schwarz Group. As a result, it is responsible for the selection, provision, operation and continuing development of IT infrastructure, IT platforms and business applications. In order to provide IT solutions that optimally support the departments'...
-
Senior Software Engineer
18 hours ago
Romania Third-Party Job Posts Full time €80,000 - €120,000 per yearWhat Makes Us Unique At Cloudbeds, we're not just building software, we're transforming hospitality. Our intelligently designed platform powers properties across 150 countries, processing billions in bookings annually. From independent properties to hotel groups, we help hoteliers transform operations and uplevel their commercial strategy through a unified...
-
Service Reliability Engineer
19 hours ago
Romania Bitdefender Full time €30,000 - €60,000 per yearBitdefenderBitdefender is a cybersecurity leader delivering best-in-class threat prevention, detection, and response solutions worldwide. Guardian over millions of consumer, enterprise, and government environments, Bitdefender is one of the industry's most trusted experts for eliminating threats, protecting privacy, digital identity and data, and enabling...
-
Senior SysOps Engineer
19 hours ago
Romania 3Pillar Full time 30,000 - 60,000 per yearJoin Our Mission at 3Pillar: Elevate Your Impact As a Senior SysOps Engineer, you are the cornerstone of operational stability, driving forward the reliability and performance of our core IT infrastructure. Your expertise in System Operations practices will ensure the seamless availability, security, and sustained operation of our groundbreaking projects,...
-
Senior Platform Engineer
18 hours ago
Romania Qodea Full time 80,000 - 120,000 per yearWork where work matters.Elevate your career at Qodea, where innovation isn't just a buzzword, it's in our DNA.We are a global technology group built for what's next, offering high calibre professionals the platform for high stakes work, the kind of work that defines an entire career. When you join us, you're not just taking on projects, you're solving...
-
Senior Software Engineer
1 week ago
Romania Third-Party Job Posts Full time €30,000 - €60,000 per yearLocation: Remote Europe How You'll Make an Impact: Together we're on a mission to power every property in the world and to do that, we need to find the best talent in the world. That's why we're on the search for a superstar Senior Software Engineer, to help us reinvent the world of hospitality tech and travel. As a Senior Software Engineer, you will help...
-
Senior DevOps Engineer
18 hours ago
Romania 3Pillar Full time 80,000 - 160,000 per yearJoin Our Mission at 3Pillar: Elevate Your Impact As a Senior DevOps Engineer, you are the cornerstone of operational excellence, driving forward innovations that redefine industries. Your expertise in DevOps practices will ensure the seamless integration and continuous delivery of our groundbreaking projects, from transforming urban living to pioneering...
-
Site Selection Specialist
19 hours ago
Romania Indero Full time 15,000 - 30,000 per yearThe Site Selection Specialist is a key collaborator to project teams as the main point of contact for site selection activities (including site identification and feasibility).This role is perfect for you if:You have experience working with clinical research sites;You want to work in a collaborative environment;You want to have an impact in a fast-growing...