The Major Incident Management (MIM) team seeks to be the premier provider of incident detection, prevention, and response for Oracle’s critical services by avoiding unplanned downtime and restoring services quickly during an outage.
In support of Service Restoration, the Incident Commander will initiate and lead Major Incident Management calls and gather technical resources who can remediate the issue.
The Incident Commander will also provide business impact updates during incidents to IT leadership as required (in verbal and written form).
Another key component of the MIM Incident Commander function is to collect relevant incident related statistics and publish operational health metrics to IT senior leadership and others on a regular basis.
This includes but is not limited to incident duration, root cause analysis and follow-on preventive / corrective action monitoring.
Who are you?
Passionate about Cloud, customer focused, have done incident management + problem management and thrive in a dynamic team culture.
A technologist at heart, curious about how things work and how things break - likely to be someone who enjoys finding a better way to do things using automation
Able to build, maintain and leverage key relationships with internal stakeholders and service leaders to drive increased engagement and accountability for your work.
Love technology and how to apply it. Maybe you have set up your own environment in the cloud or have spent time developing apps or games that you share with others
Strong communicator who is passionate about the customer’s experience
Motivated to be resourceful, innovative and entrepreneurial
Driven to learn about cloud infrastructure and its inter-dependencies
Humble and committed to always improving
Key Responsibilities include, but are not limited to :
Provides leadership and drives key decisions during major incident resolution conference calls. This includes making business decisions and providing major incident resolution leadership during bridge calls.
Owns Major Incident response from start to finish. Works with the other team members and partners resolving teams to drive the resolution of high-severity outages impacting IT infrastructure by researching recent changes, monitoring information, and other related data.
Assesses the business impact of a Major Incident and identifies key team members or teams that should participate in the restoration activities.
Clearly documents the troubleshooting steps taken during the Major Incidents in chronological order and publishes status communications.
Scribes the participants and detailed actions taken during Major Incidents.
Communicates effectively to Service Owners, Oracle IT and Executive Management.
Leads post incident reviews and writes, and publishes corrective actions and preventative actions (CAPA) investigations as requested for incidents.
Collaborates with Business Relationship Management to effectively communicate IT Events and IT Incidents to key lines of business.
Creates, documents, and executes event-response procedures to prevent service impact and restore service quickly during an outage.
Works across appropriate teams to create and maintain documentation.
Monitors and evaluates high-level service and infrastructure dashboards and takes action to address identified anomalies
Collates and analyses incident based data for team metrics and KPIs.
Identifies opportunities and takes ownership for automation and / or continuous improvement of Incident Management process steps and best practices
Meets monthly with partner teams and service owners as the MIM Liaison for one or more service offerings in order to build a strong mutual relationship with those teams and to provide the highest level of support for our most critical services.
Assists in managing business continuity and recovery of company's information systems.
Assists in maintaining the overall effectiveness of technology systems residing in Oracle’s IT organization, ensuring high levels of customer satisfaction and availability, 24x7.
Assists in maintaining a framework of policies to ensure that standardized methods and best practices are utilized.
Participates in IT strategy planning, understanding potential impact to business operations from proposed change and project activities.
Contributes to MIM Continual Service Improvement by providing constructive feedback and innovative ideas on processes, documentation, and tooling.
Promote Organizational Culture and Values Put Customers First When faced with a choice between what is easy for us and what is good for customers, customers win every time.
Act Now, Iterate We move quickly but deliberately, and we iterate toward better solution. Nail the Basics We recognize that the path to advanced solutions always runs through the basics.
Expect and Embrace Change We embrace change as an opportunity for growth and greater success. Take Risks, Remain Calm We recognize that learning from our failures is part of our path to success.
Don't Be a Jerk We treat each other with dignity. We seek understanding by listening before we speak. Own Without Ego We take responsibility for the state of our team, our products, and ourselves.
We never say, That’s not my job. Earn Trust, Give Trust We build trust by communicating openly and transparently. We give trust easily, and we recognize that trusting each other is essential to our success.
Take Pride in Your Work We identify work that needs to be done to achieve our team goals. We take responsibility for either changing our work or changing ourselves when we don’t find pride in our work.
Proven hands-on experience with technology systems, including cloud infrastructure, network, server, storage, client or application, datacenter operations and system administration.
Prefer experience in managing and tuning systems and / or applications, with ability to review and validate system test output
ITIL experience (Incident, Problem, Change, Event).
Prefer experience in a lead or manager role.
Must possess strong analytical and problem solving skills, with a proven track record of executing calmly against tough deadlines.
Must demonstrate an ability to establish relationships and build rapport in order to influence colleagues at all levels, uncover business or technical issues, and facilitate their resolution.
Must demonstrate the ability to lead an audience, regardless of their organizational role.
Comfortable with team dynamics and openly seeks and shares information across teams and departments, coordinating and combining competencies for the best overall result.
Excels in all facets of verbal and written communication; leads and inspires the team via open communication, effective listening, and strong collaborative networks;
has excellent negotiation skills, is an expert coordinator, and has an ability to orchestrate change through influence.
Candidate must be fluent in the English language, including strong verbal and well-constructed written skills.
Identifies bottlenecks and pain points and directs resources to address the challenges in a directed, methodical, cost-effective, and data-driven manner;
leverages analytical experience to build a road map to meet the needs of the department and the employer.
Works effectively in the face of stress, ambiguity, difficult situations, and shifting priorities; understands the need to shift focus and priorities as required and successfully leads others through periods of change.
Considers and implements creative and innovative approaches to tackle new issues / challenges in an aggressive manner;
encourages the team to take advantage of self-directed learning opportunities.
Possesses genuine desire to provide superior customer service.
Deep understanding DevOps and Cloud concepts and how to apply Site Reliability Engineering ideas to make service offerings more scalable, reliable, and efficient.
Able to work unsupervised, independently and within a global team.
Experienced user of a trouble ticketing system (Jira, Remedy or similar).
Ability to manage multiple tasks in a fast-paced, ever changing environment.
Ability to think strategically and tactically and work in both a reactive (incident response) as well as proactive engagement model.
Flexibility to work within a Follow the Sun global shift rotation, covering local day-time hours, including holidays and weekends, on a rotational basis.
Ability to be on-call as part of an on-call rotation shared across all team members.
Education / Certifications :
Bachelors and / or Master Degree in Computer Science
ITIL v3 Foundations
ITIL Service Operations certification preferred
Detailed Description and Job Requirements
As a member of this fast-paced, leading edge database / applications company, work with the team to deliver real time 24x7 enterprise-wide technical support for internal and / or external customers.
This includes, but is not limited to : user support of business applications, troubleshooting of technical problems and acting as a liaison between customers and resolving groups.
As a member of the Help Desk, solve specific, complex technical problems to provide and apply real time solutions in the areas of 'Email problems and functionality questions 'Network printer problems (stopping / starting queue, usage) 'Data Communication / Networking troubleshooting 'Remote network dial in access-PPP and Serial 'PC configuration and network configuration 'Oracle Base Image laptop support Innovate and document new methods and procedures as needed.
Verify procedures are being followed and notify proper resource if they are not in compliance. Lead / mentor in a team environment.
Assist in providing information and support to team members.
Job duties are varied and complex, needing independent judgment. May have project lead role. Prefer five years of related experience in a medium to large network distributed and computing environment and a BS in Computer Science or related field.
As part of Oracle's employment process candidates will be required to successfully complete a pre-employment screening process.
This will involve identity and employment verification, professional references, education verification and professional qualifications and memberships