We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Enterprise Operations Center (EOC) Incident Manager

IntelliBridge
United States, Virginia, Ashburn
Nov 22, 2024

Title: EOC Incident Manager

Clearance: Full CBP BI required prior to start

Location: Ashburn, VA

Overview:

We are seeking an Enterprise Operations Center (EOC) Incident Manager to oversee incident management processes in a high-stakes 24x7x365 operations environment. The successful candidate will lead the resolution of major incidents impacting enterprise systems or government agencies, ensuring rapid service restoration and minimal downtime. This role requires a strong background in monitoring, troubleshooting, and escalation practices, as well as experience with ITIL frameworks, incident management tools, and performance monitoring technologies.

Clearance:

  • CBP Public Trust Background Investigation

Position Overview:

The Incident Manager will have experience managing incidents in a Network Operations Center or equivalent 24x7x365 operations center supporting the resolution of Major Incidents for an enterprise or Government agency. This position supports evenings and weekends as needed.

Key duties include:

  • Performs all functional duties independently.
  • Possesses and applies expertise on multiple complex work assignments. Assignments may be broad in nature, requiring originality and innovation in determining how to accomplish tasks.
  • Operates with appreciable latitude in developing methodology and presenting solutions to problems.
  • Contributes to deliverables and performance metrics where applicable.
  • Monitor and support Incident management in production, development, and test environments in all data centers used by the client.
  • Provide a central point for coordination of incidents that arise in all environments. Establish and orchestrate bridge calls with emphasis on restoring service to users as quickly as possible, facilitate and troubleshoot toward resolution of incidents, and manage incidents to completion.
  • Coordinate, escalate, and/or resolve operational system/application/network events that have the potential of negatively impacting system and application availability to the user community.
  • Define and document metrics to judge efficiency and effectiveness of Incident Management Process. Examples: Mean Time to Repair, Mean Time Between Failures, Repeat Incidents
  • Create, update and maintain Standard Operating Procedures, Technical User Guides, Troubleshooting Guides, and Customer Contact Database. Conduct quarterly reviews of all documents.
  • Populate Knowledge Management Database with known troubleshooting procedures. Develop "lessons learned" on all escalated incidents.
  • Escalate incidents in accordance with established escalation procedures.
  • Report on previous business day's Enterprise Operations Center call volume and SLAs to be incorporated into the CIO Morning Meeting report slides. Content may change as the Government reporting requirements change over time. Due daily by 7:30am.
  • Report monthly on outstanding tickets dependent on third party action. Report to include ticket, item awaiting action, third party, duration and if known estimated resolution time.
  • Proactively identifies opportunities for process and/or documentation improvement.
  • Supports the development of monthly Enterprise Operations Center reporting for SLAs and KPIs.

Required Qualifications:

  • Must be available to support 1st shift: 0600-1640 or 3rd Shift: 2300-0730; Tues - Sat (5, 8hr shifts), Wed - Sat (4, 10 hr shifts) or Fri - Mon (4, 10 hr shifts)
  • 10+ years of experience and a BS and MS degree.
    1. Bachelor of Science (BS) can be substituted with an additional 4 years of related experience.
    2. Masters (MS) can be substituted with an additional 2 years of related experience.
  • 3+ years of strong experience with Fault and Performance monitoring and reporting tools such as IBM Netcool Omnibus, AppDynamics, HP Operations Manager
  • 3+ years of experience working with incident management tools such as BMC Remedy
  • 3+ years of engineering experience within a large-scale, complex Manager of Manager (MoM) type monitoring environment
  • 3+ years of exposure to Service Management/ITIL framework and concepts (incident, problem, change management, RCA)
  • 2+ years of proven demonstrated troubleshooting skills; highly skilled in the implementation, integration, testing, and support of distributed applications
  • Excellent communication skills: experience working with technical and functional resources; experience presenting information to client / senior leadership
  • Excellent problem-solving skills: proven ability to resolve issues and explain complex problems
  • US Citizenship

About Us:

IntelliBridge delivers IT strategy, cloud, cybersecurity, application, data and analytics, enterprise IT, intelligence analysis, and mission operation support services to accelerate technical performance and efficiency for Defense, Civilian, and National Security & Federal Law Enforcement clients.

Applied = 0

(web-5584d87848-9vqxv)