Platform Reliability Engineer 6117

Company: CognitekGroup, LLC
Location: Philadelphia, Pa
Status: Direct
Salary: Market Sr
Close Date: Direct

Job Description:

We have a challenging opportunity for an Platform Reliability Engineer in Wayne PA. In this critical role you will ensure the “-ilities” (Availability, Reliability, Scalability, Usability; etc.) of private and public cloud platforms in both test and production environments by maintaining, upgrading, and patching cloud platforms; managing communications and coordinating change events with development and support teams; identifying and resolving reliability issues; and implementing long-term mitigation strategies – ideally through automation. You will respond to incidents, apply upgrades and leverage a strategic thinking to automate repetitive manual tasks. You will work with real-time monitoring and diagnostic data, analyze trends, and plan for future infrastructure needs. You will respond to production incidents and availability needs, document platform post-mortems and train/mentor staff on reliability practices, processes and technologies.


Provides senior level Tier 3 technical infrastructure support services for issues elevated from the Support Center and other Technical Services groups. Ensures reliable operation of production.
Diagnoses and troubleshoots availability interruptions and other production issues.
Plans and coordinates enterprise-wide infrastructure projects with other IT and client teams.
Communicates with teams to keep them apprised of status and issues. Contacts vendors to resolve technical issues.
Tests, installs, and migrates software, patches, upgrades, applications, and/or hardware.
Develops technical standards. Tests and evaluates IT vendor products.
Writes documentation, including project plans, installation procedures, and troubleshooting tips. Creates diagrams, including technical topology.
Maintains, monitors, and tunes Production system and applications performance. Debugs source code and performance problems and/or provides debugging assistance to developers.
Identifies opportunities to improve system and applications performance (e.g., automating manual system tasks).
Trains and mentors staff. Resolves complex issues elevated from staff with less experience.
Adds, updates, and closes IT Problem Management database records. Researches and resolves complex issues, and reviews related technology records to mitigate impact on assigned system.
Reviews numerous IT knowledge repositories to update technical knowledge.
Learns and understands client area business functions and requirements. Has the ability to determine the appropriate technical tool to address the client’s business needs.
Thoroughly understands and complies with IT policies and procedures, especially those for quality and productivity standards that enable the team to meet established client service levels.
Thoroughly understands and complies with Information Security policies and procedures, and verifies deliverables meet Information Security and VSA requirements.
Participates in special projects and performs other duties as assigned.


Minimum of 3+ yrs of overall technical engineering experience
Bachelor’s Degree preferred or equivalent technical experience
A deep understanding and practical experience with upgrading and maintaining distributed orchestration systems (Cloud Foundry, OpenShift, etc).
Experience maintaining and monitoring distributed systems deployed in private and public clouds. Understanding of monitoring/telemetry solutions (Splunk, ELK, Nagios, etc…) data ingestion and analysis.
Deep knowledge of Linux systems and cloud platforms/providers
Strong oral and written communication skills
Passion for problem solving and strategic thinking and a desire to own and execute
Advanced understanding and application of at least one scripting language (Shell, PHP, Python; etc.)
Development experience, Java, Shell, Python, etc. a plus.
A flexible schedule – some activities you’ll be performing may require off-hours or weekend support.

Our client is an equal opportunity employer and is committed to providing a drug free workplace.

Contact Information:
Job Code: 6117

Jim Jennings 636-484-6869

<< Back