At Coalition Inc. (Permanent), in Multiple Locations
Expires at : 2022-09-18
Remote policy : Full remote
Founded in 2017, Coalition is on a mission to solve cyber risk and create a safer digital economy where everyone can thrive.
Digital risk is now a part of every business and it’s no longer solely the domain of technical teams. That’s why we combined comprehensive insurance with proactive cybersecurity tools to help organizations stay resilient to digital risks like cyber attacks, funds transfer fraud and much more.
Our team works collaboratively across North America and Europe to prevent security failures and provide both technical and financial help when incidents do occur.
Today, Coalition is the world’s largest commercial insurtech serving over 130,000 customers including many small businesses that rely on Coalition to help them chart a path forward in the new digital world.
As of September 2021, Coalition has raised $520 million from leading global technology investors as well as highly-regarded institutional investors including : Index Ventures, Ribbit Capital, Valor Ventures, Durable Capital, T.
Rowe Price Advisors, and Whale Rock Capital, valuing the company at more than $3.5 billion.
Coalition has experienced tremendous growth by helping organizations of all sizes solve real-world problems and by remaining true to our founding values of character, humility, responsibility, authenticity and diversity.
That’s why we are proud to be named one of Inc’s Best Places to Work in 2021.
About The Role
We are looking for a Senior Site Reliability Engineer (Remote) who has the experience, ability, and mental fortitude to instrument and monitor the breadth of our full platform stack (hosts, applications, and performance).
In this role you will work closely with our engineering and information security teams to enhance the automated system provisioning and deployment subsystems within codified infrastructure.
You will work with developers to create more robust and scalable services independent of cloud implementations. You will help to isolate, trap, and respond from the inevitability of system failure and develop strategies for continuous monitoring and analysis to reduce both downtime and required manual intervention.
You will participate in On-Call rotation to maintain platform SLAs.
Our core platform is written mostly in Python. We prefer to use the right tool for the job and make pragmatic decisions about how to scale and decouple systems as we continue to grow.
We’re looking for someone who can navigate a cloud environment (across multiple providers) and bare metal with many moving pieces and systems to help the team understand how they fit into the broader puzzle.
Ensure performance, responsiveness, scalability and automation - help us iterate faster and run smoothly
Collaborate with team members to ensure scalable and automated services
Review work done by other engineers
Research, learn and improve a large scale scanning and data processing platform
Skills and Qualifications
Be a part of a remote team
At least 6 years of experience
Distributed Systems Architectures
Cloud Providers ( AWS , Google Cloud, Azure, Digital Ocean, etc.)
NoSQL databases, Message Queues & Streaming platforms
Good knowledge of Linux Systems (We don't use Windows Servers)
Source Control Systems, e.g. Git
Terraform and infrastructure as code (Terraform)
Configuration Management Tools, e.g. Ansible
Building CI / CD Pipeline
Strong written and verbal communication skills in English
Docker / ECS
Fargate / Nomad
Benefits & Perks