At CrowdStrike we operate a massive cloud platform that protects our customers from a variety of bad actors : cyber criminals, hacktivists and state sponsored attackers.
We process tens of billions of events a day and we store and use petabytes of data. We’re looking for an engineer who is passionate about site reliability and is excited about joining us to ensure our service runs 24 / 7.
You will ...
Be responsible for all operational aspects of our platform - Availability, Latency, Throughput, Monitoring, Issue Response (analysis, remediation, deployment) and Capacity Planning with respect to Latency and Throughput.
Build tooling to help monitor and analyze the platform.
Work in a team of highly motivated engineers.
Use your passion for technology to ensure our platform operates flawlessly 24x7.
Obsess about learning, and champion the newest technologies & tricks with others, raising the technical IQ of the team.
We don’t expect you to know all the technology we use but you will be able to get up to speed on new technology quickly.
Have broad exposure to our entire architecture and become one of our experts in overall process flow.
Be a great code reader and debugger, you will have to dive into large code bases, identify issues and remediate.
Have an intrinsic drive to make things better.
Bias towards small development projects and the occasional larger project.
Use and give back to the open source community.
Key Qualifications : You have
Degree in Computer Science (or commensurate experience in data structures / algorithms / distributed systems).
Experience as a sustaining engineering or SRE for a cloud based product.
Good understanding of distributed systems and scalability challenges sharding, partitioning, scaling horizontally are second nature to you.
A thorough understanding of engineering best practices from appropriate testing paradigms to effective peer code reviews and resilient architecture.
The ability to thrive in a fast paced, test-driven, collaborative and iterative programming environment.
Good understand of multi-threading, concurrency, and parallel processing technologies.
The skills to meet your commitments on time and produce high quality software that is unit tested, code reviewed, and checked in regularly for continuous integration.
Team player skills we embrace collaborating as a team as much as possible.
Bonus points awarded for
Contributions to the open source community (GitHub, Stack Overflow, blogging).
Existing exposure to Go, Kafka, AWS, Cassandra, Elasticsearch, Scala, Hadoop, Spark
Prior experience in the Cyber Security or intelligence fields
Market leader in compensation
Comprehensive health benefits
Training budget (certifications, conferences)
Working with the latest technologies
Flexible work hours and remote friendly environment
Stocked fridges, coffee, soda, and lots of treats
Inclusive culture focused on people, customers and innovation
Regular team activities, including happy hours, community service events
Bring your experience in site reliability and distributed systems to CrowdStrike where you will support and build a platform that scales to millions of events per second and tens of terabytes per day.
If you want to work at unmatched scale while helping to shut down cyber criminals and international espionage, you've come to the right place.