We are seeking a high-caliber Senior DevOps Engineer to lead the optimization and management of our high-concurrency trading infrastructure. You will be a key bridge between Development and Operations, ensuring the low-latency, high-throughput performance of our matching engines and wallet services. This role demands a strong focus on automation, scalability, and "Infrastructure as Code" (IaC) in a multi-cloud environment.
Key Responsibilities:
Trading Core Business Support
- Customize low-latency and high-throughput system tuning for core components, including Matching Engines, Wallet Services, and Market Data feeds. Provide architectural stability and scalability advice during the design phase.
Cloud-Native & Automation
- Manage multi-cloud resources (AWS, GCP, Azure, Alibaba Cloud) using IaC tools (Terraform, Ansible, Pulumi) to avoid vendor lock-in. Oversee Kubernetes (K8s) clusters and optimize microservices elasticity using Helm/Operator.
CI/CD Pipeline Design
- Build and maintain GitOps workflows (GitLab CI/ArgoCD) to achieve rapid deployment and near-instant rollback capabilities.
Full-Stack Observability
- Integrate Prometheus, Grafana, ELK, and OpenTelemetry to build a comprehensive monitoring system that detects transaction delays and fund anomalies in real-time.
DevSecOps & Security
- Integrate security scanning (SonarQube/Snyk) into pipelines. Maintain WAF, firewall, and IDS/IPS strategies in production to ensure compliance with external audits.
Incident Management
- Participate in and manage the 7x24 on-call rotation. Act as an Incident Commander during production issues and drive Root Cause Analysis (RCA) to prevent recurrence.
Job Requirements:
Professional Experience
- Substantial experience in IT infrastructure, with a significant focus on DevOps or Site Reliability Engineering (SRE).
- Background in crypto exchanges, high-frequency trading (HFT), or high-concurrency Fintech platforms is highly preferred.
Technical Skills
- Expert-level Linux system administration and proficiency in Shell/Python scripting for kernel-level performance troubleshooting.
- Hands-on experience with CI/CD toolchains (GitLab CI, ArgoCD) and IaC (Terraform, Pulumi).Deep expertise in Kubernetes ecosystems (Helm, Kustomize, Operator).
- Familiarity with DevSecOps practices and cloud security best practices (SonarQube, Snyk).
Soft Skills
- Strong ownership and the ability to remain calm and decisive under extreme market volatility.
- Excellent cross-departmental coordination skills to bridge the gap between Dev and IT.
- Highly self-driven and adaptable to a professional remote or hybrid collaboration model.