Infrastructure Engineer (LATAM time zone) at Chainstack
Chainstack
Infrastructure Engineer (LATAM time zone)
Visa Source Listed
📍 Antigua and Barbuda, Argentina, Bahamas, Barbados, Belize, Bolivia, Brazil, Canada, Chile, Colombia, Costa Rica, Cuba, Dominica, Dominican Republic, Ecuador, El Salvador, Grenada, Guatemala, Guyana, Haiti, Honduras, Jamaica, Luxembourg, Mexico, Nicaragua, Panama, Paraguay, Peru, Saint Lucia, Saint Vincent and the Grenadines, Suriname, Trinidad and Tobago, United States, Uruguay, Venezuela📅8h agoRemote
pythongoawsazuregcpkubernetesterraformansibleai
AI-Powered Fit Check
Instantly analyze how your resume matches this job's requirements and uncover your top strengths.
Are you a tech-savvy job seeker looking for an exciting opportunity to work with cutting-edge Web3 infrastructure? Look no further than Chainstack!
About us
Chainstack powers global blockchain applications across fintech, DeFi, wallets, custodians, analytics, and everything in between. Teams cut time-to-market, cost, and risk with one platform for nodes in the cloud, offered through preferred service providers, or self-hosted. Standardized node management puts reliability on autopilot and keeps performance predictable, giving operators control and transparency while developers get a consistent way to build and scale.
About the role
We are on the hunt for a dedicated Infrastructure Engineer who will play a pivotal role in our innovative technology stack, ensuring that all user-oriented services and other Chainstack production systems operate seamlessly. The primary duties of the Infrastructure Engineer will revolve around incident management and the creation of top-tier technical solutions, positioning them for success while simultaneously expanding their experience with a broad range of cloud providers. As a crucial component of Chainstack, we require an individual who can lead our pursuit of near-flawless reliability and service quality across all our offerings.
Responsibilities
Maintain and continuously improve the reliability and scalability of our services daily 2.5-hour on-call and 1 paid weekend on-call per month
Develop and manage complex hybrid infrastructure (primarily multi-cloud Kubernetes clusters) with an infrastructure-as-code approach
Manage incidents and participate in on-call rotation
Improve monitoring and alerting of our platform
Actively look for opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation.
Actively work on solutions to reduce workload by automating repetitive processes
Plan, design, and execute solutions to reach specific goals agreed upon within the team
Identify system parts that do not scale, provide immediate workaround measures, and drive long-term resolution.
Improve documentation all around, explaining the why, not stopping with the what
Know a domain well and radiate that knowledge to other team members
Requirements
Experience in operating mission-critical services and being responsible for reliability/uptime/SLA