Infrastructure Engineering Manager
Build the infrastructure to help Flexport scale
We are in the process of transforming our monolith Ruby service to a polyglot microservice system. We are rapidly growing our engineering organization across multiple time zones and continents. We are planning to expand our infrastructure across multiple regions over the new few years. The product infrastructure team plays a critical role in this journey. As a product infrastructure engineer, you will work in a highly collaborative and fast-changing environment, and you will focus on improving every aspect of developer experience, service infrastructure, and performance of our services.
- Develop tools to improve developer experience and productivity.
- Contribute to dev infrastructure such as CI/CD pipeline.
- Design and build performance tools for teams to leverage.
- Develop innovative tools and automation to minimize manual work.
- Monitor site availability and reliability on a daily basis.
- Understand, investigate and triage production issues and bottlenecks, root cause and implement or help teams to implement solutions to eliminate future incidents.
- Design and test disaster recovery strategies.
- Ensure infrastructure standards are being followed.
You should have
- Understanding of AWS or other Clouds and Docker
- Extensive experience with logging, Application Performance Management, and other monitoring tools
- Experience working with complex, enterprise-level architectures
- Proficient in at least one scripting language (e.g. Java, Ruby, Python, etc)
- Preferably 6+ years of relevant work experience in Linux environments
- Preferably 2+ years team management experience
- Team player with the ability to collaborate effectively across organizations
- Grit and a strong ownership mentality
- DevOps and on-call mindset
- 通过开发自动化工具, 优化其他团队的开发流程和效率
- 维护和优化 CI/CD 流程, 提高自动化部署的速度和稳定性
- 维护系统稳定性, 故障响应, 制定容灾策略
- 完善各服务的基础监控和告警 (自动化)
- 完善测试工具和框架 (集成测试框架, 压力测试框架)
- 有服务和 on-call 的态度
- 熟悉 Kubernetes 等资源管理和部署工具
- 理解 Docker (Container) 技术
- 熟练掌握一门脚本语言（Bash, Ruby, Python）
- 熟悉 Linux 环境, 能使用相关工具进行线上故障诊断
- 有 6年以上 基础架构 或 SRE 或 DevOps 工作经验
- 熟悉常见的指标采集系统和 Distributed Tracing 框架
- 有在复杂的企业级 Infrastructure 框架下工作的经历
- 熟悉 Spring 或 Ruby on Rails 框架
- 熟悉 Terraform 等 Infra-as-Code 自动化工具
Flexport believes global trade can move the human race forward. Our mission is to make global trade easier for everyone. To achieve this, we’re building the “Operating System for Global trade” — a combination of modern, internet era technology & data analytics; logistics infrastructure; and supply chain expertise.
Flexport connects ~10k clients and suppliers across >100 countries, including established global brands like Georgia-Pacific as well as emerging innovators like Sonos. Founded in 2013, we've raised >$1.3B from SoftBank, Founders Fund, GV, First Round Capital and YC. We’re excited to start seriously scaling up after our recent $1B investment from SoftBank’s Vision Fund early this year.
Worried about not having any freight forwarding experience?
- Don’t be! We’re building the first Operating System for Global Trade. That’s why it’s incredibly important for us to bring people from diverse backgrounds and experiences together with our industry veterans to help move the freight forwarding industry forward.
- What’s freight forwarding and why does it matter? Freight forwarding is the coordination and shipment of goods from one place to another and it’s what makes global trade possible. Flexport is on a mission to make global trade easier for everyone because we believe it can help connect the world and break down economic barriers.
- We know this industry is complex. That’s why we invest in education starting day one with Flexport Academy, a one week intensive onboarding program designed specifically to set every new Flexport employee up for success.
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.