Back to Archive

Operations/Site-Reliability Engineer

bexio AG

Rapperswil-Jona, Switzerland | Posted: 6 months ago

This job is expired and may no longer be accepting applications.

Key job duties

  • Work together with the external hosting partner to maintain sufficient system capacity, including auto-scaling as required

  • Identification of upcoming system requirements to forecast future capacity needs

  • Define the right trade-offs and systems architecture to meet bexio’s needs, and translate them into operational requirements for the software engineering team

  • Identify repetitive tasks and continuously work on automation and simplification, to reduce operational workload

  • Continuously oversee the reliability of bexio; troubleshooting of production issues.

  • Implement additional monitoring systems where necessary, to proactively identify faults as they arise


  • At least four years of experience in both systems and software engineering in a cloud/ SaaS environment

  • Experience of working in collaborative environments with internal and external stakeholders

  • Understanding fault tolerance and cascading failures as well as ability to define monitoring and alerting needs

  • Knowledge of capacity management and planning

  • Ability to determine performance requirements and potential bottlenecks

  • Ability to write scripts and code in order to reduce toil and simplify work

  • You are familiar with managing and diagnosing a modern medium-scale web services architecture based on LAMP and similar technologies, familiar with cloud-services, and are competent in scripting and at least one of Python, Java or PHP

  • Confident in English; German is beneficial


  • We offer innovative and inspiring working environment

  • Working in a friendly team who care about you and your work

  • Plenty of opportunities to learn new things 

  • Flexible working hours and conditions

  • Competitive salary and generous benefits package 

This job was sourced from StackOverflow Jobs.