[Hiring] Senior Website Reliability Engineer @Articul8

Sports News


About Us

Articul8 AI is on the forefront of Generative AI innovation, delivering cutting-edge SaaS merchandise that remodel how companies function. Our platform empowers organizations to leverage the ability of synthetic intelligence in a dependable, scalable, and safe surroundings.

 

Place Overview

We’re in search of an skilled Website Reliability Engineer (SRE) to affix our group and assist make sure the reliability, efficiency, and scalability of our GenAI SaaS platform. As an SRE, you’ll bridge the hole between growth and operations, implementing automation and finest practices to take care of our service reliability goals whereas supporting speedy innovation.

 

Key Obligations

  • Architect and preserve scalable, extremely accessible infrastructure for our GenAI platform.

  • Design and implement strong monitoring, alerting, and observability options to proactively guarantee system well being and efficiency.

  • Automate deployment, scaling, and administration of our cloud-native infrastructure, decreasing toil and bettering effectivity.

  • Outline, measure, and enhance Service Degree Goals (SLOs) and Service Degree Indicators (SLIs) to ship excellent service high quality.

  • Take part in on-call rotations and supply speedy response to manufacturing incidents, minimizing downtime and person impression.

  • Collaborate carefully with growth groups to construct dependable, scalable, and environment friendly techniques for complicated AI workloads.

  • Lead incident response efforts, conduct thorough post-mortems, and champion steady enchancment initiatives.

  • Optimize infrastructure for efficiency, scalability, and cost-effectiveness—particularly for high-demand AI workloads.

  • Implement and implement safety finest practices throughout all techniques and environments.

  • Create and preserve complete documentation, together with runbooks and information base articles, to foster a tradition of shared information.

{Qualifications}

Required

  • Bachelor’s diploma in Laptop Science, Engineering, or associated discipline, or equal sensible expertise

  • 5+ years of expertise in DevOps, SRE, or comparable roles

  • Robust expertise with cloud platforms (AWS, GCP, or Azure)

  • Proficiency in at the least one programming/scripting language (Python, Go, Bash, and so on.)

  • Palms-on expertise with infrastructure as code instruments (Terraform, CloudFormation, and so on.)

  • Strong background in containerization applied sciences (Docker, Kubernetes)

  • Confirmed expertise with monitoring and observability instruments (Prometheus, Grafana, ELK stack, and so on.)

  • Robust understanding of CI/CD pipelines and automation

  • Distinctive troubleshooting and problem-solving abilities and skill to troubleshoot complicated techniques

Most popular

  • Expertise supporting AI/ML techniques in manufacturing

  • Information of GPU infrastructure administration and optimization

  • Familiarity with distributed techniques and high-performance computing

  • Expertise with database techniques (SQL and NoSQL)

  • Certifications in cloud platforms (AWS, GCP, Azure)

  • Expertise with chaos engineering and resilience testing

  • Information of safety finest practices and compliance necessities

Able to form the way forward for resilient software program techniques? Apply now and assist drive the reliability of tomorrow’s AI at Articul8 AI!



Source link

- Advertisement -
- Advertisement -

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisement -
Trending News

25 Occasions Being Poor Was Manner Extra Costly Than Anybody Realizes

25 Occasions Being Poor Was Manner Extra Costly Than Anybody Realizes ...
- Advertisement -

More Articles Like This

- Advertisement -