SanJoseCARecruiter Since 2001
the smart solution for San Jose jobs

Senior Software Engineer - Infrastructure

Company: Tbwa Chiat/Day Inc
Location: Redwood City
Posted on: January 26, 2025

Job Description:

Senior Software Engineer - InfrastructureHybrid / San Francisco, CA or Redwood City, CAWe're on a mission to democratize AI by building the definitive AI data development platform. The AI landscape has gone through incredible change between 2016, when Snorkel started as a research project in the Stanford AI Lab, to the generative AI breakthroughs of today. But one thing has remained constant: the data you use to build AI is the key to achieving differentiation, high performance, and production-ready systems. We work with some of the world's largest organizations to empower scientists, engineers, financial experts, product creators, journalists, and more to build custom AI with their data faster than ever before. Excited to help us redefine how AI is built? Apply to be the newest Snorkeler!As a Senior Software Engineer on the Infrastructure team, you'll accelerate the Snorkel AI team and our customers by improving our developer platform and services for user and data management across the stack. You'll work closely with other engineers, researchers, and product management to align on the highest leverage improvements for CI/CD, cloud infrastructure, deployment, security, authentication/authorization, and more.Main Responsibilities

  • Design, build, and maintain services and deployment for Snorkel's enterprise platforms
  • Design, build, and improve observability and alerting for Snorkel's enterprise platforms
  • Contribute to Snorkel's inhouse deployment management software to installation and upgrades of various deployments for Snorkel's enterprise customers
  • Build and maintain Snorkel's production and staging infrastructure; own our k8s and cloud strategy
  • Work closely with various engineering teams in defining test strategies and build infrastructure to execute the same
  • Deploy and optimize CI/CD pipelines across multiple environments and continuously improve development and deployment best practices
  • Collaborate with enterprise customers to understand product use cases and translate into engineering specifications, and deliver high-quality solutions
  • Participate in on-call rotations, post-incident reviews, and other operational duties to ensure service delivery qualityMinimum Qualifications
    • Bachelor's degree in Computer Science or related field, or equivalent demonstrated experience
    • Strong development and debugging skills in Python
    • 5+ years of software development experience in distributed systems and cloud-native applications
    • Strong experience with cloud platforms and infrastructure as code (Terraform, CloudFormation, Helm)
    • Practical experience with Docker containerization and clustering (Kubernetes/EKS/GKE)
    • Proficiency in code and system health, diagnosis, resolution and software test engineering
    • Strong communication and coding skills
    • Regularly follows the best software engineering practices and hold a high bar for the team by leading design, code review and test plan reviewsPreferred Skills
      • Extremely well versed in building and managing cloud infrastructure for enterprise platforms on (AWS, GCP, Azure) and services like EC2, EKS, VPC etc
      • Experience in one or more of the build tools like Bazel, Gradle, Make etc. Extra points for someone who has hands on experience in building and managing large code bases with these tools
      • Designed and implemented developer-friendly APIs or tools to boost developer productivity
      • Familiarity in deployment, monitoring and maintenance of large-scale enterprise software products
      • Familiarity in developing and releasing infrastructure software for SaaS and on-prem platforms
      • [Nice to have]: Hands-on experience setting up and operating Kubernetes clusters at scale
      • [Nice to have]: Experience with large scale distributed computing systems for ML Training or Serving, eg: Ray, Spark, Tensorflow etc
      • [Nice to have]:Hands-on experience in creating and maintaining metrics and dashboards on observability platforms such as New Relic, DataDog, Chronosphere, or similar tools
      • [Nice to have]:Experience building services and infrastructure for Machine learning and AI Systems
      • [Nice to have]:Experience in cloud networking, security and service mesh like istioBe Your Best At SnorkelSnorkel AI is on a mission to make machine learning practical for everyone, and it starts with building a team that welcomes, represents and gives opportunity to all. We work at the frontier of AI and software engineering, and believe that underrepresented communities need to play a part in shaping the future of these fields. At Snorkel AI, we actively work to create an environment that values end-to-end ownership, diverse forms of impact, and opportunities for personal growth.Snorkelers are supported by an amazing team and an amazing set of benefits. For Full-time employees, we offer comprehensive medical, dental, and vision plans for Snorkelers and their families, plus a yearly wellness stipend. Our 401k program lets Snorkelers plan for their future and our parental leave program lets new parents take up to 20 weeks of paid time off. Learn more about these benefits and more - like our workstation setup allowance - on our Careers page.Apply for this job* indicates a required fieldFirst Name *Last Name *Email *Phone *Resume/CV *LinkedIn ProfileGDPR Disclosure * Select...
        #J-18808-Ljbffr

Keywords: Tbwa Chiat/Day Inc, San Jose , Senior Software Engineer - Infrastructure, IT / Software / Systems , Redwood City, California

Click here to apply!

Didn't find what you're looking for? Search again!

I'm looking for
in category
within


Log In or Create An Account

Get the latest California jobs by following @recnetCA on Twitter!

San Jose RSS job feeds