▶Apply
Site Reliability Engineer
Trillium Professional is now seeking Site Reliability Engineers in Santa Clara!
Pay rate is $80 - $90/hour.
Responsibilities:
-Fleet monitoring & recovery of assets in our private cloud environment that houses several compute servers with NVIDIA GPUs.
-Specific focus on building and stabilizing our virtualization infrastructure of ESXi, KVM and Hyper-V.
-Deploy and maintain a large farm of machines using the latest Configuration Management & Infrastructure Automation tools (Chef, Ansible, Terraform).
-Participate in on-call & rotational L1 support for round-the-clock monitoring and remediation of infrastructure issues (PagerDuty)
-Analyze and Debug operating system, networking, configuration and performance problems.
-Assist in roll-out and deployment of infrastructure configurations to supporting the latest hardware and technologies.
-Contribute to the development of monitoring systems to have fast, reliable and real-time pulse of the various infrastructure subsystems (Zabbix, Big Panda, Grafana)
Apply now!
-Bachelor’s or Master’s Degree in Computer Science or Software Engineering, or equivalent experience.
-Good with system and platform debugging
-Virtualization experience (key match if available) - (vSphere, Hyper-V, KVM, Xen server)
-Familiar with Client Configuration tools (Chef (preferred), Ansible)
-Experience working in large scale enterprise production systems. -5+ years of professional experience required.
-Ability to debug and analyze system issues, code to triage, root cause and resolve issues in the infrastructure. Work closely with the platform engineering team in understanding hardware setups.
-Familiar with maintenance and setup of Linux, Windows hosts
-Scripting experience with any of Python, Go. Unix shell proficiency.
-Experience with version control systems like Perforce, GIT.
Preferred:
-Familiar with private cloud setups (VMware, Dell, Apple)
Scripting (bash, python, go)
-Experience with VM and hardware virtualization technologies like VMware, KVM, Hyper-V, Docker and Kubernetes.
-Background with automating bare metal and VM provisioning.
-Experience with supporting GPUs, embedded device development, driver development and CUDA/TensorRT applications.
-Development experience in Chef, Ansible and infrastructure orchestration.
Trillium has been recruiting and placing clerical and office professionals for over 30 years. From Fortune 100 companies to small businesses, our philosophy remains the same: to achieve excellence by providing quality employees with an uncompromising level of service. We believe in honesty, integrity, and a simple philosophy of providing value to our customers and our employees. We strive to be unsurpassed in the recruitment and placement of professionals. Trillium is an Equal Opportunity Employer.
By applying to this job, I agree to receive electronic communications including SMS text and email regarding future opportunities, referral bonus incentives, and other promotions from Trillium. You may opt out at any time from future communications by responding STOP to any electronic communication.
You may view our full privacy policy at https://trilliumstaffing.com/jobs/privacy/.
Trillium offers a comprehensive benefit package that includes the ability to participate in health insurance and retirement plans, paid holidays, state required leave, and vacation days, if applicable. Trillium’s offerings are dependent on the state in which the assignment is located, length of time worked, and may change depending on assignment. Benefit packages for direct hire placements vary based on the client company.
Contact Us if you have any questions
Our intentions are to fill job vacancies as quickly as possible with qualified candidates. We are always accepting applications if a time sensitive job has an application deadline it is noted in the job description. Click on "Apply" to begin the apply process.