Distributed Systems Engineer Job at Magic, Remote

d210SU8yamU4V2hVY3d6c1c3Q3U5UWtidHc9PQ==
  • Magic
  • Remote

Job Description

Magic’s mission is to build safe AGI that accelerates humanity’s progress on the world’s most important problems. We believe the most promising path to safe AGI lies in automating research and code generation to improve models and solve alignment more reliably than humans can alone. Our approach combines frontier-scale pre-training, domain-specific RL, ultra-long context, and inference-time compute to achieve this goal.

About the role:

As a distributed systems engineer, you will build the data and coordination systems that enable ultra-long context inference and training on Magic’s GPU clusters. 

What you might work on: 

  • High-performance storage and caching systems to support long-context inference and training

  • Hacking on the internals of deep learning frameworks in the distributed setting

  • Automating fault detection and recovery systems to enable highly available training

  • Troubleshooting complex issues across GPUs, network, storage, OS, and cloud environments.

What we’re looking for: 

  • Deep knowledge of distributed systems design and public cloud platforms

  • Experience designing and operating highly available, high-throughput data systems

  • Experience with the internals of distributed DBMS, batch and stream processing systems, and/or distributed file systems

  • Exceptional problem-solving skills up and down the stack

Magic strives to be the place where high-potential individuals can do their best work. We value quick learning and grit just as much as skill and experience.

Our culture:

  • Integrity. Words and actions should be aligned

  • Hands-on. At Magic, everyone is building 

  • Teamwork. We move as one team, not N individuals

  • Focus. Safely deploy AGI. Everything else is noise

  • Quality. Magic should feel like magic

Compensation, benefits and perks (US):

  • Annual salary range: $100K - $550K

  • Equity is a significant part of total compensation, in addition to salary

  • 401(k) plan with 6% salary matching

  • Generous health, dental and vision insurance for you and your dependents

  • Unlimited paid time off

  • Visa sponsorship and relocation stipend to bring you to SF, if possible

  • A small, fast-paced, highly focused team

Job Tags

Remote job, Relocation,

Similar Jobs

SciTec

Information Assurance Manager Job at SciTec

 ...Applicants that do not meet these requirements will not be considered. SciTec has an immediate opening for an Information Assurance Manager to lead, build, and mentor the team responsible for ensuring that all information systems and associated data meet the required... 

Pacific Gas And Electric Company

Incident Investigator, Principal Job at Pacific Gas And Electric Company

 ...Woodland; Yuba City Department Overview The Electric Compliance Assurance department is part of the Electric Risk & Compliance...  ...those risks, including through incident reporting and investigations, overseeing Electric Operations and Power Generation CAP, and... 

IronLinx Transportation LLC

Class A Local CDL-A in York PA, Dedicated Lanes Job Job at IronLinx Transportation LLC

 ...team. Apply Here"IronLinx is an equal opportunity employer.Pay: Hourly - Call for $$Hourly pay, time and a half over 40 hoursAetna Health Insurance available after 90 days, company covers 50% of premiumHow to apply for this Driving JobFill out a Gary's Job... 

Shaw Industries

Shaw Carpet Territory Manager Job at Shaw Industries

 ...Position Overview Shaw Industries Group, Inc. is a leader in flooring and other surface solutions designed for residential housing,...  ...COREtec, Shaw Floors, Patcraft, Philadelphia Commercial, Shaw Contract, Shaw Sports Turf, Shawgrass, Southwest Greens, Watershed Geo and... 

AECON

Project Information Manager Job at AECON

 ...join our best-in-class Aecon community! The Program Information Manager will be responsible for evaluating, establishing, and ensuring...  ...government laws and regulations. This will include electronic records management, distribution, classification, retention and...