We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
Remote New

Staff Software Engineer - Infinia IO Path

DataDirect Networks
United States
Apr 14, 2025

Staff Software Engineer - Infinia IO Path
Job Locations

US-Remote


Job ID
2025-5267


Name Linked

Remote: US


Country

United States


City

Remote

Worker Type
Regular Full-Time Employee



Overview

This is an incredible opportunity to be part of a company that has been at the forefront of AI and high-performance data storage innovation for over two decades. DataDirect Networks (DDN) is a global market leader renowned for powering many of the world's most demanding AI data centers, in industries ranging from life sciences and healthcare to financial services, autonomous cars, Government, academia, research and manufacturing.

"DDN's A3I solutions are transforming the landscape of AI infrastructure." - IDC

"The real differentiator is DDN. I never hesitate to recommend DDN. DDN is the de facto name for AI Storage in high performance environments" - Marc Hamilton, VP, Solutions Architecture & Engineering | NVIDIA

DDN is the global leader in AI and multi-cloud data management at scale. Our cutting-edge data intelligence platform is designed to accelerate AI workloads, enabling organizations to extract maximum value from their data. With a proven track record of performance, reliability, and scalability, DDN empowers businesses to tackle the most challenging AI and data-intensive workloads with confidence.

Our success is driven by our unwavering commitment to innovation, customer-centricity, and a team of passionate professionals who bring their expertise and dedication to every project. This is a chance to make a significant impact at a company that is shaping the future of AI and data management.

Our commitment to innovation, customer success, and market leadership makes this an exciting and rewarding role for a driven professional looking to make a lasting impact in the world of AI and data storage.



Job Description

We are looking for a Staff Software Engineer to join the Infinia IO Path Team, the core engine driving performance, scalability, and reliability across DDN's next-generation Data Intelligent Platform. You'll work alongside top engineers and report directly to a Sr. Engineering Manager focused on erasure coding, SPDK integration, and distributed concurrency control - shaping the future of extreme-scale I/O in AI and data-intensive environments.

In this role, you will serve as a technical authority, responsible for architecting and optimizing the I/O stack, influencing design decisions, mentoring other engineers, and delivering robust, high-performance solutions that power mission-critical workloads. This is a hands-on, deep systems engineering role - ideal for someone who thrives in complex, high-throughput, low-latency environments.

Key Responsibilities Core Engineering & Architecture
    Design, develop, and optimize the Infinia I/O path to achieve extreme performance, low latency, and high concurrency under real-world workloads.
  • Implement and refine erasure coding algorithms, data durability techniques, and storage efficiency strategies across large clusters.
  • Integrate and optimize SPDK (Storage Performance Development Kit) into the I/O layer to leverage direct hardware access and minimize kernel overhead.
  • Work on distributed locking mechanisms and concurrency models to ensure consistency, fault tolerance, and throughput at scale.
Performance & Scalability
  • Analyze and improve I/O subsystems under stress - including memory usage, buffer management, caching strategies, and scheduling mechanisms.
  • Drive performance tuning, data path tracing, and throughput profiling across multi-node, multi-device environments.
  • Participate in the design of asynchronous, event-driven architectures that support AI pipelines, high-speed data ingestion, and real-time analytics.
Technical Leadership
  • Serve as a go-to expert on the IO Path team - leading deep technical discussions, code reviews, design validation, and issue triage.
  • Work closely with the Sr. Manager of Engineering to shape the technical roadmap, evaluate new technologies, and ensure architectural alignment.
  • Mentor junior engineers and contribute to a culture of quality, rigor, and engineering excellence.
Cross-Team Collaboration
  • Partner with QA, Storage, Networking, and Performance teams to validate functionality and meet reliability goals.
  • Collaborate with Field and Support teams to debug and resolve customer issues, feeding real-world insights back into design improvements.
  • Influence test coverage strategies and CI/CD pipelines to ensure scalable and reliable software delivery.
Required Qualifications
  • 10+ years of experience building large-scale systems software in C/C++.
  • Deep understanding of I/O path optimization, storage scheduling, memory management, and data layout strategies.
  • Proven experience implementing erasure coding, data redundancy, and recovery algorithms in a distributed context.
  • Hands-on experience integrating or optimizing with SPDK or similar user-space storage frameworks.
  • Strong knowledge of distributed concurrency, locking mechanisms, and fault-tolerant design principles.
  • Familiarity with performance profiling tools, memory allocators, and benchmarking methodologies.
  • Excellent debugging skills, with the ability to diagnose complex race conditions, performance regressions, and hardware/software interactions.
Preferred Qualifications
  • Experience working in high-performance computing (HPC), AI data infrastructure, or large-scale storage platforms.
  • Knowledge of NVMe-oF, RDMA, and other high-speed I/O protocols.
  • Familiarity with cluster schedulers, resource orchestration, or containerized I/O workloads.
  • Exposure to distributed file systems, object storage layers, or hybrid data models.
  • Background in implementing monitoring hooks, observability layers, or self-healing I/O subsystems.

This position requires participation in an on-call rotation to provide after-hours support as needed.

Success Metrics - First 30 Days Technical Integration
  • Ramp up on Infinia's architecture, codebase, and tooling for I/O path development.
  • Review and contribute to at least two active development efforts (e.g., erasure coding optimization, SPDK integration).
  • Deliver an initial technical assessment identifying 2-3 areas for optimization or redesign.
Team Contribution
  • Collaborate with peers in design reviews and participate in team-wide architecture sessions.
  • Begin mentoring a junior engineer or collaborating pairwise on deep dives and bug triage.
Strategic Input
  • Work with your manager to co-author a technical proposal or spike around a key roadmap feature (e.g., lockless concurrency or multi-threaded ingest).
  • Define clear performance baselines for one subsystem and set stretch goals for throughput or latency.
Success Metrics - Beyond 30 Days
  • Measurable performance improvements in I/O throughput, latency, or concurrency under production workloads.
  • Key architectural contributions adopted into the long-term roadmap.
  • Resolution of critical bugs or performance regressions identified through customer incidents or internal testing.
  • Recognition as a core contributor and thought partner to IO Path leadership and platform stakeholders.


DDN

Join our dynamic and driven team, where engineering excellence is at the heart of everything we do. We seek individuals who love to challenge themselves and are fueled by curiosity. Here, you'll have the opportunity to work across various areas of the company, thanks to our flat organizational structure that encourages hands-on involvement and direct contributions to our mission. Leadership is earned by those who take initiative and consistently deliver outstanding results, both in their work ethic and deliverables, making strong prioritization skills essential. Additionally, we value strong communication skills in all our engineers and researchers, as they are crucial for the success of our teams and the company as a whole.

Interview Process: After submitting your application, one of our recruiters will review your resume. If your application passes this stage, you will be invited to a 30-minute interview during which a member of our team will ask some basic questions. If you clear the interview, you will enter the main process, which can consist of up to four interviews in total:

  • Coding assessment: Often in a language of your choice.
  • Systems design: Translate high-level requirements into a scalable, fault-tolerant service (depending on role).
  • Real-time problem-solving: Demonstrate practical skills in a live problem-solving session.
  • Meet and greet with the wider team.
  • Our goal is to finish the main process in 2-3 weeks at most.

DataDirect Networks (DDN) is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity, gender expression, transgender, sex stereotyping, sexual orientation, national origin, disability, protected Veteran Status, or any other characteristic protected by applicable federal, state, or local law.

#LI-Remote

Applied = 0

(web-77f7f6d758-2q2dx)