Your work days are brighter here.
We’re obsessed with making hard work pay off, for our people, our customers, and the world around us. As a Fortune 500 company and a leading AI platform for managing people, money, and agents, we’re shaping the future of work so teams can reach their potential and focus on what matters most. The minute you join, you’ll feel it. Not just in the products we build, but in how we show up for each other. Our culture is rooted in integrity, empathy, and shared enthusiasm. We’re in this together, tackling big challenges with bold ideas and genuine care. We look for curious minds and courageous collaborators who bring sun-drenched optimism and drive. Whether you're building smarter solutions, supporting customers, or creating a space where everyone belongs, you’ll do meaningful work with Workmates who’ve got your back. In return, we’ll give you the trust to take risks, the tools to grow, the skills to develop and the support of a company invested in you for the long haul. So, if you want to inspire a brighter work day for everyone, including yourself, you’ve found a match in Workday, and we hope to be a match for you too.
About the Team
The AI Model Serving team is the engine behind every production Workday agent and machine learning use case. We own the services that power all production AI workloads, serving as the gateway to vendor-hosted LLMs on GCP and AWS Bedrock and operating the model deployment platform where Workday hosts and scales its models.About the Role
As a Principal Software Development Engineer on the AI Model Serving team, you will be a technical leader who helps shape the vision and direction of the platform alongside the engineering manager. You will play a central role in making critical design decisions, driving outcomes across the team, and setting a positive and inclusive team culture. In addition to the model serving platform, the team also owns the production model registry at Workday, and you will help guide its evolution and ensure it meets the needs of ML teams across the organization.
Your work will directly impact Workday's ability to serve AI at scale — from traditional ML models to the latest large language models powering Workday's agents.
Key Responsibilities:
Help set the product vision for the AI Model Serving platform in partnership with the engineering manager, bringing a product-oriented mindset to infrastructure decisions.
Lead the team technically by making critical design decisions that drive performance, reliability, and scalability across the platform.
Design, implement, and maintain large-scale systems that enable moving ML models to production.
Write design documents to build consensus for new system components and enhancements to existing components.
Evaluate and uptake new technologies made available within Workday and across the broader industry.
Troubleshoot, improve, and scale continuous integration software pipelines.
Develop relationships with software engineers, machine learning engineers, and data scientists on partner teams.
Respond to alerts and debug production issues to maintain platform health and reliability.
Review pull requests and enforce consistency, performance, readability, and security across code bases.
Develop documentation to share knowledge with other engineers.
About You
Basic Qualifications
8+ years of related work experience in software development, with a focus on building and operating large-scale distributed systems.
Bachelor's degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience).
Other Qualifications
Software Development and Distributed Systems: Deep experience designing, building, and scaling production-grade distributed systems. You understand the full software development lifecycle — from coding standards and testing to code reviews, source control management, and deployment — and you can apply that knowledge to complex, high-throughput platforms.
Product Thinking and Design: You bring a product-oriented perspective to platform engineering. You can identify what matters most for internal and external users of the platform, translate those insights into technical direction, and make design decisions that balance usability, performance, and long-term maintainability.
Python: Deep proficiency in Python, with extensive experience writing production-level code and building systems in Python-based frameworks.
LLMs and Traditional ML Models: Familiarity with both large language models and traditional machine learning models, including how they are served, scaled, and monitored in production environments. You understand the operational differences and can design platform abstractions that serve both effectively.
Ray Serve: Experience with Ray and Ray Serve for distributed model serving at scale. You understand how to operate, tune, and scale Ray Serve clusters in production.
Prometheus: Experience with Prometheus for monitoring and observability of distributed systems. You can design and maintain monitoring strategies that provide clear insight into system health, performance, and cost.
Excellent written and verbal communication skills, including the ability to write clear design documents, articulate complex technical ideas, and build consensus across teams.
A collaborative approach to engineering, with experience mentoring other engineers and fostering an inclusive team environment.
Workday Pay Transparency Statement
The annualized base salary ranges for the primary location and any additional locations are listed below. Workday pay ranges vary based on work location. As a part of the total compensation package, this role may be eligible for the Workday Bonus Plan or a role-specific commission/bonus, as well as annual refresh stock grants. Recruiters can share more detail during the hiring process. Each candidate’s compensation offer will be based on multiple factors including, but not limited to, geography, experience, skills, job duties, and business need, among other things. For more information regarding Workday’s comprehensive benefits, please click here.
Primary Location: CAN.ON.Toronto Primary CAN Base Pay Range: $168,000 - $252,000 CAD Additional CAN Location(s) Base Pay Range: $168,000 - $252,000 CAD
Our Approach to Flexible Work
With Flex Work, we’re combining the best of both worlds: in-person time and remote. Our approach enables our teams to deepen connections, maintain a strong community, and do their best work. We know that flexibility can take shape in many ways, so rather than a number of required days in-office each week, we simply spend at least half (50%) of our time each quarter in the office or in the field with our customers, prospects, and partners (depending on role). This means you'll have the freedom to create a flexible schedule that caters to your business, team, and personal needs, while being intentional to make the most of time spent together. Those in our remote "home office" roles also have the opportunity to come together in our offices for important moments that matter.
Pursuant to applicable Fair Chance law, Workday will consider for employment qualified applicants with arrest and conviction records.
Workday is an Equal Opportunity Employer including individuals with disabilities and protected veterans.
At Workday, we are committed to providing an accessible and inclusive hiring experience where all candidates can fully demonstrate their skills. If you require assistance or an accommodation at any point, please email accommodations@workday.com.
Are you being referred to one of our roles? If so, ask your connection at Workday about our Employee Referral process!
At Workday, we value our candidates’ privacy and data security. Workday will never ask candidates to apply to jobs through websites that are not Workday Careers.
Please be aware of sites that may ask for you to input your data in connection with a job posting that appears to be from Workday but is not.
In addition, Workday will never ask candidates to pay a recruiting fee, or pay for consulting or coaching services, in order to apply for a job at Workday.