Democratizing Data Science, A Federated Approach to Supporting AI and ML
Eighteen months ago, I spearheaded an IT team to better equip data scientists to build artificial intelligence (AI) and machine-learning-driven-processes (ML) at Dell. As we gathered feedback, I was prepared to identify resource gaps and create the tools needed to fill them.
What I wasn’t prepared for was the fact that some 1,800 data scientists and analysts were already immersed in AI and ML projects across the company. AI and ML were not pursuits of the future at Dell. They were already happening. And they were doing so outside the realm of IT.
I quickly realized IT didn’t need to shape the data science and analytics at Dell. It needed to improve the resources that those already driving that effort had to get the job done easier and more efficiently. In a fast-moving world where weeks and months really matter, they needed better compute, storage, software, data access, and a more efficient path to production, while continuing to pursue their already abundant data science and analytics paths.
My real job was to democratize data science not to dictate its direction.
Cloud-like Data Science Tools
We spent four months as a team interviewing data scientists, engineers, and analysts from an array of business units to learn about their experience, objectives, and needs.
They said it was difficult to get the right compute and storage resources to do the work. It was also difficult for them to get the latest software on an ongoing basis because it was open-source-centric, which means the software tools are constantly evolving. Then there was the arduous task of finding and bringing the data into a workspace in a timely manner—that alone could take two to three months. And finally, once they created a model, it was difficult to deploy it to production.
Based on feedback, the team set out to build an internal cloud-like platform to provide data science practitioners with self-service access to the AI and ML tools and resources they required.
Using Dell’s product development model— a user-centric, simplified, and streamlined approach to quickly design, develop, iterate and deliver new capabilities—we built proprietary software on top of Dell’s private cloud infrastructure. The new system provisions all the capabilities data scientists need, including software, compute, and storage environments for them to do each step of their work.
It features four components:
- A workspace to build and train AI and ML models to create a prototype that leverages on-demand and elastic storage and compute resources.
- AI and ML Ops standards that offer a path to deploy completed prototypes into production, so they can integrate with apps and process to add value continuously.
- The ability to have automated access to data sets in a secure and privacy-compliant manner.
- A knowledge base that helps them be more efficient throughout the data science process.
The biggest risk from the start was getting users to adopt our tools. Initially, we had 10+ different tools that were not fully meeting user needs. We started small and enlisted two data scientists to test our platform. We grew to 25 users within the first couple of weeks. By building the capabilities with data scientists’ participation, they became our advocates, and we grew past 500 users by the end of our first year. Given our current velocity, we aim to have 1,500+ total users in the next 12 months.
Centralized vs. Federated Data Science
One major challenge IT faced in tackling data scientists’ requirements was recognizing the need to shift away from a traditional centralized approach. It was important that data science stakeholders be able to continue their efforts independently.
Data science, after all, is being pursued across many business units to address very specific business needs. For example, in sales finance, we have a team that is just focused on using AI to set and manage quota guidance.
We realized in this case, centralizing the knowledge of the many businesses where data science was taking place didn’t make sense. IT was never going to know how these things need to be built to address such specific requirements. We had to acknowledge that innovation could only happen close to the business problem and the business customer and involve them in writing application code.
On the other hand, there were clear ways that centralization was important. For instance, if you want to win at AI, you have to have access to large compute and storage pools. These 40+ data science teams, however, only had access to their individual, small computers to pursue AI. Everyone was paying for compute, but they were still ending up with inadequate compute resources. By establishing a centralized cloud environment, they could access powerful compute, storage, and other resources to pursue solutions more efficiently at lower costs.
What we needed to do was centralize the infrastructure and the data security but maintain a decentralized process for actual data science projects.
We generally see data science take root in organizations in two ways: a central approach in which a single organization oversees all AI or a federated model where IT provides the platform and business data and scientists leverage them to pursue AI independently. At Dell, we already had a flourishing federated operating model that we are now supporting and enabling to grow.
We built our Enterprise Data Science platform using Dell technology, including Dell PowerEdge 740 servers, Dell ECS for data object storage, Dell ScaleIO for block storage and Tanzu Kubernetes Grid for container capabilities.
One of the early key decisions we made was to enable container-based workloads to allow users to leverage many of the modern software engineering DevOps capabilities. Being able to put a data science model in a container with Tanzu Kubernetes, let IT provide data scientists with the flexibility to choose their own tools.
Data scientists focus on building their models and processes within a container, and, as IT, we focus on making sure the models are supplied with the right compute resources and deliver automated compliance from a security and privacy perspective.
This approach enables us to use the underlying hardware on-demand, effectively reducing compute and storage requirements by 70%, while at the same time giving each data scientist a larger compute pool to tap into.
From the start, the move to a federated model that empowers data scientists meant we had to rethink and modernize our policies, from security and privacy to software application policies and networking practices. Previously, for example, you could only deploy into production if you were an IT team member. We adapted that process by providing automated guardrails that let business-hosted data science teams build algorithms that can integrate with production environments in a way that complies with our policies.
Our data science users now tell IT we are moving in the right direction, with a centralized AI, ML, and Ops practice that gives them the freedom to move at the speed of our customers and business. We are continuing to refine our democratization of data science.
View this On-demand Session recorded during Dell Technologies World, May 5-6, 2021, “Utilizing AI, Data and Analytics to Drive Business Outcomes“. We can help you be ready for whatever comes next.