Generative artificial intelligence is at a pivotal moment. Enterprises want to know how to take advantage of mass amounts of data, while keeping their budgets within today’s economic demands. Generative AI chatbots have become relatively easy to deploy, but sometimes return false “hallucinations” or expose private data. The best of both worlds may come from more specialized conversational AI securely trained on an organization’s data.
Dell Technologies World 2023 brought this topic to Las Vegas this week. Throughout the first day of the conference, CEO Michael Dell and fellow executives drilled down into what AI could do for enterprises beyond ChatGPT.
“Enterprises are going to be able to train far simpler AI models on specific, confidential data less expensively and securely, driving breakthroughs in productivity and efficiency,” Michael Dell said.
Dell’s new Project Helix is a wide-reaching service that will assist organizations in running generative AI. Project Helix will be available as a public product for the first time in June 2023.
Offering custom vocabulary for purpose-built use cases
Enterprises are racing to deploy generative AI for domain-specific use cases, said Varun Chhabra, Dell Technologies senior vice president of product marketing, infrastructure solutions group and telecom. Dell’s solution, Project Helix, is a full stack, on-premises offering in which companies train and guide their own proprietary AI.
For example, a company might deploy a large language model to read all of the knowledge articles on its website and answer a user’s questions based on a summary of those articles, said Forrester analyst Rowan Curran.
The AI would “not try to answer the question from knowledge ‘inside’ the model (ChatGPT answers from ‘inside’ the model),” Curran wrote in an email to TechRepublic.
It wouldn’t draw from the entire internet. Instead, the AI would be drawing from the proprietary content in the knowledge articles. This would allow it to more directly address the needs of one specific company and its customers.
“Dell’s strategy here is really a hardware and software and services strategy allowing businesses to build models more effectively,” said Brent Ellis, senior analyst at Forrester. “Providing a streamlined, validated platform for model creation and training will be a growing market in the future as businesses look to create AI models that focus on the specific problems they need to solve.”
However, there are stumbling blocks enterprises run into when trying to shift AI to a company’s specific needs.
“Not surprisingly, there’s a lot of specific needs that are coming up,” Chhabra said at the Dell conference. “Things like the outcomes have to be trusted. It’s very different from a general purpose model that maybe anybody can go and access. There could be all kinds of answers that need to be guard-railed or questions that need to be watched out for.”
Hallucinations and incorrect assertions can be common. For use cases involving proprietary information or anonymized customer behavior, privacy and security are paramount.
Enterprise customers may also choose custom, on-premises AI because of privacy and security concerns, said Kari Ann Briski, vice president of AI software product management at NVIDIA.
In addition, compute cycle and inferencing costs tend to be higher in the cloud.
“Once you have that training model and you’ve customized and conditioned it to your brand voice and your data, running unoptimized inference to save on compute cycles is another area that’s of concern to a lot of customers,” said Briski.
Different enterprises have different needs from generative AI, from those using open-source models to those that can build models from scratch or want to figure out how to run a model in production. People are asking, “What’s the right mix of infrastructure for training versus infrastructure for inference, and how do you optimize that? How do you run it for production?” Briski asked.
Dell characterizes Project Helix as a way to enable safe, secure, personalized generative AI no matter how a potential customer answers those questions.
“As we move forward in this technology, we are seeing more and more work to make the models as small and efficient as possible while still reaching similar levels of performance to larger models, and this is done by directing fine-tuning and distillation towards specific tasks,” said Curran.
SEE: Dell expanded its APEX software-as-a-service family this year.
Changing DevOps — one bot at a time
Where do on-premises AI like this fit within operations? Anywhere from code generation to unit testing, said Ellis. Focused AI models are particularly good at it. Some developers may use AI like TuringBots to do everything from plan to deploy code.
At NVIDIA, development teams have been adopting a term called LLMOps instead of machine learning ops, Briski said.
“You’re not coding to it; you’re asking human questions,” she said.
In turn, reinforcement learning through human feedback from subject matter experts helps the AI understand whether it’s responding to prompts correctly. This is part of how NVIDIA uses their NeMo framework, a tool for building and deploying generative AI.
“The way the developers are now engaging with this model is going to be completely different in terms of how you maintain it and update it,” Briski said.
Behind the scenes with NVIDIA hardware
The hardware behind Project Helix includes H100 Tensor GPUs and NVIDIA networking, plus Dell servers. Briski pointed out that the form follows function.
“For every generation of our new hardware architecture, our software has to be ready day one,” she said. “We also think about the most important workloads before we even tape out the chip.
” … For example for H100, it’s the Transformer engine. NVIDIA Transformers are a really important workload for ourselves and for the world, so we put the Transformer engine into the H100.”
Dell and NVIDIA together developed the PowerEdgeXE9680 and the rest of the PowerEdge family of servers specifically for complex, emerging AI and high-powered computing workloads and had to make sure it could perform at scale as well as handle the high-bandwidth processing, Varun said.
NVIDIA has come a long way since the company trained a vision-based AI on the Volta GPU in 2017, Briski pointed out. Now, NVIDIA uses hundreds of nodes and thousands of GPUs to run its data center infrastructure systems.
NVIDIA is also using large language model AI in its hardware design.
“One thing (NVIDIA CEO) Jensen (Huang) has challenged NVIDIA to do six or seven years ago when deep learning emerged is every team must adopt deep learning,” Briski said. “He’s doing the exact same thing for large language models. The semiconductor team is using large language models; our marketing team is using large language models; we have the API build for access internally.”
This hooks back to the concept of security and privacy guardrails. An NVIDIA employee can ask the human resources AI if they can get HR benefits to support adopting a child, for example, but not whether other employees have adopted a child.
Should your business use custom generative AI?
If your business is considering whether to use generative AI, you should think about if it has the need and the capacity to change or optimize that AI at scale. In addition, you should consider your security needs. Briski cautions away from using public LLM models that are black boxes when it comes to finding out where they get their data.
In particular, it’s important to be able to prove whether the dataset that went into that foundational model can be used commercially.
Along with Dell’s Project Helix, Microsoft’s Copilot projects and IBM’s watsonx tools show the breadth of options available when it comes to purpose-built AI models, Ellis said. HuggingFace, Google, Meta AI and Databricks offer open source LLMs, while Amazon, Anthropic, Cohere and OpenAI provide AI services. Facebook and OpenAI may likely offer their own on-premises options one day, and many other vendors are lining up to try to join this buzzy field.
“General models are exposed to greater datasets and have the capability to make connections that more limited datasets in purpose-built models do not have access to,” Ellis said. “However, as we are seeing in the market, general models can make erroneous predictions and ‘hallucinate.’
“Purpose-built models help limit that hallucination, but even more important is the tuning that happens after a model is created.”
Overall, it depends on what purpose an organization wants to use an AI model for whether they should use a general purpose model or train their own.
Disclaimer: Dell paid for my airfare, accommodations and some meals for the Dell Technologies World event held May 22-25 in Las Vegas.