We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode John Palazza - Vice President of Global Sales @ CentML

John Palazza - Vice President of Global Sales @ CentML

2025/3/10
logo of podcast Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

AI Deep Dive AI Chapters Transcript
People
J
John Palazza
Topics
John Palazza: 我认为企业应该采取自上而下的方法来实施AI战略。高层领导的远见卓识和对AI的重视,能够营造一种文化,使机器学习能够渗透到业务的各个方面。那些成功采用AI的企业,往往都是从高层开始,形成统一的策略,然后逐步向下实施。 从AI创新到生产和规模化应用,需要转变思路。在创新阶段,可以灵活运用资源,但在生产和规模化阶段,必须重视基础设施的效率和成本控制,避免因计算资源消耗过大而导致财务问题。例如,在测试阶段使用H100 GPU很不错,但如果要推广到45000用户,就要考虑成本问题。 CentML致力于帮助企业优化AI基础设施,降低使用门槛,提高大型语言模型的利用率。我们关注的是如何让客户在AI之旅中获得最佳的计算资源利用效率,在合适的时机使用合适的计算资源处理合适的工作负载。 我们发现,许多企业在GPU利用率方面存在问题,经常只有30%-40%的利用率。提高GPU利用率是解决AI计算资源短缺的关键。CentML的目标是帮助企业最大限度地提高现有计算资源的利用效率,让企业能够在各种环境(包括云端和本地)中高效运行AI工作负载,并支持各种类型的GPU。 企业在AI应用方面存在碎片化的问题,需要加强协作和统一。许多企业在AI应用方面存在碎片化的问题,例如多个团队分别进行情感分析,缺乏协作。我们需要从更统一的角度来看待AI的应用,而不是让各个团队各自为政。 开源大型语言模型将成为企业AI应用的主流趋势。随着像Llama这样的开源模型的进步,越来越多的企业开始关注并使用这些模型。这不仅是因为它们易于使用,而且也因为它们允许企业更好地控制自己的数据和环境。 AI代理技术将极大地提高AI在企业中的应用效率和业务影响力。AI代理可以帮助企业解决各种问题,例如故障排除、数据中心管理、医疗保健等。AI代理将成为未来企业AI应用的焦点,带来更大的业务影响力。 CentML提供了一套完整的LLM操作平台,包括C-Serve、C-Train和C-Cluster等产品,致力于简化AI工作负载的管理,让用户无需过多关注底层细节。我们的平台能够在各种环境中运行,并优化工作负载,从而显著降低AI计算成本。 我们与NVIDIA和Deloitte等合作伙伴的合作,有助于我们拓展市场并深入了解行业趋势。我们不担心云服务提供商会构建类似的产品,因为我们与这些厂商建立了合作关系,并致力于提升客户满意度。 提高AI效率将释放更多资源,从而推动进一步的创新和发展。通过提高AI效率,企业可以降低成本,释放更多资源用于进一步的创新和发展。

Deep Dive

Chapters
This chapter explores the challenges of AI ownership within organizations and proposes a top-down approach for successful AI adoption. It emphasizes the importance of a unified strategy for efficient AI implementation and highlights the contrast between centralized and decentralized models.
  • A top-down approach to AI adoption is more effective.
  • Centralized AI strategies improve efficiency.
  • Successful companies integrate AI across all business facets.

Shownotes Transcript

John Palazza from CentML joins us in this sponsored interview to discuss the critical importance of infrastructure optimization in the age of Large Language Models and Generative AI. We explore how enterprises can transition from the innovation phase to production and scale, highlighting the significance of efficient GPU utilization and cost management. The conversation covers the open-source versus proprietary model debate, the rise of AI agents, and the need for platform independence to avoid vendor lock-in, as well as emerging trends in AI infrastructure and the pivotal role of strategic partnerships.

SPONSOR MESSAGES:


CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!

https://centml.ai/pricing/

Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich.

Goto https://tufalabs.ai/


TRANSCRIPT:

https://www.dropbox.com/scl/fi/dnjsygrgdgq5ng5fdlfjg/JOHNPALAZZA.pdf?rlkey=hl9wyydi9mj077rbg5acdmo3a&dl=0

John Palazza:

Vice President of Global Sales @ CentML

https://www.linkedin.com/in/john-p-b34655/

TOC:

  1. Enterprise AI Organization and Strategy

[00:00:00] 1.1 Organizational Structure and ML Ownership

[00:02:59] 1.2 Infrastructure Efficiency and GPU Utilization

[00:07:59] 1.3 Platform Centralization vs Team Autonomy

[00:11:32] 1.4 Enterprise AI Adoption Strategy and Leadership

  1. MLOps Infrastructure and Resource Management

    [00:15:08] 2.1 Technology Evolution and Enterprise Integration

    [00:19:10] 2.2 Enterprise MLOps Platform Development

    [00:22:15] 2.3 AI Interface Evolution and Agent-Based Solutions

    [00:25:47] 2.4 CentML's Infrastructure Solutions

    [00:30:00] 2.5 Workload Abstraction and Resource Allocation

  2. LLM Infrastructure Optimization and Independence

    [00:33:10] 3.1 GPU Optimization and Cost Efficiency

    [00:36:47] 3.2 AI Efficiency and Innovation Challenges

    [00:41:40] 3.3 Cloud Provider Strategy and Infrastructure Control

    [00:46:52] 3.4 Platform Independence and Vendor Lock-in

    [00:50:53] 3.5 Technical Innovation and Growth Strategy

REFS:

[00:01:25] Apple Acquires GraphLab, Apple Inc.

https://techcrunch.com/2016/08/05/apple-acquires-turi-a-machine-learning-company/

[00:03:50] Bain Tech Report 2024, Gartner

https://www.bain.com/insights/topics/technology-report/

[00:04:50] PaaS vs IaaS Efficiency, Gartner

https://www.gartner.com/en/newsroom/press-releases/2024-11-19-gartner-forecasts-worldwide-public-cloud-end-user-spending-to-total-723-billion-dollars-in-2025

[00:14:55] Fashion Quote, Oscar Wilde

https://www.amazon.com/Complete-Works-Oscar-Wilde-Collins/dp/0007144369

[00:15:30] PointCast Network, PointCast Inc.

https://en.wikipedia.org/wiki/Push_technology

[00:18:05] AI Bain Report, Bain & Company

https://www.bain.com/insights/how-generative-ai-changes-the-game-in-tech-services-tech-report-2024/

[00:20:40] Uber Michelangelo, Uber Engineering Team

https://www.uber.com/en-SE/blog/michelangelo-machine-learning-platform/

[00:20:50] Algorithmia Acquisition, DataRobot

https://www.datarobot.com/newsroom/press/datarobot-is-acquiring-algorithmia-enhancing-leading-mlops-architecture-for-the-enterprise/

[00:22:55] Fine Tuning vs RAG, Heydar Soudani, Evangelos Kanoulas & Faegheh Hasibi.

https://arxiv.org/html/2403.01432v2

[00:24:40] LLM Agent Survey, Lei Wang et al.

https://arxiv.org/abs/2308.11432

[00:26:30] CentML CServe, CentML

https://docs.centml.ai/apps/llm

[00:29:15] CentML Snowflake, Snowflake

https://www.snowflake.com/en/engineering-blog/optimize-llms-with-llama-snowflake-ai-stack/

[00:30:15] NVIDIA H100 GPU, NVIDIA

https://www.nvidia.com/en-us/data-center/h100/

[00:33:25] CentML's 60% savings, CentML

https://centml.ai/platform/

<trunc, see pdf>