We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode John Palazza - Vice President of Global Sales @ CentML ( sponsored)

John Palazza - Vice President of Global Sales @ CentML ( sponsored)

2025/3/10
logo of podcast Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

AI Deep Dive AI Chapters Transcript
People
J
John Palazza
Topics
John Palazza: 我认为企业要想有效利用人工智能,需要自上而下的统一方法。高层领导需要制定愿景并负责,才能创造一种文化,让机器学习渗透到业务的各个方面。 从创新到生产再到规模化,企业需要改变在创新阶段的松散状态。关注GPU利用率和成本控制至关重要,避免因计算能力不足而导致破产。 CentML致力于帮助企业优化基础设施,使其能够更有效地启动并使用机器学习和生成式AI,解决基础设施带来的挑战。我们的创立源于对高效利用计算资源的需求,尤其是在大型语言模型和人工智能领域。 企业开始更加重视计算资源的实用性和效率,而不是一味追求规模。企业需要从理论构建转向实际构建,以实现机器学习的最佳财务和技术利用。 许多现有的计算资源利用率不足,提高GPU利用率和效率是解决计算资源短缺的关键一步。CentML的目标是帮助企业尽可能高效地利用现有计算资源,以应对不断增长的计算需求。 尽管企业倾向于平台化,但个人仍然需要参与决策,这在实践中很常见。企业中缺乏团队协作和集中化,导致资源浪费和效率低下。大型企业需要自上而下的方法来更有效地利用人工智能。自上而下的方法,由高层领导制定愿景,才能促进机器学习在企业中的有效应用。 我们与企业组织的各个层面进行沟通,了解其在生成式AI和大型语言模型方面的战略。企业对大型语言模型和生成式AI的应用需求日益增长,但其采用路径和阶段各不相同。 许多首席技术官尚未充分关注计算资源的利用率和效率问题。从创新到生产再到规模化,企业需要关注成本控制,避免因计算能力不足而导致破产。 提高效率能够释放资源,从而推动创新和增长。我认为在不久的将来,最棒的机会将来自那些开放权重模型。企业更倾向于使用开放权重模型,因为它更灵活,更容易扩展。随着Llama模型的进步,越来越多的企业开始关注开放权重模型。 企业不应该满足于“足够好”的解决方案,而应该追求更具突破性的创新。与NVIDIA和Deloitte等合作伙伴的合作,为CentML提供了市场洞察和技术支持。 CentML并不担心云服务提供商会构建类似的产品,因为CentML与他们建立了良好的合作关系。CentML与云服务提供商合作,提高客户满意度,并为客户提供更高效的解决方案。 CentML目前没有计划自建云平台,而是专注于与现有云服务提供商合作。CentML致力于为客户提供灵活性和自由度,让他们能够轻松切换模型和基础设施。 CentML对AI领域的众多初创公司感到兴奋,并认为这是一个充满活力和机遇的领域。

Deep Dive

Shownotes Transcript

John Palazza from CentML joins us in this sponsored interview to discuss the critical importance of infrastructure optimization in the age of Large Language Models and Generative AI. We explore how enterprises can transition from the innovation phase to production and scale, highlighting the significance of efficient GPU utilization and cost management. The conversation covers the open-source versus proprietary model debate, the rise of AI agents, and the need for platform independence to avoid vendor lock-in, as well as emerging trends in AI infrastructure and the pivotal role of strategic partnerships.

SPONSOR MESSAGES:


CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!

https://centml.ai/pricing/

Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich.

Goto https://tufalabs.ai/


TRANSCRIPT:

https://www.dropbox.com/scl/fi/dnjsygrgdgq5ng5fdlfjg/JOHNPALAZZA.pdf?rlkey=hl9wyydi9mj077rbg5acdmo3a&dl=0

John Palazza:

Vice President of Global Sales @ CentML

https://www.linkedin.com/in/john-p-b34655/

TOC:

  1. Enterprise AI Organization and Strategy

[00:00:00] 1.1 Organizational Structure and ML Ownership

[00:02:59] 1.2 Infrastructure Efficiency and GPU Utilization

[00:07:59] 1.3 Platform Centralization vs Team Autonomy

[00:11:32] 1.4 Enterprise AI Adoption Strategy and Leadership

  1. MLOps Infrastructure and Resource Management

    [00:15:08] 2.1 Technology Evolution and Enterprise Integration

    [00:19:10] 2.2 Enterprise MLOps Platform Development

    [00:22:15] 2.3 AI Interface Evolution and Agent-Based Solutions

    [00:25:47] 2.4 CentML's Infrastructure Solutions

    [00:30:00] 2.5 Workload Abstraction and Resource Allocation

  2. LLM Infrastructure Optimization and Independence

    [00:33:10] 3.1 GPU Optimization and Cost Efficiency

    [00:36:47] 3.2 AI Efficiency and Innovation Challenges

    [00:41:40] 3.3 Cloud Provider Strategy and Infrastructure Control

    [00:46:52] 3.4 Platform Independence and Vendor Lock-in

    [00:50:53] 3.5 Technical Innovation and Growth Strategy

REFS:

[00:01:25] Apple Acquires GraphLab, Apple Inc.

https://techcrunch.com/2016/08/05/apple-acquires-turi-a-machine-learning-company/

[00:03:50] Bain Tech Report 2024, Gartner

https://www.bain.com/insights/topics/technology-report/

[00:04:50] PaaS vs IaaS Efficiency, Gartner

https://www.gartner.com/en/newsroom/press-releases/2024-11-19-gartner-forecasts-worldwide-public-cloud-end-user-spending-to-total-723-billion-dollars-in-2025

[00:14:55] Fashion Quote, Oscar Wilde

https://www.amazon.com/Complete-Works-Oscar-Wilde-Collins/dp/0007144369

[00:15:30] PointCast Network, PointCast Inc.

https://en.wikipedia.org/wiki/Push_technology

[00:18:05] AI Bain Report, Bain & Company

https://www.bain.com/insights/how-generative-ai-changes-the-game-in-tech-services-tech-report-2024/

[00:20:40] Uber Michelangelo, Uber Engineering Team

https://www.uber.com/en-SE/blog/michelangelo-machine-learning-platform/

[00:20:50] Algorithmia Acquisition, DataRobot

https://www.datarobot.com/newsroom/press/datarobot-is-acquiring-algorithmia-enhancing-leading-mlops-architecture-for-the-enterprise/

[00:22:55] Fine Tuning vs RAG, Heydar Soudani, Evangelos Kanoulas & Faegheh Hasibi.

https://arxiv.org/html/2403.01432v2

[00:24:40] LLM Agent Survey, Lei Wang et al.

https://arxiv.org/abs/2308.11432

[00:26:30] CentML CServe, CentML

https://docs.centml.ai/apps/llm

[00:29:15] CentML Snowflake, Snowflake

https://www.snowflake.com/en/engineering-blog/optimize-llms-with-llama-snowflake-ai-stack/

[00:30:15] NVIDIA H100 GPU, NVIDIA

https://www.nvidia.com/en-us/data-center/h100/

[00:33:25] CentML's 60% savings, CentML

https://centml.ai/platform/