Philipp Schmid’s Post

Technical Lead & LLMs at Hugging Face 🤗 | AWS ML HERO 🦸🏻♂️

It's Pre:Invent time! Excited to announce that HUGS is now supported and available on Amazon Web Services (AWS) Inferentia2! HUGS or Hugging Face Generative AI Services are optimized, zero-configuration inference microservices designed to simplify and accelerate the development of AI applications using open models. Built on open-source technologies. 👀 Think on NVIDIA NIMs, but on AWS Neuron Accelerators. No compilation or configuration headaches. Take your open model and run it without any pain on AWS Accelerators with in minutes. 😍 Starting today, HUGS on AWS Neuron supports Meta Llama, Nous Research Hermes, Mistral, and Mixtral models, with more to come! 🚀 Want to get started? Check out our Guide on how to deploy Llama 3.1 on AWS EKS with AWS Inferentia2 instances: https://lnkd.in/e24dYp_z

4 Comments

Bojan Jakimovski

Machine Learning Engineer | GenAI | MLOps | Federated Learning Enthusiast

Awesome work Philipp Schmid ! Do you plan to include Multimodal stuff in near future ?

Jeremy Laskar ☁️

Senior Account Manager - GenAI chez Amazon Web Services (AWS)

That is HUGE !! Congrats to the entire team on making HUGS available on Inferentia & Trainium!

1 Reaction

Kateryna Stetsiuk

Principal AI Consultant || Crafting сustom AI Strategies and Solutions to drive growth, profit and efficiency

Awesome product!

See more comments

To view or add a comment, sign in

More Relevant Posts

Cerebral Valley

5,022 followers
6mo
Report this post
We're live with our Deep Dive on Lambda and with Founding Team and VP of Revenue Robert Brooks IV! Lambda's 1-Click Clusters is your on-demand GPU multi-node superpower 🔋 In this conversation, Robert walks us through how Lambda is revolutionizing GPU cloud compute for Machine Learning Engineers & Researchers across AI startups & the enterprise. Learn why 1-Click Clusters is the superpower that AI teams have been waiting for ⚡️ Link below 👇 https://lnkd.in/eXat4VtM

Lambda's 1-Click Cluster is your on-demand GPU multi-node superpower 🔋

cerebralvalley.ai

1 Comment
Like Comment
To view or add a comment, sign in
Rama Ponnuswami

World Wide GTM - AI/ML on EKS at Amazon Web Services (AWS)
2w
Report this post
🚀 Join me at AWS re:Invent 2024! 🚀 I'm super thrilled to be joining Mike Stefaniak (AWS) and Casimir Starsiak III (Eli Lilly and Company) for a breakout session on high-performance generative AI on Amazon EKS at RIV 2024 next week. Don't miss this opportunity to dive deep into the future of AI infrastructure! 📅 Session: KUB314 | High-performance generative AI on Amazon EKS 🗓️ AWS re:Invent 2024 (December 4| 8.30 AM to 9.30 AM | Wynn | Upper Convention Promenade | Bollinger) In this session, we'll explore: • How Amazon EKS addresses key gen AI infrastructure challenges • Leveraging GPU acceleration, scalability, and low-latency responses • Real-world case studies of successful implementations • Insider tips on maximizing performance while keeping costs in check As generative AI reshapes the computing landscape, learn how to harness the power of Kubernetes and Amazon EKS to deploy your models effectively. Whether you're an AI engineer, cloud architect, or tech leader, this session will equip you with practical insights to stay ahead in the rapidly evolving world of generative AI. We will take you on a journey through the AI landscape, showing you how to harness the full potential of Kubernetes and EKS for your generative AI projects. From GPU acceleration to scalability secrets, we've got you covered! Looking forward to seeing you there! Let's connect if you're planning to attend. #AWSreInvent2024 #GenerativeAI #AmazonEKS #Kubernetes #AIInfrastructure P.S. Can't make it? Tag a colleague who shouldn't miss this!
1 Comment
Like Comment
To view or add a comment, sign in
Michael Hou

AI Applied Research Manager - NVDIA
1w
Report this post
Extending the partnership with AWS and bringing NVIDIA's NIM microservices to the AWS platform marks a significant advancement in AI reasoning. This not only significantly accelerates AI training and inference, but also reduces latency when dealing with generative AI applications, with huge potential for industries that rely on high-speed, real-time feedback such as healthcare, finance, autonomous driving and more. By accelerating deployment on cloud platforms, NVIDIA can help more enterprises and developers lower the technical threshold and drive the popularization and innovation of AI technology. This cooperation model also provides more possibilities for the future development of the AI ecosystem, especially when cross-platform and cross-service synergies become smoother, the application scenarios of AI will be more extensive.

NVIDIA

3,021,431 followers
1w

📣 Announced at #AWSreInvent, we are expanding our partnership with AWS to supercharge AI inference. NVIDIA NIM microservices, part of our AI full-stack accelerated computing platform, are now available across Amazon Web Services (AWS), resulting in faster AI training and inference, as well as lower latency for generative #AI applications. Learn more 🔗 https://nvda.ws/3B960Ch

NVIDIA NIM on AWS Supercharges AI Inference

blogs.nvidia.com
Like Comment
To view or add a comment, sign in
Harman Chadha

Building Cloud Business @ Scale with GSI Partners
1w Edited
Report this post
🚀 Highlights from AWS re:Invent 2024: Transforming the Cloud Landscape 🌐 AWS re:Invent 2024 has set the stage for groundbreaking innovations in generative AI, cloud computing. 🌟 Generative AI Innovations • Amazon Bedrock Advancements: Enhanced capabilities like Model Distillation (transferring knowledge between models) and Automated Reasoning (minimizing hallucinations in AI outputs) make AI safer and more efficient. New multi-agent collaboration tools are streamlining complex workflows. • Amazon Nova Suite: A new generation of foundational models for AI applications, featuring Nova Canvas (text-to-image generation) and Nova Reel (text-to-video generation). Designed for diverse use cases, from lightweight to advanced multimodal processing . 💻 Infrastructure and Compute Enhancements • AWS Graviton4 Processors: Delivering a 40% improvement in price-performance with 60% lower energy consumption compared to its predecessor. • Trainium2 Instances: Tailored for AI and ML workloads, with Trn2 UltraServers allowing larger AI training clusters for enterprises. • Amazon S3 Table Buckets: A new bucket type optimized for Iceberg tables, reducing latency and improving performance for data analytics #AWS #GenerativeAI #CloudComputing #Innovation #AWSreInvent
Like Comment
To view or add a comment, sign in
Zoe O'Connor

Sr. Sourcing Recruiter, Tech, Project Kuiper
2w
Report this post
AWS re:Invent is happening this week! Check out these live updates for the latest news including all things generative AI, new service announcements, tech demos, and more. #AWS #reinvent #amazon

Live updates from AWS re:Invent 2024

aboutamazon.com
Like Comment
To view or add a comment, sign in
Hamilton Barnes 🌳

154,497 followers
2w
Report this post
AWS re: Invent 2024: Major Highlights so far🌟 Today, AWS re: Invent 2024 at Venetian Resort in Las Vegas, has brought some thrilling announcements that are set to reshape the tech landscape. Andy Jassy took the stage by storm, heralding the introduction of Amazon Nova. With its four variants - Micro, Lite, Pro, and Premier - Nova is being hailed for its cost-efficiency and speed. Jassy emphasized that Nova is 75% more cost-effective than other models in the market, truly a game-changer for businesses leveraging AI. Speaking of innovation, the revamped customer service tools at Amazon are another fascinating development. These tools, integrated with AI, now predict user frustration and facilitate seamless transitions to human operators. Additionally, AI advancements are being utilised to optimise seller pages, manage inventory, and drive robotics. The unveiling of the Rufus tool, a shopping chatbot designed to mimic the human shopping experience, adds an interesting twist to the shopping journey. Matt Garman also shared some impactful updates, celebrating the 10th anniversary of Amazon Aurora. He highlighted AWS's strides in reducing latency in database transactions, introducing Amazon Aurora DSQL and enhancing Amazon DynamoDB global tables with multi-region consistency. The collaboration with Nvidia to introduce the P6 instance of Blackwell GPUs and the unveiling of Amazon EC2 Trn 2 Instances further cement AWS's leadership in the AI and cloud space. Want to stay up to date? ITPro will be updating throughout providing commentary around the key highlights! Find the link below! #AWSreInvent #TechInnovation #AI #ITPro #hamiltonbarnes
1 Comment
Like Comment
To view or add a comment, sign in
Saasverse

187 followers
2w
Report this post
#AWS Levels Up: Bedrock, Nova Models & Project Rainier Amazon Web Services (AWS) is making waves at re:Invent 2024, doubling down on enterprise AI and next-gen infrastructure. From the launch of Amazon Nova, a suite of state-of-the-art foundation models, to the unveiling of the Project Rainier AI compute cluster, AWS is reshaping the AI landscape. Key Announcements: 1️⃣ Amazon Nova Models: Six new foundation models supporting multimodal tasks (text, image, video) were launched, promising industry-leading performance and cost efficiency. These models integrate with Amazon Bedrock, AWS's fully managed service for building generative AI applications, now enhanced with Automated Reasoning checks and Model Distillation for faster, cost-effective training. 2️⃣ Project Rainier: A mega compute cluster powered by hundreds of thousands of Trainium2 chips. These chips boast 96 GB of ultra-fast memory and deliver 332 petaflops of performance per server. AWS addressed latency challenges of distributed systems with its Elastic Fabric Adapter, ensuring scalability without sacrificing speed. Why It Matters: AWS is betting big on generative AI, focusing on cost efficiency, scalability, and performance—key drivers for enterprise adoption. With projects like Rainier and Ceiba (another AI cluster leveraging Nvidia chips), AWS is positioning itself as the AI infrastructure leader. With AI becoming a cornerstone for every app, how will AWS’s advancements shape the competition with other cloud giants? #AWS #AIInfrastructure #Bedrock #AmazonNova #ProjectRainier #GenerativeAI #CloudInnovation #SaaSverse
Like Comment
To view or add a comment, sign in
Bruce B. A. West, MBA 🇺🇸

Defense AI & Digital Transformations
9mo
Report this post
Good talk on AI use cases across an enterprise from my friend Rohit Talluri ☁️ at Amazon Web Services (AWS) speaking for Rackspace Technology. Ro is one of the most authoritative experts I know regarding enterprise AI adoption and innovation, and pairs his AI knowledge with vast enterprise cloud solutions knowledge. Definitely worth a watch.

Rackspace Technology

357,650 followers
9mo Edited

Elevate Your AI Game with Amazon Web Services (AWS) Building Blocks Join today's #CloudTalkLive, featuring Rohit Talluri ☁️ from AWS, who will explore real-world use cases for how organizations leverage AI's building blocks for #NaturalLanguageProcessing, ComputerVision, #MachineLearning and more. Learn how to unlock the transformative potential of #ArtificialIntelligence with Foundry for AI by Rackspace (FAIR™) and #AWS today! AWS Partners *This description is brought to you by AWS Claude 3 with a human flair

www.linkedin.com

1 Comment
Like Comment
To view or add a comment, sign in
KAOPS by Nethopper

32 followers
9mo Edited
Report this post
Part of Priyanka Sharma's (executive director, Cloud Native Computing Foundation) keynote was that Kubernetes is turning 10 years! Read #CNCF's official recap if #KubeCon + #CloudNativeCon Day-2 event in Paris, which was all about how cloud native is powering the AI movement, #cloudnative whitepaper on #AI, and dynamic resource allocation (DRA). https://lnkd.in/dcC6v_iK #platformengineering #DRA #nethopper #kubernetes #gitops Nethopper

KubeCon + CloudNativeCon Europe 2024 day two: how cloud native is powering the AI movement (and other news)

cncf.io
Like Comment
To view or add a comment, sign in
John Deegan

Global DRaaS | Ransomware Remediation | Channel Strategies
9mo
Report this post
Check out use cases from AWS on Generative AI

Rackspace Technology

357,650 followers
9mo Edited

Elevate Your AI Game with Amazon Web Services (AWS) Building Blocks Join today's #CloudTalkLive, featuring Rohit Talluri ☁️ from AWS, who will explore real-world use cases for how organizations leverage AI's building blocks for #NaturalLanguageProcessing, ComputerVision, #MachineLearning and more. Learn how to unlock the transformative potential of #ArtificialIntelligence with Foundry for AI by Rackspace (FAIR™) and #AWS today! AWS Partners *This description is brought to you by AWS Claude 3 with a human flair

www.linkedin.com
Like Comment
To view or add a comment, sign in

131,794 followers

886 Posts

View Profile Connect

Philipp Schmid’s Post

More Relevant Posts

www.linkedin.com

www.linkedin.com

Explore topics