On April 18th, just last week, Meta unveiled Llama 3, lauded as "the most capable openly available LLM to date". This remarkable achievement by the Meta team comes hot on the heels of Llama 2's release last July. Notably, the new models surpass their predecessors and competing offerings from other providers in terms of performance. Meta's commitment to open sourcing these powerful models while prioritizing model safety and responsible usage is commendable. (https://lnkd.in/ghv3RJRz) Yesterday, Microsoft introduced Phi-3 Mini, a member of the Phi-3 family of models, promoted as "the most capable and cost-effective small language models available". According to Microsoft, the key to achieve high performance in such a small package lies in the quality of the training data. The model has been open sourced and is now available on Ollama and Hugging Face. (https://lnkd.in/gKgSmrBn) These compact yet powerful models serve as invaluable tools for creators eager to bring their visions to life or simply experiment with cutting-edge technology, particularly for those with limited access to resources. Shortly after the release of Llama 3, reports began surfacing of enthusiasts running the 8B model on a Raspberry Pi 5 equipped with just 8GB of RAM—a modest $80 single-board computer. (https://lnkd.in/gbeQPhQ2) While the performance understandably reflects the limitations of such a platform, the fact that the model runs and generates an output is nothing short of magical. Similarly, the Phi-3 Mini model can run on ubiquitous devices such as smartphones. Reflecting on my own journey, I recall the pivotal role that early access to a low-cost computer played in shaping my passion for coding. The gift of an 8-bit Commodore 64 home computer ignited my curiosity at a young age. Later, with the guidance of an outstanding Computer Science teaching staff at my high school IISS Marconi-Hack Bari, I embarked on ambitious projects like a networked multiplayer version of Battleship written in Turbo Pascal and 8086 Assembly, complete with custom sprites and lo-res graphics. These formative experiences paved the way for my career, eventually leading me to various engineering and leadership roles. It all began with a simple, low-cost, modestly capable computer used for experimentation—a gateway to discovering my life's passion. I encourage educators and parents alike to consider setting up similar low-cost experimentation environments. By introducing younger generations to open-source AI technologies, we can nurture their creativity and help them uncover their true passions. It's not just a weekend project—it's an investment in their future and the future of innovation and discovery. #education #youth #AI #LLM #future #innovation
Giuseppe M.’s Post
More Relevant Posts
-
Imagine someone glancing at you and instantly knowing your name, address, and phone number. Sounds like science fiction? It's happening right now. Two Harvard students, AnhPhu Nguyen and Caine Ardayfio, just turned Meta's smart glasses into something both amazing and unsettling. They created I-XRAY. A system that, with a simple look, can identify you and pull up your personal info in real-time. They combined Meta’s Ray-Ban smart glasses with face search engines and large language models. In their demo, they didn't just recognize classmates—they accessed their personal details on the spot. What started as a side project quickly became a spotlight on privacy and consent in our tech-driven world. Now, you might be thinking, "Doesn't Meta have safeguards?" Sure, there's a tiny light on the glasses to indicate recording. But let's be honest. In a crowded street or under bright sunlight, who's really noticing that? And while Meta's policy advises against harmful use, these students showed how easily the tech can be repurposed. This isn't about one experiment. It's a glimpse into a future where the line between public and private blurs even more. Nguyen and Ardayfio didn't do this to scare us. They wanted to highlight what's possible—and it's a lot. So, where do we go from here? It's time for serious conversations about how we regulate and use these technologies. Because without robust privacy protections, we might be stepping into an era where nothing is truly private. Are we ready for that? I'd love to hear your thoughts. #AiNewsOfTheWeek #KnowldegeNest #HamptonRoads
To view or add a comment, sign in
-
Monday musings 🌟 As a marketing leader, I see innovation as a driving force, but this move from Meta seems like it is a step in the wrong direction... As AI continues to leap and bound, brands need to be super careful it's being used ethically, privacy invasions lead to all sorts of problems. Instead, how about Meta: - Give users a choice: Let people opt out of being identified 🕵️♂️ - Focus on safety: Use the glasses for good, like alerting people to dangers or helping lost kids 🤔 - Be upfront: Tell people exactly what the glasses can do and how to use them responsibly 📖 I think we can all agree that innovation should be about improving lives, not making people feel unsafe...!! Interested to hear your thoughts.. https://lnkd.in/e5AxKqsR #Meta #SmartGlasses #Privacy #Innovation #Marketing #Technology #SocialMedia
To view or add a comment, sign in
-
🚀 If you're planning to train your own LLM and want to learn from Meta's experience, here is my summary: Meta's journey in training LLaMA 3.1 offers invaluable insights into the challenges and solutions involved in large-scale model training. During a 54-day pre-training snapshot, their team encountered several key challenges: The Challenges Meta Faced 1️⃣ Frequent Interruptions: A total of 466 job interruptions, with: ✅ 47 planned (e.g., firmware upgrades, dataset updates). ✅ 419 unexpected interruptions, primarily hardware-related. 2️⃣ Hardware Failures Dominate: ✅ 78% of unexpected interruptions were due to confirmed or suspected hardware issues, including GPU failures, silent data corruption, and host maintenance events. ✅ 58.7% of all unexpected interruptions were specifically tied to GPU issues, making them the most significant bottleneck. 3️⃣ The Scale of the Problem: Large-scale LLM training, with thousands of GPUs running at full capacity, makes hardware failures inevitable. These failures often require extensive diagnostics and recovery. Key Takeaways for Your Own LLM Training ✅ Invest in Proactive Monitoring: Meta's experience highlights the need for robust monitoring systems to detect hardware faults (like GPU overheating or memory corruption) early. ✅ Expect and Plan for Failures: Interruptions are unavoidable when training at scale, but strategies like redundancy, fault isolation, and automated recovery can reduce downtime. ✅ Focus on Infrastructure Resilience: Large-scale training isn't just about designing great models—it's about building an infrastructure that can handle the stress of continuous operation for weeks or months. Meta's experience with LLaMA highlights the complexity of large-scale AI projects. For anyone looking to build or fine-tune their own LLMs, these lessons are essential to navigating the challenges of infrastructure and scalability. #AI #MachineLearning #LLMTraining #GPUs #InfrastructureResilience #MetaAI #LLaMA #meta
To view or add a comment, sign in
-
🚨 OMG fam! Meta just dropped Llama 3.1 and it's a whole MOOD! 🦙✨ Buckle up, cause this is HOT: • It's giving "most capable LLM" vibes and it's FREE?! We stan an (almost) open-source queen 👑 • This bad boy took MONTHS to train on 16K Nvidia H100 GPUs. That's like, a bazillion TikTok dances worth of processing power 💃 • We're talking 405B parameters and a 128K token context length. Translation: it's THICC 🍑 • Rumor has it, it's dunking on OpenAI's GPT-4 in the benchmarks. But y'know, benchmarks can be sus. Real tea comes from the users 🍵👀 The open-source situation? It's complicated™️: • Use it for your side hustle, but if you're TikTok famous (700M+ active users), you gotta slide into Meta's DMs for a license 📝 • Training data? That's on the DL. But the code? It's out there, living its best life. We're talking 300 lines of Python goodness 🐍✨ • Model weights are open too! DIY AI, anyone? 🛠️ • Bye-bye GPT-4 API fees, hello emptying your wallet for cloud GPUs instead! 💸 Wanna know more? Hit up Meta's blog: https://lnkd.in/gFbRFGfm Drop a 🦙 if you're ready to llama-nate your coding life! #Llama3 #AIRevolution #MetaAI #OpenSourceVibes #GenZTech #LLMsOnFleek
To view or add a comment, sign in
-
Well. I found a chart in the Meta Llama 3.1 paper that changed my napkin math estimates. So, this is an updated post. I deleted the incorrect one. I like trying to make sense of very dense whitepapers and put it into laymen terms. Helps me make sense of the craziness. As you may have heard. Meta just announced their Llama 3.1 Foundational AI model. This is the 405B one I'm referencing. It's open source and anyone can use. Which is pretty cool. Want to run a version at home? You can if you have the horsepower. Just download with Ollama and have fun. The training is the heavy lift. Which took the following. Using table 4 on page 10. The Meta team was training with BF16 not FP8. The best performance they found was of 43% peak, or 430 Tflops per GPU. It went down to 380Tflops for certain portions, but, let's just go with 430 for simplicity. Total training took 3.8 x 10^25 FLOPs or Floating Point Operations. That is 3.8 Trillion Trillion FLOPs. Billion Quadrillion? Million Quintillion? Sigh. It's a big number. In fact it's 38,000,000,000,000,000,000,000,000 FLOPs. If you used a 3k superpod, or 3072 GPU's it would take 332 days to complete. 3k GPU's would cost you roughly $100-$150 million in just hardware. It consumed enough power to run roughly 2,000 average US homes. A small town really. 3.5 MWatts. Good thing they had multiple 16k+ GPU pods to play with. All of this for an Open Source, free to use, AI model. I'm still trying to wrap my head around what it truly takes to train these models. I just find it absolutely amazing, and I build the infrastructure for a living. It's just fun to try and look at these numbers with a different perspective.
To view or add a comment, sign in
-
For those not aware, Meta just dropped their latest foundational model, Llama 3 in both 8B and 70B format. It is not a Mixture of Experts which blows me a way considering it is a 70B model and based on currently available information, it is outperforming Mixtral 8x22. This gives us an OpenSource model with phenomenal pre-finetuned performance. With one glaring shortcoming.... 8k context. 8k in today's world feels... obsolete. But I can see some decent uses currently. And Meta says they will be releasing longer context versions as well as a whopping 400b multi-modal model in the near future. 400b... Not even an M2 Ultra mac studio with 192g RAM will be able to run that in 4bit quantization. Sounds like its time to get out that 1bit Quantization and get cracking so that we can run it on commodity hardware! Anywho, new foundational models only help drive the community forward, and I expect there will be some 32k context finetunes coming out in the next few weeks, at which point, this model may be a decent coding model.
Introducing Meta Llama 3: The most capable openly available LLM to date
ai.meta.com
To view or add a comment, sign in
-
TODAY'S TECH NEWS 🗞️ Here's some recent tech news: 👩💻Google Gemini AI: Google's Gemini AI is under scrutiny for generating "diverse" images when asked to create specific images. Google says it will fix it. 👩💻WhatsApp: WhatsApp has announced four new text formatting options for all Android, iOS, Web, and Mac desktop app users. 👩💻Reliance's 'Hanooman': Mukesh Ambani-backed ChatGPT is scheduled to be launched in March. 👩💻Rivian: Rivian is laying off 10 percent of its workforce as EV woes deepen. 👩💻Meta: Meta is testing out Facebook cross-posts to Threads. Meta is only rolling out the test to its Facebook iOS app, and not in the EU. 👩💻Ansys: Ansys beat Wall Street estimates for fourth-quarter revenue and profit on Wednesday, driven by growing demand for its engineering software solutions. . . #technews #techupdates #googlegeminiai #whatsapp #meta #cs #computersciencestudents #engineers #softwareengineering #learner #students
To view or add a comment, sign in
-
👓 🚓 Does Meta offer an SDK for the Ray-Ban Smart Glasses? Last Sunday, as Sundays usually go, I found myself in a bit of a boredom spiral 😅. And, naturally, in my boredom, I came up with yet another great idea for an app (you know, the type you start but never quite finish). My idea? Build an app that uses Smart Glasses like Meta’s Ray-Bans to capture anything I highlight on my phone—texts, screenshots, whatever—process it with AI, and save it for later. Sure, there are other ways to do this without fancy glasses, but it was a Sunday project idea 😌. So, my first question was, "Does Meta offer an SDK for these Ray-Ban Smart Glasses?" A few searches later, I found out: nope. And, honestly, that makes sense. For privacy reasons, Meta keeps it pretty locked down. No SDK means limited development options, which might be a good thing…until it’s not. 🤔 Then I came across a wild story. Two Harvard students got creative with these glasses by analyzing Instagram Story streams (yes, a feature the glasses do support). They were able to extract some very personal information from students as they walked around campus! Crazy, right? 😬 You can check out the story here: https://lnkd.in/ecpS-Q2m This combo of AI and IoT tech can be seriously powerful…and seriously scary. It’s becoming harder to protect privacy when data capture is this seamless. 🔍 P.S.: Wouldn’t it make sense if these glasses had protocols like drones do? Drones can be blocked from working in certain areas or under certain conditions. Why not apply similar standards here? #AIandPrivacy
How 2 Students Used The Meta Ray-Bans To Access Personal Information
social-www.forbes.com
To view or add a comment, sign in
-
WHAT IF Series - (3) - What if I actually had good relations with Meta and I was allowed to normally collaborate on projects with them. I was so excited about SAM not for the wide applicability only but they are the first to identify the ambiguity in what defines an object. I wanted to work on studying SAM, SEEM, HQ-SAM and others but from the lens of Gestalten principles and understanding what defines an object in these foundation models. For example the below example why did it go for the pavement and not the 2 cats, is it centre bias? If you select the 2 cats in the far left it will be able to. Also for all methods that went to extend SAM for video, what if you have sliders to control what defines an object within the Gestalten principle, one of them is the "Common Fate" principle that objects that move together form an object. Moving human not necessarily is one object in the video then, hands/legs vs. torso will have different ways. On why this question is important because you can control the ambiguity behaviour thats appearing in these models :D and its beyond granularity. I actually tried so many times to get an internship position, send to researchers in Meta to work with them, send to professors even not in Meta but collaborating with em. I was never identified as good enough for Meta :). Honestly, they are not the only one it was always Google, Meta, Amazon I was never good enough for these 3. I am not really afraid to declare these things, but I am quite passionate about Meta research though. Talking about failures, did you know that our beloved Computer Vision community decided to host ECCV 2022 in Israel, Tel Aviv. Let me spell out the word ECCV: European Computer Vision Conference :D it has nothing to do with the area there. But yes they are super powerful in our community to do that. I of course boycotted the conference (as author and reviewer). You know it resulted in my paper getting rejected three times now, even when new works evaluating on my proposed benchmark was able to get published in ICCV23. Referring to this paper: https://lnkd.in/eBRN6vyN. It also resulted in my other paper getting rejected 3 times too: https://lnkd.in/dVRtWdMi. But it unfortunately also resulted in ensuring that I never get assigned as an outstanding reviewer even when I am. In ICCV23 literally the PCs declared me as outstanding reviewer but told me its a mistake we didn't include you and we can't fix it (Emails here: https://lnkd.in/d4nvHjR8). I am also obviously not an area chair in case u are wondering from his email :D. It was only because of the Ombuds they identified me as outstanding reviewer, but I am now declared Witch for the rest of my life :D. Oh it also resulted in one of my journal papers in TPAMI staying for more than a year and a half without getting even the first review thats how they slowly punish you. #CVCommunityWitch !
To view or add a comment, sign in
-
Why did Meta Open Source a Million-Dollar AI Model? When Meta announced last month that they were open sourcing their flagship Llama 3 large language model (LLM), it was a game-changer! Investing millions and then giving it away for free? What’s behind this bold move? Here’s the strategy: Control & Security: Just like Linux gave developers unprecedented control and security over their operating systems compared to the proprietary Unix, Meta's Llama 3.1 allows developers to tailor and secure their AI projects to their needs. Cost Efficiency: Remember how Linux revolutionized the cost structure of operating systems, offering a free alternative to costly Unix systems? Llama 3.1 does the same for AI—cutting inference costs to about 50% of closed models like GPT-4, making advanced AI more affordable. Avoiding Vendor Lock-In: Linux freed users from the constraints of Unix’s closed ecosystems, just as Meta’s open-source model helps organizations sidestep vendor lock-in issues, such as price hikes and support discontinuations. By championing open-source AI, Meta is fostering a more innovative and competitive tech landscape, much like Linux did for operating systems. This approach not only drives down costs but also enhances collaboration and accessibility. The future of open-source LLMs is as exciting as the rise of Linux! #AI #OpenSource #Innovation #TechStrategy #MachineLearning #Meta #Llama3 #Linux #Unix
To view or add a comment, sign in