AI Devices Take Center Stage

Lots of stumbles, but potential is there

Innovative Voice Assistant Outpaces Competitors with Speed and Practicality

The Rabbit R1, a cutting-edge AI device, offers superior knowledge access through voice commands, making it ideal for hands-free interactions. The device has a mobile SIM card and is powered by Perplexity, allowing it to access the internet anywhere. It uses natural language processing to provide fast, conversational responses to queries. Its practical applications range from asking questions while driving, learning about holidays and Pokemon, and recording and summarizing meetings using its AI capabilities, while its conversational memory and rapid responses set it apart from other AI devices. The device, priced at $199 with no subscription fee, has been highly popular, with the company selling out of its initial 10,000 units and an additional 100,000 devices within a week of launch.

Pinecone’s serverless vector database helps you deliver remarkable GenAI applications faster at up to 50x lower cost. Learn more about Pinecone serverless here.

  • AI Startup Hugging Face Valued at $4.5Billion After Raising Funding from Google, Nvidia - Hugging Face Inc., an AI startup, has reached a valuation of $4.5 billion following a successful funding round that raised $235 million with contributions from major tech companies, including Google, Amazon, Nvidia, and Salesforce. The company, founded in 2016, is known for its platform allowing users to share and access over 500,000 AI models and 250,000 datasets. CEO Clement Delangue plans to use the funding to expand the team and further develop proprietary AI tools, like IDEFICS, which generates text and images from prompts. Hugging Face aims to bolster its growth by capitalizing on the competitive hiring landscape in AI and has 10,000 paying customers for its advanced features.

  • AI Search Startup Perplexity Valued at $1 Billion in Funding Round - Perplexity AI, a startup that developed an AI-powered search engine, has recently raised $63 million in funding, pushing its valuation to over $1 billion, with notable investors like Jeff Bezos and Nvidia Corp. participating. The company was founded less than two years ago, is recognized for its AI chatbot that accurately summarizes search results, and has processed nearly 75 million user queries in the US this year alone. It's expanding its business model by launching a new enterprise version of its chatbot and targeting larger markets through partnerships with major carriers like SoftBank and Deutsche Telekom. Despite being smaller than giants like OpenAI and Google, Perplexity focuses on high accuracy and frequent updates to stay competitive in the rapidly evolving AI landscape.

  • Six-Month-Old AI Coding Startup Valued at $2 Billion by Founders Fund - Cognition, a startup launched six months ago, has quickly garnered $175 million in funding led by Founders Fund, valuing it at $2 billion, despite a shaky product debut of its AI coding assistant, Devin. The product, which aims to automate coding tasks, was criticized for inconsistencies during its launch, though it has received significant attention and investment. This funding is part of a broader trend of high valuations for AI startups with minimal revenue in a competitive market that includes tech giants and other startups. Cognition, founded by Scott Wu, Walden Yan, and Steven Hao, is based in New York and San Francisco and seeks to innovate in software development using AI.

  • Nvidia acquires AI workload management startup Run:ai for $700M, sources say - Nvidia is purchasing Run:ai, a company specializing in AI hardware infrastructure management, for around $700 million, according to TechCrunch sources, though initial reports suggested figures up to $1 billion. Run:ai, originating from Tel Aviv, has been working with Nvidia since 2020 to facilitate the efficient use of AI infrastructure. Post-acquisition, Run:ai's products will be a part of Nvidia's DGX Cloud AI platform, enhancing support for generative AI across data centers. Run:ai, co-founded by Geller, Dar, and Feder, aimed to optimize AI processing by distributing model fragments across various hardware setups. Preceding this deal, Run:ai had attracted significant venture capital funding and built a substantial Fortune 500 customer base. This acquisition is one of Nvidia's largest since buying Mellanox and underscores the increasing demand for advanced AI deployment management against rising compute costs and infrastructure complexities.

  • Amazon wants to host companies' custom generative AI models - AWS has announced a new feature in its enterprise-focused generative AI suite, Bedrock, called Custom Model Import, which is currently in preview. This feature allows enterprises to import their own generative AI models into Bedrock, providing fully managed APIs and integrating with the existing infrastructure and tools. The service aims to alleviate infrastructure challenges, a common barrier to AI deployment for companies. AWS distinguishes Bedrock's customization options, like Guardrails and Model Evaluation, from similar services offered by competitors, emphasizing that it allows for extensive testing and bias mitigation across multiple models.

  • Generative A.I. Arrives in the Gene Editing World of CRISPR - Profluent, a startup from Berkeley, has developed a new A.I. technology capable of generating blueprints for intricate biological systems aimed at editing DNA, potentially revolutionizing medical treatments by enhancing precision and speed. Leveraged by methodologies akin to those used by the pioneering ChatGPT, Profluent's system analyzes vast biological data to create gene editors based on the CRISPR mechanisms, which have already shown promise in treating genetic disorders. This advance is expected to be detailed in an upcoming presentation at the American Society of Gene and Cell Therapy's annual meeting.

  • Tesla could start selling Optimus robots by the end of next year, Musk says - Tesla Inc. is planning to begin selling its humanoid robot, Optimus, by the end of next year, according to CEO Elon Musk. The robot is expected to perform real tasks inside Tesla factories by the end of the year and be available for customers outside the company by the end of 2025. Musk believes that Optimus will eventually represent most of Tesla’s revenue and overall value, costing less than half of a car, around $25,000. The robot is expected to take over repetitive and dangerous tasks, primarily in manufacturing and service jobs. Musk sees the robotics program as a way to reassure investors and reposition Tesla as an AI or robotics company rather than just a carmaker.

  • Meta and Open - Meta, under Mark Zuckerberg's direction, is leveraging an "open" approach—contrasting Apple's closed system—across various frontiers of technology. Zuckerberg found Apple's Vision Pro underwhelming compared to Meta's Quest, emphasizing his commitment to an open model reminiscent of the PC era. In line with this philosophy, Meta has declared an open ecosystem for its Horizon OS, inviting hardware collaborations and promoting a unified storefront for applications. The differentiation strategy aligns with Meta's broader aim to dominate user time and attention—a scarce digital commodity—through an array of open-source initiatives, like Llama AI models and hardware endeavors, nurturing a metaverse conducive to Meta's horizontal services without gatekeeping constraints.

  • Olympic organizers unveil strategy for using artificial intelligence in sports - The International Olympic Committee (IOC) has laid out its plan to integrate artificial intelligence (AI) into various aspects of sports in preparation for the Paris Olympics, which are set to begin in under 100 days. The IOC President, Thomas Bach, emphasized the importance of utilizing AI responsibly to enhance the identification of promising athletes, customize training, and improve fairness in judging. AI will also play a role in shielding athletes from online harassment and augmenting the broadcasting experience. Despite controversies surrounding AI's use in security for the Paris games, the IOC trusts French authorities on security matters. Additionally, the IOC is collaborating with Intel to discover new athletic talent using AI, while maintaining a cautious stance on the potential for AI to limit athletes' career choices.

  • Moderna and OpenAI Collaborate To Advance mRNA Medicine - Moderna, a leader in mRNA medicine, and OpenAI are collaborating to advance mRNA medicine using OpenAI's generative AI tools. Moderna, a digital-first company, has a strong data foundation and a culture of learning, making it well-positioned to integrate AI into its operations and capitalize on next-generation AI innovation. The collaboration began in early 2023, with Moderna launching its own instance of ChatGPT, called mChat, which is now embedded across various business functions, including legal, research, manufacturing, and commercial. These AI tools serve as assistants to Moderna's employees, augmenting their roles and improving workflow efficiency. Moderna's Chief Information Officer, Brad Miller, emphasized the importance of AI in transforming the company's workforce and improving productivity, stating that if the company keeps adding people without leveraging technology, the value of each employee will be diminished. The collaboration between Moderna and OpenAI is expected to bring a new generation of medicines to patients in need.

  • Gen Z workers pick genAI over managers for career advice - The article discusses a new study by Intoo and Workplace Intelligence that reveals Gen Z workers prefer getting career advice from AI tools like ChatGPT over their managers. The study found that 47% of Gen Z employees believe they receive better advice from AI than their managers, while 55% said they would use social media for career advice. The lack of support from employers could lead to increased attrition and diminished engagement from Gen Z employees. To retain Gen Z talent, companies should emphasize mentorship and learning in the workplace. The article also mentions that Gen Z values salary less than every other generation and is motivated by job security, good global citizenship, and diversity.


Vultr is empowering the next generation of generative AI startups with access to the latest NVIDIA GPUs.

Try it yourself when you visit and use promo code "BERMAN300" for $300 off your first 30 days.

Awesome Research Papers

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions - Current Large Language Models (LLMs) are vulnerable to manipulations where attackers can use harmful prompts to alter the model's behavior. The root issue is that these models do not distinguish between prompts from trusted developers and untrusted sources. The proposed solution is to establish a hierarchy of instructions, allowing the model to prioritize and discern between them. A new data generation technique is introduced to train LLMs, like GPT-3.5, to adhere to this hierarchy, effectively ignoring commands from unauthorized entities. This method notably enhances the model's resistance against various attacks, even those unencountered during its training phase, without significantly compromising its existing functions.

SnapKV: LLM Knows What You are Looking for Before Generation - SnapKV presents a novel approach to optimize the efficiency of Large Language Models (LLMs) through an advanced Key-Value (KV) cache compression technique. It targets the issue of ballooning KV cache sizes as input lengths increase, which strains both memory and time resources. By recognizing stable attention patterns in LLMs, SnapKV selectively compresses KV caches without the need for additional fine-tuning, delivering a 3.6x generation speed increase and an 8.2x improvement in memory efficiency with slight accuracy trade-offs. Notably, SnapKV can handle inputs of up to 380K tokens on standard hardware, maintaining robustness across multiple datasets, which underscores its potential for real-world applications.

Dynamic Typography: Bringing Words to Life - The paper discusses "Dynamic Typography," an automated text animation technique designed to enhance static text by deforming letters and adding motion, thereby evoking emotions and conveying meanings more effectively. This advanced method utilizes vector graphics and an optimization-based framework that incorporates neural displacement fields. The framework shapes text per user prompts, ensuring that animations remain semantically coherent and legible. The technique's generalizability and superiority are affirmed through both qualitative and quantitative evaluations against baseline methods. The framework guarantees text readability throughout the animation process, and the code for this method is publicly available.

DoRA: Weight-Decomposed Low-Rank Adaptation - The paper introduces a method for efficient fine-tuning of large models using low-rank adaptation (LoRA). The authors propose decomposing the pre-trained weight into magnitude and direction components, and fine-tuning the direction component using LoRA. This approach allows for efficient fine-tuning of large models, while maintaining a learning capacity similar to full fine-tuning. The authors validate their method across a variety of tasks, including NLP and Vision-Language tasks, and demonstrate its effectiveness in terms of parameter efficiency and fine-tuning cost. The paper also discusses the potential for merging the adapted weight with the pre-trained weight before inference, which would not introduce any additional latency.

BLINK: Multimodal Large Language Models Can See but Not Perceive - Blink is a newly introduced benchmark designed for evaluating the core visual perception capabilities of multimodal language models (LLMs). It transforms 14 traditional computer vision tasks into a set of multiple-choice questions using single or multiple images coupled with visual prompts. The tasks cover areas like depth estimation and forensics detection, aimed to be simple for humans but challenging for LLMs. In testing, humans average a high accuracy of 95.70%, while even advanced models like GPT-4V and Gemini achieve only 51.26% and 45.72% respectively. The benchmark underscores the gap between human visual perception and current LLMs' abilities, offering insight into potential improvements for these models.

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework - OpenELM, an open language model with a novel layer-wise scaling strategy, has been introduced to enhance the reproducibility, transparency, and accuracy in large language model research. Different from prior models that offer limited access to their framework, OpenELM provides a comprehensive package including the full training and evaluation framework, training logs, checkpoints, pre-training configurations, and conversion code for Apple devices. With this release, built on publicly available datasets, OpenELM achieves a 2.36% accuracy improvement over OLMo while being more parameter-efficient.

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone - This publication introduces "phi-3-mini," a compact 3.8 billion parameter language model trained on a vast 3.3 trillion token dataset, achieving high-performance levels, competitive with larger models such as Mixtral 8x7B and GPT-3.5. Notable for its deployment capabilities on a smartphone, this model's key innovation is its training data—a scaled-up, filtered web and synthetic dataset based on its predecessor, phi-2. Furthermore, the model emphasizes robustness, safety, and chat functionality alignment. Additional insights are provided on two larger versions, "phi-3-small" and "phi-3-medium" (7B and 14B parameters), which exhibit even better performance on standardized benchmarks.

China’s SenseTime Unveils New AI Model SenseNova 5.0 - SenseNova 5.0, a generative AI model, was released during an event in Shanghai. The new model boasts significant enhancements in linguistic and creative abilities, aiming to rival OpenAI’s ChatGPT, and marks a push among Chinese firms to compete in generative AI technology development, supported by Beijing. SenseTime's development of SenseNova highlights China's ongoing interest in AI as a driver for technological revolution despite facing challenges in computing infrastructure and research innovation.

Snowflake Arctic: The Best LLM for Enterprise AI - Snowflake has introduced Arctic, a highly efficient and open enterprise-focused large language model that excels at tasks like SQL generation and coding. Arctic represents a leap in Mixture-of-Experts model scale and demonstrates top-tier performance compared to other open-source models. The high training efficiency of Arctic allows for more affordable custom model training for Snowflake customers and the AI community. Arctic is positioned as the best open-source model for off-the-shelf enterprise use cases.

DREAM: Distributed RAG Experimentation Framework - The DREAM (Distributed RAG Experimentation Framework)is a blueprint showcasing the orchestration of distributed retrieval-augmented generation (RAG) experiments using open-source technologies on a Kubernetes platform. It employs tools like Ray for compute distribution, LlamaIndex for processing unstructured data, Ragas for synthetic data generation and LLM evaluations, MLFlow for experiment tracking, and MinIO for object storage. The framework aims to facilitate the comparison of different RAG configurations to determine the best fit for a specific use case. It guides through preparing unstructured data, generating a "golden dataset," experimenting and evaluating in a distributed manner, and tracking these experiments with visual tools. Additionally, future considerations include enhancing the distributed nature of the framework and potentially developing a no-code/low-code workflow using Argo. References to more detailed descriptions and code are provided for users interested in implementation.

Awesome New Launches

Adobe Introduces Firefly Image 3 - Adobe has announced the release of Firefly Image 3 Foundation Model, a new generative AI model for creative exploration and ideation. This model aims to bring significant advancements in quality and control, including higher-quality image generations, better understanding of prompts, new levels of detail and variety, and improvements in fast creative expression and ideation. Firefly Image 3 delivers photorealistic quality with better lighting, positioning, attention to detail, advancements in text display, and more. The new model is available in the Firefly web application and will be integrated into Adobe Photoshop, Adobe Express, Adobe Illustrator, Adobe Substance 3D, and Adobe InDesign. Firefly has been used to generate over 7 billion images worldwide since its initial debut in March 2023 and has transformed image editing, template creation, vector design, and 3D texturing and staging for the better.

Eric Schmidt-backed Augment, a GitHub Copilot rival, launches out of stealth with $252M - Augment, a new AI-powered coding platform developed by former Microsoft software developer Igor Ostrovsky, has entered the market, securing $252 million in funding with a valuation close to $977 million. Augment, backed by prominent investors like Eric Schmidt, aims to enhance software quality and developer productivity through AI. Despite a crowded market featuring offerings from major tech companies and various startups, Augment's approach remains secretive with details on user experience and AI models under wraps. The company plans a software-as-a-service model, with product details to be released later. Augment, currently in early access with "hundreds" of developers, continues to refine its service, asserting superiority in understanding programmer intent and protecting intellectual property, with ambitions to grow its 50-strong team by the year's end.

Check Out My Other Videos:

Careerist’s Manual QA course can be completed in 15 weeks, with personalized guidance from experienced coaches. Take the first step towards a successful tech career today by following this link or with promo code MATTHEW BERMAN to receive a $600 discount on the course PLUS a money-back guarantee.

Join the conversation

or to participate.