OpenAI API Crushing, China Access Blocked

OpenAI Making Moves This Week

A Week of Non-Stop Innovation: New Models Emerge and Tools Launch to Bridge the Gap

OpenAI's sales of AI models have surpassed Microsoft's, with OpenAI generating over $1 billion in annualized revenue. Despite Microsoft's alliance with OpenAI and extensive sales resources, OpenAI's direct sales approach, including early access to new models, has proven effective. Both companies share revenue and maintain a competitive but cooperative relationship. OpenAI's ability to quickly adapt and offer new AI capabilities has given it an edge, challenging Microsoft's traditionally dominant position in enterprise AI services.


Vultr is empowering the next generation of generative AI startups with access to the latest NVIDIA GPUs.

Try it yourself when you visit and use promo code "BERMAN300" for $300 off your first 30 days.

  • Zuckerberg disses closed-source AI competitors as trying to 'create God' - In a recent interview, Meta CEO Mark Zuckerberg criticized closed-source AI efforts, likening them to attempts to "create God." He emphasized the importance of open-source AI for broader access and innovation. Zuckerberg highlighted Meta's AI Studio, which allows creators to build custom AI avatars, as a counterpoint to more restrictive approaches. He also discussed Meta's progress with smart glasses and the future of AI interfaces, suggesting that while new technologies will emerge, they won't completely replace smartphones.

  • OpenAI reportedly plans to block access in China. Chinese AI companies may fill the void - OpenAI is set to restrict API access in China, prompting Chinese tech companies to seize the opportunity to attract OpenAI's users. The decision comes in the wake of OpenAI cutting off certain "covert influence operations," including a Chinese network that had misused its AI models. Amidst concerns from the U.S. government about AI cybersecurity risks, Chinese firms like Baidu, Alibaba Cloud, and Zhipu AI are now offering incentives such as free migration services and additional tokens to encourage users to switch to their platforms, advertising capabilities comparable to OpenAI's technologies.

  • Stability AI lands a lifeline from Sean Parker, Greycroft - Stability AI, the company behind Stable Diffusion, has secured new funding from investors including Sean Parker, who also joins as executive board chairman. The investment comes amid financial struggles, with the company facing unpaid cloud bills and a possible reduced valuation. Former CEO Emad Mostaque’s mismanagement led to financial difficulties, but new investors have committed $80 million, forgiven $100 million in debt, and released Stability from $300 million in future obligations. Going forward, Stability plans to focus on managed image, video, and audio pipelines, custom enterprise models, and APIs for consumer apps, while remaining committed to open-source principles.

  • YouTube tries convincing record labels to license music for AI song generator - YouTube is negotiating with major music labels to obtain song licenses for AI tools that emulate popular musicians' styles. These discussions involve upfront payments to labels like Sony, Warner, and Universal. The motivation is to use the songs to legally train AI song generators for upcoming launch. There is resistance from artists concerned about the potential impact on the value of their work. Initial tests of a generative AI tool were limited, with few participating artists. While YouTube refines its AI music product, the industry is witnessing lawsuits against AI startups for copyright infringement. Record companies are cautiously exploring opportunities to monetize AI-generated songs amid technological disruptions. YouTube's plans, the scale of potential deals, and artist participation remain evolving elements in this nascent stage of AI music generation.

  • AI Chatbots and Tools Are Coming to Gmail, Google Drive, and Firefox - Google is integrating AI side panels into Gmail, Docs, Sheets, Slides, and Google Drive using Gemini, enhancing writing assistance, summarization, and content creation. Mozilla is allowing users to incorporate AI chatbots like ChatGPT and Google Gemini into Firefox's sidebar for information summarization and language simplification. These AI tools are optional and aim to improve user experience without replacing existing tools, reflecting the growing trend of integrating AI into daily digital interactions. Privacy and security remain a priority for both companies.

  • I Paid $365.63 to Replace 404 Media With AI - Prototype.Press is a digital news platform specializing in technology content, featuring a range of articles across categories such as tech, science, and AI. Boasting a user-friendly design for smooth navigation, the site claims to produce content effortlessly, publishing a high volume of articles daily, like the 53 pieces reported on June 17. However, it's alleged that the site also engages in republishing uncredited work from other sources, exemplified by a story also found on 404 Media. Remarkably, the site's operation is powered by an autonomous system based on ChatGPT, set up for a modest investment with assistance from a freelancer.

  • Researchers Prove Rabbit AI Breach By Sending Email to Us as Admin - A group called Rabbitude exposed critical security flaws in Rabbit R1 AI devices, revealing hardcoded API keys that allowed them to access all responses ever given by the devices, manipulate responses, and send emails from internal Rabbit addresses. Despite being aware of the breach for a month, Rabbit had not rotated the keys until Rabbitude made the issue public. Rabbit has since started investigating and rotated the keys, causing temporary service disruptions.

  • Firefox starts letting you use AI chatbots in the sidebar - Mozilla is experimenting with incorporating AI chatbots into its Nightly build of Firefox. Users can choose from ChatGPT, Google Gemini, HuggingChat, or Le Chat Mistral to be added to the browser's sidebar. They can highlight text and interact with their chosen chatbot to summarize content, simplify language, or engage in a memory and knowledge test. This feature is optional and not a part of Firefox's core functions. To access it, users need the experimental Nightly version, enable AI Chatbot Integration in settings, and add the chatbot to the toolbar. Mozilla acknowledges the AI models are imperfect and will improve the experience before advancing it to more stable versions of Firefox.

  • How AI Revolutionized Protein Science, but Didn’t End It - In December 2020, a significant milestone in protein science was achieved when Google DeepMind's AI tool, AlphaFold2, demonstrated over 90% accuracy in predicting 3D protein structures, far outperforming other methods. This groundbreaking innovation effectively solved the long-standing protein folding problem: predicting the three-dimensional shape of a protein solely from its one-dimensional amino acid sequence. While AlphaFold2 is a powerful prediction tool, it does not fully explain the protein folding process but rather provides accurate structural predictions. Its release accelerated biological research and inspired a surge in AI-based scientific endeavors, including new algorithms for designing novel proteins. Nonetheless, challenges remain, such as predicting how proteins interact within the complex environment of a cell and accounting for dynamic structural changes. Despite its limitations, AlphaFold2 has significantly advanced the field, raising questions about the future direction of protein science and artificial intelligence’s role in scientific understanding.

  • Generative AI Can’t Cite Its Sources - OpenAI has entered into licensing agreements with major media outlets like The Wall Street Journal, Business Insider, and The Atlantic, allowing its AI models to include partner articles in responses. In an evolving digital landscape, OpenAI pledges to enhance readership by crediting and driving traffic back to these media sites. Despite OpenAI’s assurances, current AI tools, including ChatGPT, struggle to consistently provide proper citations or functional links to original sources. Investigations reveal that AI models often misattribute or incorrectly synthesize information and may replace genuine citations with summaries or secondary sources. The ability of AI to reliably recognize and cite information remains a critical area of research, with significant improvements required. The future of AI-driven attribution and the preservation of journalistic integrity is uncertain, as developers and media companies navigate potential solutions within this new collaborative framework.

  • AI Dataset Licensing Companies Form Trade Group - Seven companies specializing in content-licensing for AI training, including Rightsify and vAIsual, have formed the Dataset Providers Alliance (DPA). The DPA aims to advocate for ethical data sourcing, protecting intellectual property rights, and promoting legislation like the NO FAKES Act to prevent unauthorized digital replicas. The group will also push for transparency in training data requirements, similar to the EU's AI Act. They plan to publish a white paper in July outlining their positions.

Awesome Research Papers

  • Efficient data generation for source-grounded information-seeking dialogs: A use case for meeting transcripts - The website details the creation and evaluation of the Meeting Information Seeking Dialogs dataset (MISeD), a unique resource aimed at improving interaction with meeting recordings through conversational AI models. It emphasizes the use of pre-trained language models (LLMs) for semi-automatic dialog generation, augmenting the labor-intensive Wizard-of-Oz (WOZ) method traditionally used for creating dialog datasets. MISeD allows users to query and engage with transcript content efficiently through a developed agent that answers queries with supporting evidence from the meeting record. Annotated over transcripts from various meeting domains, MISeD consists of 443 dialogs and assists in fine-tuning AI models, resulting in better response and attribution quality. The dataset, now open-source, is presented as a significant step forward in the automated generation of dialog datasets and a useful tool for meeting-exploration research.

  • Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers - The paper introduces SPARSEK Attention, a novel sparse attention mechanism that addresses the high computational and memory costs of traditional autoregressive Transformers when dealing with long sequences. By implementing a scoring network and a differentiable top-k mask, SParseK selectively processes a constant number of key-value (KV) pairs per query, achieving linear time complexity and a constant memory footprint. Experimental results suggest that SPARSEK Attention surpasses existing sparse attention methods in efficiency and can be integrated into pre-trained Large Language Models with minimal adjustment, enhancing their capability to handle long-range dependencies without compromising performance.

Meta Announces LLM Compiler for Code Optimization and Compilation - Meta has introduced the Meta LLM Compiler, a suite of models derived from Meta Code Llama that includes enhanced code optimization and compiler functionalities. These models are capable of emulating a compiler, predicting optimal passes for reducing code size, and disassembling code, with the ability to be fine-tuned for new tasks. Achieving state-of-the-art results in code size optimization and disassembly, these models are expected to aid compiler experts by identifying optimization opportunities. Meta is releasing the 7B and 13B models under a permissive license for both research and commercial purposes.

Finding GPT-4's Mistakes with GPT-4 - OpenAI has developed CriticGPT, a model based on GPT-4, to critique ChatGPT's responses, helping human trainers spot errors. This model has improved the accuracy of identifying mistakes, outperforming human reviewers 60% of the time. CriticGPT aids in providing more comprehensive critiques, reducing hallucinated bugs, and enhancing the quality of reinforcement learning from human feedback (RLHF). The model shows promise for better alignment of advanced AI systems, although challenges with complex tasks and dispersed errors remain.

Gemma 2 is now available to researchers and developers - Gemma 2 is the latest addition to the Gemma family of AI models, presenting breakthroughs in both size and performance. Offered in 9B and 27B parameter sizes, it provides top-tier performance in its class, rivaling models twice its size, and can operate efficiently on a single hardware unit like the NVIDIA H100 Tensor Core GPU. It is designed for open access and broad framework compatibility, making it flexible for various workflows. Gemma 2 emphasizes responsible AI development, with safety processes to mitigate biases and risks. Users can experiment with Gemma 2's capabilities via Google AI Studio or access the model weights on platforms like Kaggle and Hugging Face.

ESM3: Simulating 500 million years of evolution with a language model - Introducing ESM3, a cutting-edge AI model designed for biological applications. Built by EvolutionaryScale, ESM3 is capable of understanding and predicting the sequence, structure, and function of proteins, simulating evolutionary processes, and generating new proteins with specific traits. This model, trained on a diverse set of proteins, has demonstrated the ability to create a green fluorescent protein (esmGPF) not found in nature, akin to simulating 500 million years of evolution. ESM3 is part of a responsible framework committed to open science; the company will release model weights and code for public use. ESM3, which operates on a massive scale of computational power and parameters, could revolutionize the fields of medicine, research, and clean energy by enabling precise programming of biological entities. The model's development aims to be transparent and aligned with safety and societal benefits, continuing the tradition of responsible scientific innovation.

4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities - The paper presents a model trained on diverse multimodal datasets, capable of performing numerous tasks across various input types. This model enhances the out-of-the-box capabilities of multimodal models by integrating semantic and geometric modalities, neural network feature maps, and new modalities such as image metadata and color palettes. The research highlights the potential for fine-grained multimodal generation, maintaining performance across three times more tasks/modalities compared to existing models. The model and its training code are open-sourced for community use.

Sutra by Two AI: A language model supporting a broad range of widely-spoken global languages such as English, Spanish, and Mandarin, as well as languages from the Indian subcontinent, East Asia, the Middle East, and Europe. SUTRA's dual-transformer LLM architecture and multilingual tokenizer powers cost-efficient AI solutions in 50+ languages with conversation, search, and visual capabilities. The model is based on this paper: SUTRA: Scalable Multilingual Language Model Architecture.

Awesome New Launches

Collaborate with Claude on Projects - has introduced a new feature that allows Pro and Team users to consolidate their interactions with Claude into Projects, offering an enhanced collaborative AI experience. This integration enables the storing of a vast amount of data, up to 200K context window, for refined AI assistance tailored to specific organizational knowledge and tasks. Users can also create and customize instructions for context-specific responses. The platform now provides enhanced tools for content creation, such as Artifacts, to aid in generating various types of content. Claude Team members can share their interaction highlights for collective skill and idea development. Privacy protocols ensure user data within Projects is not used for model training without consent.

Introducing llama-agents: A Powerful Framework for Building Production Multi-Agent AI Systems — LlamaIndex, Data Framework for LLM Applications - Llama-agents is an open-source framework designed to facilitate the creation, iteration, and deployment of multi-agent AI systems. It features a distributed architecture with each agent operating as a separate microservice, a standardized API for agent communication, and flexible agent orchestration. Additionally, it promises ease of deployment, scalability, and built-in tools for performance monitoring

Figma announces big redesign with AI - Figma unveiled a series of updates at its Config conference: a major UI redesign termed "UI3," new generative AI tools, and built-in slideshow functionality, known as Figma Slides. The interface overhaul, aimed at setting a foundation for the next decade, includes a refreshed toolbar, rounded corners, and 200 new icons, prioritizing user work and accessibility. Initiated as a limited beta, the redesign is Figma's third significant makeover since its beta launch. The AI tools, also in limited beta, promise to simplify design processes, offering quick project starts and tasks like "AI-enhanced" asset search and auto-text generation, moving beyond generic placeholders.

Gemini 1.5 Pro 2M context window, code execution capabilities, and Gemma 2 are available - Google announces enhancements and availability for developers in Gemini 1.5 Pro, which now includes a 2 million token context window, context caching to reduce repetitive input costs, and a code execution feature for tasks involving mathematics or data reasoning. The code execution capability enables the model to generate and run Python code iteratively. Gemma 2, an open model accessible for experimentation, has been introduced to Google AI Studio.

Meta starts testing user-created AI chatbots on Instagram - Meta CEO Mark Zuckerberg announced that the company is testing AI characters created by users through Meta AI Studio on Instagram, starting in the U.S. These AI chatbots, developed in collaboration with creators, will be clearly labeled and initially available in messaging. The tests, involving about 50 creators, aim to explore various use cases, such as creators engaging with fans and businesses interacting with customers. Meta plans to refine these AI avatars and gradually expand access, aiming for a full launch by August.

110 new languages are coming to Google Translate - Google Translate has expanded its language offerings to include a diverse mix of global and indigenous languages, now encompassing more than 614 million new speakers, about 8% of the global population. This rollout features major languages and those with significant revitalization movements, with a notable inclusion of African languages such as Fon, Kongo, Luo, and Wolof. Highlights include Cantonese, a highly requested language; Manx, a revived Celtic language; NKo, a West African script-based language; Shahmukhi Punjabi; Tamazight from North Africa supporting multiple scripts; and Tok Pisin, an English-based creole. The language selection process prioritizes commonly used varieties and dialects, aided by technological advancements such as PaLM 2 for efficient learning of closely related languages.

Meet Sohu, the fastest AI chip of all time, by Etched - Sohu, the first specialized chip (ASIC) for transformer models, outperforms GPUs by handling over 500,000 tokens per second with Llama 70B, making an 8xSohu server equivalent to 160 H100 GPUs. Its specialization means it cannot run other AI models like CNNs or LSTMs, but this focus yields significant performance gains crucial for modern AI products such as ChatGPT and Claude. As GPUs face limitations in compute density improvements, the shift towards custom chips like Sohu is driven by the need for enhanced performance and cost efficiency in training and inference at scale.

Meta x Lamini - Lamini and Meta's collaboration presents Llama 3, a powerful tool to build LLM applications with enhanced capability for generating SQL queries. The tutorial explains tuning Llama 3 using Lamini Memory Tuning to achieve 95% accuracy, reduce hallucinations, and improve SQL query generation. It details steps involving the creation of a dataset, model evaluation, data tuning, and iterative improvements. Real-world use cases show significant time savings and workload reduction for business users. The overall iterative tuning process is emphasized, highlighting multiple filtering passes, dataset iterations, and error analyses as essential components. The process is highly iterative, requiring numerous model tuning cycles but promoting gradual improvement, underscoring the importance of building iteration capability.

Friend Wearable AI - Based Hardware | Your Personal AI Mentor and Memo - Presenting an AI wearable device acclaimed as the number one open-source option in its category. The device integrates artificial intelligence into a wearable format, offering potential customization and development opportunities due to its open-source nature. Users and developers are likely given the ability to modify and enhance the software, ensuring a broad scope of applications and adaptability across various use cases.

Smashing, from Goodreads' co-founder, curates the best of the web using AI and human recommendations - Otis Chandler, co-founder of Goodreads, is launching Smashing, a new content recommendation app that seeks to help users discover high-quality online content including news, podcasts, and social media posts. Smashing is entering an invite-only beta phase, aiming to combat issues like the fragmented media landscape, media layoffs, and AI-generated news summaries that threaten publisher traffic. Smashing encourages users to engage with full articles on publishers' sites by offering AI-powered recommendations and a community-voting system to unearth the internet's hidden gems. Users customize their content feed by submitting URLs, voting on submissions, and interacting with AI-summarized content. The platform has secured $3.4 million in seed funding, and its team includes tech veterans with a track record of successful ventures.

ElevenLabs Reader App - ElevenLabs has launched the ElevenLabs Reader App, enabling users to listen to articles, PDFs, ePubs, or any text using high-quality AI voices. The app is currently available for iOS users in the US, UK, and Canada, with plans for a global launch once multilingual support is added. Users have praised its enunciation, tone, and fluidity. The app aims to provide an enhanced text-to-speech experience for users on the go.

AliveCor Launches Kardia 12L: A First-of-Its-Kind AI-Powered 12-Lead ECG System - AliveCor announced the FDA clearance of its AI technology, KAI™ 12L, and the Kardia™ 12L ECG System—the first of its kind to detect cardiac conditions, including heart attacks, using a reduced leadset. KAI 12L utilizes deep neural networks trained on over 1.75 million ECGs for identifying 35 cardiac determinations. The Kardia 12L ECG System is a handheld, 12-lead ECG device that is significantly smaller and more portable than traditional machines, requiring only a single cable to function. Its convenience and simplicity are intended to enhance clinical efficiency, making it suitable for a broad range of healthcare environments.

Check Out My Other Videos:


Join the conversation

or to participate.