Thick AI Drama Continues

Execs Leave, Ex-board spills tea, and more

OpenAI drama continues as accusations of lying, withholding information, and creating a toxic work environment surface

In response to claims made by former OpenAI board members Helen Toner and Tasha McCauley, the company's new board members have published an article refuting their statements. The new board commissioned an external review led by law firm WilmerHale, which found no AI safety concerns necessitating CEO Sam Altman's replacement in November and rejected the prior board's decision to oust him. The article highlights OpenAI's commitment to AI safety, regulation, and governance, emphasizing ongoing discussions with government officials, voluntary commitments to the White House, and Altman's role on the US Department of Homeland Security's Artificial Intelligence Safety and Security Board. Additionally, the company has formed a new Safety Committee and adopted improved governance structures to ensure responsible AI development. The article concludes by reaffirming OpenAI's dedication to building industry-leading models in terms of both capabilities and safety, recognizing their role in stewarding transformative technologies for the global good.

  • OpenAI CEO Cements Control as He Secures Apple Deal - After a brief ouster last year, Sam Altman has strengthened his position as OpenAI's CEO, eliminating internal opposition and steering the company towards restructuring its nonprofit status. Altman has secured a significant deal with Apple to integrate OpenAI's conversational AI into Apple products, potentially generating billions in revenue. This partnership aims to enhance OpenAI's industry standing and possibly disrupt Google's dominance with Apple. Additionally, Altman is pursuing ventures outside OpenAI, including AI server-chip factories and an AI-powered personal device, further consolidating his influence.

  • PwC agrees deal to become OpenAI's first reseller and largest enterprise user - PwC has entered into a partnership with OpenAI, becoming its first resale partner and significant enterprise user. This deal allows PwC's US and UK branches to provide their employees and clients with ChatGPT Enterprise, focusing on advanced AI tools, such as the newly unveiled ChatGPT-4o model. PwC aims to integrate these AI capabilities to enhance its audit, tax, and consulting offerings. Over 100,000 PwC employees are expected to access these tools, although exact usage figures were not disclosed. Financial terms of the agreement remain private. The collaboration is part of PwC's strategic initiative to invest $1 billion into AI over three years, aiming to leverage custom GPTs for tasks like reviewing tax returns and generating business materials, while also aiding clients in adopting generative AI technologies.

  • Apple's Plan to Protect Privacy With AI: Putting Cloud Data in a Black Box - Apple is set to unveil a new initiative at its upcoming developer conference that aims to integrate AI into its products while preserving user privacy. The approach, known internally as Apple Chips in Data Centers (ACDC), involves processing data in a virtual black box using custom chips designed for secure computing. This method, similar to confidential computing, ensures that data remains private even during processing, preventing access by Apple employees. The initiative could enable more advanced AI features for devices like iPhones and future wearables, addressing privacy concerns while leveraging cloud computing capabilities.

  • Google Researchers Say AI Now Leading Disinformation Vector (and Are Severely Undercounting the Problem) - Google researchers have found that AI-generated disinformation, particularly image-based, has become the leading vector for spreading false information online. The study, which analyzed data from fact-checking sites like Snopes and Politifact, indicates that AI-generated disinformation surged dramatically in late 2023 with the advent of accessible AI image generators. Despite this, the researchers warn that the problem is likely underreported due to the selective nature of fact-checking and resource limitations, especially with non-English disinformation often going unchecked. Additionally, the study highlights the shift from image-based to video-based disinformation, which may include AI-generated content​.

  • AI models have favorite numbers, because they think they're people - Recent experiments by engineers at Gramener reveal that AI models exhibit human-like biases when selecting random numbers, showing preferences for specific numbers. For instance, OpenAI's GPT-3.5 Turbo frequently chooses 47, Anthropic's Claude 3 Haiku favors 42, and Google's Gemini often picks 72. These models avoid low and high extremes, and rarely select multiples of 5 or repeating digits, mimicking common human biases. This behavior stems from their training data, where certain numbers appear more frequently in contexts requiring randomness. The findings highlight the anthropomorphic tendencies in AI responses, rooted in patterns from human-generated data​.

  • AI helps Klarna cut marketing agency spend by 25% and run more campaigns - In Q1 2024, Klarna has reduced its sales and marketing spend by 11%, attributing $10 million in annual savings to AI implementation, which is 37% of total cost reductions. These savings were achieved by cutting external agency expenses by 25%, resulting in $4 million savings, and by slashing image production costs by $6 million, despite an increase in campaigns and image volume. Utilizing AI tools like Midjourney, DALL-E, Firefly, Topaz Gigapixel, and Photoroom, Klarna reduced image production time from 6 weeks to 1 week and saved $1.5 million in Q1 alone. AI has enhanced efficiency and creativity, permitting weekly content updates and the generation of over 1,000 images in three months. Furthermore, AI has facilitated cost-effective generation of event-specific marketing images, reducing reliance on stock photos. Copywriting has also been streamlined, with AI producing 80% of all marketing copy. Klarna collaborates with OpenAI and has developed over 300 GPTs for internal use, with a significant portion of AI-driven projects situated within the marketing department.

  • OpenAI Inks Licensing Deals to Bring Vox Media, The Atlantic Content to ChatGPT - OpenAI has signed agreements with Vox Media and The Atlantic to license their content for use with ChatGPT, including collaboration on product development. These partnerships come amid other contracts with media entities, such as News Corp and Reddit, contrasting with lawsuits from others like the New York Times. The Atlantic recently criticized such deals in an article, voicing concern over media companies compromising their intellectual property and credibility. Financial terms were not disclosed, but the partnerships involve content licensing for ChatGPT and developmental collaboration, aiming to enhance audience engagement and product innovation.

  • We had cellphones, then feature phones, then smartphones. Now, 'IntelliPhones' are coming - The evolution of mobile devices from early cellphones to today's smartphones could witness another leap with the advent of "IntelliPhones." Analysts at Bank of America Securities project a significant shift, driven by advances in AI, which could render present smartphones obsolete. Key potential features of IntelliPhones include context-aware assistance, proactive suggestions, object and scene recognition, real-time translation, health monitoring, AI-driven content creation, adaptive music haptics for those with hearing impairments, and improved voice recognition for users with speech-affecting conditions. However, the concept faces skepticism due to current AI hype and coincides with industry initiatives to unveil new AI features, underscoring a potential "once in a decade" major upgrade cycle for consumer smartphones.

  • Arm unveils new AI-optimised chip designs for smartphones - Arm has unveiled new AI-optimized chip designs, particularly notable for smartphones, including both CPU and GPU offerings with enhanced performance and efficiency. The flagship CPU, Cortex-X925, offers a 36% improvement in speed over its predecessor, along with a 25% AI performance boost. Software tools like Kleidi libraries aid AI and computer vision integration, and a strategic shift offers production-ready blueprints, in collaboration with Samsung and TSMC, to accelerate market delivery. Moreover, there's an ongoing initiative to support more native Windows applications, including Spotify, Chrome, and Audacity, on Arm processors.

Awesome Research Papers

  • Divergent Creativity in Humans and Large Language Models - The study compares the creative capabilities of state-of-the-art Large Language Models (LLMs) and humans using the Divergent Association Task (DAT) and other creative writing tasks. It finds that LLMs like GPT-4 can surpass human performance in certain creative tasks, such as generating semantically diverse word lists and creative writing. The research highlights that LLMs exhibit human-like creativity patterns and that their performance can be enhanced through prompt engineering and hyperparameter tuning. This study opens new avenues for understanding and developing artificial creativity while also providing insights into the distinctive elements of human inventive thought processes.

  • What We Learned from a Year of Building with LLMs (Part I) - This comprehensive guide on building products with Large Language Models (LLMs) is divided into tactical, operational, and strategic insights. Written by a diverse group of experts, the content aims to share hard-earned lessons from real-world LLM applications. The first tactical section covers practical aspects such as prompting techniques, the structuring of inputs/outputs, and retrieval-augmented generation (RAG) to improve model performance. The authors provide tips on formulating simple, effective prompts and highlight the significance of designing deterministic workflows. Furthermore, the guide anticipates upcoming operational and strategic guidance, aiming to arm practitioners with knowledge beyond machine learning expertise.

  • SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering - The paper introduces SWE-agent, an autonomous system designed to tackle software engineering tasks by using a language model to interface with a computer. It emphasizes the importance of an Agent-Computer Interface (ACI) for enhancing an agent’s ability to manage code and navigate complex repositories effectively. The SWE-agent outperforms prior models on SWE-bench, solving 12.5% of tasks, a significant improvement over the 3.8% success rate of past methods employing retrieval-augmented generation. The research also delves into how different ACI designs affect agent performance, offering insights into what constitutes a successful interface in the realm of coding and software development.

Codestral: Hello, World! - Codestral is a new 22B parameter generative AI model focused on code generation, fluent in over 80 programming languages, designed to streamline software development. It outperforms other models with a larger context window and excels in a variety of benchmarks, including Python, SQL, and additional languages across various coding tasks. Codestral is accessible through its dedicated API endpoint, which is initially free during an 8-week beta period. It's available on popular IDEs through community partner integrations and can be used in research and testing under a non-production license.

AstroPT: Scaling Large Observation Models for Astronomy - AstroPT is an autoregressive transformer specifically designed for astronomical datasets. The model was trained on 8.6 million galaxy images from the DESI Legacy Survey DR8. Researchers scaled the model from 1 million to 2.1 billion parameters, observing a performance scaling law similar to that found in text-based models. Performance on tasks improved with model size, reaching a saturation point. The team encourages open-source collaboration and development of a Large Observation Model for the observational sciences. They released AstroPT's source code, weights, and dataset under the MIT license, inviting further collaborative research and development.

Awesome New Launches

Anthropic’s AI now lets you create bots to work for you - Anthropic is introducing a new "tool use" functionality for its AI chatbot Claude, which integrates with external APIs to enable bespoke assistant creation, like email management or shoe shopping bots. This feature also extends to visual data analysis, such as offering customized interior design advice by processing room images. Accessible via Anthropic’s Messages API, Amazon Bedrock, and Google Vertex AI, the service's cost is determined by the amount of text processed. During beta testing with select customers, the most cost-effective option was the Haiku tier. Anthropic's expanded use of AI for creating personalized solutions denotes a broader industry trend toward highly versatile AI agents, as seen with Google’s and OpenAI’s latest efforts.

Announcing Sonic: A Low-Latency Voice Model for Lifelike Speech - Cartesia aims to create pervasive real-time AI intelligence using state space models (SSMs), starting with Sonic, a low-latropy voice model. The company envisions AI that processes vast contexts efficiently, functions on all devices, and handles multiple modalities in real-time. Traditional models are perceived as limited, whereas SSMs, like S4 and Mamba, enable more efficient AI that streams information continuously. Cartesia focuses on making AI faster, more affordable, and broadly accessible. Progress has been made with state space model architectures, reducing latency and improving quality in audio generation. Sonic exhibits these advancements with lifelike speech generation capabilities. Cartesia is expanding beyond audio to multimodal experiences and is currently hiring to advance this mission, with future releases planned to be open-source.

Introducing the Property Graph Index: A Powerful New Way to Build Knowledge Graphs with LLMs - LlamaIndex introduced the Property Graph Index to enhance the capability of knowledge graphs with greater expressiveness and flexibility. The traditional triple-based structure is expanded to support node and relationship properties, text as vector embeddings, and hybrid search. Users can now categorize nodes, store metadata, and perform complex queries with the Cypher language. The Property Graph Index offers methods like schema-guided, implicit, and free-form extraction to construct knowledge graphs. It also allows for various querying techniques, including keyword/synonym retrieval, vector similarity, and custom traversal.

Perplexity's New 'Page' Feature Writes Fully Sourced Reports and Articles for You - Perplexity has introduced a new AI feature named Page that allows users to automatically generate draft reports complete with citations, videos, and images. Accessed via a 'Page' button under the Library section, users can tailor content to audience levels, from beginners to experts. The AI sources content in real time from billions of web pages. The tool is designed to work similarly to Microsoft Word or Google Docs, with functionalities such as adding sections and multimedia, and the capability to remove sources along with the associated content. This innovation comes amid a wave of advancements from AI firms like OpenAI and tech giants such as Google and Microsoft, all augmenting their services with new AI features. Perplexity, currently amidst raising startup funds, has reached a valuation of $3 billion, showing rapid growth amidst a thriving AI industry.

Cool New Tools

Tools on Hugging Chat - HuggingChat has launched the beta release of "Tools on HuggingChat," expanding the functionality of its default model, Cohere Command R+. These tools, accessed via ZeroGPU spaces, enable various functionalities such as web searching, URL content fetching, document parsing, image generation and editing, and performing calculations. The tools are chosen for their speed and simplicity, ensuring a smooth user experience. Future enhancements include support for more models, multi-step tool use, and the ability to add and reference user-provided tools and files.

Scale's SEAL Leaderboards - Scale has introduced the SEAL Leaderboards—a tool for comparing the safety and performance of large language models (LLMs) which can't be manipulated by those being evaluated. This ranking system is continuously updated and relies on private datasets and verified domain experts in areas like coding, following instructions, math, and multilinguality. In addressing common industry issues such as data contamination, inconsistent reporting, unverified evaluator expertise, and inadequate tooling, Scale aims to enhance the evaluation quality, transparency, and standardization of AI models. Alongside the leader a boards, Scale offers an Evaluation Platform to aid in AI model analysis and improvement. This initiative aspires to improve AI transparency and accelerate responsible development, inviting feedback and participation from AI communities.

YouTube Music will let you search by humming into your Android phone - The Android YouTube Music app is introducing a new feature that allows users to identify songs by humming, whistling, singing, or playing a recording. When searching, users can tap a waveform icon to enable the app's listening function. Testing reveals the feature is fast and can accurately identify many songs from a hum or whistle, though it's not infallible. This instrument identification tool, similar to Shazam, has been spotted in the iOS version but hasn't been widely released yet. Overall, the feature is quick and efficient, albeit with occasional amusing inaccuracies.

Opera adds Google’s Gemini to its browsers - Opera's Aria AI extension, initially launched last year to assist users with tasks such as answering queries and writing code, has been enhanced through integration with Google's Gemini AI models. This update allows Aria to access the latest information with high performance. Gemini AI includes a range of models from the compact Gemini Nano to the advanced Gemini Ultra. Now, Opera users, including those on Opera GX, can experience Aria's improved capabilities, including reading responses aloud in a natural, conversational manner, using Google's text-to-audio technology. This follows a previous collaboration where Opera implemented Google's Imagen 2 model for in-browser image generation.

Check Out My Other Videos:

Join the conversation

or to participate.