The world of generative AI has been rapidly evolving, with major tech companies and startups continuously pushing the boundaries of what's possible. In April 2024, we've seen a flurry of exciting developments that are poised to reshape the AI landscape. From Google's new Vertex AI Agent Builder empowering partners to build advanced AI assistants, to the launch of Read AI's AI-powered productivity tools, this article will explore the top 10 most exciting news stories in the generative AI space this past month.
Vertex AI
In a move to empower its partners in delivering advanced AI capabilities to customers, Google has introduced the Vertex AI Agent Builder. This new tool is designed to simplify the process of building and deploying generative AI agents that can tackle a wide range of business challenges. The Vertex AI Agent Builder brings together Google's foundation models, the power of Google Search, and various developer tools into a unified platform. This allows partners to create AI agents that are grounded in Google's vast knowledge and can be tailored to specific customer needs.
The agent paradigm is where, instead of asking a question, you're giving it a goal. Then it's breaking down the goal: it's reasoning, it's planning, it's generating the first draft of the answer, it is reflecting on the answer, and it's making it better, through an iterative set of steps, it's either reflecting on its own or it's asking you for feedback. Then at the end of the process, it's going to help you accomplish your goal.
In addition to the Vertex AI Agent Builder, Google has also introduced new enhancements to help organizations build data-driven AI agents, including integrating the Gemini language model into popular products like BigQuery, Databases, and Looker. These advancements from Google aim to empower partners and customers to create AI assistants that can truly understand and assist in achieving their goals, marking a significant step forward in the evolution of AI-powered solutions.
Google’s Copilot
In a move to bolster its enterprise-focused AI capabilities, Google has unveiled Code Assist, its latest offering to compete with GitHub's popular Copilot service. Unveiled at the company's Cloud conference, Code Assist is a rebranded and significantly upgraded version of Google's previous Duet AI code completion tool. The new service aims to provide more advanced AI-powered code assistance to developers, drawing on the power of Google's Gemini 1.5 Pro language model. One of the key differentiators of Code Assist is its massive 1 million-token context window, which is the largest in the industry, according to Google. This expanded context allows the tool to better understand and reason over large code bases, enabling "AI-assisted code transformations that were not possible before," as explained by Brad Calder, Google's VP and GM for its cloud platform and technical infrastructure. Code Assist will be available through plug-ins for popular development environments like VS Code and JetBrains, making it easily accessible to enterprise developers. This direct competition with GitHub's Copilot Enterprise service underscores Google's ambitions to become a dominant player in the AI-powered code assistance market. With the launch of Code Assist, Google is aiming to provide its enterprise customers with a more powerful and contextually aware AI tool to enhance their software development workflows. As the battle for developer mindshare intensifies, this latest move from Google is sure to shake up the landscape of AI-driven coding assistance.
Read AI
Artificial intelligence startup Read AI has announced a significant new funding round and an expansion of its product capabilities, positioning the company for growth in the rapidly evolving field of AI-powered productivity tools. The company has raised $21 million in funding from Goodwater Capital and Madrona Venture Group, underscoring investor confidence in Read AI's potential. This new capital will help the startup, which currently has 20 employees, to more than double its workforce in the near future.
Alongside the funding news, Read AI has introduced a new feature called "Readouts" that leverages the company's AI agents to automatically summarize email conversations and messaging threads across popular platforms like Gmail, Outlook, Teams, and Slack. This functionality is designed to help users quickly digest the key points of discussions, even when they span multiple communication channels. Read AI's competitors include built-in tools from platform companies such as Microsoft, Google, Zoom, and others. Read is aiming to differentiate itself in part by working across platforms. The expansion of Read AI's feature set comes as the demand for AI-driven productivity solutions continues to grow, with users increasingly seeking tools that can help them navigate the deluge of digital communication and information. By providing AI-generated recaps of email and messaging, Read AI is positioning itself as a valuable ally in streamlining workflows and boosting efficiency for its customers. With the new funding and product enhancements, Read AI is poised to capitalize on the AI revolution sweeping through the productivity software landscape.
xAI
Elon Musk's artificial intelligence venture, xAI Corp., is reportedly in talks to raise a substantial amount of funding, potentially up to $4 billion. This proposed funding round could see the company's valuation soar to $18 billion, according to a Bloomberg report. The company, which was founded by Musk, is currently in discussions with potential investors, and a 20-page pitch deck is being circulated among Silicon Valley's venture capitalists. The pitch highlights Musk's successful track record at his other ventures, Tesla and SpaceX, as well as the potential for xAI to leverage high-quality data from Musk's social network, X, to train its AI models.
Access to quality data is crucial for developing large language models, which power AI chatbots. This is the area where xAI aims to compete with well-established players like ChatGPT-parent OpenAI. The ambitious funding round underscores Musk's aspirations to make xAI a major player in the rapidly evolving artificial intelligence landscape. With Musk's proven entrepreneurial prowess and the potential to leverage data from X, the company is positioning itself as a formidable challenger in the AI race. The success of this funding round will be closely watched by the tech industry, as it could further solidify Musk's standing as a visionary in the world of AI and potentially disrupt the existing dynamics in the AI market.
Cohere Unveils Command R+
In a move to bolster its offerings for the global enterprise community, AI research company Cohere has introduced Command R+, its latest and most advanced large language model (LLM) designed to excel at real-world business use cases. Command R+, which is now available first on Microsoft Azure, builds upon the success of Cohere's previous R-series models. The new LLM features a 128k-token context window and is optimized for several key capabilities that are crucial for enterprise-grade AI adoption, including:
- Advanced Retrieval Augmented Generation (RAG) with in-line citations to reduce hallucinations.
- Multilingual coverage in 10 key languages to support global business operations.
- Sophisticated tool use to automate complex business processes
Enterprises are clearly looking for highly accurate and efficient AI models like Cohere's latest Command R+ to move into production," said Miranda Nash, group vice president at Oracle. Cohere's partnership with Microsoft Azure underscores the company's commitment to driving enterprise AI adoption. The integration of Command R+ with Azure AI services will enable businesses to leverage the model's cutting-edge capabilities while adhering to the highest standards of security and compliance. With its enhanced performance, scalability, and enterprise-focused features, Command R+ positions Cohere as a formidable player in the race to provide businesses with state-of-the-art AI tools that can tackle real-world challenges and drive productivity across various industries.
Stable Audio
In a significant leap forward for AI-powered music creation, Stable Audio has announced the release of Stable Audio 2.0, a groundbreaking update to its AI-based audio generation platform. The new model introduces several key advancements that elevate the capabilities of AI-generated audio. Notably, Stable Audio 2.0 can now produce high-quality, full-length musical tracks up to three minutes in duration, complete with coherent structures and 44.1kHz stereo sound. This represents a major step forward from the platform's previous version, which was limited to shorter audio samples. Expanding beyond text-to-audio generation, Stable Audio 2.0 also introduces audio-to-audio capabilities, allowing users to transform uploaded audio samples through natural language prompts. This feature provides artists and musicians with greater flexibility and control over the creative process, enabling them to customize and manipulate sounds to fit their specific needs.
To ensure ethical and responsible development, Stable Audio 2.0 was exclusively trained on a licensed dataset from the AudioSparx music library, honoring opt-out requests from creators and ensuring fair compensation. Additionally, the platform employs advanced content recognition technology to prevent copyright infringement.
With these advancements, Stable Audio 2.0 positions itself as a powerful tool for artists, musicians, and creators, empowering them to explore new frontiers in AI-driven music production and sound design. As the technology continues to evolve, the implications for the future of audio creation remain both exciting and transformative.
Humane Ai Pin
Humane, the hardware startup founded by former Apple executives Bethany Bongiorno and Imran Chaudhri, has unveiled its first product - the Ai Pin. Priced at $699, the device marks the company's foray into the burgeoning market of standalone AI-powered devices. The Ai Pin is Humane's vision for a voice-based, always-connected device that can help users break away from constant smartphone usage. Powered by a combination of large language models (LLMs) from various providers, the Ai Pin aims to offer a more seamless and hands-free AI experience compared to traditional smartphone-based interactions.
The device's $699 price tag includes the Ai Pin, an extra battery, and an AI charging case. Humane is also offering various accessories, ranging from $29 to $49, as well as a $24/month subscription service. As the generative AI landscape continues to evolve, Humane's Ai Pin represents the company's attempt to carve out a niche in the emerging market of AI-centric hardware. With its focus on voice-based interactions and hands-free functionality, the Ai Pin aims to provide users with a more natural and seamless way to harness the power of advanced AI technologies. The release of the Ai Pin comes at a time when numerous companies are exploring the potential of generative AI in various form factors, from handheld devices to novel operating systems. Humane's entry into this space will be closely watched as it seeks to differentiate itself in a rapidly growing market.
Hailo
AI chip manufacturer Hailo has announced a significant $120 million funding extension, along with the debut of its latest product - the Hailo-10 on-device generative AI accelerator. The new funding, which adds to Hailo's previous $136 million Series C round, underscores the growing demand for specialized hardware to power the next generation of AI-driven applications. The Hailo-10 chip is designed to bring the power of large language models (LLMs) and synthetic content generation to edge devices, without relying on cloud connectivity. The Hailo-10 chip boasts up to 40 tera-operations per second of performance, outpacing competing integrated neural processing units. This high-performance capability allows the chip to efficiently run large language models and synthetic image generation models, such as Llama 2 and Stable Diffusion, directly on edge devices.
By bringing generative AI processing to the edge, Hailo aims to address challenges around privacy, latency, connectivity, and sustainability that are often associated with cloud-based AI services. The company believes that on-device generative AI will empower users to leverage chatbots, code assistants, and content generation tools with greater flexibility and immediacy.
Opera
In a bold move to embrace the future of AI-powered browsing, Opera has announced that its upcoming Opera One browser will include experimental support for over 150 local large language models (LLMs). This breakthrough integration marks the first time users of a major browser will have easy access to a wide range of locally stored AI models, including Llama from Meta, Vicuna, Gemma from Google, and Mixtral from Mistral AI, among others. The key advantage of using local LLMs is that users' data remains on their own devices, allowing them to leverage the power of generative AI without the need to transmit information to a remote server. This approach addresses growing concerns around privacy and data security, which have become increasingly important as AI-driven features proliferate across the digital landscape.
The new local AI functionality will be available first in the developer stream of Opera One, giving early adopters the opportunity to test and provide feedback on the feature. Users will be able to select the specific LLM they want to use for their AI-powered interactions, and the selected model will be downloaded to their machine for local processing. By embracing the power of local AI models, Opera is positioning itself as a pioneer in the browser space, offering users greater control and privacy over their AI-driven experiences. This move could inspire other major browser providers to follow suit, shaping the future of how people interact with and leverage AI capabilities within their everyday web browsing activities.
OpenAI
In a significant advancement for its large language model technology, OpenAI has announced the release of a new version of GPT-4 Turbo that integrates powerful generative AI vision features. The updated GPT-4 Turbo model marks a milestone for OpenAI, as it is the first time the company has incorporated visual understanding and generation capabilities into its premier language model. This integration allows the model to comprehend and respond to both text and image inputs through a single API call, streamlining the development of applications that leverage the power of multimodal AI.
The GPT-4 Turbo with Vision embedded in the new model offers a streamlined way for developers to build apps that can handle both text and images with one API call, the company stated. The idea is to simplify developer workflows, further enabling the creation of more streamlined and efficient applications.
The addition of vision capabilities to the GPT-4 Turbo model enables a wide range of new use cases, from visual question answering and image captioning to multimodal content creation. Developers can now build applications that can understand and generate text based on visual inputs, opening up new possibilities for AI-powered automation, productivity tools, and creative applications.