NeuralByte's weekly AI rundown - 11th February

Sam Altman wants 7 trillion dolars to transform chip industry. Bard is now Gemini. And more robots!

Feb 11, 2024

Greetings fellow AI enthusiasts!

This week we have major news from Google. They rebranded bart to Gemini and launched Gemini Advanced, the competitor to ChatGTP-4. Sam Altman the CEO of OpenAI wants to build AI chip infrastructure and seeking 7 trillion dollars. And many more interesting news, so lets dive in!

Dear subscribers,

Thanks for reading my newsletter and supporting my work. I have more AI content to share with you soon. Everything is free for now, but if you like my work, please consider becoming a paid subscriber. This will help me create more and better content for you.

Now, let's dive into the AI rundown to keep you in the loop on the latest happenings:

✨ Google rebrands Bard as Gemini and launches new AI experiences
💵 Sam Altman’s $7 Trillion Plan to Power AI with New Chips
🤖 1X Studio: The Future of Humanoid Robotics
🏷️ Meta to Label AI-Generated Images on Its Social Media Platforms
📰 Microsoft launches collaborations with news organizations with AI
🕵️ OpenAI developing two AI agents to automate entire work processes
📜Students Use AI to Read Ancient Scroll Burned by Vesuvius
🤖 ALOHA 2: A Better Way to Control Robots with Your Hands
🧠 Researchers Develop a Device That Lets Humans Control Robots with Their Minds
🕶️ Brilliant Labs Launches Frame, an Open Source AR Glasses with AI Assistant
🎮 Roblox Breaks Down Language Barriers with AI Chat Translations

And more!

Google rebrands Bard as Gemini and launches new AI experiences

Google has announced that its conversational AI platform Bard will now be called Gemini, reflecting its most advanced family of models. The company also introduced Gemini Advanced, a premium service that gives access to Ultra 1.0, its largest and most capable generative AI model. Additionally, Google launched a mobile app for Gemini, allowing users to chat with the AI on the go and integrate it with other Google products.

The details:

Gemini Advanced with Ultra 1.0 can handle more complex tasks such as coding, logical reasoning, creative projects, and personalized tutoring.
It is available as part of the Google One AI Premium Plan for $19.99/month, with a two-month free trial. The plan also includes 2TB of storage and other benefits.
The Gemini mobile app is compatible with Android and iOS devices and lets users chat with the AI in over 40 languages. The app also integrates with Gmail, Maps, and YouTube, enabling users to leverage the AI for various purposes.
Gemini can also generate images based on text descriptions, powered by the updated Imagen 2 model. The generated images have digitally identifiable watermarks and filters to avoid inappropriate content.

Why it’s important:

Gemini represents Google’s ambition to democratize AI and make it accessible to everyone. By rebranding Bard as Gemini, Google signals its commitment to developing and improving its AI models and capabilities. Gemini Advanced and the mobile app offer new ways for users to collaborate with the AI and benefit from its features. Image generation is a novel and exciting feature that can spark creativity and inspiration. Gemini is a powerful and versatile AI platform that can help users with various tasks and goals.

Source

Sam Altman’s $7 Trillion Plan to Power AI with New Chips

Sam Altman, the CEO of OpenAI, is seeking to raise up to $7 trillion to build new chip factories that would provide more computing power for artificial intelligence. He has met with officials from the United Arab Emirates to pitch his partnership idea, which would involve investors, chip makers, and power providers. The cost of his plan is staggering, as it exceeds the GDP of many countries or US cost for WWII adjusted to inflation. However, Altman believes that more chip production is essential for the future of AI, which he sees as the ‘holy grail’ of energy.

Source

1X Studio: The Future of Humanoid Robotics

1X Studio is a company that develops and trains humanoid robots using neural networks and embodied learning. Their goal is to create general-purpose androids that can perform various tasks in human environments, such as healthcare, education, entertainment, and more.

It uses a novel approach to robot autonomy: learning motor behaviors end-to-end from vision using neural networks. This means that their robots do not rely on pre-programmed rules or scripts, but instead learn from their own experiences and feedback. By using neural networks, they can also leverage the massive amount of data that they collect from their robots and environments.

The details:

1X Studio has two types of humanoid robots: EVE and NEO. EVE is a female android that stands at 1.6 meters tall and weighs 35 kilograms. NEO is a male android that stands at 1.65 meters tall and weighs 30 kilograms. Both robots have 34 degrees of freedom and can lift up to 70 kilograms.
They use a combination of simulation and real-world testing to train their robots. They use a proprietary software called 1X Vision, which allows them to create realistic and diverse simulations for their robots. They also use 1X Cloud, which enables them to store and analyze the data from their robots and simulations.
1X Studio has several partnerships and collaborations with other organizations and institutions, such as Sunnaas Hospital, where they deploy EVE to assist patients and staff; Google Cloud, where they use their edge computing and AI tools to enhance their robots’ capabilities; and Lifeboat Foundation, where they share their vision and research on humanoid robotics.

Why it’s important:

Humanoid robotics is a field that has the potential to transform civilization and improve human lives. By creating androids that can learn from their own experiences and adapt to different situations, 1X Studio is pushing the boundaries of what is possible with robotics. Their robots could provide valuable services and solutions in various domains, such as healthcare, education, entertainment, and more. They could also help us understand ourselves better, as they are modeled after the human form and behavior.

Source

Meta to Label AI-Generated Images on Its Social Media Platforms

Meta, the company formerly known as Facebook, announced that it will start labeling images that are created using artificial intelligence (AI) on its social media apps, including Facebook, Instagram and Threads. The company said that it is working with industry partners on common technical standards for identifying AI content, including video and audio1 The labels will be applied in all languages supported by each app in the coming months.

The move comes as AI-generated content becomes more prevalent and realistic, raising concerns about misinformation, manipulation and ethics. Meta said that it wants to provide transparency and help people know when the content they are seeing has been created using AI. The company already labels photorealistic images created using its own Meta AI feature, which allows users to generate pictures with simple text prompts, as "Imagined with AI".

The details:

Meta will label images from Google, OpenAI, Microsoft, Adobe, Midjourney, and Shutterstock as they implement their plans for adding metadata to images created by their tools.
Meta will use both visible markers and invisible watermarks and metadata embedded within image files to signal that they are AI-generated.
Meta is collaborating with other companies and organizations, such as the Adobe-led Content Authenticity Initiative and the Partnership on AI, to develop common standards for identifying AI content.
Meta’s president of global affairs, Nick Clegg, said that the company will learn from the feedback and experiences of its users and adjust its approach accordingly.
Meta’s announcement follows an executive order by U.S. President Joe Biden in October that called for digital watermarking and labeling of AI-generated content.

Why it’s important:

AI-generated content has the potential to unleash creativity and innovation, but also poses challenges and risks for society. By labeling AI-generated images, Meta aims to increase awareness and accountability among its users and creators, and to foster trust and responsibility in its platforms. Labeling AI-generated content is also a step towards addressing the ethical and legal implications of using and sharing such content, such as consent, attribution, ownership and liability. As AI technology advances and becomes more accessible, labeling AI-generated content will become more important and necessary for ensuring the integrity and authenticity of online information.

Source

Microsoft launches collaborations with news organizations to create the future with AI

Microsoft announced several partnerships with news outlets to adopt generative AI, a technology that can create original content from data and text. The company said that in a year where billions of people will vote in democratic elections worldwide, journalism is critical to creating healthy information ecosystems, and it is its mission to ensure that newsrooms can innovate to serve this year and in the future.

The details:

Microsoft is working with Semafor, a platform that uses AI to assist journalists in their research, source discovery, translation, and more. Semafor Signals, a tool that helps journalists provide a diverse array of credible local, national, and global sources to their audience, is powered by Microsoft’s AI services.
They also collaborating with the Craig Newmark Graduate School of Journalism at CUNY to offer a tuition-free program for experienced journalists to explore ways to incorporate generative AI into their work and newsrooms.
Microsoft is supporting the Online News Association’s AI in Journalism Initiative, which offers opportunities for journalists and newsroom leaders to navigate the evolving AI ecosystem.
Microsoft is partnering with the GroundTruth Project, which sends local journalists into newsrooms around the world through its Report for America and Report for the World programs. The AI in Local News initiative will add an AI track of work for its corps members, with the goal of helping make reporting and newsrooms more efficient and sustainable for the future.
Microsoft is backing Nota, a startup that puts high-quality AI tools into newsrooms to help improve newsroom operations. Nota will soon release a new tool called PROOF, an assistive recommendation widget that will give real-time tips to journalists and editors about how to better reach audiences with their content.

Why it’s important:

Generative AI is a game-changer for journalism, as it can augment humanity’s ability to think, reason, learn and express ourselves. It can help journalists uncover insights amid complex data and processes, speed up their ability to express what they learn, and stimulate creative expression. It can also help newsrooms create efficient business practices and build sustainable models for the future. However, generative AI also poses ethical and social challenges, such as ensuring accuracy, transparency, accountability, and diversity. Therefore, it is vital that news organizations work with technology companies like Microsoft to adopt generative AI responsibly and effectively, and to address both the promises and perils that lie ahead.

Source

OpenAI developing two AI agents to automate entire work processes

OpenAI, a leading research organization in artificial intelligence, is reportedly working on two types of AI agents that can automate complex tasks for users. One type of agent can take over a user’s device and perform tasks such as data transfer, form filling, or expense reporting. The other type of agent can perform web-based tasks such as data collection, travel planning, or ticket booking. These agents are part of OpenAI’s vision to create a universal, personalized, and intelligent assistant that can combine the capabilities of different GPT models and perform actions for users, rather than just responding to them.

Source

Students Use AI to Read Ancient Scroll Burned by Vesuvius

A team of three students has won $700,000 for using artificial intelligence to decipher the first passages of a 2,000-year-old scroll that was charred by the eruption of Mount Vesuvius in 79 C.E. The scroll is one of the hundreds of papyri known as the Herculaneum papyri, which were found in a villa that may have belonged to Julius Caesar’s father-in-law. The texts are considered to be a treasure trove of ancient knowledge, but they are too fragile to unroll by hand.

The students, Youssef Nader, Julian Schilliger, and Luke Farritor, used a combination of computer vision, machine learning, and hard work to identify over 2,000 Greek letters on the scroll, which appears to be a philosophical treatise on pleasure. They participated in the Vesuvius Challenge, a contest launched last year by researchers from the University of Kentucky and two entrepreneurs, who offered more than $1 million in prize money for reaching a series of milestones.

The details:

The Herculaneum papyri were discovered in the 18th century, but attempts to read them proved futile: Unrolling them by hand only caused them to fall apart.
The scrolls were written in carbon-based ink, which made them invisible to conventional CT scans. The researchers used a technique called “virtual unwrapping” to create 3D models of the scrolls without opening them.
The students used a neural network to analyze the scans and detect the ink traces on the papyrus. They then used another neural network to recognize the letters and words on the scroll.
The students decoded four passages, each of which contained at least 140 characters. They also provided a translation and a commentary for each passage.
The scroll seems to belong to the Epicurean school of philosophy, which advocated for a life of moderation and pleasure. The author of the scroll may have been Philodemus, a prominent Epicurean philosopher and poet.

Why it’s important:

The students’ achievement is a historic breakthrough for the field of classical studies, as it opens the possibility of reading the rest of the Herculaneum papyri, which could reveal new insights into the ancient world. The scrolls are the only intact library known from the classical period, and they contain works by famous authors such as Epicurus, Lucretius, and Virgil. The students’ method could also be applied to other damaged or unreadable manuscripts, such as the Dead Sea Scrolls or the Voynich Manuscript. The use of AI to decipher ancient texts demonstrates the power and potential of this technology for advancing human knowledge.

Source

ALOHA 2: A Better Way to Control Robots with Your Hands

Researchers from Google DeepMind have developed ALOHA 2, an improved version of their low-cost hardware for bimanual teleoperation. ALOHA 2 allows users to control robots with their hands, using grippers that are more powerful, ergonomic, and durable. The hardware is also compatible with Mujoco, a physics simulator that can emulate complex tasks. ALOHA 2 is open-sourced and can enable large-scale data collection for robot learning.

Source

Stanford Researchers Develop a Wearable Device That Lets Humans Control Robots with Their Minds

Imagine being able to command a robot to perform a range of everyday tasks, such as cleaning, cooking, or playing games, just by thinking about them. This is the vision of Stanford researchers who have developed a wearable device that can read brain waves and translate them into robotic actions. The device, called NOIR, Neural Signal Operated Intelligent Robots, is a non-invasive, electronic cap that measures the electrical activity in the brain (EEGs) and uses machine learning to identify what objects the wearer is paying attention to and how they want to interact with them. The researchers demonstrated NOIR’s capabilities by having human participants control robots to move objects, play tic-tac-toe, pet a robot dog, and even cook a simple meal using only their brain signals.

NOIR is based on two neuroscience techniques: steady-state visually evoked potential (SSVEP) and motor imagery. SSVEP is a method of detecting which object the wearer is focusing on by attaching a flickering mask to each object on a computer screen and measuring the corresponding brain responses. Motor imagery is a method of detecting the intended action by asking the wearer to imagine performing the action and analyzing the brain signals associated with it. By combining these two techniques, NOIR can infer the wearer’s intention and send commands to the robot accordingly.

The details:

NOIR is a wearable device that allows humans to control robots with their brain waves, without any invasive surgery or training.
It uses machine learning to identify what objects the wearer is paying attention to and how they want to interact with them, based on two neuroscience techniques: SSVEP and motor imagery.
NOIR can enable humans to direct robots to perform a range of everyday tasks, such as moving objects, cleaning countertops, playing tic-tac-toe, petting a robot dog, and cooking a simple meal.
NOIR was developed by researchers at Stanford University’s Vision and Learning Lab, led by Professor Silvio Savarese, and co-authored by postdoctoral researchers Ruohan Zhang and Zhiyuan Li, and graduate students Yufei Ye and Hang Zhao.sion, instruction following, human feedback, and ethical and social implications. The paper also maintains a real-time tracking website for the latest progress in the field.

Why it’s important:

NOIR is an innovative and accessible way of using brain-computer interfaces to enhance human-robot interaction. It has the potential to improve the quality of life for people who need assistance with daily activities, such as the elderly or the disabled, by allowing them to communicate their needs and preferences to robots more naturally and intuitively. It also opens up new possibilities for entertainment, education, and creativity, by enabling humans to explore and manipulate the physical world with their minds. NOIR is a step towards achieving a seamless and harmonious collaboration between humans and robots, where both can benefit from each other’s strengths and abilities.

Source

Brilliant Labs Launches Frame, an Open Source AR Glasses with AI Assistant

Brilliant Labs, a Singapore-based startup, has introduced Frame, a new pair of augmented reality glasses that look like normal eyeglasses. Frame is powered by an AI assistant called Noa, which can perform tasks such as visual search, real-time translation, text summarization, and more.

Frame weighs only 39 grams and has a high-resolution display with 3,000 nits of brightness. It features thick round frames made of nylon plastic, inspired by iconic figures like John Lennon, Steve Jobs, and Gandhi. Frame comes in three color variants: Smokey Black, Matte Cool Gray, and H2O (see-through).

Frame’s AI assistant, Noa, uses cutting-edge AI models such as Perplexity AI, Stable Diffusion by Stability AI, GPT-4 by OpenAI, and the speech recognition prowess of Whisper. Noa can process visual information, generate images, translate languages, and much more, all projected onto the lenses of Frame.

The details:

Frame requires a companion mobile app to work, but Brilliant Labs plans to embed lightweight ML models directly into Frame in the future.
It has integrated camera and sensors enable real-time visual analysis for tasks like checking product prices in stores or identifying details about homes when house hunting.
Frame does not store any files on device or allow sharing to social media, unlike other AR devices such as Meta’s Smart Glasses or Snap’s Spectacles.
It is fully open source. All the design files, code and documentation are available on GitHub. Anyone can modify and run the code they want on their device.
Frame has attracted the attention and investment of prominent names in the tech industry, such as Brendan Iribe, co-founder of Oculus; Adam Cheyer, co-founder of Siri; and John Hanke, CEO of Niantic Labs and creator of Pokémon GO.

Why it’s important:

Frame is a groundbreaking product that combines the power of AI and AR in a sleek and accessible device. Frame aims to enhance rather than intrude upon our lives, by providing useful and personalized information and experiences. Frame also empowers users to customize and hack their device, opening up new possibilities for innovation and creativity. Frame is not just a product, but a new paradigm of daily living.

Source

Roblox Breaks Down Language Barriers with AI Chat Translations

Roblox, a popular online platform for immersive 3D experiences, has introduced a new feature that allows users to communicate seamlessly with each other in different languages. Using a custom multilingual translation model, Roblox can automatically translate text chat messages between any combination of the 16 languages it supports, in real-time and with low latency. This feature enhances the social aspect of Roblox and enables users to connect with more people from around the world. Roblox’s translation model is based on the latest advances in natural language processing and machine learning and is optimized for the specific language and slang used on the platform. Roblox plans to further improve the accuracy and scope of its translations, and even explore the possibility of voice chat translations in the future.

Source

Quick news

Midjourney Alpha website opened for people with more than 1000 generations (link)
Build your own personal Assistant in Hugging Face Chat (link)
Microsoft showed Superbowl ad “Copilot for everyone (link)
AI generated Spheres for Apple Vision Pro from Outtakes (link)
Google’s Lookout App has introduced an innovative feature named ‘Image Question and Answer’ (link)
Apple released an open-source AI model for image editing (link)
The new paper introduces MedSAM, a universal medical image segmentation model. (link)

For daily news from the AI and Tech world follow me on:

Follow me on X

Be better with AI

In this section, we will provide you with comprehensive tutorials, practical tips, ingenious tricks, and insightful strategies for effectively employing a diverse range of AI tools.

How to run Copilot completely for free on your computer.

Download and install lmstudio.ai
Find and download OpenHermes-2.5-Mistral-7B model
Add CodeGTP extention to your VS Studio
Change provider to LM studio

Tools

📅 Gravity - Your data never leaves the machine. (link)
🎥 Kuasar Video - Kuasar Video is an AI-powered marketing tool (link)
✉️ Letterly - Convert your thoughts into words without using the keyboard (link)
🗨️ Coze - Coze is a platform for developing advanced AI chatbots (link)
📼 HeyGen GPT - Turn any text into videos (link)

We hope you enjoy this newsletter!

Please feel free to share it with your friends and colleagues and follow me on socials.

Follow me on Threads

NeuralByte

NeuralByte's weekly AI rundown - 11th February

Sam Altman wants 7 trillion dolars to transform chip industry. Bard is now Gemini. And more robots!

Google rebrands Bard as Gemini and launches new AI experiences

The details:

Why it’s important:

Sam Altman’s $7 Trillion Plan to Power AI with New Chips

1X Studio: The Future of Humanoid Robotics

The details:

Why it’s important:

Meta to Label AI-Generated Images on Its Social Media Platforms

The details:

Why it’s important:

Microsoft launches collaborations with news organizations to create the future with AI

The details:

Why it’s important:

OpenAI developing two AI agents to automate entire work processes

Students Use AI to Read Ancient Scroll Burned by Vesuvius

The details:

Why it’s important:

ALOHA 2: A Better Way to Control Robots with Your Hands

Stanford Researchers Develop a Wearable Device That Lets Humans Control Robots with Their Minds

The details:

Why it’s important:

Brilliant Labs Launches Frame, an Open Source AR Glasses with AI Assistant

Why it’s important:

Roblox Breaks Down Language Barriers with AI Chat Translations

Quick news

Be better with AI

How to run Copilot completely for free on your computer.

Tools

Discussion about this post