NeuralByte's weekly AI rundown - 23th December
Bill Gates, Visual and Audio Generative Models Emergence, and More - Explore the Latest Trends in Technology!
Greetings fellow AI enthusiasts!
Christmas is coming but there was also lot of going on in AI world. Join us in celebrating the convergence of technology and holiday spirit.
In this week’s AI news, amazing things happening in the field of AI art, video creation, music generation, 3D animation, human video synthesis, video editing, AI chatbot, 3D avatar, and AI marketing. You don’t want to miss the predictions from Bill Gates, the stunning creations from Midjourney V6, VideoPoet, Microsoft Copilot, Align Your Gaussians, DreaMoving, Fairy, and Apple’s papers. These are some of the most exciting and innovative developments in AI that you need to know. Subscribe now and stay updated with the latest and greatest in AI!
👓 Bill Gates predicts how AI will transform healthcare, education, and work in 2023
🖼️ Midjourney V6: The latest AI art generator can create realistic text and images
📹 Google Introduces VideoPoet, a Powerful AI Tool for Video Creation
🤖 Microsoft Copilot can now create songs with Suno integration
🧠 Align Your Gaussians: Nvidia’s AI System Generating 3D Animations From Text
💃 Alibaba‘s DreaMoving: A Video Generation Framework based on Diffusion Models
⚙️ Fairy: A Revolutionary AI Tool for Video Editing and Synthesis
🍎 Apple’s AI chatbot and 3D avatar papers hint at future features
🧑💼 AI Marketing Jobs Are Still Rare Despite the Hype
And more!
Bill Gates predicts how AI will transform healthcare, education, and work in 2023
Bill Gates, the Microsoft cofounder and philanthropist, has shared his insights on how artificial intelligence (AI) will shape the world in the near future. In a blog post, he highlighted the potential impact of AI on healthcare, education, and work, and praised the technology as a “revolutionary” force for innovation.
Gates said that AI development was about to “supercharge” the innovation pipeline and bring new solutions to some of the world’s biggest challenges. He also suggested that AI adoption was imminent in high-income countries and would soon follow in low- and middle-income countries.
The details:
In healthcare, Gates mentioned several ambitious projects that use AI to tackle problems such as antibiotic resistance, high-risk pregnancies, and HIV risk assessment. He said these projects were setting the stage for a "massive technology boom later this decade.”
In education, Gates described some of the “mind-blowing” AI tools that are being piloted today, such as Khanmigo and MATHia, which can personalize learning and engage students. He said he was excited about the possibility of localizing these tools to different cultural contexts.
In work, Gates said that AI was becoming more of a “copilot” than a replacement for human workers. He said that AI could help people perform tasks faster, better, and more creatively, but also acknowledged that old habits are hard to break at work.
Why it’s important:
AI is one of the most powerful and disruptive technologies of our time. It has the potential to improve lives, solve problems, and create new opportunities. However, it also poses ethical, social, and economic challenges that need to be addressed. As Gates wrote, “We need to make sure that AI serves humanity, and not the other way around.”
Midjourney V6: The latest AI art generator can create realistic text and images
Midjourney, the popular AI art generator, has released its latest version 6, which brings significant improvements in image quality and text generation. Users can now create realistic and detailed images by typing in text descriptions, and even have the model generate legible text within images. The new version also requires users to relearn how to prompt the model, as the old tricks no longer work. Midjourney V6 is still in alpha testing and will undergo further changes before the full release.
The details:
You can enable V6 by using the –style raw option in your prompt. For example,
/imagine prompt: a dragon –v 6
Midjourney V6 can generate images with drastically improved and more realistic, highly detailed images, and the ability to have the model generate legible text within images, something that had eluded Midjourney since its release in 2022.
Midjourney V6 uses an entirely new prompting method, which requires users to be explicit about what they want and use –style raw for more photographic results. The old prompting tricks, such as including camera names, film stock, and resolution, no longer work.
Midjourney V6 is still an alpha test, which means it is missing some features from V5.2, such as pan and zoom, and will change frequently and without notice. The founder and leader of the Midjourney project, David Holz, said V6 is not the final step, but a progression of something profound.
Midjourney V6 has received positive feedback from some power users, who have posted incredibly vivid and richly detailed results on social media. The model can generate realistic skin, lighting, and reflection details, as well as text in various fonts and styles.
Why it’s important:
Midjourney V6 is a major update for the AI art community, as it shows the potential and creativity of AI image generation. Midjourney is considered by many to be the preeminent and highest quality AI art generator, and it continues to innovate and challenge its competitors. Midjourney V6 also opens up new possibilities for users to create and express themselves with AI, as they can now generate realistic text and images with simple prompts. However, Midjourney V6 also poses some ethical and legal challenges, as it may infringe on the rights and works of other artists and creators.
Google Introduces VideoPoet, a Powerful AI Tool for Video Creation
Researchers from Google have introduced VideoPoet, a large language model that can generate videos from text, images, audio, and other inputs. The model can also perform tasks such as video stylization, inpainting, outpainting, and audio synthesis.
VideoPoet is based on the idea of using language models for video generation, which has been shown to be effective for various modalities such as language, code, and audio. The model uses multiple tokenizers to convert video, image, audio, and text into discrete tokens, which are then fed into an autoregressive language model. The model can then generate tokens for the desired output modality, which can be decoded back into a viewable representation.
The details:
VideoPoet can handle a wide variety of video generation tasks, such as text-to-video, image-to-video, video-to-audio, video stylization, video inpainting, and video outpainting.
VideoPoet can generate videos with coherent large motions and interesting styles, guided by text prompts or other inputs.
VideoPoet can also generate audio from video clips without any text guidance, enabling video and audio synthesis from a single model.
VideoPoet can produce videos in portrait orientation to suit short-form content, and can also control camera movements by appending text suffixes to the prompts.
VideoPoet outperforms existing video generation models in terms of text fidelity and motion interestingness, according to user preference ratings.
Why it’s important:
VideoPoet demonstrates the potential of language models for video generation, a challenging and creative domain that requires multimodal understanding and generation. The model showcases the ability to produce high-quality videos with diverse and dynamic content, as well as to edit and manipulate existing videos with fine-grained control. VideoPoet also opens up new possibilities for “any-to-any” generation, where any modality can be converted to any other modality using a single model.
Microsoft Copilot can now create songs with Suno integration
Microsoft Copilot, the AI-powered chatbot from Microsoft, has a new feature that allows users to create songs with the help of Suno, a gen AI music app. Users can simply enter prompts like “Create a pop song about adventures with your family” and have Suno generate complete songs, including lyrics, instrumentals, and singing voices.
The details:
Suno is a gen AI music app that can produce songs from a single sentence prompt, using advanced AI models to compose original music in various genres and styles.
Microsoft Copilot users can access the Suno integration by launching Microsoft Edge, visiting here, logging in with their Microsoft account, and enabling the Suno plugin.
The partnership between Microsoft and Suno aims to make music creation more accessible and fun for everyone, regardless of their musical skills or background.
Why it’s important:
AI-driven music creation is a growing field that has the potential to revolutionize the music industry and empower artists and enthusiasts alike. By integrating Suno with Microsoft Copilot, Microsoft is demonstrating its commitment to innovation and creativity, as well as providing a new way for users to express themselves and enjoy music.
Align Your Gaussians: Nvidia’s AI System Generating 3D Animations From Text
A team of researchers from Nvidia, the University of Toronto and MIT have developed a novel AI system that can create 3D animations from text descriptions. The system, called Align Your Gaussians (AYG), combines different AI models to produce animations with vivid motion, realistic textures, and geometric consistency.
The details:
AYG represents 3D shapes as collections of 3D Gaussian functions and models their motion using deformation fields that define how the Gaussians move over time.
It leverages the Stable Diffusion text-to-image model, a text-to-video model, and a multi-view 3D model to generate animations from textual input such as "a horse galloping across a meadow".
Align your gaussians can also generalize to some new concepts that were not seen during training and extend and link animations over longer time scales than existing text-to-video models.
AYG also allows multiple animated objects to be combined in a single scene, such as dogs around a campfire
Why it’s important:
AYG is a breakthrough in the field of text-to-animation, as it can generate 3D animations that are both realistic and consistent from different angles. AYG could have potential applications in creative tools and the generation of synthetic data, which is often used for training AI models in domains such as autonomous driving. AYG demonstrates the power of combining different AI models to achieve novel and complex tasks.
Alibaba launched DreaMoving: A Human Video Generation Framework based on Diffusion Models
Researchers have developed a new framework for generating realistic human videos from target identity and posture sequences. The framework, called DreaMoving, uses a diffusion-based model to produce high-quality and high-fidelity videos with various controls.
The details:
DreaMoving consists of two components: a Video ControlNet and a Content Guider. The Video ControlNet controls the motion and appearance of the generated video, while the Content Guider preserves the identity and style of the target person.
DreaMoving can generate videos of the target person dancing anywhere, driven by the posture sequences. The posture sequences can be extracted from existing videos or manually edited by the user.
DreaMoving can also control the video appearance based on text prompts and reference images. For example, the user can specify the background, clothing, hair, and facial expression of the target person.
DreaMoving demonstrates superior performance and generalization ability compared to existing methods. It can generate videos with diverse identities, motions, and styles, as well as handle unseen domains and challenging scenarios.
Why it’s important:
DreaMoving is a novel and powerful framework for controllable video generation. It enables users to create customized human videos with ease and flexibility. It also opens up new possibilities for applications such as video editing, animation, entertainment, and education.
Fairy: A Revolutionary AI Tool for Video Editing and Synthesis
Fairy is a new AI technology that offers unparalleled speed and quality in video-to-video synthesis. It is based on a minimalist adaptation of image-editing diffusion models, enhanced with a cross-frame attention mechanism and a data augmentation strategy. Fairy can generate high-fidelity and temporally coherent videos from simple text instructions, opening up new possibilities for digital content creation.
The details:
Fairy uses key anchor frames to extract and propagate diffusion features across video frames, ensuring a smooth and consistent video output.
Fairy can process all frames from the source video without downsampling or frame interpolation, preserving the original aspect and quality of the video.
Fairy can generate a 27-second video in just over 71 seconds using 6 A100 GPUs, demonstrating its remarkable efficiency and speed in video generation.
Fairy is backed by a large-scale user study involving 1000 video-instruction samples, the largest of its kind in the video-to-video generation literature. The study shows that Fairy outperforms existing methods in quality and functionality.
Why it’s important:
Fairy represents a monumental stride in the realm of video editing and synthesis. Its fast, parallelized, and instruction-guided approach makes it a pioneering tool in the industry, capable of transforming how professionals and amateurs alike create and manipulate video content. The future of video editing looks bright, and it’s illuminated by the innovative brilliance of Fairy.
Apple’s AI chatbot and 3D avatar papers hint at future features
Apple is working on some exciting technologies that could enhance its products and services in the near future. One is an AI chatbot that can run on iPhones with limited memory, using a clever method to minimize data transfer from flash storage to RAM. Another is a way to generate realistic 3D avatars from standard video footage, which could enable applications like virtual clothes fitting or Vision Pro. These are some of the findings from two recent research papers published by Apple’s AI team, which show the company’s ambition and innovation in the field of artificial intelligence.
AI Marketing Jobs Are Still Rare Despite the Hype
Artificial intelligence has been a hot topic in the marketing industry, with many predictions and innovations based on this technology. However, most big brands have not yet created specific roles or teams to oversee their AI strategy, leaving the responsibility to existing marketers or external agencies. Some exceptions include Coca-Cola, which has an AI director, and Unilever, which has an AI hub. Experts say that AI marketing jobs will become more common in the future, as the technology matures and the demand for AI skills increases.
Axel Springer and OpenAI partner to boost journalism with AI
Axel Springer, a leading media and technology company, has announced a global partnership with OpenAI, a research organization dedicated to creating artificial intelligence (AI) that can benefit humanity. The partnership aims to strengthen independent journalism in the age of AI by enriching users’ experience with ChatGPT, a conversational AI system powered by OpenAI’s technology. ChatGPT users will receive summaries of selected global news content from Axel Springer’s media brands, including POLITICO, BUSINESS INSIDER, and BILD. The partnership also supports Axel Springer’s existing AI-driven ventures and values the publisher’s role in contributing to OpenAI’s products. This marks a significant step in both companies’ commitment to leverage AI for enhancing content experiences and creating new financial opportunities that support a sustainable future for journalism.
Quick news
Google to curb election queries with AI in 2024 (link)
Anthropic vs OpenAI: The $5 billion race to build the next frontier of AI (link)
The Dictionary.com Word of the Year is hallucinate. (link)
AI machine cannot be called an inventor, rules UK court (link)
A Christmas menu dreamed up by a robot (link)
Optimizing Food Use With Machine Learning Generated Recipes (link)
Be better with AI
In this section, we will provide you with comprehensive tutorials, practical tips, ingenious tricks, and insightful strategies for effectively employing a diverse range of AI tools.
Cool marble statue propmt for DALL-E 3
Use the exact prompt :
A photorealistic image of an ultra-detailed sculpture of a [PROMPT] made of shining white creamy marble. The sculpture should display smooth and reflective marble surface, emphasizing its luster and artistic craftsmanship. The design is elegant, highlighting the beauty and depth of marble. The lighting in the image should enhance the sculpture's contours and textures, creating a visually stunning and mesmerizing effect
Tools
📑 Shader App - Effortless video concept and storyboarding (link)
🔗 Rex.fit - Meet your AI nutrition and fitness coach. (link)
⚒️ ChatGTP unwrapped - Review your year with ChatGTP (link)
🧑🦰 Wysper.ai - Converting audio into written content with one click. (link)
📱 Findly.ai - Supercharge Your Google Analytics 4 Growth (link)
🧠 Mental - Train your mind to achieve peak performance (link)
We hope you enjoy this newsletter!
Please feel free to share it with your friends and colleagues.
Thank you for reading!