Multimodal generative AI

Contributor(s):

Material type: Text

TextPublication details: Singapore Springer 2025Description: xxii, 382 pISBN:

9789819623549

Subject(s):

DDC classification:

006.31 SIN

Summary: This book stands at the forefront of AI research, offering a comprehensive examination of multimodal generative technologies. Readers are taken on a journey through the evolution of generative models, from early neural networks to contemporary marvels like GANs and VAEs, and their transformative application in synthesizing realistic images and videos. In parallel, the text delves into the intricacies of language models, with a particular on revolutionary transformer-based designs. A core highlight of this work is its detailed discourse on integrating visual and textual models, laying out state-of-the-art techniques for creating cohesive, multimodal AI systems. “Multimodal Generative AI” is more than a mere academic text; it’s a visionary piece that speculates on the future of AI, weaving through case studies in autonomous systems, content creation, and human-computer interaction. The book also fosters a dialogue on responsible innovation in this dynamic field. Tailored for postgraduates, researchers, and professionals, this book is a must-read for anyone vested in the future of AI. It empowers its readers with the knowledge to harness the potential of multimodal systems in solving complex problems, merging visual understanding with linguistic prowess. This book can be used as a reference for postgraduates and researchers in related areas. (https://link.springer.com/book/10.1007/978-981-96-2355-6)

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Collection	Call number	Copy number	Status	Date due	Barcode
Book	Indian Institute of Management LRC General Stacks	IT & Decisions Sciences	006.31 SIN (Browse shelf(Opens below))	1	Available		009104

Browsing Indian Institute of Management LRC shelves, Shelving location: General Stacks, Collection: IT & Decisions Sciences Close shelf browser (Hides shelf browser)

Previous						No cover image available		Next
Previous	006.31 SHA Deep learning with Tensorflow JS projects	006.31 SIL Beginning with deep learning using TensorFlow: a beginner’s guide to TensorFlow and Keras for practicing deep learning principles and applications	006.31 SIN Deep learning through the prism of tensors	006.31 SIN Multimodal generative AI	006.31 SRI Machine learning	006.31 TES Building Scalable deep learning pipelines on AWS: develop, train, and deploy deep learning models	006.31 THA Data science and machine learning in R	Next

Table of contents:
Front Matter
Pages i-xxii
Download chapter PDF
Introduction to Multimodal Generative AI
R. Brindha, R. K. Pongiannan, A. Bharath, V. K. S. M. Sanjeevi
Pages 1-36
ChatGPT and BERT: Comparative Analysis of Various Natural Language Processing Applications
Saranya M, Amutha B
Pages 37-61
Large Language Model on Multi-Modal Data
Avi Aneja, Anuradha Dhull, Akansha Singh, Krishna Kant Singh
Pages 63-78
Adaptive Learning Technologies: Navigating the Road from Hype to Reality
S. Valai Ganesh, M. Gomathy Nayagam, V. Suresh, S. Rajakarunakaran, B. Bensujin
Pages 79-113
Generative Artificial Intelligence in Visual Content: A Review of the Influence on Consumer Perception and Perspective
Akanksha Singh, Gulshan Kumar, Akashdeep Dhariwal
Pages 115-132
Text-to-Image Synthesis: Techniques and Applications
Akansha Singh, Krishna Kant Singh
Pages 133-155
Image-to-Text Generation: Bridging Visual and Linguistic Worlds
Akansha Singh, Krishna Kant Singh
Pages 157-175
Sustainability in the Metaverse: Challenges, Implications, and Potential Solutions
Poornima Jirli, Anuja Shukla
Pages 177-200
Transcendent Artificial Intelligence in Education
Yashwant A. Waykar, Sucheta S. Yambal
Pages 201-232
ChatGPT in Academia and Research: A Comprehensive Review of Integrating AI in Higher Education
Aashka Thakkar, Andinet Asmelash Fentaw, Habtamu Ditta Hirpo
Pages 233-251
Exploring Multi-modal Hate Speech Detection Using Machine Learning and Deep Learning Models
Shefali Khera, Anuradha, Akansha Singh, Krishna Kant Singh
Pages 253-270
Multi-modal Generative AI for People with Disabilities
N. R. Raji, C. L. Biji, V. Vineetha
Pages 271-296
Single Modality to Multi-modality: The Evolutionary Trajectory of Artificial Intelligence in Integrating Diverse Data Streams for Enhanced Cognitive Capabilities
Hardeep Kaur, C. Kishor Kumar Reddy, D. Manoj Kumar Reddy, Marlia Mohad Hanafiah
Pages 297-322
Interfacing Multi-modal AI with IoT: Unlocking New Frontiers
S. Delsi Robinsha, B. Amutha
Pages 323-346
Enhancing Safety and Reliability in Vanets for Autonomous Vehicles by M-XAI (Multi-modal Explainable-AI)
Umesh Gupta, Ayushman Pranav, Ankit Dubey, Rajesh Kumar Modi, Akansha Singh
Pages 347-371
Future Directions in Multimodal Generative AI
Akansha Singh, Krishna Kant Singh
Pages 373-382

[https://link.springer.com/book/10.1007/978-981-96-2355-6]

This book stands at the forefront of AI research, offering a comprehensive examination of multimodal generative technologies. Readers are taken on a journey through the evolution of generative models, from early neural networks to contemporary marvels like GANs and VAEs, and their transformative application in synthesizing realistic images and videos. In parallel, the text delves into the intricacies of language models, with a particular on revolutionary transformer-based designs. A core highlight of this work is its detailed discourse on integrating visual and textual models, laying out state-of-the-art techniques for creating cohesive, multimodal AI systems. “Multimodal Generative AI” is more than a mere academic text; it’s a visionary piece that speculates on the future of AI, weaving through case studies in autonomous systems, content creation, and human-computer interaction. The book also fosters a dialogue on responsible innovation in this dynamic field. Tailored for postgraduates, researchers, and professionals, this book is a must-read for anyone vested in the future of AI. It empowers its readers with the knowledge to harness the potential of multimodal systems in solving complex problems, merging visual understanding with linguistic prowess. This book can be used as a reference for postgraduates and researchers in related areas.

(https://link.springer.com/book/10.1007/978-981-96-2355-6)

There are no comments on this title.

to post a comment.