The Future of Deep Learning: From Texts to Images and Hierarchies

October 9, 2023
Reading Time: 3 min

Yann LeCun, Chief AI Scientist at Meta AI Research and Silver Professor at the Courant Institute for Mathematical Sciences at New York University, one of the foremost minds and pioneers in the field of Deep Learning Deep Learning, gave a fascinating insight into the future of this emerging technology last Friday at the Bavarian Academy of Sciences in Munich.

His inspiring talk summarized the evolution from Deep Learning to today's large linguistic models (LLMs), such as ChatGPT, while highlighting the challenges and potentials that await us in the coming years.

LeCun reduces the problem of today's large language models like ChatGPT to two main challenges. First, there is not enough text data to train the ever-growing LLMs, and text data conveys only limited "world knowledge" in the sense of physics. For another, LLMs are based on recursive prediction of the next word, without a planning component. LeCun therefore sees today's LLMs as an impressive breakthrough, but one that is structurally limited.

Extension of LLMs to image data

LeCun proposes to train large meshes similar to LLMs with image data by removing image regions that the mesh should then add back.

On the one hand, this approach solves the problem of limited text, since there is much more image data than text. On the other hand, images can easily be generated by observing the real world, while texts can only be written by humans. Similar to LLMs, Yann LeCun expects the emergence of a "foundation model", i.e., a model that builds emergent world knowledge about images and can be easily "refined" for specific applications.

Among others, Meta AI has trained and successfully "refined" such a model "DINOv2" for determining tree heights for environmental monitoring from satellite imagery with comparatively little data, demonstrating the idea of a foundation model for image data.

Hierarchical networks for planning thinking

LeCun proposes a new hierarchical architecture (H-JEPA) that allows starting with rough planning that is then progressively refined. This is intended to overcome the limitation of LLMs to think only from word to word. This idea is still in its early stages, but promises to be an exciting new direction for AI development.

Our conclusion

The history of neural networks is a history of structural innovations, from CNN and RNN to LSTM and Transformer. LeCun's proposals could be the next evolutionary step - but there are other promising developments as well. While we eagerly await to see how his ideas evolve, one thing is certain: the integration of LLMs and image data is an important milestone.

Image-based "foundation networks" could revolutionize the automatic understanding and processing of image data in business processes. However, the hardware requirements for "foundation" models are generally significantly higher than for specialized models.

With our new nVidia H100 GPUs, at CIB we are ideally positioned to track and evaluate such issues in our research department and make the most of them in our products.

Let´s CIB!

Back to blog

AI & Innovation

The Future of Deep Learning: From Texts to Images and Hierarchies

Extension of LLMs to image data

Hierarchical networks for planning thinking

Our conclusion

CIB Group

Experts in digitalization

More articles

CIB doXiview 9.0.9: Higher productivity and more security in document management

PDF Transformation - with the new CIB pdf toolbox greater control and efficiency

New Freedom of Information Act in Austria: With more transparency comes the challenge of safeguarding data protection

CIB documentServer - Optimized document processing for more security and efficiency

BUSINESS TOOLS

Business CCM & BPM

Mail delivery with a click

Business Exchange

Business All-round PDF-Editor

E-Invoice

FREE ONLINE TOOLS

Safe cloud

PDF editing in browser

Compress PDF

Mobile Apps

Digital signature

ARTIFICIAL INTELLIGENCE

CIB AI

AI training

Open Source

Commercial

Support & Hotline

Company

Digitalization & Automatization

Sectors

Case Studies

CIB User Group

Products

Safe cloud

PDF-Experten

AI

AI training

Open Source

Documentation

Commercial

Support & Hotline

Template service

Company

For Business

Commercial

Support & Hotline

Zum Anfrageformular

Company