"ChatGPT is the momentary sensation given all the tasks it performs, but do we know what is behind this artifact?"

Written by Josep Carreras, Data Innovation Lead at SDG Group


If we search the Internet for the well-known conversational application ChatGPT, we get about 744,000,000 results. Of all the latest news about this chatbot built on the basis of GPT, most notable is that Microsoft has incorporated the OpenAI artificial intelligence into its Bing search engine, thus challenging the two-decade dominance of the Google engine.

ChatGPT is undoubtedly the sensation of the moment because of the wide number of tasks it performs and because it imitates human language in a very realistic way. However, do we know what is behind this "artifact"? The answer is in NLP, or Natural Language Processing.

This field has undergone an exponential transformation in recent years, from manual feature extraction and the use of machine learning, to identifying patterns, to powerful generative models. Recurrent neural networks and long-term memory drives have enabled unprecedented handling of text sequences. Transformers introduced self-directed attention, later adopted by Google's BERT to use bidirectional language models and pretraining on large text corpora.

"NLP has gone from machine learning to identifying patterns to powerful generative models.”

Finally, OpenAI has developed GPT, culminating in GPT-4, which leads the generation and understanding of language in 2023. To compete, Google recently launched PALM2, offering very similar functionalities to GPT-4, even exceeding their capabilities in some respects according to some reports.

Language is the main way in which we communicate, meaning a huge amount of this kind of information is generated every day. NLP – the area of artificial intelligence that deals with the analysis of human language, both spoken and written – allows us to structure it, interpret it, exploit it and incorporate it into our productive processes. What's more, we can classify the actions that we can carry out with NLP in three big blocks.

First, word analysis identifies terms that have certain characteristics within a text and categorizes them or establishes relationships between them. On the other hand, we have the analysis of texts and text sets, which assigns each a category from within a set which can be predefined or discovered as part of the process. This is where we find the classification of documents according to typology, subject, topics, sentiment, etc. Finally, text generation creates a response, usually from an equally textual input.

"The high algorithmic specialization of the solutions provided by NLP includes notions of computational linguistics and techniques that are not frequently applied in other areas."

Additionally, there are three intrinsic characteristics that separate NLP solutions from any other advanced analytics initiative. One is the high algorithmic specialization of NLP solutions. The characteristics and complexities of the language have enabled the emergence of new algorithms and models that include notions of computational linguistics, as well as the handling of ML techniques that are not frequently applied in other areas, such as Hidden Markov Models or Conditional Random Fields.

Second, traditional Machine Learning problems work with data of a very diverse nature. For example, the ones used to predict the evolution of electricity demand are completely different from the ones that can be used to recommend when and how a commercial can address a potential customer. This means that each problem requires unique data and models. The language, despite still varying, is much more constant. This gives rise to the so-called foundational models: large patterns trained on huge sets of texts, capable of capturing and sculpting the structures and features of language. The ability to exploit and adapt these models is one of the fundamental factors that define NLP today.

Finally, the marked technological nature of NLP-based solutions means that the generation of foundational models can be transferred to different tasks. Due to its scale, this poses unique technological challenges and involves considerations beyond the technical field, such as the associated carbon footprint.

"NLP must be conceived beyond the methodologies of the discipline itself: it must be combined with other areas of AI, the architecture of 'Machine Learning' systems, and data processing."

Even if the aforementioned methodologies are used to apply existing models, both the storage and/or exploitation of unstructured data and the production of models based on deep learning pose specific challenges for the Machine Learning engineer that must be contemplated from the initial foundation of the platforms. In this sense, there is a wide range of managed services and SaaS for NLP, which force (or encourage) adopting a technological approach to find the right design for each solution.

A good conclusion from all of this is that solutions based on NLP can rarely be viewed solely from the prism of the techniques and methodologies specific to this discipline. In most cases, a combination of knowledge of NLP, other areas of AI, data processing, and Machine Learning system architecture is necessary.

The approach adopted in the projects in this area is holistic, based on the approach to data science, the design of ML systems, the understanding at the business level of the particularities of the different sectors, and the interaction between NLP and the rest of the AI areas.


Translated from original article published in Metadata here.