How can SMEs make the most of LLMs?

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Businesses of all sizes have been unable to escape the incredible impact AI has recently had on the way we do business.

From corpo­ra­tions to SMEs, organi­za­tions are becoming faster, more agile and more resilient as we outsource admin­is­trative and repet­itive tasks to our AI colleagues.

One of the latest AI trends is the estab­lishment of Large Language Models (LLMs) in the public domain: machine learning algorithms that are trained on huge amounts of data to recognize the struc­tures and patterns of natural language. They are profi­cient in Natural Language Processing (NLP), which allows us to explore huge data sets through everyday questions or commands.

Therefore, LLMs are the most common method of making AI under­standable — to take the most famous example, LLMs are the means by which ChatGPT can answer your questions. However, this intel­li­gence has a conven­tional disad­vantage: it is in a kind of time capsule.

LLMs are trained inten­sively, with millions upon millions of data points fired at them in a constant feedback loop to teach each model how to under­stand specific data points or patterns. But “opera­tional­izing” an LLM – taking it out of the training loop and putting it online as part of your infra­structure – obviously prevents it from learning anything new. Even some of the early versions of ChatGPT will politely explain their own time constraints if you ask a question about very current events.

This means that you need to be sure that the LLM can rely on the systems it will examine and the data available to it. And while the corporate giant may have the funding and technology stack to make this possible, it’s a bold assumption for an SMB.

Move it or lose it

In the past, we tended to think of data as static. When the layperson downloads a file onto their PC, the file is not “there” until it appears in your documents, even if millions of individual bytes of data quietly assemble into something infinitely more complex.

With this mindset, you can under­stand why companies have often chosen to collect as much data as possible and only then determine what they have actually collected. Convention would have us dumping data into a giant data warehouse or lake, spending ages deleting and preparing that data, and then digging up various pieces for analysis — a method commonly known as batch processing.

That’s about as efficient as it sounds. Tackling an entire data set dupli­cates work, obscures insights, and imposes enormous demands on hardware and power consumption—all while delaying important business decisions. For SMEs looking for ways to make up for limited resources and staff, this method under­mines the agility and speed that should be their natural advantage.

Since there was previ­ously no need for infor­mation to be consumed or even captured in real time, this has not been a problem until now. But consid­ering how many of the new companies’ end-customer value propo­si­tions are based on real-time data (think, for example, of hailing a taxi using Uber or a similar appli­cation, and imagine not seeing the “live” map with your driver’s location), this is now a must-have, not a nice-to-have.

Fortu­nately, LLMs don’t just work on a batch basis. You can interact with data in different ways — and some of those ways don’t require the data to stay still.

Ask and you shall receive

Just as disruptive SMBs seek to topple older and more estab­lished companies, data streaming is replacing batch processing.

Data streaming platforms use real-time data “pipelines” to collect, store and use data contin­u­ously and in real time. The processing, storage and analysis that batch processing has you waiting for can now be done suddenly and instantly.

In streaming, this is achieved through so-called event-driven principles, where essen­tially every change in a data set is treated as an “event” in itself. Each event contains a trigger to receive more data, creating a constant cascade of new infor­mation. Instead of having to retrieve data (usually stored in a table somewhere in a database), data sources “publish” their data in real-time at any time to anyone who wants to consume that data by simply “subscribing” to them. Data.

All of this can free LLMs from the distinction between training and opera­tions. Additionally, it is possible for the LLM to train itself if actions can be performed on each data point. to use the correctness of its actions to contin­ually refine the under­lying algorithms that define its purpose.

This means that the LLM can draw on a constantly updated and curated data set, while constantly improving the mecha­nisms that deliver and contex­tu­alize this data. Data isn’t at risk of redun­dancy or left in a forgotten silo – all you have to do is ask for it!

Cut from the SME cloth

So what does this mean for medium-sized businesses?

On the one hand, it releases the proverbial handbrake. The sheer speed at which LLMs can deliver infor­mation across a stream-driven infra­structure allows decision makers to move business forward at the pace they desire without batch processing keeping them in second gear. The agility that enables SMBs to outma­neuver larger players is once again in abundance.

These decisions are made with less doubt and more relevant context than before. Thanks to the natural language that LLMs recognize, it is so easy to access specific insights that data streaming can inspire real enthu­siasm for business trans­parency across the board.

Not only is output faster and more accurate, but SMBs can also free themselves from outdated technology. Data streaming can be entirely on-premises, entirely in the cloud, or a mix of both. The high-perfor­mance hardware often required for batch processing is simply no longer necessary when you can get the same result from an LLM in record time. Additionally, there are several providers offering fully managed (turnkey) solutions that do not require any capital investment from SMEs.

For SMEs to get the most out of LLMs, they need to think about the way they handle company data. When a business is willing to treat data as a continuous flow of infor­mation, it is in a much better position to maximize the potential of the data in motion to help it evolve.


Carlos Roman

Carlos is a passionate leader with over 25 years of experience launching cloud, software and hardware teams in emerging and mature markets. He specializes in helping emerging companies go beyond their ambitions. He joined Confluent in 2021 after being a key part of Oracle’s cloud sales teams for more than two decades.

Related Posts