Real-time means short time for old data query methods

1 February 2023 | 15 Shares

Despite the economic conditions prevalent across much of the world, the recent holiday season saw a huge amount of online retail activity, with sales in particular after Black Friday reaching $454 billion by some estimates. (The Black Friday event didn’t stir as much excitement as expected – there are theories that consumers are increasingly skeptical about any offered savings from retailers.) Aside from the headline statistics, there were a few specific pointers market analysts have highlighted as potential trends that might be emerging, such as the high proportion of sales instigated by social media recommendations.

That increased reliance on others’ opinions and the move towards decentralized data structures (Twitter’s implosion and the increase in distributed social platforms like Matrix and Mastodon) give a couple of cluesto any business that relies heavily on data analysis to ensure business growth or operational success. And few organizations today don’t rely on their abilities to gather, parse, query, and act on data. Firstly, data is going to become increasingly widely distributed. And secondly, decisions made by consumers and service users will be more greatly affected by messages from trusted sources, such as friends, colleagues, and respected opinion sources.

Many data sources and analysis platforms don’t reflect these emerging trends. A single case in point (although a highly impactful one) is the stalwart, Google Analytics. As well as privacy and concerns expressed by internet users and legislatures around the world around big tech, GA’s metrics are at least 24 hours old when presented to users. As 2023 progresses, organizations will have to look further afield for data on internet citizens wary of being tracked by big business and also seek out data that’s much closer to real-time than is “traditionally” available.

Maybe ten years ago, “up-to-date data” could have been defined as information a day old; hence, Google Analytics and Adobe Experience Cloud’s offerings were indeed up-to-date. Back then, too, data analytics were slow, with queries onto data lakes taking hours to surface results, and even then, that was assuming that the query strings were on-point and data had been sanitized well. So, have our definitions of real-time changed so much?

Real-time is still a fuzzy term: real-time data in an autonomous vehicle weighing a ton and traveling in an urban area at 30 mph means something different from real-time data gathered from traffic flow control cameras mounted on street corners. For the latter, a few seconds is real-time-enough; for the former, it’s a more critical definition!

But to circle back to the holiday period we’ve all now come out of: how many retailers would have been able to change prices on items recommended by a high-profile influencer in mid-December? Or, to give another example, could IoT devices in smart buildings (temperature sensors, etc.) have helped with environmental control over the same period (very mild in Europe, very cold in the US)?

We can safely assume, therefore, that reacting to real-time data (whatever its definition) is increasingly important. But any data professional well knows, doing so is difficult and expensive. The required expertise and resources to gather, normalize and query data come with a price that rises steeply the closer to real-time the results that are needed. Increasingly, however, that’s less true: Apache Kafka and similar stream data make real-time metrics viable, but prices (especially for skilled personnel) remain high. But even that reality is beginning to change thanks to the fluttering wings of a certain Tinybird.

Tinybird makes it simple for data teams and developers to build applications serving millisecond responses to complex analytical queries. In fact, the creation of an API for data queries is almost trivial (you’ll note the author of this article is not a developer). First, you ingest data at scale into Tinybird. It’s built on top of ClickHouse, so you know it can handle the largest workloads you can imagine throwing at it. Then you write queries to shape and transform your data in the standard nomenclature of choice – SQL – and results are presented in milliseconds from data that’s as up-to-date as its collection methods will allow. Finally, you publish your data as low-latency, high-concurrency APIs, which you can then consume from any application you want to build, whether it’s an internal dashboard or your customer-facing website or app. That means Tinybird can play a pivotal part in any organization’s IT stack, suddenly offering the business the ability to use data in a meaningful way as soon as it’s available.

Data warehouses or lakes will continue to provide value to any organization, especially for ML processing or for business intelligence that can change a company’s quarterly strategic goals. But the Tinybird platform opens up the real-time dimension without a King’s ransom paid in data professionals’ salaries and HPC.

In conclusion, there are times in any organization’s year when data activity quickens and slows. Responding correctly – whether a price change or a heating element springing to life – can be the difference between success and going under. To find out more about how Tinybird opens up all the possibilities of real-time data, you should head over to the website to read more or go straight for the developers’ documentation.