Skip to main content
Share

Easy Guide to Integrating Kafka: Practical Solutions for Managing Blob Data

· 12 min read

Kafka ReductStore Example Sensor data processed and labeled by AI, stored in ReductStore, with metadata relayed to Kafka

In this tutorial, we will walk through a straightforward setup for integrating Kafka with ReductStore for handling unstructured data streams from edge devices. We'll cover the basics of setting up Kafka and ReductStore using Docker, creating Kafka topics in Python, and managing blob data and metadata.

If you are new to Kafka and ReductStore, here's a quick summary of the technology:

  • Apache Kafka is a distributed streaming platform to share data between applications and services in real-time.
  • ReductStore is a time-series database for blob data, optimized for edge computing and complements Kafka by providing a data storage solution for files larger than 1MB–Kafka's maximum message size.

In our example, we will deploy a simple architecture with a single instance of Kafka and ReductStore running on a local machine. We will demonstrate how to create Kafka topics, write data to ReductStore, and forward metadata to Kafka.

You can also follow along by cloning the GitHub repository containing all the code snippets and Docker Compose files used in this tutorial within the reduct_to_kafka demo.

Share

How To Implement Data Streaming In PyTorch From A Remote Database

· 9 min read

PyTorch Training Diagram PyTorch training loop with data streaming from remote device

When training a model, we aim to process data in batches, shuffle data at each epoch to avoid over fitting, and leverage Python's multiprocessing for data fetching through multiple workers.

The reason that we want to use multiple workers is that GPUs are capable of handling large amounts of data concurrently; however, the bottleneck often lies in the time-consuming task of loading this data into the system.

Moreover, the challenge is even trickier when there is simply too much data to store the whole dataset on disk and we need to stream data from a remote database such as ReductStore.

In this blog post, we will go through a full example and setup a data stream to PyTorch from a playground dataset on a remote database.

Let's dig in!

Share

How to Keep a History of MQTT Data With Node.js

· 6 min read

The MQTT protocol is widely used in IoT applications because of its simplicity and ability to connect different data sources to applications using a publish/subscribe model. While many MQTT brokers support persistent sessions and can store message history as long as an MQTT client is not available, there may be cases where data needs to be stored for a longer period. In such cases, it is recommended to use a time series database. There are many options available, but if you need to store unstructured data such as images, sensor data, or Protobuf messages, consider using ReductStore. It is a time series database specifically designed for storing large amounts of blob data and optimized for IoT and edge computing.

ReductStore provides client SDKs for many programming languages to integrate it into your infrastructure. In this example, we will use the client SDK for JavaScript.

Let’s make a simple MQTT application to see how it works.

Share

Exploring Open-Source Alternatives to Landing AI for Robust MLOps

· 7 min read

Photo by Luke Southern Photo by Luke Southern on Unsplash

In the thriving world of IoT, integrating MLOps for Edge AI is important for creating intelligent, autonomous devices that are not only efficient but also trustworthy and manageable.

MLOps—or Machine Learning Operations—is a multidisciplinary field that mixes machine learning, data engineering, and DevOps to streamline the lifecycle of AI models.

In this field, important factors to consider are:

  • explainability, ensuring that decisions made by AI are interpretable by humans;

  • orchestration, which involves managing the various components of machine learning in production–at scale; and

  • reproducibility, guaranteeing consistent results across different environments or experiments.

Share

From Lab to Live: Implementing Open-Source AI Models for Real-Time Unsupervised Anomaly Detection in Images

· 8 min read

Photo by Randy FathPhoto by Randy Fath on Unsplash

The journey of taking an open-source artificial intelligence (AI) model from a laboratory setting to real-world implementation can seem daunting. However, with the right understanding and approach, this transition becomes a manageable task.

This blog post aims to serve as a compass on this technical adventure. We'll demystify key concepts, and delve into practical steps for implementing anomaly detection models effectively in real-time scenarios.

Let's dive in and see how open-source models can be implemented in production, bridging the gap between research and practical applications.

Share

ReductStore v1.7.0 has been released with provisioning and batch writing

· 2 min read

We are pleased to announce the release of the latest minor version of ReductStore, 1.7.0. ReductStore is a time series database designed for storing and managing large amounts of blob data.

To download the latest released version, please visit our Download Page.

What's new in 1.7.0?

ReductStore v1.7.0 introduces two new features that make it easier to provision resources and write data in batches, which can improve your performance and efficiency when using ReductStore for edge computing and AI applications.

Share

ReductStore vs. MinIO & InfluxDB on LTE Network: Who Really Wins the Speed Race?

· 6 min read

Benchmarks don't lie, let's put the systems to the ultimate test.

Diagram of ReductStore vs MinIO and InfluxDB benchmark on Edge Device HX401ReductStore vs. MinIO & InfluxDB on Edge Device HX401

For anyone deeply immersed in the engineering world of Edge Computing, Computer Vision, or IoT, you'll want to read further to understand why a time series database for blob data is needed and where it stands out.

Enter our contest: First, we have ReductStore—a time series database for blob data—specifically designed for edge devices.

Its counterpart? The duo of MinIO and InfluxDB, each optimized for their niche in blob storage and time-series data respectively.

When directly compared, which system takes the lead in performance?

Let's roll up our sleeves and deep-dive into this benchmarking analysis to separate fact from fiction.

Share

ReductStore 1.6.0 has been released with new license and client SDK for Rust

· 3 min read

We are pleased to announce the release of the latest minor version of ReductStore, 1.6.0. ReductStore is a time series database designed for storing and managing large amounts of blob data.

To download the latest released version, please visit our Download Page.

What is new in 1.6.0?

Business Source License (BUSL-1.1)

We have updated the ReductStore license to the Business Source License (BUSL-1.1). This license permits free usage of the database for development, research, and testing purposes. Furthermore, it can be used in a production environment for free, provided that the Aggregate Financial Capacity of the company is less than $2,000,000 for the previous year. For additional information, please refer to here.

We believe that the new license strikes a good balance between freedom and revenue generation. This balance is necessary to maintain and improve our technology, and to bring benefits to its users.

Share

How to Choose the Right MQTT Data Storage for Your Next Project

· 14 min read

Choosing the right database can be overwhelming–trust me, I know.

Photo by Jan Antonin KolarPhoto by Jan Antonin Kolar on Unsplash

Since joining ReductStore's project, I've been exploring alternative solutions to get a better understanding about how the project fits into current echosystem. I found all kind of databases, from the most popular ones to the most obscure ones.

To give you some context, we will look at solutions to store data from IoT devices (e.g. sensors, cameras, etc.) that commonly use MQTT to communicate with each other.

MQTT stands for "Message Queuing Telemetry Transport" and is a lightweight messaging protocol designed to be efficient, reliable, and scalable, making it ideal for collecting and transmitting data from sensors in real time.

Why is this important when choosing a database?

Well, MQTT is format-agnostic, but it works in a specific way. We should therefore be aware of its architecture, how it works, and its limitations to make the right choice. This is what this article is about, we will try to cut through the fog and explore some key factors to consider when selecting the right option.

Let's get started!

Share

ReductStore v1.5.0 has been released

· 2 min read

Hello everyone,

I'm happy to announce that the next minor version of ReductStore has been released. For the last month, we worked hard to improve the user experience when querying data from the database. And in this release, we deliver two important features:

  • Batching multiple records into an HTTP response for read operations
  • Reading only meta information about a record without its body.

Let me show you how it works in detail and how you can use it.