Low-Latency Machine Learning Feature Store with GridGain and Feast
What are Features and Feature Stores?
Features are individual measurable properties used as input for machine learning models. Feature engineering and extraction is the process of transforming raw data into formats that best represent the underlying problem to the predictive models. For example, in a CGM (Continuous Glucose Monitoring) system, features might include current glucose levels, time of day, day of week, and historical glucose levels.
Feature serving is the process of making these engineered features available for both training and inference.
Feast is a platform that combines feature management capabilities with storage backend integrations. It serves as a centralized repository for storing, managing, and serving features to machine learning models.
Online vs Offline Feature Stores
Feature stores typically operate in two modes:
-
Offline Feature Store:
-
Stores historical feature values used for model training
-
Optimized for batch processing
-
Usually implemented using data warehouses or data lakes
-
-
Online Feature Store:
-
Serves features in real-time for model inference
-
Requires extremely low latency (milliseconds)
-
Usually implemented using in-memory databases
-
Why GridGain for Online Feature Store?
GridGain serves as an ideal online feature store due to its distributed in-memory architecture:
-
Low-Latency Access:
-
In-memory data storage provides sub-millisecond access times
-
Critical for real-time feature serving in production environments
-
Enables immediate feature updates and retrieval
-
-
Horizontal Scalability:
-
Distributed architecture allows seamless scaling across clusters
-
Handles growing feature sets and increasing request volumes
-
Maintains performance as demand increases
-
-
High Availability:
-
Built-in data replication ensures fault tolerance
-
Automatic failover capabilities
-
No single point of failure
-
The CGM Prediction System: A Real Example
The demonstration project implements a sophisticated CGM (Continuous Glucose Monitoring) prediction system that leverages GridGain’s capabilities as a Feast online store. This system showcases how GridGain can serve as a high-performance backend for real-time feature serving in healthcare applications.
Glucose Prediction Model
The system includes a pre-trained model (glucose\_prediction\_model-v1
) that predicts future glucose levels based on historical data and current readings. Key aspects of the model include:
-
Personalized Predictions: The model takes into account individual subject data to provide personalized glucose predictions
-
Time-Based Features: Incorporates temporal features like day of the week and time of day to capture daily and weekly patterns
-
Historical Context: Uses historical glucose readings to understand trends and patterns
-
Real-Time Updates: Continuously updates predictions as new CGM readings arrive
Under the Hood
-
Data Organization
-
Raw CGM readings are fetched by the kafka producer and stored in Kafka for real-time processing
-
Historical data maintained in offline store (Parquet or Snowflake)
-
Real-time features served from GridGain online store
-
Custom feature transformations for time-based aggregations
-
-
Feature Engineering
-
Time-based feature extraction (hour of day, day of week)
-
Event-based feature generation from CGM data streams
-
-
Performance
-
Sub-millisecond feature retrieval times
-
Real-time feature updates from streaming data
-
-
Model Integration
-
Seamless integration with pre-trained glucose prediction model
-
Real-time feature serving for immediate predictions
-
This implementation demonstrates how GridGain’s distributed computing capabilities can be leveraged to build a scalable, high-performance feature store that provides accurate and timely features for glucose prediction while maintaining low latency and high availability.
Demonstration Project
A fully functional demo project is available on GitHub in the Feast demo repository. It includes functional sample code, as well as a step-by-step instruction on setting up and running the project.
© 2025 GridGain Systems, Inc. All Rights Reserved. Privacy Policy | Legal Notices. GridGain® is a registered trademark of GridGain Systems, Inc.
Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are either registered trademarks or trademarks of The Apache Software Foundation.