Package org.apache.ignite.tensorflow Description
TensorFlow integration that allows to start and maintain TensorFlow cluster on top of Apache Ignite cluster
infrastructure. The TensorFlow cluster is built for the specified cache the following way:
- The TensorFlow cluster maintainer is created to maintain the cluster associated with the specified cache so
that this service works reliable even the node will fail. It's achieved using Ignite Service Grid.
- TensorFlow cluster maintainer finds out that cluster is not started and begins starting procedure.
- TensorFlow cluster resolver builds cluster specification based on the specified cache so that every
TensorFlow task is associated with partition of the cache and assumed to be started on the node where the
partition is kept.
- Based on the built cluster specification the set of tasks is sent to the nodes on purpose to start TensorFlow
servers on the nodes defined in the specification.
- When this set of tasks is completed successfully the started process identifiers are returned and saved
for future using.
- The starting procedure is completed. In case a server fails the cluster will be turn down and then started
again by maintainer.