SymetryML6.1
  • Introduction
  • Guides
    • Onboarding Guide
    • Technical Requirements
    • Admin User Guide
    • Installation Guide
      • Installation Guide - GPU
      • Installation Guide - Spark
  • SymetryML GUI
    • ML Toolkit
      • The SymetryML Difference
      • Data Mining Lifecycle
      • SymetryML Concepts
      • Data Sources
      • Streams
      • Encoders
      • Projects
      • Models
    • Sequence Models
    • SymetryML Federated Learning
      • Creating the Federation
      • Load data to local project
      • Requesting Federation Information from Admin Node
      • Joining a Federation with a peer node
      • Federated Data & Modelling
      • Appendix
    • DEM Generator
  • SymetryML Rest Client
    • REST API Reference Guide
      • SymetryML REST API Security
      • SymetryML JSON API Objects
      • Encoder Object REST API
      • SymetryML Projects REST API
      • About Federated Learning
      • Hipaa Compliance and Federated Learning
      • Federated Learning API
        • Federated Learning Topologies
        • Federated Learning with Nats
        • Federated Learning with AWS
        • Fusion Projects
      • Exploration API
      • Modeling API
      • Exporting and Importing Model
      • Third Party Model Rest API
      • SymetryML Job Information
      • Prediction API
      • Data Source API
      • Project Data Source Logs
      • Stream Data Source API
      • AutoML with SymetryML
      • Transform Dataframe
      • Select Model with SymetryML
      • Auto Select with SymetryML
      • Tasks API
      • Miscellaneous API
      • WebSocket API
      • Appendix A JSON Data Structure Schema
      • Appendix B Sample Code
  • SymetryML SaaS
    • SaaS Homepage
    • SaaS Dashboard
    • SaaS Account
    • SaaS Users
    • SaaS Licence
Powered by GitBook
On this page
  • Federated Project Terminology
  • Federation Terminology
  • Federated Project Uses Cases
  • Create a Federation
  • Join a federation
  • PSR Contracts
  • PSR Contract Rules
  • PSR Contract Failure Action
  • Peer Exploration
  • Secure Multi-Party Computation Mode
  • Federated Project REST API at a Glance
  • Limitation of Federated Project
  1. SymetryML Rest Client
  2. REST API Reference Guide

About Federated Learning

PreviousSymetryML Projects REST APINextHipaa Compliance and Federated Learning

Last updated 2 years ago

This section will give background needed in order to understand the business object behind the Federated Learning functionality. The next section will go into the details about the REST API itself.

Federated Project Terminology

SymetryML projects can easily be merged together. That is, imagine you have 2 projects: (a) a project p1 that processed dataset d1 and (b) a project p2 that processed dataset d2. You can merge p2 into p1 and the resulting p1 project will be the same as if p1 would have processed the datasets d1 and d2. This capability is leveraged in a SymetryML Federated project. A federation consists of n Symetry Projects that each process their own private data and share their results at a given interval. This can be seen in the following picture:

In order to fully understand the federated learning REST API one needs to understand a few concepts / terminology.

Federation Terminology

Term
Definition

peers or node

A node is a member of a federation. It’s basically a Federated Symetry Project.

federated project

A Federated Symetry Project contains 2 symetry projects. One local project and one federated project. The federated project is rebuilt from time to time according the the Federation Schedule defined by the federation admin.

local project

A Federated Symetry Project contains 2 projects. One local project and one federated project. The local project is responsible to process data that is local to this project.

Federation

A federation is a set of nodes that communicate and share Symetry project information

Federation Info

Information that describes a federation.

Federation Admin

The user who creates a federation automatically becomes the federation admin.

Federation Contract

Federation secret key

An AES secret key that is used to encrypt communication between peers/nodes of a federation.

Federation Schedule

Peers in a federation will send updates to other peers according to a schedule. This schedule is defined by the federation admin when a federation is created. Example of schedule:

- m30 : synchronize every 30 mins

- h3 : synchronize every 3 hours

- d7 : synchronize every 7 days

scheduled synchronization message

A periodic message sent by a peer to other peers in a federation. The period is defined by the federation schedule.

AWS Backed Federation

In the AWS implementation, under the hood, the federation service uses many AWS services:

  • Each federation node has an AWS SQS queue to receive messages

  • A Federation has an AWS SNS topic that allows fanout messages to be sent to multiple SQS queues.

  • Nodes in the federation use messages to the SNS topic to communicate with other nodes

  • SNS messages are lightweight and contain pointers to Amazon S3 files that are used to temporarily store message content.

  • AWS STS credentials are used to allow other users to access a user’s file on S3.

  • The following figure illustrates this:

NATS Based Federation

NATS based federation use the NATS 'connective technology' to create a federation. For more details on NATS please consult www.nats.io. Under the hood SymetryML uses NATS to send message as well as synchronization message between all the peers in the federation.

Federated Project Uses Cases

Create a Federation

The user who creates a federation will become the administrator of it.

Join a federation

In order to join a federation one must:

PSR Contracts

As explained previously, SymetryML’s speed in machine learning and real-time capabilities rely on its proprietary statistical representation - the PSR. Once constructed from a new dataset, the PSR allows for two main functionalities:

  • supervised and unsupervised machine learning and

  • various exploration APIs.

It turns out that some of these exploration API can be used to enforce quality on the data that peers participating in a federation contribute. Of course, data is never shared directly, only knowledge of this data is shared via the PSR. But this knowledge is sufficient to enforce rules like: "enforce that at least 40% of the rows with positive cancer are female" or "enforce that at least 500 example of fraud is part of this dataset", etc...

This enforcement is done via what we can 'PSR Contract'. A PSR Contract is a list of rules to be enforced on a PSR for it to be validated. These rules are effectively Boolean predicates that evaluate to true or false and for a PSR contract to be validated, all its rules need to evaluate to true.

PSR Contract Rules

Federation PSR Contracts are defined with the following Backus-Naur notation as well as the following table that describes the individual function that can be used in a PSR Contract.

PSR Contract Backus-Naur Notation

RULE_PREDICATE ::= [PredicateRule] [BIN_OP PredicateRule]*

BIN_OP ::= AND | OR

PredicateRule ::= Uni_Rule(FEATURE1) | BI_Rule(FEATURE1, FEATURE2) 

Uni_Rule ::= COUNT  | MEAN | STDDEV | VARIANCE | STDDEV_UNBIASED | VARIANCE_UNBIASED

Bi_Rule = Bi_Exploration_Fct | Conditional_Fct | Complement_Fct | Compose_Fct

Bi_Exploration_Fct ::= COVAR | LINCORR

Conditional_Fct ::= COND_STDDEV | COND_VARIANCE | COND_STDDEV_UNBIASED | COND_VARIANCE_UNBIASED

Complement_Fct ::= COMPL_COND_STDDEV | COMPL_COND_VARIANCE | COMPL_COND_STDDEV_UNBIASED | COMPL_COND_VARIANCE_UNBIASED

Compose_Fct ::= PCT_OF_TRUE | PCT_OF_FALSE | NUM_OCCURENCE_WHEN_TRUE | NUM_OCCURENCE_WHEN_FALSE | MEAN_WHEN_TRUE | MEAN_WHEN_FALSE

Function That Can Be Used in a PSR Contract

  • F1 / F2 means 'Feature 1 type' and 'Feature 2 type'

  • C means Continuous Type

  • B means binary Type

Rule
F1
F2
Business Rule Interpretation

COUNT

C|B

How many time a feature was seen

MEAN

C|B

The mean value of a features

STDDEV

C|B

The standard deviation of a features

VARIANCE

C|B

the variance of a feature

STDDEV_UNBIASED

C|B

The unbiased standard deviation of a features

VARIANCE_UNBIASED

C|B

the unbiased variance of a feature

COVAR

C|B

C|B

The covariance of 2 features

LINCORR

C|B

C|B

The linear correlation of 2 features

COND_STDDEV

C|B

B

Stddev of feature 1 when Feature 2 is '1' or true

COND_VARIANCE

C|B

B

Variance of feature 1 when Feature 2 is '1' or true

COND_STDDEV_UNBIASED

C|B

B

Unbiased stddev of feature 1 when Feature 2 is '1' or true

COND_VARIANCE_UNBIASED

C|B

B

Unbiased variance of feature 1 when Feature 2 is '1' or true

COMPL_COND_STDDEV

C|B

B

Stddev of feature 1 when Feature 2 is '0' or false

COMPL_COND_VARIANCE

C|B

B

Variance of feature 1 when Feature 2 is '0' or false

COMPL_COND_STDDEV_UNBIASED

C|B

B

Unbiased stddev of feature 1 when Feature 2 is '0' or false

COMPL_COND_VARIANCE_UNBIASED

C|B

B

Unbiased variance of feature 1 when Feature 2 is '0' or false

PCT_OF_TRUE

B

B

Percentage of occurrence with Feature1 is 1 or true and feature2 is '1' or true

PCT_OF_FALSE

B

B

Percentage of occurrence with Feature1 is 1 or true and feature2 is '0' or false

NUM_OCCURENCE_WHEN_TRUE

B

B

Number of occurence when Feature1 is 1 or true and feature2 is '1' or true

NUM_OCCURENCE_WHEN_FALSE

B

B

Number of occurence when Feature1 is 1 or true and feature2 is '0' or false

MEAN_WHEN_TRUE

C

B

Mean of feature 1 when Feature 2 is '1' or true

MEAN_WHEN_FALSE

C

B

Mean of feature 1 when Feature 2 is '0' or false

Examples of PSR Contract

Here is a small example with the Iris data set. For a PSR Contract to be valid all the rows must evaluate to TRUE.

COUNT(sepal_length) >= 150
MEAN(sepal_width) > 3.0
STDDEV(petal_length) > 1.75
VARIANCE(sepal_width) > 0.1
VARIANCE(petal_width) > 0.5
COUNT(Iris_setosa) > 100
COUNT(Iris_versicolor) > 100
COUNT(Iris_virginica) > 100
# BI rules
COVAR(petal_length, petal_width) > 1.28
LINCORR(petal_width, petal_length) > 0.25
COND_COUNT(sepal_length, Iris_versicolor) >= 50
COND_MEAN(sepal_length, Iris_versicolor) >= 50
COND_STDDEV(sepal_length, Iris_versicolor) >= 50
COND_VARIANCE(sepal_length, Iris_versicolor) >= 50
COUNT(sepal_length) >= 150 AND MEAN(sepal_width) > 3.0
STDDEV(petal_length) > 1.75
VARIANCE(sepal_width) > 0.1 OR VARIANCE(petal_width) > 0.5
COUNT(Iris_setosa) > 100 AND (COUNT(Iris_versicolor) > 100 OR COUNT(Iris_virginica) > 100)
# BI rules
COVAR(petal_length, petal_width) > 1.28 OR LINCORR(petal_width, petal_length) > 0.25
COND_STDDEV(sepal_length, Iris_versicolor) >= 50 OR COND_VARIANCE(sepal_length, Iris_versicolor) >= 50

PSR Contract Failure Action

Action Type
Action Choice

fed_psr_contract_snd_fail_action

  • fed_psr_contract_snd_fail_action_block - default

  • fed_psr_contract_snd_fail_action_allow

fed_psr_contract_rcv_fail_action

  • fed_psr_contract_rcv_fail_action_block - default

  • fed_psr_contract_rcv_fail_action_allow

Peer Exploration

Another functionality enabled by SymetryML PSR technology is 'peer exploration'. It allows to use SymetryML whole suite of exploration APIs against the PSR of a peer. This can be used to perform various univariate and bivariate comparisons between different peer PSRs without ever seeing the raw data of that peer.

Secure Multi-Party Computation Mode

The PSR allows to share certain summary features of the data without ever sharing the data. However, in order for the PSR not to be invertible - that is not allow for the reconstruction of the original data from the PSR - it needs to have processed a minimum number of rows. This minimum threshold depends on the number of attributes and equals the following:

Minimum number of rows = Number of Attributes + 5

If this minimum is not meet on a given peer at the time of synching then the peer will not share its current PSR with the other nodes in a federation. The same logic appers for incremental synchronization. That is the delta of each sync - or the amount of new data in a PSR since the last synchronization - must follow this rule for the synchronization to be allowed.

This can be a limitation for some federations where each peer do not have a lots of data. To circumvent this limitation, it's possible to use secure multi party computation when peers share their PSR. The protocol will only complete if the resulting PSR is not invertible.

Federated Project REST API at a Glance

Besides creating and joining a federation via rest endpoints, other operations are available. The following table lists all available rest endpoints for federated learning. The following functionality of normal SymetryML projects is available in Federated Project

Limitation of Federated Project

  • Features hashing is not available

Federated Project Actions
Definition

This rest endpoint is used to create a new federation. The user performing this operation will become the owner of the federation.

This is a map of properties for a federation.

Return the federation information encrypted with a password. This is needed in order to share federation information with other peers that the federation admin wants to invite to join the federation. The response will contain a token that can only be used once.

This rest endpoint allows a peer to join an existing federation.

This endpoint instructs your federated project to start pulsing, that is the project will periodically poll for messages from other nodes in the federation as well as sending its scheduled synchronization message.

Stop synchronizing with the federation

Returns the error log for this project. Since many messages between nodes happen asynchronously, this allows the user to see if there was an error while communicating with the other peers in the federation.

This returns a log of when this federated project was updated.

For AWS based federations, this will return information about AWS SNS topic, SNS subscriptions as well as SQS queues. This is for troubleshooting purposes.

A set of boolean rules used to enforce quality of individual peer's PSR. Please see the for details.

A SymetryML Federation can use either Amazon services or in the backend to transmit the various messages to support its functionality.

Peers can authenticate to the NATS network by either using user/password combination or token. Please consult for more details.

Make sure that your clock is correctly synched using a ntp service or something similar. If a computer’s clock, in a federation, is not correctly synched it will have problems receiving messages from other nodes as the service will ignore many messages because of the discrepancy between the time a message was sent and the internal clock of the computer receiving the message. Those errors could be seen using the rest endpoint.

Receive one-time encrypted federation info along with the password to decrypt the message. This can be done over email, Skype or any other means that allows transferring some base64 encrypted text. The federation administrator can get this encrypted federation info using the rest endpoint

Invoke the rest point to join the federation () with the encrypted message and the password received from the federation admin. This message is also to be encrypted using the user secret key.

Upon successful result from step 3, one can now start syncing with other nodes in the federation. This is done by invoking the rest endpoint.

Another example using multiple predicates on each line which is permitted per the :

PSR Contract can be evaluated by each peer at two times: First, when sharing their own PSR with other peers in a federation and second when receive other peer's PSR. It's possible to control what SymetryML does when a validation failure occurs at both these times. This is specified when creating / joining a federation by each individual peers by specifying the following parameters inside the Federation Key Map Value, please consult the sections mentioned in for details:

If a particular peer wishes to block such functionality please consult to learn how to disable / enable this functionality.

Federated Learning with SMPC can be enabled by simply adding a key value pair the fed_use_smpc= true inside the Federation Key Map Value when an administrator creates a federation please consult the sections mentioned in for details.

By default Random Forest model are disabled. It can be enabled by change the SymetryML server configuration. For details please see the rtlm.option.sml.fed.strict.mode key in section.

If your project has more than 2000 attributes you should be careful on how frequently you sync your projects. Please consult the section for more information.

AWS
NATS
https://docs.nats.io/developing-with-nats/security
Exploration API
Modeling API
Prediction API
Data Source API
WebSocket API
Backus-Naur Notation
Federation Terminology
PSR Contract section
Federated Learning API
Installation guide configuration
Get Error Log
Get Encrypted
FedJoin
Start Pulse
Federated Learning: Creating Federation
this section
Federated Learning: Creating Federation
Create a new federation
Get Federation Info
Get Encrypted Federation Info
Join an existing federation
Start Pulsing
Stop Pulsing
Get Error Log
Get Sync Log
Get AWS Info
Example of 3 nodes federation
Fedederated SML AWS Integration
NATS based SML Federation