SymetryML6.1
  • Introduction
  • Guides
    • Onboarding Guide
    • Technical Requirements
    • Admin User Guide
    • Installation Guide
      • Installation Guide - GPU
      • Installation Guide - Spark
  • SymetryML GUI
    • ML Toolkit
      • The SymetryML Difference
      • Data Mining Lifecycle
      • SymetryML Concepts
      • Data Sources
      • Streams
      • Encoders
      • Projects
      • Models
    • Sequence Models
    • SymetryML Federated Learning
      • Creating the Federation
      • Load data to local project
      • Requesting Federation Information from Admin Node
      • Joining a Federation with a peer node
      • Federated Data & Modelling
      • Appendix
    • DEM Generator
  • SymetryML Rest Client
    • REST API Reference Guide
      • SymetryML REST API Security
      • SymetryML JSON API Objects
      • Encoder Object REST API
      • SymetryML Projects REST API
      • About Federated Learning
      • Hipaa Compliance and Federated Learning
      • Federated Learning API
        • Federated Learning Topologies
        • Federated Learning with Nats
        • Federated Learning with AWS
        • Fusion Projects
      • Exploration API
      • Modeling API
      • Exporting and Importing Model
      • Third Party Model Rest API
      • SymetryML Job Information
      • Prediction API
      • Data Source API
      • Project Data Source Logs
      • Stream Data Source API
      • AutoML with SymetryML
      • Transform Dataframe
      • Select Model with SymetryML
      • Auto Select with SymetryML
      • Tasks API
      • Miscellaneous API
      • WebSocket API
      • Appendix A JSON Data Structure Schema
      • Appendix B Sample Code
  • SymetryML SaaS
    • SaaS Homepage
    • SaaS Dashboard
    • SaaS Account
    • SaaS Users
    • SaaS Licence
Powered by GitBook
On this page
  • Creating New Data Sources
  • Google Cloud Storage (GCS)
  • Oracle Cloud Infrastructure (OCI)
  • Azure Blob Storage (AZB)
  • Spark Enabled Data Sources
  • Uploading Data Source
  • Viewing Data Sources
  • Deleting Data Sources
  • Editing Data Sources
  1. SymetryML GUI
  2. ML Toolkit

Data Sources

PreviousSymetryML ConceptsNextStreams

Last updated 2 years ago

Creating New Data Sources

To create a new data source:

Open the Data Sources accordion and click the Add Data button. A New Data Source wizard appears.

Select the preferred Data Source Type. The wizard forms change, depending on the type selected, as shown in the following figures

Ensure your data source settings are valid.

Click Finish to add the data source.

Google Cloud Storage (GCS)

Oracle Cloud Infrastructure (OCI)

For Data Source Type, select OCI. Enter OCI Access Key, OCI Secret key, OCI Namespace, OCI Region, and then navigate to your data file you wish to create a data source with.

Azure Blob Storage (AZB)

For Data Source Type, select Azure Blob Storage.

Once selecting for Azure, choose your authentication method, SAS Token, Shared Token or Authentication String.

Spark Enabled Data Sources

Certain data sources are able to leverage Apache Spark for distributed learning and encoder creation. The list of data sources which are able to support Spark are:

  • AWS S3

  • Azure Blob Storage

  • Oracle Object Storage

  • Google Cloud Storage

To enable spark for one of the supported data source, one simply has to select the Enable Spark checkbox, choose the preferred version of Spark, and supply the master url.

Additional Spark specific configurations can be supplied by clicking on the Spark Options button and supplying the corresponding key-value configuration pairs.

Uploading Data Source

Files local to the client’s machine can be uploaded onto a SFTP server via the Upload Wizard. This can be done by selecting a destination source and selecting a local file from the user’s computer.

Viewing Data Sources

To view your newly created data source:

  1. Double-click the data source node under the Data Sources accordion.

  2. Inspect your newly added data.

Deleting Data Sources

To delete a data source:

  1. Right-click the Data Source node, and then click Delete.

Editing Data Sources

To edit a data source:

  1. Right-click the Data Source node, and then click Edit.

  2. Update the fields as appropriate, and then click Next to continue.

  3. After validating your data source, click Finish to commit the changes.

For Data Source Type, select Google Cloud Storage. Enter the GCS Project name (optional), GCS Access Key, GCS Secret key, and then navigate to your data file you wish to create a data source with. For more information on Google Access and Secret keys please see the Cloud Storage guide.

Please consult the guide for details on how to configure your Oracle OCI Account for use with SymetryML.

For a full list of spark specific configuration parameters please see the official spark documentation

HMAC
Amazon S3 Compatibility API
page
Data Sources: Select Add data button
S3 Data Source
SFTP Data Souce
Local Data Source
Redshift Data Source
JDBC Data Source
HTTP Data Source
Verify New Data Source
GCS: Enter account information
OCI: Enter account information
Azure: Select Authentication Method
Spark Required Inputs
Spark Optional Inputs
Upload File
Data Source Preview
Data Source Delete
Data Source Edit