Open the Data Sources accordion and click the Add Data button. A New Data Source wizard appears.
Data Sources: Select Add data button
Select the preferred Data Source Type. The wizard forms change, depending on the type selected, as shown in the following figures
S3 Data Source
SFTP Data Souce
Local Data Source
Redshift Data Source
JDBC Data Source
HTTP Data Source
Ensure your data source settings are valid.
Click Finish to add the data source.
Verify New Data Source
Google Cloud Storage (GCS)
For Data Source Type, select Google Cloud Storage. Enter the GCS Project name (optional), GCS Access Key, GCS Secret key, and then navigate to your data file you wish to create a data source with. For more information on Google Access and Secret keys please see the Cloud Storage HMAC guide.
GCS: Enter account information
Oracle Cloud Infrastructure (OCI)
Please consult the Amazon S3 Compatibility API guide for details on how to configure your Oracle OCI Account for use with SymetryML.
For Data Source Type, select OCI. Enter OCI Access Key, OCI Secret key, OCI Namespace, OCI Region, and then navigate to your data file you wish to create a data source with.
OCI: Enter account information
Azure Blob Storage (AZB)
For Data Source Type, select Azure Blob Storage.
Once selecting for Azure, choose your authentication method, SAS Token, Shared Token or Authentication String.
Azure: Select Authentication Method
Spark Enabled Data Sources
Certain data sources are able to leverage Apache Spark for distributed learning and encoder creation. The list of data sources which are able to support Spark are:
AWS S3
Azure Blob Storage
Oracle Object Storage
Google Cloud Storage
To enable spark for one of the supported data source, one simply has to select the Enable Spark checkbox, choose the preferred version of Spark, and supply the master url.
Spark Required Inputs
Additional Spark specific configurations can be supplied by clicking on the Spark Options button and supplying the corresponding key-value configuration pairs.
Spark Optional Inputs
For a full list of spark specific configuration parameters please see the official spark documentation page
Uploading Data Source
Files local to the client’s machine can be uploaded onto a SFTP server via the Upload Wizard. This can be done by selecting a destination source and selecting a local file from the user’s computer.
Upload File
Viewing Data Sources
To view your newly created data source:
Double-click the data source node under the Data Sources accordion.
Inspect your newly added data.
Data Source Preview
Deleting Data Sources
To delete a data source:
Right-click the Data Source node, and then click Delete.
Data Source Delete
Editing Data Sources
To edit a data source:
Right-click the Data Source node, and then click Edit.
Update the fields as appropriate, and then click Next to continue.
After validating your data source, click Finish to commit the changes.