Operations Guide | PUBLIC
Document Version: 2108 – 2021-11-12
Setup Guide for Data Engineering
© 2021 SAP SE or an SAP affiliate company. All rights reserved.
THE BEST RUN
Content
1 Vision and Business Scenarios. . . .3
2 Architecture and Component Overview. . . . 4
3 Setup Information. . . . 5
3.1 Prerequisites. . . 5
3.2 Manage Data Agents. . . .5
3.3 Install the Agent on the Virtual Machine. . . 7
3.4 Manage Connections. . . 8
3.5 Manage Data Replicas. . . .10
2 PUBLIC
Setup Guide for Data Engineering Content
1 Vision and Business Scenarios
Data Engineering (DE) provides data-acquisition capabilities in SAP Digital Manufacturing Cloud – enabling customers to source data from their on-premise systems like SAP ME, SAP MII, SAP ECC, SAP S/4HANA and other legacy or third-party systems.
This data is then available for analysis using the SAP Digital Manufacturing Cloud for Insights component, covering various manufacturing use cases.
To achieve this goal, data from on-premise systems needs to be first replicated or copied to Cloud. The process of replicating or sourcing this data is one of the goals of Data Engineering.
As mentioned earlier, using Data Engineering, you can import data from multiple sources:
● Own On-Premise systems or software in Manufacturing Execution (SAP ME)
● Own On-Premise systems or software in SAP MII
● Own On-Premise systems or software in SAP ECC or S/4HANA
Once data is available on the Cloud, Data Engineering allows you to consume analytical reports at a global scale (bringing in data from multiple plant sources), as well as analyze data at a plant and shift level.
Using SAP Digital Manufacturing in Cloud, you can define a digital twin of the plant layout and line setup to configure a visual representation of the factory floor (typically by the plant supervisor).
You can gain real-time insights on execution activities carried out in SAP Digital Manufacturing Cloud for Execution (both in the Cloud as well as on Edge). The solution supports big-data scenarios (both data acquisition as well as analysis) to help define the right intelligent applications that could be deployed, both in Cloud and Edge.
You can leverage SAP Technology offerings such as SAP Analytics Cloud (SAC) without large scale integration needs or custom developments.
For more information on the functional aspects and usability, see Data Engineering - An Overview, under the Manufacturing Insights Configuration tab, in Application Help for Insights.
Setup Guide for Data Engineering
Vision and Business Scenarios PUBLIC 3
2 Architecture and Component Overview
This diagram represents the architecture overview.
4 PUBLIC
Setup Guide for Data Engineering Architecture and Component Overview
3 Setup Information
This section provides an overview of the required steps to set up Data Engineering for SAP Digital Manufacturing Cloud.
The setup includes the following steps:
● Manage Data Agents: Create and manage the data agents (source).
● Use Shell Script: To complete the installation of the data agent.
● Manage Connections: Create and manage connections between the source and target databases.
● Manage Data Models: Replicate source data in the target or destination database.
3.1 Prerequisites
You must have the following prerequisites in place, before proceeding with the setup:
● A clean Linux VM (no previous docker installation) with 32 GB RAM
● The OS for the VM must be Ubuntu 18.04, 20.04, or RHEL 7.9. No other Operating Systems are supported for Data Engineering
● 100 GB hard drive space and SUDO permissions on the server
● The VM needs to be close to the source of the data, preferably in the same network
● Your SAP Digital Manufacturing in Cloud tenant is hosted in an Azure data center (EU20/US20)
Raise an incident on the MFG-DM-DMI component with the subject Request to onboard Data Engineering. In the description, mention the SAP BTP Sub-account ID, Digital Manufacturing Cloud application URL of the Digital Manufacturing Cloud tenant to be onboarded.
Then, proceed with the next steps (once this ticket is processed).
3.2 Manage Data Agents
You can create, delete, check the status, download the script, and copy the connection string of data agents.
Perform the following steps:
1. From the SAP Fiori launchpad, open the Manufacturing Insights Configuration tab.
2. Choose Manage Data Agents to open the app. You can see a list of existing data agents.
3. Choose Create to create a new data agent and enter the relevant information.
4. Additional entry information for selected fields:
Setup Guide for Data Engineering
Setup Information PUBLIC 5
Information
Field Name Description
Name Enter a data agent name. This name can have a maximum
possible length of 20 characters, alphanumeric without spaces, and it cannot start with a number.
Description Enter a brief description about the data agent.
Proxy Protocol Choose between http and https.
Proxy Host Enter the name of a proxy host (Optional. Use only when your network has a proxy setup).
Proxy Port Enter the proxy port number.
Proxy User Enter a proxy user name.
Proxy Password Enter a proxy password.
5. Choose Save. The system associates the newly created data agent with a connection string.
Note
If the connection string is not copied from the dialog, you can also view the connection string by selecting any data agent and choosing the Show Connection String button. You can also copy the connection string from here.
6. Download the script using the Download Script button.
7. Install the agent on the Virtual Machine.
1. Copy this connection string and provide it as input in the shell script to run and complete the installation of the agent. For more details on running the shell script in the Linux VM you created, see Install the Agent on the Virtual Machine [page 7].
Note
○ After creating the data agent, you must deploy it to the IoT hub (A note on IoT hub is provided below). To do this, you need to execute the shell script in the Linux VM.
○ You can delete a data agent only if no open connections exist.
8. Once you have created a data agent, you can directly create a new connection, instead of exiting the Manage Data Agents app and navigating to the Manage Connections app. Choose Create Connection to directly create a connection from the Manage Data Agents app.
9. Separately, you can also download the script using the Download Script button.
10. Once the script has run successfully, the status shows as Connected.
Other statuses include:
● Awaiting deployment: The script has not run yet.
● Deployed with errors: An error occurred while the script was running, for example: The IoT hub was down.
6 PUBLIC
Setup Guide for Data Engineering Setup Information
Note
A note on IoT Hub - IoT refers to Internet of Things. The IoT hub facilitates communication and data transfer between cloud systems and the data agent installed on the on-premise system, over the internet.
Different devices spread across various locations can connect commonly. For example – a cloud system asking for data. This interaction happens with the docker container (in the on-premise system), through the IoT hub.
3.3 Install the Agent on the Virtual Machine
You must install and configure the data agent to point to the device created in Cloud.
This topic describes the steps to execute the script, monitor progress and possible errors, perform troubleshooting, and provides some additional information.
Prerequisites
For more information on prerequisites, refer the Prerequisites [page 5] topic.
Steps
Perform the following steps to execute the script:
1. Place the script downloaded from the application anywhere in the Linux VM and execute the locate 'agentSetup.sh' command to get the location of the scripts.
2. Change the directory to the location where the script is placed.
3. Verify whether the scripts have Execute permissions by using the ls -l | grep agentSetup.sh command. If not, run the sudo chmod +x agentSetup.sh command to set the Execute permission.
4. Execute the script by using the sudo ./agentSetup.sh command.
5. The system asks you to enter a connection string. Enter the connection string of the agent (as copied from step 6 in the Manage Data Agents [page 5] topic), when prompted to do so.
Result: The script prints the major progress milestones on the screen, along with the detailed progress in the log files. The log entries are written into the dmclog.log file.
6. On successful execution of the script, a message is displayed on the screen. Refer the second point in the Troubleshooting section, if you run into an error.
7. Run the sudo docker container list command and verify that the list contains the four services named results-buffer, edgeHub, data-agent, and edgeAgent. After a minute, rerun the command and confirm that the four services have the status: Up.
8. If the above-mentioned services (mentioned in step 7) are not in status Up, contact your SAP Digital Manufacturing Cloud Support Team. Typically, agent installation takes 1-5 minutes, depending on the network speed and proxy (if enabled).
Setup Guide for Data Engineering
Setup Information PUBLIC 7
Troubleshooting
Troubleshooting
Problem Solution Comment
Screen messages are pausing indefi
nitely.
Verify the contents of the
dmclog.log file to see if all steps during execution were carried out cor
rectly. Create a ticket with the contents of the log file if the issue persists.
If packages of different versions already exist on the installation, the script does not execute.
You can see the exact package version mismatch(es) among the four pack
ages. An exhaustive list of packages and their corresponding versions is pro
vided, as shown in the table below.
See the table below to understand what packages (and corresponding versions) need to be installed.
Package and Versions
Package Name Version
Moby-cli 3.0.11+azure-2
Moby-engine 3.0.11+azure-2
IotEdge 1.0.9.1-1
Libiothsm 1.0.9.1-1
3.4 Manage Connections
After the setup of the data agent, you need to create a connection to the source database. For this, perform the following steps:
1. From the Manufacturing Insights Configuration tab, choose Manage Connections. You can see a list of all existing connections.
2. Choose Create to create a new connection and enter the relevant information.
3. Additional entry information for selected fields:
Information
Field Name Description
Connection Name The name of the connection.
Data Agents Displays connected or disconnected data agents.
8 PUBLIC
Setup Guide for Data Engineering Setup Information
Field Name Description
System Type Select a system type (such as SAP MII OEE, SAP ME WIP,
SAP ME ODS, and so on).
System ID Enter a valid system ID. This entry mandatory (maximum
10 characters, alphanumeric, no spaces).
Connection Type Database (default)
Database Type Select the required database type (such as HANA, Oracle, or SQL Server).
Database Service Select between Service ID and Service Name (only for Oracle database).
Service ID Enter a valid Service ID, for example - PAUME.
Schema Enter a valid Schema (only for HANA database). Entering
a schema is optional for Oracle and mandatory for the HANA database. In an Oracle database, if the schema is not provided, the default schema associated with the user is considered.
Database Name Enter a valid database name (only for SQL Server data
base).
Hostname Enter a hostname/IP of the database you are connecting.
Port Enter a port number.
Note
The data agent you created appears only if it is in either in Connected or Disconnected state. A data agent awaiting deployment does not appear in the list of agents.
4. After entering the details, choose Test to test your connection.
5. Once the system displays the message: Test Successful, choose Save. If you run into any errors, verify the database details you entered in the previous step.
6. Using the Suspend and Resume buttons, you can suspend or resume one or more connections, that are in the Connected and Disconnected state. Once a connection is suspended, all activated data models associated with this connection, are also suspended. A warning is issued for this. Likewise, on resumption of a connection, the connection as well as associated (suspended) data models resume.
Note
If you want to suspend more than one connection, and if any one of these connections is already in Suspended state, then it will remain as it is. The same is true with resumption of connections as well.
7. Once you have created a connection, you can directly create a new data model, instead of exiting the Manage Connections app and navigating to the Manage Data Models app. Choose Create Data Model to directly create a data model from the Manage Connections app.
Setup Guide for Data Engineering
Setup Information PUBLIC 9
Note
For non-schema owners (other than the default users associated with the schema), using privileges is highly recommended, without which, the data models fail when prompted for the initial load (Applicable only for HANA and Oracle databases, not for SQL Server). You can use the following privileges:
Privileges
Privileges Oracle HANA
Privileges
create any table create any
create any trigger select
select any table insert
insert any table drop
drop any table trigger
drop any trigger
For SQL Server connections, you must use the following permissions on the provided database (that is - the database name used while creating the connection):
Permissions
Database Permissions SQL Server
Database Permissions
insert select Delete
alter any schema
3.5 Manage Data Replicas
The data agent and connection must be up and running to configure the Manage Data Replicas app.
To replicate data, you need to create a data replica. This data replica contains information on the database to get data from (source), the database to where the data must be copied (target), and the data to be replicated or copied.
To create a data replica, perform the following steps:
1. From the Manufacturing Insights Configuration tab, choose the Manage Data Replicas app. You can see a list of available data replicas.
10 PUBLIC
Setup Guide for Data Engineering Setup Information
2. Choose Create to create a new data replica. You can either create a new data replica or multiple new data replicas (Refer Note just after this step, for more information). Enter the Data Replica Name, Description, and System Type.
Note
○ The Create button is a dropdown, with two options: New Data Replica and Bulk Data Replicas. Using the New Data Replica option, you can create a new data replica. Using the Bulk Data Replicas option, you can create multiple data replicas, at once. The need to individually create multiple data replicas is therefore eliminated.
○ The System Type must be the same as the one selected while creating a connection. If you select a different System Type, the connection cannot be established.
3. Choose Next Step.
4. Select a connection (in the Select Connection field box) to connect to the database. On connecting, enter three characters of the table name in the search field. Based on the characters entered, the system populates tables from the source system that match the search criteria.
5. Select the required source table. The system copies data from this source table. The top 10 data records are displayed.
Note
The list of source tables is restricted to the list of tables used in SAP Digital Manufacturing Cloud for insights. For more information, refer the Data Replication topic in the Integration Guide (Except for PQM).
6. Choose Next Step to navigate to the Select Target page. The target table is also shown with the top 10 records, where data needs to be replicated. This target table is auto selected, based on the chosen source table.
7. Choose Next Step.
8. Configure the Filters and Replication Schedule (optional). On the left pane, under Select Columns, select a filter. Enter a value corresponding to the selected filter, in the Value field. Use filter expressions such as Include, Exclude, equal to, greater than, lesser than, and so on, to complete the equation. The section below shows the filtered data, based on the selected filter. Choose Update to apply the filter. The system displays the top 10 data records (with filter applied).
Note
Recommendation for Filters on Quality tenants (users)/Test and Demo tenants (partners): Apply suitable filters to limit the data volumes being replicated, as these are smaller sized tenants.
9. Navigate to the Replication Schedule tab to set a schedule for data replication, either in minutes, hours, or days. This step is optional.
Note
If you do not set a schedule, the system displays the following message during activation: No schedule is provided. Data will load only once (initial load). Choose Yes. If one minute is selected as the replication schedule, the delta load executes once every minute.
10. Once you have selected the target, you can choose either Save or Activate. By choosing Save, you only save the data replica, the data is not loaded. Choose Activate to save and activate the data replica. Data loading begins only after you have selected Activate. You can always save a data replica and activate it later, as required. Activation is the step that results in the creation of target tables (if not existing already), triggers,
Setup Guide for Data Engineering
Setup Information PUBLIC 11
and a change log table. This is used by the agent to determine delta changes once the initial load is completed. You can verify the following changes in your schema:
○ Namespace of the objects created
○ Trigger definition
○ Shadow table/change log table
The deactivation step (if performed) is expected to remove the trigger definition, as well as the shadow/change-log table
11. On activation, the target table gets created and the system displays a confirmation message, stating that the data replica is activated. Once the data is copied, the Initial Load Status displays as: N of N rows are copied. Here, N represents the number of rows.
Note
It is sometimes possible that the exact number of rows are not copied, as there might be changes in the source database during the initial load. It takes approximately 1-2 minutes for successful replication. On successful replication, the row turns green.
12. If you do not want to wait for the normal delta load to bring data, you can always trigger an 'on-demand' delta load, using the Trigger Now button.
Note
If the Run ID shows In progress in any one of the three columns (Delta Load Triggered, Upload to Cloud, Insert to Database), then the Trigger Now button is disabled. It is enabled only if the status for the three columns shows either Completed or Failed.
13. You can terminate an ongoing run (for example:- if the run is taking abnormally long to complete) using the Terminate Current Run button. This button not only terminates the existing run, but also immediately starts a new run (That is: A new run is scheduled immediately, but reflects soon after, for example - one minute).
Note
If the Run ID shows In progress in any one of the three columns (Delta Load Triggered, Upload to Cloud, Insert to Database), then the Terminate Current Run button is enabled. Else, it remains disabled.
14. You can also select any data replica to copy or delete it. You can edit the activated data replica to include additional filters or any changes that you make. You can suspend or resume data replicas, using the Suspend and Resume buttons.
Note
You cannot suspend certain data replicas, such as those with status New, Deletion Started, Deletion Failed, and, replicas where the initial load is running. The Suspend feature works only for activated replicas. Once suspended, you cannot activate or delete a data replica. You can edit and save a suspended data replica. If you choose Resume, then all the data replicas that are in suspended state are resumed. All operations of the replica are allowed, once you perform the Resume operation.
12 PUBLIC
Setup Guide for Data Engineering Setup Information
Important Disclaimers and Legal Information
Hyperlinks
Some links are classified by an icon and/or a mouseover text. These links provide additional information.
About the icons:
● Links with the icon : You are entering a Web site that is not hosted by SAP. By using such links, you agree (unless expressly stated otherwise in your agreements with SAP) to this:
● The content of the linked-to site is not SAP documentation. You may not infer any product claims against SAP based on this information.
● SAP does not agree or disagree with the content on the linked-to site, nor does SAP warrant the availability and correctness. SAP shall not be liable for any damages caused by the use of such content unless damages have been caused by SAP's gross negligence or willful misconduct.
● Links with the icon : You are leaving the documentation for that particular SAP product or service and are entering a SAP-hosted Web site. By using such links, you agree that (unless expressly stated otherwise in your agreements with SAP) you may not infer any product claims against SAP based on this information.
Videos Hosted on External Platforms
Some videos may point to third-party video hosting platforms. SAP cannot guarantee the future availability of videos stored on these platforms. Furthermore, any advertisements or other content hosted on these platforms (for example, suggested videos or by navigating to other videos hosted on the same site), are not within the control or responsibility of SAP.
Beta and Other Experimental Features
Experimental features are not part of the officially delivered scope that SAP guarantees for future releases. This means that experimental features may be changed by SAP at any time for any reason without notice. Experimental features are not for productive use. You may not demonstrate, test, examine, evaluate or otherwise use the experimental features in a live operating environment or with data that has not been sufficiently backed up.
The purpose of experimental features is to get feedback early on, allowing customers and partners to influence the future product accordingly. By providing your feedback (e.g. in the SAP Community), you accept that intellectual property rights of the contributions or derivative works shall remain the exclusive property of SAP.
Example Code
Any software coding and/or code snippets are examples. They are not for productive use. The example code is only intended to better explain and visualize the syntax and phrasing rules. SAP does not warrant the correctness and completeness of the example code. SAP shall not be liable for errors or damages caused by the use of example code unless damages have been caused by SAP's gross negligence or willful misconduct.
Bias-Free Language
SAP supports a culture of diversity and inclusion. Whenever possible, we use unbiased language in our documentation to refer to people of all cultures, ethnicities, genders, and abilities.
Setup Guide for Data Engineering
Important Disclaimers and Legal Information PUBLIC 13
www.sap.com/contactsap
© 2021 SAP SE or an SAP affiliate company. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company. The information contained herein may be changed without prior notice.
Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors.
National product specifications may vary.
These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP or its affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.
SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. All other product and service names mentioned are the trademarks of their respective companies.
Please see https://www.sap.com/about/legal/trademark.html for additional trademark information and notices.