What, How, Why: Sitecore Data Exchange Framework

I have  been experimenting with Sitecore 9 and DEF 2.0.1 over the past couple of weeks. I’ve gotta say, it has been along time since I worked on an integration tool and it has been an interesting journey reading the different documentations and articles out there on DEF.

In this series, I will walk you through the journey I had to understand the different aspects of the ETL process and how to identify and work with the different components of the Sitecore Data Exchange Framework. Also, I will include a number of blogs that I wrote on how to play around and customize the Sitecore Provider to tailor to your different needs.

Disclaimer: it is highly likely that you will know most of what is written here if you have worked with DEF before but I was hoping to share a tip/trick or two on how to work with it. Also, I hope this will help new Sitecorians to DEF understand and implement this framework.

Happy Sitecoring!

Advertisements

Sitecore Data Exchange Framework: The Why

One of the important aspects toward developing an intelligent framework is data integration. It is used to combine data from different data sources and processes it in a way that enables a seamless integration between different systems that includes Sitecore!

ETL is the common step to integrate data and transform it into a compatible format that is integrated into destination systems. Sitecore Data Exchange Framework plays its important role in integrating different attributes from different systems during  the transformation process in ETL. Then the ETL process performs data cleansing during the extraction process and load significant data into the target system maintaining a seamless integration between different providers.

So, Why Sitecore Data Exchange Framework?

1- It is supported by Sitecore! Yup! nothing is better than having the Experience Platform itself supporting a framework that is highly needed these days.

2– Provide a consistent model for reading/writing/mapping data within different integrated systems using the (ETL) process.

3- Combine past experiences through a single framework that allows for better enhancements and maturity while maintaining the flexibility of existing integration approaches.

4- It is easy to use and extend (Highly Abstracted) and supports a number of different 3rd party systems seamlessly. You can always build your own provider to integrate with a system that is not supported yet as minimum Sitecore expertise is needed to develop a provider.

5- It helps you automate a number of repetitive tasks through an easy Sitecore interface leaving you to focus on customizing the framework based on your business need rather than how to get this integration work.

6– Data Exchange Framework pipelines are configured in Sitecore, but they do not have to run within Sitecore. It is possible to run a pipeline from a standalone .NET application which can help minimizing dependency on Sitecore in production environments at runtime.

7– Do you have something else in mind to share, please comment here or share on twitter via #WhenToSitecore

My outtakes on the framework:

1– You’ll need to go back and forth between version 2.0.1 and version 1.4.1 documentation to get access to all needed information.

2– It works best for straightforward scenarios (I might be wrong though!). If you have lots of custom business logic, you will need to decide whether to go with DEF and implement custom logic or go with your existing solution if you have any.

3– More providers are needed for more inclusive of other 3rd party systems.

Happy Sitecoring!

Sitecore Data Exchange Framework: The How

In the previous blog, we talked what is Sitecore Data Exchange Framework, in this one we will dive a bit deeper into how this framework works to deliver results. You can always refer to the documentation for further information.

The most important step when installing Sitecore Data Exchange Framework is to choose the correct version for your Sitecore instance. Read the Sitecore installation guide and don’t depend on your experience installing Sitecore modules, remember what happened when you installed Sitecore 9 and didn’t follow the installation guide 🧐

There are several areas to look at when start working with the DEF:

  • DEF: The Framework – it consists of a number of components that are configured to setup the DEF.
  • DEF: The Providers – it consists of a number of components that are used to configure the Endpoints and drive the process of reading/writing Object Data from Source/Target system.
  • DEF: Remote SDK – it consists of a number of components that help you run DEF pipelines and synchronization outside Sitecore.

DEF Framework Components:

  • Tenants: represents an organization object to group all related configurations in a single silo and inherits from Data Exchange tenant. Create a new Tenant whenever you import data from a different system.

  • Endpoints: represents data sources and inherit from Base Endpoint template. In the example below, we have two Endpoints representing a File Provider as a source system and Sitecore Provider as a target system.

  • Data Access: represents Objects used to read/write data from both Endpoints. There is different types of Data Accessors to help you represent and map different types of data:
    • Value Accessor: used to configure value mapping components that are used to read and write data objects. You can consider them as a property that has a getter (Value Reader) and a setter (Value Writer). It inherits Base Value Accessor Template.  There are different types of Value Accessors dependent on the provider used in the Data Source. If Sitecore Provider is used, then these Value Accessors will include the path to the Sitecore Item Fields used to write Data Objects. If File Provider is used, then these Value Accessors will include the position of a Data Object in the Source file. I’ll show you in subsequent blogs how to use your own custom Reader/Writer to set the values for these Accessors.
    • Value Reader: is a getter used to read data from Source Data Objects. There is different types of Value Readers that you need to be familiar with before building your own.
    • Value Writer: is a setter used to write data to a Target Data object. It inherits Base Value Writer.
    • Value Accessor Set: is a group of Value Accessors . It represents all the attributes you need to map from one Data source to the other.
  • Data Mapping: represents a set of components used to map Data objects between source and target data sources. These components are:
    • Value Mapping: this is a base template that is used to create all Value Mapping values. It maps Data Object from a Source Value Accessor into a Destination Value Accessor.
    • Value Mapping Set: represents a collection of Value Mappings and is used as a base template. This is similar to the Value Accessors Set.
    • Apply Mapping Rules: represents a condition that needs to be met before applying the value mapping. This is used when you wish to only write data to your Target system when a certain condition is met.
    • Mapping Applied Actions: represents a business logic you wish to execute if your imported Data has not changed due to some missing information but still wish to perform a certain action nevertheless.
    • Mapping Applied Action Rules: represents a condition that needs to be met before applying the Mapping Applied Actions.
  • Pipeline Steps: represents a unit of work that is performed when a pipeline runs. This is where all the work will be done when you start your migration process after configuring everything discussed above. You should configure Pipeline steps in the correct order for the synchronization process to execute correctly. Pipeline Steps implements Plugins that allows passing parameters at runtime. This is an example of some Pipelines used to read data from a CSV file and write back into Sitecore.
  • Pipeline Batches: represents the synchronization process used to configure and initiate the migration task. It provides you with reporting information about the schedule of the batch and logging information.
  • Filter Expressions: represents filters applied to limit the data read by the system. This is best used when you
  • Queues: represents objects used to store read data to be executed sometime in the future. Data is retrieved from Queues or determined to be processed either by status (Queue Entry Status) or Queue Processor.

DEF Providers Components:

Providers are used at the Source or Target systems to determine requirements to take into consideration when reading and writing your data.

There is a number of DEF providers can be found here. In addition, Sitecore walks you through the process of creating a provider for Data Exchange Framework through its documentation.

Looking into how these different providers are implemented, you can also have a quick peek on its implementation by decompiling any of the providers dlls. like “Examples.DataExchange.Providers.FileSystem.dll”.

You can find a documentation for Dynamics CRM provider components here and for Salesforce provider components here.

DEF Remote SDK

One of the significant features of the DEF is that you can run Pipelines, Pipeline bathes, and other DEF components outside Sitecore server which help reduce the dependency on Sitecore specially in production environments. Also, you can enhance the performance of your server by moving the I/O operations into a separate server.

However, this option comes with some limitations as doing this will need to remove the dependency on APIs that are only available within Sitecore. This includes:

  • Run Sitecore pipelines (Data Exchange Framework provides its own pipeline implementation that can be used instead).
  • Reading anything from Sitecore configuration files.
  • Using any type defined in Sitecore assemblies (Sitecore.Services.Core.dll is the one exception).

You can find the full implementation of how to this here.

Demo

While experimenting with a number of DEF migration scenarios, I managed to come out with the following blogs, hope you find them useful:

A number of different providers and examples from different Sitecore Hackathons and community work can be found below:

Next, we will conclude this conversation by exploring Why DEF!

Happy Sitecoring!

Sitecore Data Exchange Framework: Create Sitecore Items from a Branch Template

Yup! that was the spirit when I found out that Sitecore Data Exchange Framework doesn’t support Branch Templates out of the box! I told myself I can do this like I did in the past blogs but I was like….

I kept going in circles and circles trying to get the damn thing to work! However, I managed to get it work finally and imported my data using a branch template.

1- Use the same “Advanced Sitecore Item Pipeline Step” we created in the previous blog.

2- Use the “ResolveAdvancedSitecoreItemStepConverter”  from before as well.

3- Update the ResolveAdvancedSitecoreItemStepProcessor to include references to the new ItemModelRepository.

There are two main places you need to look at when you implement this, CreateNewObject & FindExistingObject. This is where most of the work to create new objects and update existing ones is done. You will need to be careful that the later method is working in order to not duplicate existing imported Sitecore items.

3.1 – AdvancedPipelineStepExtensions

4- Update the ItemModelRepository as the Add method used supports adding templates only.

In order to do this, navigate to C:\inetpub\wwwroot\xxxxx\App_Config\Sitecore\DataExchange\Sitecore.DataExchange.Local.config

Update the <itemModelRepository> tag with a custom class for the InProcItemModelRepository class.

<itemModelRepository type=”Sitecore.Sandbox.Libraries.DEF.Repositories.AdvancedInProcItemModelRepository, Sitecore.Sandbox.Libraries”>
<databaseName>master</databaseName>
</itemModelRepository>

5- This is the part where you need to understand how this work and work on implementing the right interfaces and classes.

5.1 – Create AdvancedInProcItemModelRepository.

I created a new class that inherits InProcItemModelRepository. You probably should create a method that determines whether this is a branch template or not but.. oh well ¯\_(ツ)_/¯

5.2 – Create AdvancedHandlerProvider.

Every time I tried to get the previous class to work, I always got an error that an Interface doesn’t exist! I found out that the HandlerProvider reads items from a Dictionary Item and my new custom class is not among them!

You can find this in Sitecore.Services.Infrastructure.dll.

5.3 – Create CreateAdvancedItemCommand.

We need to create the Command that is used to pass all the ItemModel information to the CreateBranch method.

5.4 – Create CreateAdvancedItemHandler.

The CreateAdvancedItemHandler is used to handle the actual request and create items based on branch template.

5.5 – Create AdvancedItemRepository.

Finally! implementing ItemRepository to add items based on branch templates.

and that’s it!

Happy Sitecoring!

Sitecore Data Exchange Framework: Create Item Names using Multiple Columns

One of the things that caught my attention when working with the Sitecore provider is that you can only choose one Value Accessor to assign to your Sitecore item name. So I said what the heck, let’s make it a Treelist selection 😄

Similar to the previous post, we’ll use the Sitecore Data Exchange Framework File System Provider as the basis for this blog and assuming that you have already installed it to your Sitecore instance.

1- Create a new Pipleline Step inside your Pipleine process. I’m calling my Pipeline Resolve “Advanced Sitecore Item Pipeline Step”.

This is the place where we assign the Item Name Value Accessor to each created item. In order to change the Droplink field into a Treelist field:

  • Create a new template inherited from “Base Resolve Object From Sitecore Pipeline Step”.
  • Change the “ItemNameValueAccessor” field into a Treelist field.
  • Template Path: /sitecore/templates/Data Exchange/Providers/Sitecore/Pipeline Steps/Resolve Advanced Sitecore Item Pipeline Step

2- Create a new Sitecore Convertor class to handle passing a Droplist of Data Objects to your Value Accessor Reader.

The newly added “ConvertReferencesToModel” method has a new signature to accept IEnumerable<IValueAccessor>. Make sure to implement the correct interfaces for this to workout.

3- Create a new Sitecore Convertor class to handle passing a Droplist of Data Objects to your Value Accessor Reader.

Line 168 is where you read the IEnumerable<IValueAccessor> and assign it to your item name. Make sure to change the ResolveAdvancedSitecoreItemSettings where it is needed.

4- From the class defined above for the Processor, we need to create a new ResolveSitecoreItemSettings class in order to resolve the new ItemNameValueAccessor property. I’m calling it “ResolveAdvancedSitecoreItemSettings”.

That’s it! this is all it is needed to create your custom PipelineStep to handle multiple Value Accessors for Sitecore Item Name.

Happy Sitecoring!

Sitecore Data Exchange Framework: Write Different Custom Data

I’ve seen a couple of questions on StackExchange regarding how to map some types from your CSV file into Sitecore fields. This is a quick demo on how to implement a number of different readers to help you write your data into Sitecore fields using Sitecore Data Exchange Framework. I’m happy that I reached the same conclusion as the correct answer in StackExchange, again understanding the framework is the answer to many scenarios you may encounter.

The nice thing about DEF is that once you understand how the whole thing works you can roll your sleeves and start implementing whatever your business needs. The scenarios below are built using the Sitecore Data Exchange Framework File System Provider.

Scenario 1: I want to be able to migrate Yes/No data into a Sitecore Checkbox Field.

The DEF doesn’t support mapping such data into Sitecore fields out of the box. In order to fix this, you need to understand at which point the DEF will write the Data Object into Sitecore field. There are two parts to make this happen: Read Data Objects from your Source Endpoint and write the Data Object into your Target Endpoint.

Solution 1: Build a Field Value Transformer For Write – in your Value Accessor Set.

a. In your Target Provider Value Accessor, select the Value Accessor you want to transform the data written to it. Assign a custom Value Reader from the Value Readers folder that references in this case the String to Checkbox Transformer.

b. Create a a custom Value Reader in your Target Provider Value Readers. This should be built using the Value Reader template under the following path:

/sitecore/templates/Data Exchange/Providers/Sitecore/Data Access/Value Readers/String to Checkbox

c. You will need to build the Value Reader in order to read and transform your Data Object.

d. You will need to build the ItemModelConvertor that will write the transformed data from the Value Reader you just created into your Value Accessor.

Scenario 2: I want to be able to migrate data from a space/comma separated column value into a Sitecore Treelist Field.

The DEF doesn’t support mapping such data into Sitecore fields out of the box. In order to fix this, you will need to take into consideration reading data from your Source Provider’s Value Accessor that contains the space separated value.

Solution 2: Build a Field Value Transformer For Write – in your Value Accessor Set.

Similar to the solution above, you will need to repeat the same steps and implement a new Reader and Convertor. Also, you will need to assign the columns you will need to read in your Value Mapping Set. In this example, we will assign Spoken Languages from different number of columns in the CSV file into a Treelist Sitecore field.

c. You will need to build the Value Reader in order to read and transform your Data Object.

d. You will need to build the ItemModelConvertor that will write the transformed data from the Value Reader you just created into your Value Accessor. You will need to build the ItemModelConvertor that will write the transformed data from the Value Reader you just created into your Value Accessor.

Scenario 3: I want to be able to migrate data from different columns into a Sitecore Treelist Field.

The DEF doesn’t support mapping such data into Sitecore fields out of the box. In order to fix this, you will need to take into consideration reading data from multiple Value Accessors of your Source Provider.

Solution 3: Build a Field Value Transformer For Write – in your Value Accessor Set.

Similar to the solution above, you will need to repeat the same steps and implement a new Reader and Convertor. Also, you will need to assign the columns you will need to read in your Value Mapping Set. In this example, we will assign Spoken Languages from different number of columns in the CSV file into a Treelist Sitecore field.

c. You will need to build the Value Reader in order to read and transform your Data Object.

d. You will need to build the ItemModelConvertor that will write the transformed data from the Value Reader you just created into your Value Accessor. You will need to build the ItemModelConvertor that will write the transformed data from the Value Reader you just created into your Value Accessor.

e. Use the Value Mapping Set to assign the Source Value Accessors into the Treelist Target Value Accessor.

Scenario 4: I want to be able to migrate lockup data into a Sitecore Droplink Field.

The DEF doesn’t support mapping such data into Sitecore fields out of the box. In order to fix this, you will need to take into consideration that you will need to return the lockup data dynamically, however, in this case I was too lazy and will only go with the easy option :P.

Solution 4: Build a Field Value Transformer For Write – in your Value Accessor Set.

Similar to the solution above, you will need to repeat the same steps and implement a new Reader and Convertor. In this specific case, we’ll try a simple example of returning a simple droplink lockup for gender selection.

c. You will need to build the Value Reader in order to read and transform your Data Object.

d. You will need to build the ItemModelConvertor that will write the transformed data from the Value Reader you just created into your Value Accessor. You will need to build the ItemModelConvertor that will write the transformed data from the Value Reader you just created into your Value Accessor.

That’s it for me! feel free to let me know if there is a better way to do this and if you wish to experiment with other scenarios, I’ll be happy to help!

Happy Sitecoring!

Sitecore Data Exchange Framework: The What

According to Sitecore,

Sitecore Data Exchange Framework is designed to facilitate the transfer of data between systems. It allows you to define the logic needed to read data from a source system, transform that data into a format that is compatible with a target system, and write the transformed data into a target system. Developers can build connectors that allow 3rd party systems to serve as source and target systems.

Those who started working with Sitecore along time ago can relate to all the hours spent to create your own integration and migration tool or modules and add that to the marketplace for the greater good. I recall the first time I wrote my own code to migrate from Excel sheets and SharePoint into Sitecore… good days!

Nevertheless, you can notice that Sitecore is trying to bring together a lot of different practices we have developed over the years and align that with Sitecore’s main objective of having one connected platform that integrates seamlessly well with robust 3rd party ecosystems.

To build more on the What question, let’s take a couple of User Stories you might receive that can help you understand what DEF is:

  • As a user, I want to be able to integrate my system with Sitecore.
  • As a user, I want to be able to migrate CSV-based data into Sitecore.
  • As a user, I want to be able to migrate CRM contacts into Sitecore.
  • As a user, I want to be able to migrate ……….. into Sitecore.
  • As a user, I want to be able to migrate content to Sitecore without the help of a developer.
  • As a user, I want to be able to change the format of migrated content with less development changes.
  • ………

So far, we have established that this framework is built to integrate two systems (source, target) and allow for a two-way exchange of data with either of these systems on the other end. To conceptualize this even further, lets try to define what is the entity used to define each of these systems.

  • As a user, I want to be able to use X as my source / target system.

In DEF, we call these entities Endpoints. An Endpoint simply is your data source. Once you define your source and target Endpoints, you will need to look into the process of reading/writing the migrated data in the Endpoints. This is called a Provider. A Provider is used at either Endpoint to outline the main requirements to take into consideration when reading and writing your data.

  • As a user, I want to change the path of the file I’m reading data from.
  • As a user, I want to be able to map each column in a file into a Sitecore item field.

Now, we defined our data sources, provided requirements for handling the data at each data source, we’ll need to define the procedure for mapping data from the source to the destination. These data are modeled as Objects and the DEF provides components to read and write these Objects. We’ll talk more about this in the How section.

  • As a user, I want to be able to map an image URL into a Sitecore Image.
  • As a user, I want to map two columns in my CSV file into Sitecore item name.

Finally, we need to define the protocol used to communicate between these two systems, or in other words, we need to define the steps used to do the actual works. This is done using Pipelines and Pipeline Batch. The DEF allows you to define this entire protocol inside Sitecore and you can use both Sitecore and Non-Sitecore code to implement this.

  • As a user, I want to be able to update existing content if it is already imported.
  • As a user, I want to be able to schedule data migration on production servers using Sitecore.

In a nutshell, the Sitecore DEF is a layer leveraging the ETL processes.

This blog by @NeilKillen is a great explanation of the DEF. Also this one by @ianjohngraham is yet another great explanation of the DEF.

In the next post, we will talk how to use the DEF and provide a number of different custom implementations tailored for your needs.

Happy Sitecoring!