C#, .Net, Sql Server, Salesforce, Dynamics CRM etc,
Charity and Not-for-Profit sector

Sunday, January 25, 2015

Automating Salesforce Data Imports with Powershell

If you're a programmer and you're doing lots of Salesforce Data Imports, at some point you'll get fed up of messing around with Apex Data Loader and CSV files and look for better options. You might be doing incremental migration and so want to re-run and evolve the migration as the target system evolves. Or you might just want to get the migration automated for scheduling reasons.

There are various 'integration' systems that could help with this: JitterBit, Pentaho, TalenD, and so on. But if you're fond of writing code you might want to work with something a bit more low-level where you can write specific code to get the exact transformations you want without learning the vagaries of a new integration package.

Salesforce has a Bulk Data API so you might be tempted to look at that. But its quite heavy duty, if you try and use it you'll soon have to deal with batches and async callbacks and you'll essentially end up re-creating quite a lot of the code of the Apex Data Loader. The Bulk Data API is really meant for people writing generalised Integration Tools like those mentioned above, rather than people doing specific migrations into a specific system.

But the Apex Data Loader itself can be automated, so one way to combine the precise control of 'code' while importing into Salesforce is to automate the Apex Data Loader with Powershell.

The first part of this post deals with configuring the Data Loader for Automation, then the second part deals with the Powershell angle.

Setting Up Automation with the Data Loader

When you automate the Data Loader lots of files are needed and generated, so it helps to have a good folder structure. Here's what I used:

  • Data is where the source data (CSV) files will end up
  • DataLoaderConfig is where the two main config files for the data loader will be stored
  • Logs is where the Data Loader will output its logs to
  • Maps is where the mapping files for each CSV/object will be stored
  • The root folder (ExampleMigration in this case) will hold the Powershell scripts

Data Loader Config files

The Data Loader needs two files. The simpler of the two is and it just contains the login info and default settings that you want to use:

# Data Loader Config
# defaults for all runs

# login settings:
# password encrypted by C:\Program Files\\Data Loader\bin\encrypt.bat 
# run encrypt -e passwordandtoken 

# other settings
# for UTF8 files:

The endpoint and username settings are as with all uses of the Salesforce API - use for sandboxes and for live. The password needs to be encrypted by a command line tool that comes with the Data Loader. This can be done as follows:
  • On the server go to Start/Run and type cmd to get the dos prompt
  • type "cd C:\Program Files\\Data Loader\bin"
  • type "encrypt -e (password)(securitytoken)"
    where password and securitytoken are concatenated together, e.g.
    "encrypt -e tomato2FGTN3RE26yP0GYdfXD3A4Ln"
  • The command will output an encrypted password string.
  • To get that from the DOS window, right click do "Select All" and then "Copy", then paste the result into Notepad
  • In Notepad you will then have the text of the whole DOS session - extract the encrypted password string and paste it into the file as above
 Most of the other settings are fairly obvious, except perhaps:
  •  process.useEuropeanDates will read dates from the CSV in dd/MM/yyyy format instead of USA MM/dd/yyyy
  •  process.lastRunOutput = false prevents the Data Loader from creating temporary files that it uses to track last run dates etc. 
  • dataAccess.readUTF8 (commented out above) can be used if you want to have UTF8 unicode chars in your CSV file. See also notes about generating UTF8 CSV files below.
The other config file is process-conf.xml and it is used to keep info about the actual 'jobs' that you want to automate, each job being given a distinct name. It is based on the Java spring framework and looks like this:


<!-- NB, endpoint and login details set in -->
   <bean id="myJobName"
        <property name="name" value="myJobName"/>
        <property name="configOverrideMap">
                <entry key="sfdc.entity" value="MyCustomObject__c"/>
  <entry key="sfdc.externalIdField" value="Something_Id__c"/>
                <entry key="process.operation" value="upsert"/>
                <entry key="process.mappingFile" value="C:\ExampleMigration\Maps\map_mycustomobject__c.sdl"/>
                <entry key="process.outputError"   value="C:\ExampleMigration\Logs\error_myJobName.csv"/>
                <entry key="process.outputSuccess" value="C:\ExampleMigration\Logs\success_myJobName.csv"/>
                <entry key="" value="C:\ExampleMigration\Data\mycustomobject__c.csv" /> 
                <entry key="dataAccess.type" value="csvRead" /> 
   <bean id="myOtherJobName"
        <property name="name" value="myOtherJobName"/>
        <property name="configOverrideMap">
                <entry key="sfdc.entity" value="MyOtherCustomObject__c"/>
  <entry key="sfdc.externalIdField" value="SomethingElse_Id__c"/>
                <entry key="process.operation" value="upsert"/>
                <entry key="process.mappingFile" value="C:\ExampleMigration\Maps\map_myothercustomobject__c.sdl"/>
                <entry key="process.outputError"   value="C:\ExampleMigration\Logs\error_myOtherJobName.csv"/>
                <entry key="process.outputSuccess" value="C:\ExampleMigration\Logs\success_myOtherJobName.csv"/>
                <entry key="" value="C:\ExampleMigration\Data\myothercustomobject__c.csv" /> 
                <entry key="dataAccess.type" value="csvRead" /> 

Both jobs in this example are upserts. The job names for the two jobs are in bold (note that you have to give the name twice). Quite a lot of it is boilerplate, its only the settings within the <map> tag that are actually important.

  • sfdc.entity: the entity you are importing into
  • sfdc.externalIdField: if you are doing an upsert, the name of the external id field to use
  • process.operation: upsert, insert, update or extract
  • process.mappingFile: the location of an .sdl file (see below) that specifies the exact mapping from CSV fields to Salesforce fields
  • location of the source CSV file
  • dataAccess.type: csvRead when reading a file for import, csvWrite if exporting
  • process.outputError and process.outputSuccess: location of the two CSV files that Data Loader creates to document successes and failures
Note that if you are trying to export data instead of import, the <map> section looks more like this:

        <entry key="sfdc.entity" value="user"/>
        <entry key="process.operation" value="extract"/>
 <entry key="sfdc.extractionSOQL" value="Select id, username, name, email, profileID, IsActive From user" />
        <entry key="" value="C:\ExampleMigration\Data\extract_user.csv" /> 
        <entry key="dataAccess.type" value="csvWrite" /> 

Mapping Files

For each CSV file you import, you'll need a mapping file.

You can generate mapping files by running the actual Data Loader and creating a mapping and then saving it as a file. However the actual format is very simple so you can also code them up yourself if you want.

example mapping (.sdl) file

#Mapping values

Basically you put the CSV column names on the left and the Salesforce field names on the right.

If you are setting a relationship using an external ID on the other object, then the format to use is:


See the Contact__c example above.

If there is something wrong with your mapping file, you will get error messages but probably not very intelligible ones. So watch out for trailing spaces or commas, and note that the Salesforce field names (on the right of the equals sign) are case sensitive.

Running the Data Loader from Powershell

So if you have everything set up as above (you've sorted out your folder structure, done the two config files, encrypted your password, created a mapping file, put your CSV file in the right place, set up some jobs in process-config.xml and given them names) you can now run the Data Loader from powershell.

If you havent used Powershell before: Powershell usually comes installed on any recent Windows machine but initially it is (usually) set to only run signed scripts as a security measure. So before running any scripts of your own you need to relax that restriction a bit by telling it that its OK to run local scripts that aren't signed. This can be done by:
  • Starting "Windows PowerShell ISE"
  • executing this command: "Set-ExecutionPolicy RemoteSigned"
The Data Loader itself can be run from the CMD command line by using the process.bat script that comes with it.  This is well documented elsewhere. To run the Data Loader from Powershell, we need to execute that same process.bat script. This can be done as follows:

$dataloader = "C:\Program Files\\Data Loader\bin\process.bat"
$dlconfig = "C:\ExampleMigration\DataLoaderConfig"
set-location "C:\Program Files\\Data Loader\bin\"

& $dataloader $dlconfig myJobName

Line, by line, what this does is:
  • put the location of process.bat into the $dataloader variable
  • put the folder with the config files (see above) into the $dlconfig variable
  • set the working directory to the data loader bin folder
  • the ampersand on the fourth line tells powershell to execute the following command. The command that follows is process.bat and two command line arguments - the config location and the name of the job to run
Note that 'myJobName' must be properly defined in process-conf.xml (see above)
Also note that on 64-bit systems you'll need to change 'Program Files' to 'Program Files (x86)' in the file paths.

If you put the above into a script in PowerShell ISE and then run it you should then see a stream of output from the Data Loader as it runs (it is very verbose in its log output). To see how the import went, go and look at the success and error .csv log files that get created.

So why bother running the Data Loader from Powershell when all we're doing is running an old batch file anyway? Well the point is, now we are in the Powershell environment we can also do a load of other things, like automatically checking for import errors, and pulling SQL data into CSV files.

Powershell: Checking the error log output

After each Data Loader run, success and error csv files are created. There'll be a log entry in the powershell output telling you how many rows failed, but there's so much other output that it gets a bit drowned out. Here's a Powershell function that checks the error.csv file to see if there's anything in it, and if there is, prints a red warning in the powershell output:

# function to check the CSV files generated by data loader for errors
function CheckErrorFile {

    Param ([string]$perrorfilepath)

    $csvlines = (Get-Content $perrorfilepath | Measure-Object -line).Lines
    # header is always 1 row so we look for > 1 rows
    if ($csvlines -gt 1)
        $errorCount = $csvLines - 1
        $errorMessage = [string]::Format("** errors ** {0} failed rows in {1}", $errorCount, $perrorfilepath)
        # this will print red error message but not stop execution
        write-error $errorMessage

To use it, add it to the top of your Powershell script and then call it after each run (or at the end of a series of runs) like this:

$dataloader = "C:\Program Files\\Data Loader\bin\process.bat"
$dlconfig = "C:\ExampleMigration\DataLoaderConfig"
set-location "C:\Program Files\\Data Loader\bin\"
& $dataloader $dlconfig myJobName

$dlLogFolder = "C:\ExampleMigration\Logs\" 
CheckErrorFile ("{0}error_myJobName.csv" -f $dlLogFolder)
Note that you have to tell it exactly where to look for the CSV error file, because the location of it is determined by what you previously set up in process-conf.xml.

Powershell: Defining functions in a separate file

In Powershell you can define a function then call it in the same file but that quickly gets annoying. You'll probably want to share functions between multiple scripts, etc.

A quick way to share functions in Powershell is to define all your commonly-used functions in one powershell file, lets call it 'CommonFunctions.ps1'. You can then 'include' that file in other files by using the Powershell 'dot include' syntax:

. C:\ExampleMigration\CommonFunctions.ps1

# we can then call functions in this script ...

Note that powershell files that are 'dot included' in this way can also add variables into your script as well as functions. One use for that is to define connection strings in one script and then 'dot include' that script elsewhere so you don't have to keep defining the same variables.

Powershell: SQL to CSV to Data Loader

CSV files are a necessary evil but its likely that the real data you're after is in SQL and the CSV is just a stepping stone. Its possible to get the Apex Data Loader to connect directly to a database, but frankly setting that up looks like a bit of a Java/JDBC/XML nightmare.

With Powershell its pretty easy to go from SQL Server to CSV. This example is using ADO.NET to fetch the data so could probably be easily adapted for databases other than MS Sql

# function to execute query and write results to CSV
function SqlToCsv {

    Param ([string]$pconnectionString, [string]$pquery, [string]$pextractFolder, [string]$pextractFile)

    $extractPath = [string]::Format("{0}{1}", $pextractFolder, $pextractFile)

    $connection = New-Object System.Data.SqlClient.SqlConnection
    $connection.ConnectionString = $pconnectionString

    $command = New-Object System.Data.SqlClient.SqlCommand
    $command.CommandText = $pquery
    $command.Connection = $connection

    $SqlAdapter = New-Object System.Data.SqlClient.SqlDataAdapter
    $SqlAdapter.SelectCommand = $command
    $DataSet = New-Object System.Data.DataSet

    # dump the data to a csv
    $DataSet.Tables[0] | Export-Csv $extractPath –notypeinformation

You call it like this:
$connectionString = "Data Source=SomeSqlServer;Initial Catalog=SomeDatabase;User Id=SqlUsername;Password=SqlPassword;"
$extractFolder = "C:\ExampleMigration\Data\"

SqlToCsv $connectionString  "select * from some_table "   $extractFolder  "mycustomobject__c.csv"

You'll want to set the $extractFolder and CSV file name to match the file that the Data Loader is looking for as defined in the process-conf.xml (see above).

Note that the CSV files produced are pretty robust (e.g. all strings are enclosed in speechmarks, speechmarks within the strings are properly escaped, etc).

If you're dealing with unicode data then you'll want to create a UTF8 CSV file, this can be done by modifying the last line of the above SqlToCsv function, like so:

    $DataSet.Tables[0] | Export-Csv $extractPath –notypeinformation -encoding "UTF8"

In this scenario you'll also want to set "dataAccess.readUTF8=true" in the Data Loader file - see above.

Everything so far, again

So here's a full example tying all the Powershell stuff together.

Its assuming:
  • you've set up your config files and mapping files as described above.
  • you've defined a file called "C:\ExampleMigration\CommonFunctions.ps1" that contains the SqlToCsv and CheckErrorFile functions above
# include our functions 
. C:\ExampleMigration\CommonFunctions.ps1

# set up locations/variables for data loader
$dataloader = "C:\Program Files\\Data Loader\bin\process.bat"
$dlconfig = "C:\ExampleMigration\DataLoaderConfig"
$dlLogFolder = "C:\ExampleMigration\Logs\" 
set-location "C:\Program Files\\Data Loader\bin\"

# set up locations/connection strings for SQL to CSV
$connectionString = "Data Source=SomeSqlServer;Initial Catalog=SomeDatabase;User Id=SqlUsername;Password=SqlPassword;"
$extractFolder = "C:\ExampleMigration\Data\"

# get the SQL data into CSV
SqlToCsv $connectionString  "select * from some_table "   $extractFolder  "mycustomobject__c.csv"

# execute the data loader
& $dataloader $dlconfig myJobName

# check for errors in the error.csv file
CheckErrorFile ("{0}error_myJobName.csv" -f $dlLogFolder) 


Its not particularly pretty, and yes there's a lot of folder locations that need to be defined, but if you go down this route, the advantages are:
  • Free (assuming you have a windows box to hand)
  • Reasonably efficient because the Apex Data Loader can use the Bulk Data API
  • Very fine control over the mappings and external IDs used to import data
  • You can do all your transformations using SQL instead of fiddly visual box things in some integration package
  • Powershell will let you do a multitude of other things if needed
  • Everything is either a config file, a SQL file or a powershell file and so can be checked in to source control

Labels: , ,

Thursday, May 29, 2014

Remote Desktop on High DPI screens

Scott Hanselman wrote a nice blog post back in January about some of the issues you might face running Windows on a High DPI screen like that of a Surface Pro or Lenova Yoga. I'm kindof mystified that he didn't mention Remote Desktop though because thats been the number one problem for me on High DPI screens.

That said, if you remote into very recent Server OS's like Windows Server 2012 R2, then apparently Remote Desktop will sort out the DPI scaling automatically. Perhaps Scott hadn't noticed the Remote Desktop issue because he only remotes into Server 2012 R2. Certainly if I was Scott, I wouldn't remote into anything less than that.

But, in practice, I regularly have to remote into Server 2008 machines and yes even Server 2003. If you do that from a high DPI screen, the remote desktop is rendered at regular pixel size, which makes everything tiny.

Its hard to show screenshots of High DPI screens that correctly convey the pixel size, so I've photoshopped them onto a picture of a Surface and its pen, to give you an idea of scale:

Suffice to say, its hard to get anything done when everything is that size. There are an annoying number of not-quite-right solutions to this that don't work:
  • The Windows RDP client has 'Smart Scaling' - that will let you scale a desktop down to display a large amount of desktop in a smaller box ... but it won't let you scale up to display a smaller-res desktop in a larger box.
  • The Metro RDP client has a Zoom feature - but it won't let to set the screen resolution of the remote machine (so you can't, for example, connect at 1280x800 and scale up)
Turns out the solution is to install an alternative RDP client with a similar sounding name of Remote Desktop Connection Manager 2.2

This is a souped-up version of the Windows RDP client, apparently intended for people switching between many different servers. It displays a tree of servers on a panel to the left of the screen, and displays the remote machines desktop on the rest of the screen. With the right settings, you can get it to display a sane sized desktop on a high DPI screen:

As you can see, the remote machine now appears within the Remote Desktop Connection Manager client area, but it is rendered at a much more usable scale.

To get it to display correctly, I needed to set the 'Remote Desktop Settings' after adding a new server, and select the 'Same as client area' option:

Thanks to Falafel Software's blog post for pointing me in the direction of this solution.

Edit (July 2015): This is a Windows 8 solution, it wont work on Windows 7. Also, there is now a newer version of Remote Desktop Connection Manager (2.7) that doesn't fix scaling as described here. So you need to get the 2.2 version (linked above) in order for this to work.

Wednesday, May 14, 2014

Salesforce User License Transitions

In Salesforce, User License types are tied to Profiles, so to change the User License type of a User, you need to change their Profile. However, only certain types of transitions are allowed:

Core Licenses:

aka "Salesforce Licences". You can transition users between these six core licenses, but you can't transition from these six to any of the others. That is, you can't convert to Chatter or Community/Portal licenses.
  • Salesforce
  • Salesforce Platform
  • - App Subscription
  • - Free
  • Partner App Subscription
  • Only

Chatter Licenses:

  • Chatter Free
    This license can be changed to any of the six Core Licences listed above - but then you cannot change back
  • Chatter External
    This license can be changed to Chatter Free or any of the six Core Licences listed above - but then you cannot change back

Customer Community Licenses:

You can transition users between these four Customer Community type licences, but you can't transition from these four to any of the others. That is, you can't convert to Core Licences or Chatter, etc.
  • Customer Community
  • Customer Community Login
  • High Volume Customer Portal
  • Authenticated Website

Partner Comminity Licenses:

You can transition users between these four Partner Community type licences, but you can't transition from these four to any of the others. That is, you can't convert to Core Licences or Chatter, etc
  • Partner Community
  • Partner Community Login
  • Gold Partner
  • Silver Partner

In summary: You can transition from Chatter to Core, but apart from that you cannot move between Core, Customer Community and Partner Community licenses.

To find out more details about the various User License types, see Salesforce Help - User Licenses Overview. But make sure you've got a decent hat on, because you're going to be scratching your head. A lot.

edit: Along similar lines, I just found this great blog post by Marty Chang - Allowed User License Changes: Upgrades, Downgrades, Switches which uses a large matrix to show all the allowable license transitions.


Friday, February 28, 2014

Copying data to Salesforce Sandboxes using TalenD

A common problem with Salesforce Developer Sandboxes is that they are blank. Really you're going to want some data in there, so there are various strategies for copying data from your live instance to the Sandbox.

There are some paid-for solutions - SFXOrgData, Salesforce Partial Data Sandboxes - but if you've got a decent ETL tool you can build your own.

There are a bunch of free ETL tools for Salesforce:
  • JitterBit Data Loader is good for quick ad-hoc tasks but the free version makes it difficult to manage specific ETL projects or share projects with other users
  • Pentaho Community Edition - an open source edition of the enterprise version
  • Apatar was a free open source Salesforce ETL which still works but development seems to have stopped since 2011
  • TalenD Open Studio is an open source ETL tool
For the task of copying data from live to a Sandbox, either Pentaho or TalenD Open Studio could be used, depending on preference. Here's a good comparison of the differences between them. For the purposes of this blog post, I'm going to concentrate on TalenD.

Introducing TalenD

TalenD is an open source ETL tool based on the Eclipse IDE. It has a multitude of features and connectors, which makes it very useful.

And, not to beat about the bush, it also makes it very big and potentially confusing. Its one of those really really big Java programs that has that ... well, "Java" sort of look to them. It has good Salesforce support but its not built for Salesforce specifically which means that there are some idiosyncracies in the way it handles custom objects for example.

But, on the plus side, it works, its pretty fast with data, its properly open-source and it has a fairly big community of users.

You can download it here. The installer is 429 MB. Yes, thats right, nearly half a gig. Half a gigabyte of Java desktop. I blame Eclipse. Anyway, lets run through getting it working with Salesforce ...

(NB: On my machine (64-bit Windows 8), the TalenD installer did not create any shortcuts to start TalenD. There were a bunch of differene EXEs in the "C:\Program Files (x86)\TalenD\TOS_DI-Win32-r111943-V5.4.1" folder. The "TOS_DI-win32-x86.exe" worked for me.)

Our Mission

In this example we want to transfer a subset of the Contacts from our Live site to our Sandbox. Its a subset because the Developer sandboxes have a limited amount of space.

We also have a picklist/lookup table called 'Animal' (thats a custom object, obviously) - we want to transfer all of those Animal records.

In our example, each Contact can have one or many 'Pet' records. We want to transfer all of the Pet records that belong to our subset of Contacts.

Note that Pet has a foreign key to both Contact and Animal - handling foreign keys when moving data to Salesforce Sandboxes is a bit of a challenge so thats why our example has two of them.

Note that we are not transferring Accounts, just Contacts, to make this example easier. In my particular case, the underlying Salesforce Instance has the Salesforce Non-Profit Starter Pack (SPSP) which uses some tricks to make the Contact -> Account relationship less important (it automatically generates an Account for each Contact for you). In a regular Salesforce inastance, not setting the Account for a Contact is allowed but will lead to it being a 'Private Contact' only visible to its owner. For this example, that'll have to do. If you need to transfer Accounts as well in your scenario, it should be fairly clear how to extend this example to allow that.

Here are the object definitions and example data used for the Animal__c and Pet__c custom objects, if you want to see exactly what this example will be working with:

The Contact object in this example is the usual Contact object. One custom field has been added - Sandbox_Migration_Id__c, the significance of which will be explained below. For this example we will only be transferring contacts with a LastName matching 'Testperson' (as show in Pet list above).

Creating a Project and Job

When TalenD starts, it asks you to create an initial Project. Press the Adanced button and you'll get a few more options, like this:

Note that TalenD uses a master 'Workspace' area - this is your chance to change it to something else. Then press Create ... and give your Project a name. I went for "CopyToSandbox". Then select it and press Open...

You might get asked here to sign up to the TalendForge community. I'm sure its lovely, but lets press the Skip button for now...

You then get a TalenD welcome screen which gives you the chance to catch up on the latest ... er ... TalenD news. Select the "Create a new Job ..." option, or just press "Start Now" and right click on 'job designs' to create a new Job.

Lets call the Job "CopyToSandboxJob". You should then end up with something looking like this:

On the left you have a Job called "CopyToSandboxJob 0.1" under "Job Designs". On the right you have a bazzillion different components to use (we're going to use exactly 4 of them). In the middle is the blank slate where we're going to wire-up our job. At the bottom is a mutli-tab Properties box type area where most of the fiddling around will happen.

Connections and Meta-data

First we need to define connections for our Source and Target Salesforce instances:

  • From the  Palette on the right, go to Cloud -> Salesforce and drag a tSalesforceConnection component onto the central canvas ('designer')
  • Double click on the component and then the 'Component' tab at the bottom of the screen will show its properties:
  •  You just need to set three things here:
    • Username - in double speechmarks
    • Password - in double speechmarks (sometimes, you may need to append your security token after the password)
    • Salesforcce WebService URL: for live instances the default value will be correct. For sandboxes rename the first 'login.' part of the URL to be 'test.' as above.
  • The speechmarks are pretty important. TalenD treats settings a bit like code, so in the component properties, generally anything not in speechmarks is interpreted as a variable name. There are some exceptions to this rule ...
  • We also need to give this component a friendly name so click on ther View side-tab:

  • Label format will default to __UNIQUE_NAME__ which tells TalenD to generate a name.
  • But you can replace Label format with your own text and the component will then be given that name on the canvas. In the screenshot above we've called it 'SalesforceSource'.
  • So now we've got a Source connection. Add another tSalesforceConnection and repeat the process to create a 'SalesforceTarget',
Next we need to set up a Repository of Meta-data - a description of the Salesforce objects that we are going to be dealing with. TalenD is keen on meta-data, and likes to have schemas centrally defined rather than each component creating it own - although it can do that as well. Getting the meta-data right is especially important with Salesforce Custom Objects.

In the Repository tree on the left of the screen, expand Metadata and then right-click on Salesforce and select Create Salesforce Connection:

 The Meta-data repository cannot re-use the Source and Target connections we created inside the Job, unfortunately, so it needs its own connection:
  • In page 1 of the connection wizard, set Name to MetadataConnection and press Next (no speechmarks needed in this wizard)
  • On page 2, set Web service url, User name and password. Probably best to connect to the same Salesforce instance as the SalesforceTarget connection defined above. As before, if its a sandbox, replace the 'login.' part of the Web service url with 'test.'
  • You can then press the 'Check Login' button to test the connection. If its ok press Finish.
Now right-click on the 'MetadataConnection' node you just created and select 'Retreive Salesforce Modules'

You are then presented with a big list of the objects in the Salesforce Instance that you connected to. You need to tick the box for the objects you want to work with (in this case, Contact, Animal__c and Pet__c). As you tick each box, the Creation Status column will say 'Pending' and then 'Success' as it downloads the metadata for each table.

Some hints: Custom Objects will be at the bottom of the list. You can use the Name filter box to filter the list, however sometimes this seems to cause the 'Finish' button to disable itself, meaning you cannot complete the dialog. So you're probably better off just scrolling through the list, ticking the objects you need.

When you've ticked all the objects you need, press Finish. The metadata section will then look something like this:

Digression: Foreign keys, Ids and External Ids

We're going to copy Salesforce data from Source (e.g. live instance) to Target (e.g. a developer sandbox). However, we can't copy it perfectly; there are some read-only fields that we cannot set on the Target:
  • Salesforce will generate new 'Id' values in the Target instance. Source Ids will not match Target Ids.
  • Autonumber fields (such as Pet Ref in our example) will be regenerated in the Target and so probably wont match the Source
  • Other system fields such as Created By/Last Modified By will be generated at the Target.
The Id change is the most serious issue. Remember that we are copying Contact, Animal__c and Pet__c, and that Pet__c has foreign keys to both Contact and Animal__c. In the Source database, those relationships are represented by Id values, but in the Target database all the Id values are going to be different.

We can get around this by making use of External Ids. In our example (and you should consider doing this with whatever object you are using) we have added an External Id to all three objects:
  • Sandbox Migration Id (Sandbox_Migration_Id__c) text length 18, External ID
In the source data, this field is blank. But in the target data we will set this field to the original Id of the source row. This will enable us to do two things:
  • Import the data using upserts, so that the job can be re-run without creating duplicate data in the target
  • Relate the lookup values in Pet__c to Contact and Animal__c, by using the External IDs on Contact and Animal__c.
Note that strictly speaking you only need the'Sandbox Migration Id' fields in the target instance. But it would be a pain to manually add them to a sandbox every time you create a new one. For that reason I'd recommend adding these fields to the live instance. You wont actually use them in the live instance, but every time you generate a Sandbox the fields will be there ready to be used.

External Ids can be a bit mind-bending but it should be clearer as we work through the example:

Building a Subjob to transfer data

Here we're going to use a tSalesforceInput component, a tSalesforceOutput component and a tMap component to transfer the Animal__c data from Source to Target:

Drag a tSalesforceInput component from the Palette on the right over to the canvas. "tSalesforceInput" here means 'input data into the Talend job from Salesforce'.

There are lots of different settings on the tSalesforceInput component. Here's what we need to set:
  • Check the "Use an existing connection" checkbox. The login fields will disappear and be replaced by a picklist that lets you choose from the tSalesforceConnection objects already on the canvas. We want to choose the 'SalesforceSource' one.
  • Leave 'Query Mode' as 'Query'
  • The 'Module' field lists the standard Salesforce objects. Scroll right to the bottom of the list and choose 'Custom Object'. A new textbox 'Custom Object Name' will appear, in which we want to type "Animal__c" (with speechmarks). 
  • Now the tSalesforceInput component knows what object we want to input. But it also needs to know the schema:
  • Change the 'schema' picklist from 'Built-in' to 'Repository'. A textbox and an ellipsis button will appear to the right of the picklist. Press the ellipsis button and you will get a dialog called "Repository Content". Expand the Salesforce node and drill down into the tree until you get to the Animal__c table icon:

  • Next we want to clear the Query Condition (we want to transfer all the Animal__c rows). So set Query Condition to blank.
  • Your settings should now look like this:

  • Finally we'd better give this component a different name. Switch to the view tab of the settings and replace Label format "__UNIQUE__NAME__" with something like "InputAnimal"
Next we need to configure the tSalesforceOutput component:

Drag a tSalesforceOutput onto the canvas (its output as in 'output from the talend job into Salesforce'). We need to set its properies like so:
  • As with the input component, check the 'Use an existing connection' checkbox and select a connection, this time the SalesforceTarget one
  • Change 'Action' to 'Upsert'. We're going to use Upserts everywhere, so that the job can be re-run as much as we want during testing. An 'Upsert Key Column' picklist will appear, but it will be empty - we'll come back to that.
  • In the Module picklist, as with the Input component, scroll to the bottom and select 'Custom object'. Then type "Animal__c" (with speechmarks) in the box that appears.
  • For Schema choose "Repository" and then (as with the Input component) press the ellipsis button and choose the Animal__c table from the Repository Contect dialog (see screenshot above).
  • Switch to the View tab and set Label format to something like "OutputAnimal"
  • Upsert Key Column will have to remain blank for now, so the properties should look like this:

We need to adjust the schema for this output component - by default it will try to set all fields of the object in Salesforce, but some of them (like 'Id') are read only, which will cause an error. So:
  • Click on the 'edit schema' ellipsis button
  • A dialog will appear with three options: View schema, Change to built-in property, Update repository connection. Leave it at "View schema" and press OK
  • You will then see a 'Schema of OutputAnimal' dialog listing all the fields. Each one has a 'Use' tickbox. We want to untick the read-only/system fields, like so:

  • Then press OK.
  • Out Output Component is now down. Mostly. Except for Upsert Key Column...
Now we are going to add a  tMap component which will join the input and output:
  • In the palette on the right, go to the 'Processing' folder and drag a tMap component onto the canvas. Try and put it between the Input and Outout components. Its default name will become 'tMap_1'
  • Right-click on the 'InputAnimal' component and start dragging ... you should see a line coming from the component to your mouse cursor. Drag your cursor to the 'tMap_1' component and then release the button. InputAnimal should now be connected to tMap_1:
  • If you can't get the right-click-and-drag to work, the other option is to right click on InputAnimal and select "Row -> Main" from the top of the right-click menu. You will then be in 'drag a line' mode and can go and click on tMap_1 to join the two components together.
  • Now we right click on tMap_1 and drag the connecting line to the 'OutputAnimal' component. This time a dialog will appear; TalenD wants us to name this new connection.
  • Type something like 'MapAnimalOut' (you will notice the unhelpful 'input is invalid' warning will disappear once you type something) and press OK.
  • You will now get another dialog - "Do you want to get the schema of the target component?". Press Yes. What this means is, the tMap output will use the schema of the OutputAnimal component that you connected it to.
We now have Input, tMap and Ouput connected like this:

The light blue box that has appeared around the connected components is called a Subjob. You can have many Subjobs within a Job. The little red X above the OutputAnimal component is there because we still havent filled in the 'Upsert Key Column'. We will get round to it.

Now double click on the 'tMap_1' component and a big mapping dialog will appear: Here's how it looks before we fill anything in:

You can do a lot with this tMap component. The input is on the left - it will have picked up the schema from the 'InputAnimal' component. The output on the right has fewer fields because we adjusted the 'OutputAnimal' schema to remove the read-only/system fields. More details of the schemas are shown at the bottom. This dialog will let us map input fields to output fields, but it can also do lots of other things, like formulas, multiple outputs and so on.

We want to set the mapping as follows:
  • Press the 'Auto-map!' button at the top-right. This will map fields with matching names together. You'll see a bunch of yellow mapping lines appear going from left to right.
  • The default guess of Auto-map is right for more fields, but not for Sandbox_Migration_Id__c (see 'Digression: Foreign keys, Ids and External Ids', above). We want to map Id from the left to Sandbox_Migration_Id__c on the right. Unfortunately, clicking on the yellow lines doesn't give you a way of deleting them. To remove a mapping we must edit the 'Expression' column in the output grid on the right hand side.
  • Click on the cell that is in the 'Expression' column for the 'Sandbox_Migration_Id__c' field in the MapAnimalOut grid. The cell will become editable - remove the text that is there (row1.Sandbox_Migration_Id__c) so that the cell is blank. The yellow line leading to that field will then disappear.
  • Now drag the 'Id' field from the input grid on the left to that same cell on the right (that is now blank). You should then get a yellow arrow from 'Id' on the left to 'Sandbox_Migration_Id__c' on the right.
  • The other way to acheive this is to type 'row1.Id' into the blank cell.
  • Your mapping should now look like this:

  • In which case press OK to close the mapping dialog.
Now that the output component 'OutputAnimal' is all connected up, we can set the Upsert Key Column that we had to skip earlier:
  • Double click on 'OutputAnimal'
  • Set 'Upsert Key Column' to 'Sandbox_Migration_Id__c'
  • Everything else on that component should be OK because we did it earlier.
This is probably a good time to save the project - press the Save icon in the TalenD menu, if you haven't been doing that out of habit anyway.

First run - transferring data for a single object

We're now ready to run the thing, and what that should achieve is copying all Animal__c data from Source to Target, mapping the Source Ids into the Target 'Sandbox_Migration_Id__c' field along the way.

To run the project, switch to the 'Run Job' tab at the bottom of the screen and press the big Run button.

Before if runs, TalenD will show a 'Building Job' progress bar. What its actually doing here is taking the job and compiling it into a stand-alone Java program that you could potentially run separately. TalenD is based partially upon the Eclipse Java IDE, and so it is able to build programs. We're not really interested in that ability right now, we just want to run it. But its quite impressive.

The job will then run. If it all goes well, you'll see row counts and data transfer rates appear on the connecting arrows, blue before running and green when finished. You'll also get a few messages in the Run tab about statistics:

If anything goes wrong, you'll get red error warnings on the canvas, and detailed error messages in the run tab. If it does fail, check the following:
  • Are your Salesforce credentials in SalesforceSource and SalesforceTarget correct?
  • Did you edit the OutputAnimal schema to remove the read-only/system fields? (see above)
  • Did you set 'Upsert Key Column' on the OutputAnimal schema? (see above)

Now, more of the same, slightly different

After much step-by-step work we now have one table being transferred. Now we want to add the other two for our example - Contact and Pet__c. This will largely mean repeating the steps above, but with some differences.

To transfer Contact, drag a tSalesforceInput component onto the canvas and set its properties as follows:

That is, mostly the same as 'InputAnimal', but:
  • For 'Module' we can select 'Contact' from the picklist because its a standard Salesforce object rather than a custom one
  • In our example, we want to filter the Contacts from the source data, so here we have a Query Condition of "LastName = 'Testperson'" (with speechmarks)
  • Contact can have a lot of fields so you may want to use the 'Edit Schema' button to untick 'Use' for some of the columns so that TalenD does not read them from Salesforce. That makes the mapping a bit easier to handle. It also reduces the number of relationship/lookup fields to might have to deal with. Here's the columns I left ticked on the Contact input schema for this example:
Then drag a tSalesforceOutput component on and set its properties like this:

 As with Animal, we'll have to come back and set Upset Key Column later.

We need to press the 'Edit schema' button and untick 'use' for any read only/system columns. And for any other columns we're not interested in. Here's the fields I'm using in this example:

To keep this example simple, we're not including any Lookup/Parent-Child fields. Specifically, we're not trying to fill in the Account for the Contact.

We are including the Sandbox_Migration_Id__c field in the Contact Output schema - this is important for the upserts and Pet__c to Contact relationship to work.

As before, add a tMap to the canvas, link the Input to it, and then link the tMap to the output. Then set the mapping as follows:

That is - auto-map everything, except for Id in the Input which should map to Sandbox_Migration_Id__c in the output.

The final thing to get Contact working is to go back to the OutputContact component and set the'Upsert Key Column' to 'Sandbox_Migration_Id__c'.

Second run - transferring data for two objects

If you run the job again, you should see something like this:

Note that we now have two Subjobs in our TalenD job. TalenD will run them in whatever order it likes (usually, the one added first runs first). Thats OK so far, because Animal__c and Contact are not directly related. However in the next stage we're adding a third Subjob and we will need to tell TalenD how the Subjobs depend on each other.

Adding a third object with lookups to the other two

Now we want to add the Pet__c object. Pet__c has lookup fields to both Contact and Animal__c, and so there will be a few extra settings this time.

The tSalesforceInput component for Pet__c is much like the one for Animal__c:

The one difference is that we want to filter Pet__c, using the same Contact filter. So the Query Condition "Contact__r.LastName='Testperson'" (with speechmarks) achieves this by filtering on the parent Contact's fields.

The tSalesforceOutput component for Pet__c is also much like the one for Animal__c:

As ever, we need to come back and set 'Upsert Key Column' in a minute. We also need to do some things on the Advanced Settings tab to get the relationships working. The output schema for Pet__c must be edited to skip all system/read-only fields, like so:

The tMap component for Pet has the usual mapping - everything automapped except for Id to Sandbox_Migration_Id__c:

The complicated bit, with Pet__c, is getting the relationships to Contact and Animal__c to work. To do this we have to go to the Advanced Settings tab of the OutputPet component, and set the 'Relationship mapping for upsert' grid like this:

 The 'Relationship mapping for upsert' grid is pretty confusing to fill in, the first column is a picklist but the other three columns are just text fields where you have to type the field names (hence the speechmarks in those columns).

What these two rows in the grid are saying is:
  • The Pet__c data has a Contact__c column. Follow the "Contact__r" relationship from Pet__c to Contact and then match the value to the "Contact.Sandbox_Migration_Id__c" column.
  • The Pet__c data has an Animal__c column. Follow the "Animal__r" relationship from Pet__c to Animal__c and then matchthe value to the "Animal__c.Sandbox_Migration_Id__c" column.
Two hints for this grid:
  • Any fieldnames you type in need to be wrapped in speechmarks (except the first column which is a picklist)
  • Relationships names from custom objects to other objects generally end in '__r'. However relationships between standard objects do not have the '__r' suffix.

Setting the Subjob order

One more thing we need to do is tell TalenD to only run the Pet Subjob after the other two Subjobs:
  • Right click on the InputAnimal component and select 'Trigger -> On Subjob OK' from the menu.
  • You will now have a line going to your mouse pointer. Click on the InputContact component.
  • TalenD now knows to only run the Contact SubJob if the Animal Subjob has finished with no errors.
  • Right click on the InputContact component and select 'Trigger -> On Subjob OK' from the menu.
  • Click on the InputPet component to link the line to it.
  • TalenD now knows to only run the Pet Subjob if the Contact Subjob has finished with no errors.

Third run - transferring data for three related objects

Now that all three objects are set up, we can run the job again. If it all works it should look like this:


Copying a sample of Live data into Developer sandboxes so that you have something to work with is a common problem in Salesforce. There are various tools for helping with this, including several free ETL tools. TalenD is a whopping big Java ETL that can do the job. When you transfer data, some values get freshly generated, including object Ids. This means care has to be taken with relationships between objects to make those relationships transfer properly. The approach taken above with 'Sandbox_Migration_Id__c' fields is one way of tackling this.

Building an ETL package like this to populate a Sandbox takes a bit of effort, but once you have done it you can re-use it again and again whenever you refresh a sandbox. Just changing the credentials in the 'SalesforceTarget' component is all you need to do to point the package at a different sandbox.

Labels: ,

Friday, September 20, 2013

Salesforce and Dynamics CRM pricing for Charities

Currently I'm working on two separate projects for two different charities - one of them implementing Dynamics CRM, the other implementing Salesforce.

As I've said before, the two systems are surprisingly similar in philosophy and features, but there are some differences. Probably the most important one to discuss is price. Both Salesforce and Dynamics CRM offer steep discounts for Charities and Not-for-Profits. This graph of total cost per month versus no of users sums it up:

Broadly this shows that Salesforce is actually cheaper (or free) for small numbers of users, although when you get to more than about 16 or 17 users Dynamics CRM Online works out cheaper. The estimated charity price for the imminent new release of Dynamics CRM Online is also shown, but this has not been confirmed yet (see below).

This graph hides a lot of the nuances though, so here's a more thorough discussion

Salesforce pricing for Charities

Salesforce is very clear about its Charity pricing - qualifying charities get 10 licenses for free, and any subsequent licenses have an 80% discount.

So applying that discount to the UK prices we get a basic user cost of:

Salesforce Enterprise Edition Charity price: £17 per user* per month (+VAT)
* = first 10 users free

Note that there are other cheaper editions of Salesforce (Professional, Group etc) which may be adequate for small organisations or specific situations. But everyone I've encountered has been using Enterprise Edition because it includes Workflow and the Web Services API (among lots of other things).

Within a given Edition of Salesforce (say, Enterprise Edition) you are allowed to mix different user license types. For example 'Salesforce Platform' licenses are cheaper but do not have access to some of the built-in objects and features. In practice though, to take advantage of this, the Salesforce system you are building will need to be very carefully designed by someone that both understands all the difference license types and understands the requirements of what you are trying to build. That is no mean feat.

The amount of storage space Salesforce gives you is related to the number of user licenses. For Enterprise edition, it works like this:
  • Minimum for all Enterprise customers:
    • 1 GB of data storage
    • 12.5 GB of file storage
  • When there are more than 50 licenses:
    • 20 MB of extra data storage for each user license (after first 50)
    • 612 MB of extra file storage for each user license (after first 50)
You can purchase extra space, although prices for this are not published and are generally described as 'expensive'.

Salesforce will apply the same 80% charity discount to other items and services as well, such as customer portal logins. However they are very unlikely to offer any further discount on top of that.

Dynamics CRM pricing for Charities

Firstly, I should point out that Dynamics CRM can be hosted in three different ways:
  • Host it Yourself ("On Premise") - you buy the server and user licenses, install it on your own hardware. This is unlikely to be simple or cheap once you take account of availability, security, updates, backups and support, and is unlikely to make sense even for very large charities. So I'll leave it out of this article.
  • Hosted by a Third Party - wherein someone other than Microsoft hosts it for you, e.g. people like Logica, Fujitsu or Accenture. This is worth considering, but pricing depends entirely on the third-party so I won't include it in this discussion.
  • Dynamics CRM Online - the cloud version, hosted by Microsoft. Going forward, this is where Microsoft expects to sell most licenses, and it is directly comparable with Salesforce own cloud hosting model. 
For Dynamics CRM Online 2011 then, the current listed commercial price is £28.70 per user per month. The official price for qualifying charities is rather less easy to find but this Microsoft page lists it as $9.99 per user per month, which then equates to about £6.75 per user per month in the UK (as confirmed by Preact in their faq). So thats about 77% discount.

Dynamics CRM Online 2011 Charity Price: £6.75 per user per month

But ... over the next few months CRM 2013 is being rolled out, and with it comes a price change. Dynamics CRM Online 2013 will have with three different pricing tiers:
  • Professional – commercial price $65 per user/month
  • Basic – commercial price $30/user/month
  • Essential – commercial price $15/user/month
See this PowerObjects blog for a discussion of what those prices get you. However the expectation seems to be that most people will go for Professional, and Professional is the only one that covers the feature set that Crm 2011 customers currently have. It seems that a minimum of 5 licenses might also be required.

Charity prices have not been revealed yet, but it looks like a price increase is on the cards. If we apply the same proportional discount as before, we get the following guesstimate:

Dynamics CRM Online 2013 Guesstimated Charity Price: £9.00 per user per month (for Professional)

I'll update this as more information becomes available.

Currently the Dynamics CRM approach to storage space is very simple: You get 5GB to start with, and the charity price for an extra GB is about £6.50 per month. With the 2013 release, they're changing it a bit, thowing in an extra 2.5 GB for free for every 20 professional licenses purchased, up to 50GB.

Other Factors

The above covers the basics, but there are other factors to take into account.


For example, "Portals" - whereby the CRM system provides a web interface for your customers to log into - have different license prices. Salesforce has a number of different portal options and licensing models - see here for an overview and this blog post for a diagram of the options. Dynamics CRM has traditionally only offered portal licenses for the non-Online versions, but it seems the new 'Essential' license for Crm Online 2013 is intended for portal users.

Dev and Test Environments

Its nearly always vital to have a development or test environment for your system so that you can develop customisations away from the live environment. Salesforce calls these Sandboxes and for Enterprise edition you generally have to purchase them separately (see here for discussion of prices).

Since Jan 2013 Dynamics CRM Online has had the option for multiple 'instances', although they have been quite expensive - about £350 per month with apparently no charity discount. Additional instances do not require additional user licenses but do take up space in the existing storage space quota.  This situation is set to improve with Crm Online 2013 - it will be possible to buy 'non-production instances' that are cheaper than regular instances, and one 'non-production instance' will be thrown in for free if you buy 25 or more Professional licenses.


For small numbers of Charity users, Salesforce works out cheaper, for more than about 16 users, Dynamics CRM might work out cheaper. There are price changes coming with Dynamics CRM 2013 that are not yet known, but it will still be cheaper than Salesforce per-user. CRM 2013 also brings a much-improved user interface. There are other factors to take account of when estimating the price of a system - storage, different license types, dev and test environments.

But which system is better? This is not an easy question to answer. Salesforce is more expensive, and one could argue that it has a broader and more mature set of features. But is it far enough ahead to justify the extra money? This comes down to which features matter to you. Both systems are large, complex systems with many many features, and although there is a lot of overlap, there are still lots of differences. In short, to choose between them you need to do an accurate, detailed analysis of the features each can offer against the features you want, and decide where you want to make compromises.

Labels: ,

Wednesday, August 21, 2013

A closer look at the Dynamics CRM Process Control

The usefulness of Checklists

If you use standard 'flowchart' notation to describe how to make a cup of tea, you'll get something very linear, like this:

A lot of space in a standard flowchart is taken up describing what order things should happen in. But actually, for some of the steps, the order is not that important. We certainly have to wait for kettle to boil before pouring the water into the mug, but the order in which we add milk and sugar doesn't matter. If we re-draw the process using a notation that allows for certain steps to happen in any order, we get something like this:

If you look at real world business processes, you'll probably find the same thing: although there are certain steps that do have to be ordered, a lot of them can be split into 'stages' and then tackled in any order. In many cases a series of checklists is a much better model for business processes than a classic flowchart.

The CRM Process Control

Someone at Microsoft must have noticed this too, because the new CRM Process Control is all about modelling processes as a series of checklists.

The Process Control is the centrepiece of a major overhaul to the Dynamics CRM UI that I wrote about previously. It was originally released in December 2012 update (aka 'Polaris') and is being further enhanced in the upcoming Dynamics CRM 2013 release (aka 'Orion'). It looks like this:

... and its key features are:
  • Each tab in the Process Control is a 'Stage'. Within a stage is a number of Steps
  • As you click on each Step, you are asked to do something. Generally there are three types of Step:
    • Fill in a field - a panel will appear with the appropriate control (textbox, dropdown, datepicker or whatever) to let you fill it in
    • Tick off an item - more basic, just clicking on the Step marks it as 'done'
    • Special Steps - there are some steps that do quite specific things, e.g. the 'Similar Cases' step on the Case process reveals a large panel that lets you search through other cases to help find a solution to the current one.
  • A process can span more than one entity - in the current version of CRM (Polaris), the 'sales' process starts on Lead and then moves the user to Opportunity on the second stage.

Different Types of Step

Field Steps

Many (but not all) steps correspond to a field on an entity. They render differently depending on the type of field. Here's an animation of the main field types and what their Steps look like:

(click on the animation to see the full-size version)

Note that for Lookup fields, initially the Process Control suggests a few matches based on the text entered, but there is also an option to 'Look Up More Records' which brings up a more comprehensive search panel.

In the 'Process Control Customization Tool', to add a field-related steps, you simply add the step and select the field:

The convention seems to be that if a field appears in a Process Flow, you probably don't put it on the main form as well. The idea presumably is to simplify forms and only show the user fields relevant to the current stage. You can of course still put the field on the main form if you want. However (in Polaris at least) updating the field in one place will not automatically update it in the other. For example if 'Estimated Budget' appears on the Process Control and on the main form, if you fill it in on the form you'll have to refresh the screen before it also shows on the Process Control, and vice-versa.

Special Steps

Not all Steps correspond directly to a field. In the 'Polaris' version of the Customization Tool, certain entities give you a special drop-down at the top of the tool for selecting special steps:

The special steps available in Polaris are "Find Customer", "Find Case", "Similar Cases" and "Resolve Case".

The "Find Customer" and "Find Case" steps allow the user to perform a search, but they also have a 'New' button that displays a Quick Create panel that allows the user to create a new record directly:

The "Similar Cases" step is very specific - it brings up a large panel that allows the user to search cases and also displays posts and activities for those cases:

The existence of such a specific step suggests that (maybe) it will be possible to define arbitrary custom steps for the Process Control that can display arbitrary panels (perhaps defined via FormXml). I haven't seen any announcements about this, so perhaps it wont be possible for a while yet. But fingers crossed.

The "Resolve Case" step does not have any special UI behaviour but it does set the state of the Case to Resolved. Again, this suggests that it might be possible in future to tie arbitrary bits of processing to Steps. But again, I'm not aware of any announcements so far.

'Mark Complete' Steps

Several of the built in processes in Polaris have 'tick off' steps that appear as 'mark complete'. When you click on them, they change to 'complete':

This is done via an innovative use (or 'hack') of the TwoOptions field type. The Two Options are defined as 'mark complete' and 'complete', with 'mark complete' set as the default:

This makes it look nice and tidy on the Process Control - you get a nice bit of text saying 'mark complete' - but looking at the data model as a whole I'm not sure its a very good idea. Taken away from the context of the Process Control, the 'mark complete' option could be very confusing. For example, imagine a Report or Advanced Find that is displaying a list of Opportunities. The 'Send Thank You Note' column would be full of 'mark complete' and 'complete' labels, and it wont be entirely clear that 'mark complete' actually means 'this has not been done yet'.

Improvements in Orion

The upcoming Orion release (Dynamics CRM 2013) is going to bring a number of changes and improvements to Process Flows and the Process Control.

(This screenshot taken from Reuben Krippner's recent presentation at WPC2013)

  • There are some slight style changes, such as the Process Control being collapsible, and the current stage being highlighted by a little blue flag (see screenshot above)
  • It will be possible to mark certain steps as mandatory - they must be completed before moving on to the next stage
  • It will be possible to define process flows for any entity including custom entities
  • It will be possible to have multiple process flows for an entity. For any given record, the user can then choose which process flow to use (via a 'Select Business Process Flow' dialog)
  • Process Flows will be able to span multiple entities - this was already shown in Polaris with the Sales process flow that spanned Lead and Opportunity. However in Orion this will be expanded and presumably apply to custom entities as well


So, the concept of a series of checklists is a good one, and in the new Dynamics CRM UI, Microsoft have found a way to implement it in a simple but flexible way.

As a developer, I'm left wondering about the following:
  • Will it be possible to define arbitrary UI panels that relate to Steps? (like the 'Find Similar Cases' step)
  • Will it be possible to define arbitrary bits of processing for Steps? (like the 'Resolve Case' step)
  • What 'state' is associated with a Process Flow? Most of it seems to be based on the fields in the underlying record and whether they are filled in or not, which suggests that Process Flows in themselves are largely stateless. But does CRM keep track of which 'stage' is current, and if so, where does it track that?
  • If there is 'state' relating to Process Flows, will it be possible to manipulate that state via the API - for example set which Stage a record is on, or set which Process Flow a record is connected to?
We should know more over the next three or four months as Orion is rolled out.