How CRM Data Updates Lead to Data Corruption

23.04.2012
In theory, it's best to clear all your database transactions as soon as you can, keeping the tables up to date so that you don't have to worry about data integrity or time-smear problems. It's not always that simple.

In the real world of cloud systems, you may have dozens of loosely coupled databases. The updates typically happen in a few seconds, but under certain conditions--such as weekend system refreshes or quarter-end reporting--transactions may be held up for an hour or even longer. That's the way it's supposed to work.

So in loosely coupled cloud systems, the first thing to do is make sure that any transactions have time stamps on all end-point systems, and make sure that your business logic understands what to do (or how to reconcile things) in case you have a "new" update that is actually several hours late.

OK, that's fine for the transactional tables (such as deals, payments, or transfers). But what about the core of the CRM system, which typically doesn't have much accounting content?

CRM systems are different from other enterprise software. The standards of data quality are different, both because of the data input sources (your customers or your sales reps) and the frequency of data update (some individual records may be hit several times in a day).

The highest-risk tables in a CRM are the Opportunity (particularly if its stage moves to "closed-won") and the Account (which is typically at the top of the information pyramid). So updates to these tables should be very carefully controlled, both with access control systems ("a sales rep may not modify these fields when this record is in a particular state") and with validation rules or coded mechanisms that limit the conditions under which data may be changed.

These controls need to be counter-balanced, however, with the need to respond to external systems that need to clear their transactions. For example, complex sales channels may have a need to create a "new" account that in fact already exists in your CRM system. The external system may show the account with a different name (formal legal entity versus "street name"), a different city, or other key information that is different from what you already have.

The same issue applies to an Opportunity that you might (or might not) already have in the system. At the time that the transaction comes in, there may be no way to know which version of the data is better--and sometimes both versions might need to be maintained for later reconciliation.

So our recommendation is that you purposefully create a duplicate record, with pointers to the record you believe is already there. When the record is created, an alert should be sent to your sales operations or accounting department for subsequent evaluation and data reconciliation. Typically, the new record will be made a child of your existing record in a master-detail relationship, so that the outside system can continue to update the "copy" of the data that it likes&and your accounting system can get the roll-up data it needs to balance the books.

This is a fairly straightforward decision when the Account name is nearly identical. It's a bit more of a judgment call when the new Account is ESPN and the existing Account is ABC Networks or even Disney. They're all part of the same corporation, but do they need to be a single account, or an account hierarchy? It all depends on your business rules.

OK, but what about other CRM tables? It's not at all unusual to have noisy data updates coming in for both Leads and Contacts. The data are noisy in three key ways:

Given these issues, simply updating your existing records can be an act of data corruption. The faulty update can case receipts and other official communication goes to the wrong place, or can interrupt the sales cycle. Neither of these symptoms will be particularly endearing to the CRM user community.

So the intentional creation of duplicate records is again not a bad strategy - as long as you have (and live up to) an SLA around the reconciliation of data updates.

When merging records that have been intentionally created as dupes, you'll need to use some subtlety and guile. Conventional merges collapse records together using static rules (such as "most recently updated" or "best data quality"). Since the merges are typically done at the record level - rather than cell-by-cell - the losing record might have some values in it that get wiped out by the winning record.

With the intentionally created dupes, it may be appropriate for bits of the losing record to survive (such as a new phone number or additional email address). If you're going to be using standard merge logic, then, you'll need to copy these valuable parts of the losing record into "spare fields" to the merge. We typically do this with a long-text field, concatenating the extra bits of data using an "XML-lite" style (e.g., "OtherPhone:800-555-1212, AssistantEmail:joan@didion.com"). However, there are situations where a more explicit "extras" field works better.

The underlying issue: make sure that all updates are increasing the total information value of the CRM database, rather than blindly making sure it's "up to date with the latest values." Because sometimes, the latest ain't the greatest.

David Taber is the author of the new Prentice Hall book, "" and is the CEO of , a certified Salesforce.com consultancy focused on business process improvement through use of CRM systems. SalesLogistix clients are in North America, Europe, Israel, and India, and David has over 25 years experience in high tech, including 10 years at the VP level or above.

Follow everything from CIO.com on Twitter @CIOonline.