As with the SOAP API Upsert, working with the Salesfore Bulk API v1 is a little different to the other operations. BulkAPIv1Update, BulkAPIv1Delete and BulkAPIv1Harddelete work from the provided Id, with regards what records to operate against. Insert merely creates new records and passes the new Id back.
Whereas BulkAPIv1Upsert, as the name suggests, works primarily off a provided External Id for the given object being upserted against.
If a match is found in that Salesforce object for the value being passed in, then the Operation will update the provided fields in the table payload to the matched record.
If a match is not found, the Upsert operation will insert a new record, using the provided fields in the table payload. All load operations require the Error column, whether success or failure, to guide and inform you.
External Id
Upsert will only work against an External data type field. This is a special setting for a field in Salesforce:
For this example, the field "External_Id__c" has been added to the Account object in our Demo Org, note the special indicator "(External ID)". If your intended External Id field does not have this, it is likely not actually setup as an External Id, no matter what the field name is.
SQL Sales will inform if it is not truly an External Id, this is demonstrated in the following examples.
Create a load table Example (WAIT)
(Refer to previous sections for details on SERIAL and PARALLEL options as well as WAIT and BACK methods).
This example creates a load table called "Account_BulkAPIv1_Wait_Serial_Upsert".
drop table if exists Account_BulkAPIv1_Wait_Serial_Upsert
select top 33
convert(nchar(18),null) as Id
,convert(nvarchar(255),null) as Error
,Name
,'UPDATE_TEST_' + Id as AccountNumber
,External_Id__c
,AccountNumber as AccountNumber_Orig
into Account_BulkAPIv1_Wait_Serial_Upsert
from Account
where AccountNumber not like '0%'
and External_Id__c is not null
order by createddate desc
insert Account_BulkAPIv1_Wait_Serial_Upsert
(Id
,Error
,Name
,AccountNumber
,External_Id__c
,AccountNumber_Orig)
select top 33
convert(nchar(18),null) --Id
,convert(nvarchar(255),null) --Error
,Name
,'REJECT_TEST_' + Id --AccountNumber
,null --External_Id__c
,null --AccountNumber_Orig
from Account
where AccountNumber not like '0%'
and External_Id__c is not null
order by createddate desc
insert Account_BulkAPIv1_Wait_Serial_Upsert
(Id
,Error
,Name
,AccountNumber
,External_Id__c
,AccountNumber_Orig)
select top 33
convert(nchar(18),null) --Id
,convert(nvarchar(255),null) --Error
,Name
,'INSERT_TEST_' + Id --AccountNumber
,'INSERT_TEST_' + Id --AExternal_Id__c
,null --AccountNumber_Orig
from Account
where AccountNumber not like '0%'
and External_Id__c is not null
order by createddate desc
Check the Payload
Note, those records with a matched (to Salesforce) value in External_Id__c will result in an update, those with a new value but no match to Salesforce will result in an Insert, whereas those with no value in External_Id__c will be rejected (as External_Id__c, being the specified External Id field for the BulkAPIv1Upsert operation, es expected to be populated).
Running the example (incorrect)
exec ss_Loader 'BulkAPIv1Upsert', 'DEMO', 'Account_BulkAPIv1_Wait_Serial_Upsert','WAIT:SERIAL'
SQL-SALES BulkAPIv1Upsert run date: 2023-12-09 ------------------
20:47:17: Using Env|Schema: DEMO|dbo
20:47:17: Provided Upsert Operation value not in the correct format, for a hypothetical field called "External_Id__c"
20:47:17: set (in Salesforce) as an External Id (this is a field setting)
20:47:17: the expected input value for the standard web services API is: Upsert:XId=External_Id__c
20:47:17: or BulkAPIv1Upsert:XId=External_Id__c for the bulk API v1
20:47:17: or BulkAPIv2Upsert:XId=External_Id__c for the bulk API v2
20:47:17: with a defined batchsize (for example 100 or 1000 respectively) these would be:
20:47:17: Upsert(100):XId=External_Id__c | BulkAPIv1Upsert(10000):XId=External_Id__c | BulkAPIv2Upsert(2000):XId=External_Id__c
20:47:17: Note, the provided External Id is validated directly against Salesforce Account
20:47:17: prior to the run commencing to check it is actually defined as an External Id.
-----------------------------------------------------------------
Note, the format provided in the example is incorrect, observe the supporting help text
Running the example (correct)
exec ss_Loader 'BulkAPIv1Upsert:XId=External_Id__c', 'DEMO', 'Account_BulkAPIv1_Wait_Serial_Upsert','WAIT:SERIAL'
SQL-SALES BulkAPIv1Upsert:XId=External_Id__c run date: 2023-12-09
20:47:51: Using Env|Schema: DEMO|dbo
20:47:54: Starting Loader for Account batchsize 10000
20:47:54: SSId added to Account_BulkAPIv1_Wait_Serial_Upsert
20:47:57: Connection method BULK & BULK API
20:47:57: Bulk API method WAIT:SERIAL
20:47:57: Columns checked against Salesforce metadata
20:47:57: Starting load for Account_BulkAPIv1_Wait_Serial_Upsert
20:48:11: JobId: 7508d00000TtTzbAAF
20:48:12: Excluded: AccountNumber_Orig is not available on object Account
20:48:12: Load complete: Success:66 Failure:33
-----------------------------------------------------------------
Note the Job Id for your submission is returned in the output for your reference, see also the log table ss_BulkAPILog.
_Return helper table
The batch to which a given row has been allocated is logged in the helper table created in the run that is your load table + “_Return”, using the above example it is:
Account_BulkAPIv1_Wait_Serial_Upsert_Return
For the load table itself as this was run with WAIT, the Error column is provided, paired with the originally submitted SSId.
ss_BulkAPILog table
The Job Id is also preserved in the ss_BulkAPILog table, written on each submission.
Checking the load table
The success or failure errors for each row will be automatically written back to the Error column as with the SOAP API Update operation.
Further uses
Refer to the worked examples shown for the other BulkAPIv1 operations, they would equally apply to BulkAPIv1Upsert, ensure you take note of the guidance regarding the use of the XId switch.