What is force.com Bulk API?
- An asynchronous API to work with high volumes of data
- Use the Bulk API for more than 50K records or for the time-sensitive loads that can take advantage of very large batches
- Improves throughput when loading large data sets into salesforce due to parallel processing
- Increase stability, monitoring, and controlling high volume data load.
How does Bulk API work?
Processing data typically consists of the following steps.
- Create a new job that specifies the object and action.
- Send data to the server in a number of batches.
- Once all data has been submitted, close the job. Once closed, no more batches can be sent as part of the job.
- Check the status of all batches at a reasonable interval. Each status check returns the state of each batch.
- When all batches have either completed or failed, retrieve the result for each batch.
- Match the result sets with the original data set to determine which records failed and succeeded, and take appropriate action.
Avoid Lock Contention-
- Lock Contention is a situation in which one process tries to acquire a lock held by another process. If the lock is not released in a timely manner, a lock time (UNABLE_TO_LOCK_ROW) can occur. Parallel processing enables faster loading of data however, sometimes it can cause lock contention on records.
- Operations that may cause lock contention are creating new users, updating ownership for records with private sharing, updating user roles and updating territory hierarchies
- For example- if you are loading the Account Team member and during parallel processing, two Account team member with same account try to insert or update in different batch at the same time, it will cause lock contention.
- One solution to avoid the lock contention is organizing data in batches. In the above example, if we keep all account team members of the same account together, there higher changes to have all account members in the same batch. This would minimize the lock contention.
- Another solution is to avoid lock contention is to use a serial mode which will ensure to have a single batch at a time. However, this will slow the process. Pls, use the serial mode only when data would result in a lock timeout and data cannot be re-organized to avoid the lock timeout.
What are the operations supported by Bulk API
Query
Update
- Upsert
- Delete (Soft delete- keep data in recycle bin for 15 days)
- Hard Delete (Delete the data permanently, Use hard delete if deleting more than 500K records)
What are the HTTP methods used by the BULK API?
Bulk API is REST-based but use only
- GET- Retrieve data from salesforce
- POST- Used to send request DML
What are the steps involved using BULK API?
- Log in to Salesforce (Bulk API does not provide a login operation, so SOAP API must be used to log in)
- Create a job- to specify which object needs to be loaded.
- Create batches of records and send them to the server
- Close the job- Once all batches are sent, the program must close the job. After closing the job, no batch can be sent.
- Check batch status
- Upon completion of all batches, retrieve batch results.
How do you monitor Bulk data load Jobs?
- To monitor the job, the user must have Manage Data Integrations permission set.
- Go to setup-> Monitor Bulk Data load
No comments:
Post a Comment