Thursday, January 08, 2009

Why asynchronous process?

This article is major for traditional internal business data owners, product managers and programmers.

  • Then why we bother to consider asynchronous process?

# transactional growth, data storage and workload become bottleneck in one single system.
# moving the application to larger computers. there are 2 main limitations:
- outgrowing the capacity of the largest system available
- expensive
# Try distribution transparency(2PC) to insure consistency and accuracy.
- it was better to fail the complete system than to break this transparency.
# Three properties of shared-data systems—data Consistency, system Availability, and tolerance to network Partition—only two can be achieved at any given time.
- Network partition is a given.
- consistency and availability cannot be achieved at the same time.
^ making consistency a priority means that under certain conditions the system will not be available.
^ relaxing consistency will allow the system to remain highly available under the partition network conditions.

  • Solution

Focus on the trade-offs between high availability and data consistency and large-scale data replication through messaging system, batch process the staging data with transaction Database API, we can deliver durability, availability,performance and cost effectiveness large-scale distributed database systems and infrastructure services.

BASE allows for availability in a partitioned database, then opportunities to relax consistency have to be identified. This is often difficult because the tendency of both business stakeholders and developers is to assert that consistency is paramount to the success of the application. Temporal inconsistency cannot be hidden from the end user, so both engineering and product owners must be involved in picking the opportunities for relaxing consistency.

Practical example:

Book qty, 2 input sources, it is not accurate in fact.
- update sequence
- priority
- conflict resolution
Consistency can be relaxed and use just don't care the consistency
take advantage of user-perceived consistency window between write and read
# Finance batch process
# Orders and Credit card batch update
# Return items batch process

ACID: transfer money from one account to another account, maintain total amount same.

The requirement ratio of ACID vs. BASE is about 1:100, like 100 searches generate 1 order.

Example:
a: Flight ticket booking system
b: 1% performance decrease caused 5% orders lost
c: publish a new book and make it online for search
d: user data, e.g. sale_amt, purchase_amt

Reference:

Good things come to who can wait.
* Scalability Best Practices: Lessons from eBay (Partition and Asynchronously)
* BASE: AN ACID ALTERNATIVE
* Eventually Consistent
* Latency is Everywhere and it Costs You Sales - How to Crush it

...

My cubicle in office.

No comments: