原文: Topic: performance Distribution can hurt: network b/w and latency bottlenecks Lots of tricks, e.g. caching, concurrency, pre-fetch Distribution can help: parallelism, pick server near client Idea: scalable design Nx servers -> Nx total performance Need a way to divide the load by N Divide data over many servers ("sharding" or "partitioning") By hash of file name? By user? Move files around dynamically to even out load? "Stripe" each file's blocks over the servers? Performance scaling is rarely perfect Some operations are global and hit all servers (e.g. search) Nx servers -> 1x performance Load imbalance Everyone wants to get at a single popular file -> one server 100%, added servers mostly idle -> Nx servers -> 1x performance
问题的地方: Some operations are global and hit all servers (e.g. search) Nx servers -> 1x performance 问题:为什么 search 功能, N 台机器反而性能没有提升?