现在数据库层想做这样的方案,cluster+sharding,cluster方面资料挺多的,也基本上实现了,但是sharding方面好复杂,一种方案是业务层改变,另一种方案是加一层透明proxy,找了一下好像目前没有好用的proxy。业务层方面改又很费劲,面临以下问题:
- Do I need to shard all of my tables or just the big tables?
- How do I ensure my data is evenly distributed between shards?
- How does sharding affect referential integrity constraints?
- How do I use auto increment values and ensure unique values across all shards?
- How do I perform joins between my sharded tables and non-sharded tables?
- How do I run aggregrate queries that need data from multiple shards?
- What if I need to add more shards later on or change the sharding strategy?
- How do I perform the initial sharding of my existing data?
- What about joins between shards and transactions involving multiple shards?
- How do I ensure data is going to the correct shard?
- How do I implement HA in a sharded environment?
- How does sharding affect my backup/recovery procedures?
请问大家都是怎么sharding的?最好是生产环境验证过的,谢谢