mysql sharding ，大家有什么好的方案吗？

现在数据库层想做这样的方案，cluster+sharding，cluster方面资料挺多的，也基本上实现了，但是sharding方面好复杂，一种方案是业务层改变，另一种方案是加一层透明proxy，找了一下好像目前没有好用的proxy。业务层方面改又很费劲，面临以下问题：

Do I need to shard all of my tables or just the big tables?
How do I ensure my data is evenly distributed between shards?
How does sharding affect referential integrity constraints?
How do I use auto increment values and ensure unique values across all shards?
How do I perform joins between my sharded tables and non-sharded tables?
How do I run aggregrate queries that need data from multiple shards?
What if I need to add more shards later on or change the sharding strategy?
How do I perform the initial sharding of my existing data?
What about joins between shards and transactions involving multiple shards?
How do I ensure data is going to the correct shard?
How do I implement HA in a sharded environment?
How does sharding affect my backup/recovery procedures?

请问大家都是怎么sharding的？最好是生产环境验证过的，谢谢

siteshen

2015-05-25 06:10:41 +08:00

1. 用户产生的可能“超载”都需要 sharding
2. sharding时指定个权重即可，刚开始3个sharding时，1:1:1，时隔半年增加3个sharding,1:1:1:5:5:5 即可（具体比例根据业务发展调整）
3. 外键都得删掉，人工检查维护，当然完整性不能得到DB层保证
4. ((timestamp -CONSTANTS) << 23) + (sharding_id << 10) + (auto_increment & (2<<10))（支持2^13个sharding）
5. 避免join
6. 每个sharding里aggregrate，代码里合并
7. 一致性hash
8. return sharding0 if id < 10^9
9. WTF?
10. 代码保证啊
11. WTF?
12. 和去掉外键一样，没法保证

没在生产环境验证过，不谢。

参考文献：
[1] http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram
[2] http://media.postgresql.org/sfpug/instagram_sfpug.pdf

这是一个专为移动设备优化的页面（即为了让你能够在 Google 搜索结果里秒开这个页面），如果你希望参与 V2EX 社区的讨论，你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/193192

V2EX 是创意工作者们的社区，是一个分享自己正在做的有趣事物、交流想法，可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.