I am not sure whether this is the right place to ask this question. As this is more like a logic question.. but hey no harm in asking. Suppose I have a huge list of data (customers) and they all have a data_id Now I want to select lets say split the data in ratio lets say 10:90 split. Now rather than stating a condition that (example)
the sum of digits is even...go to bin 1
the sum of digits is odd.. go to bin 2
or sum of last three digits are x then go to bin 1
sum of last three digits is not x then go to bin 2
现在这可能导致数据收集不均,有时可能找到数据。
Is there a way (probabilistically speaking) which says.. sample size is always greater than x%
增 编