-
Adam Williamson authored
As discussed in #259, the `limit` parameter - to the low-level `listing.List` and its subclasses, and to various higher-level functions which ultimately return `List` or `GeneratorList` instances - is confusing and misleading. It is passed through to the API calls, and effectively specifies how many items a single API call will return. But because our `List` yields a single item at a time and will keep doing API calls until the API says there's no more data, it does not limit how many items our `List` will yield, nor specify its chunk size. It seems natural to expect that `List(limit=10)` or `page.revisions(limit=10)` will give you a generator that yields only 10 items, or yields 10 items at a time, but it does not; it gives you a generator which queries the API in chunks of 10 items at a time, but will yield every item the API gives it, one by one. This is probably not ever how we really intended mwclient to work (there's an old `# NOTE: Fix limit` comment which implies as much) but it has worked this way for 16 years, so we should probably not change it just in case someone really has a need to specify the API chunk size for some reason, or some code somehow happens to implicitly rely on it behaving the way it does. So, this keeps the behaviour of the `limit` param wherever it exists, but triggers a deprecation warning. Everything that had a `limit` param now also has an `api_chunk_size` param that does the same thing, but is more explicitly named and does not trigger a deprecation warning. And everything that had a `limit` param now also has a `max_items` param that does what it sounds like, and what people are more likely to want - sets an absolute limit on the number of items the generator will yield. For efficiency, if `max_items` is set, neither `limit` nor `api_chunk_size` is set, and `max_items` is below `site.api_limit`, we set the API chunk size to `max_items` so we only retrieve as many items as we actually need. Signed-off-by:
Adam Williamson <awilliam@redhat.com>
Adam Williamson authoredAs discussed in #259, the `limit` parameter - to the low-level `listing.List` and its subclasses, and to various higher-level functions which ultimately return `List` or `GeneratorList` instances - is confusing and misleading. It is passed through to the API calls, and effectively specifies how many items a single API call will return. But because our `List` yields a single item at a time and will keep doing API calls until the API says there's no more data, it does not limit how many items our `List` will yield, nor specify its chunk size. It seems natural to expect that `List(limit=10)` or `page.revisions(limit=10)` will give you a generator that yields only 10 items, or yields 10 items at a time, but it does not; it gives you a generator which queries the API in chunks of 10 items at a time, but will yield every item the API gives it, one by one. This is probably not ever how we really intended mwclient to work (there's an old `# NOTE: Fix limit` comment which implies as much) but it has worked this way for 16 years, so we should probably not change it just in case someone really has a need to specify the API chunk size for some reason, or some code somehow happens to implicitly rely on it behaving the way it does. So, this keeps the behaviour of the `limit` param wherever it exists, but triggers a deprecation warning. Everything that had a `limit` param now also has an `api_chunk_size` param that does the same thing, but is more explicitly named and does not trigger a deprecation warning. And everything that had a `limit` param now also has a `max_items` param that does what it sounds like, and what people are more likely to want - sets an absolute limit on the number of items the generator will yield. For efficiency, if `max_items` is set, neither `limit` nor `api_chunk_size` is set, and `max_items` is below `site.api_limit`, we set the API chunk size to `max_items` so we only retrieve as many items as we actually need. Signed-off-by:
Adam Williamson <awilliam@redhat.com>