Conversation
…t handles transactions for us + refactor tests)
| mock==3.0.5 | ||
| moto==1.3.13 | ||
| more-itertools==10.5.0 | ||
| moto==2.0.8 |
There was a problem hiding this comment.
This gets us transact_write_items. There is probably value to going a bit further since this is still quite old and some of these older versions have older limits (e.g. a newer version has support for 100 items in a transaction, which we recently removed from Tron, though I'm sure there are other things 🤷), but I was able to plug this version in without much pain so I stopped there.
…g items on restore
| def small_object(): | ||
| yield { | ||
| def job_state_object(): | ||
| return {"enabled": True, "run_nums": [0, 1]} |
There was a problem hiding this comment.
it's nice to always yield in fixtures since that's one less change to make if we ever need some post-fixture cleanup - but this is also fine as-is :)
|
|
||
|
|
||
| @pytest.mark.usefixtures("store", "small_object", "large_object") | ||
| @pytest.mark.usefixtures("store", "job_state_object", "small_jobrun_state_object", "large_jobrun_state_object") |
There was a problem hiding this comment.
just curious (in case you know): do we need pytest.mark.usefixtures here? seems like we actually refer to the fixtures in our tests and, outside of the store fixture, the other fixtures only return data
| @pytest.mark.parametrize( | ||
| "object_fixture_name", | ||
| ["small_jobrun_state_object", "large_jobrun_state_object"], | ||
| ) | ||
| def test_save_and_restore(self, store, object_fixture_name, request): | ||
| """Verify that objects of different sizes can be saved and restored correctly.""" | ||
| state_object = request.getfixturevalue(object_fixture_name) |
There was a problem hiding this comment.
| @pytest.mark.parametrize( | |
| "object_fixture_name", | |
| ["small_jobrun_state_object", "large_jobrun_state_object"], | |
| ) | |
| def test_save_and_restore(self, store, object_fixture_name, request): | |
| """Verify that objects of different sizes can be saved and restored correctly.""" | |
| state_object = request.getfixturevalue(object_fixture_name) | |
| @pytest.mark.parametrize( | |
| "state_object", | |
| [small_jobrun_state_object, large_jobrun_state_object], | |
| ) | |
| def test_save_and_restore(self, store, state_object, request): | |
| """Verify that objects of different sizes can be saved and restored correctly.""" |
disclaimer: i haven't test that - but shouldn't we be able to use the fixtures directly here rather than using the getfixturevalue indirection?
| if object_fixture_name == "small_jobrun_state_object": | ||
| assert num_partitions == 1 | ||
| else: | ||
| # For large objects, we expect multiple partitions, the number of partitions can vary depending on our object size | ||
| # and the DynamoDB partition size limit. We assert that there is more than one partition. | ||
| assert num_partitions > 1 |
There was a problem hiding this comment.
we should try to avoid conditionals in tests - that said, not sure how to handle this specific case without having one of the parameterized values to pass through be a lambda or something :p
| "UnprocessedKeys": { | ||
| store.name: { | ||
| "Keys": [{"key": {"S": store.build_key("job_state", 0)}, "index": {"N": "0"}}], | ||
| table_name: { |
There was a problem hiding this comment.
it's probably more future-proof to keep using store.name (and would remove a fixture from this test) - but this isn't the end of the world either :p
| return f"{type} {iden}" | ||
|
|
||
| def restore(self, keys, read_json: bool = False) -> dict: | ||
| # KKASP: removed read_json so I think we can remove this parameter from the config |
There was a problem hiding this comment.
| # KKASP: removed read_json so I think we can remove this parameter from the config |
| try: | ||
| json_items[key] += partition_item["json_val"]["S"] | ||
| except KeyError: | ||
| log.critical( |
There was a problem hiding this comment.
does log.critical print the traceback if it exists? if not, might be nice to change this to log.exception() to ensure that folks have a line number to jump to from the traceback :)
| time.sleep(5) | ||
| continue | ||
| with self.save_lock: | ||
| # KKASP: what is this again? |
There was a problem hiding this comment.
| # KKASP: what is this again? | |
| # TODO: what is this again? |
There was a problem hiding this comment.
and isn't this the delete shenanigans we do in tron?
Problem
We've been saving both pickles and JSON as part of our "safe migration" off of pickle data. Before we can remove pickles (and ultimately delete Mesos + hella old things) we need to stop saving pickles. This change removes the writing pipeline for pickles.
Solution
Generally, just stop writing pickles. This includes:
Some additional "just 'cause" changes:
Note: I'll test restoring on some more tables, but so far so good. I think it's pretty reviewable as is. Ideally we do some more read_json cleanup, but for now this is a safe approach. One annoying thing is writing to json_val and num_json_partitions, but that's just the way the cookie crumbles.
Extra note: This will not be rollback safe (we can technically rollback with restored data, but generally I think it's best to consider this release not pleasantly undone). Full steam ahead 🚂.