Workaround fuel duplicated model #490

koonpeng · 2024-04-04T06:34:45Z

Bug fix

There is a duplicated model in fuel which break pit crew's cache. In order to not break existing caches, I just make an exception for that model.

Fixed bug

Fix applied

If your pull request is a bug fix, delete this line and all the below lines.

If your pull request is a feature implementation, delete this line and all the above lines.

New feature implementation

Implemented feature

Implementation description

Signed-off-by: Teo Koon Peng <[email protected]>

luca-della-vedova · 2024-04-04T07:16:57Z

I tried a different approach to actually skip the model instead of breaking, it seems to work for me but admittedly I'm not sure why duplicate name was made a breaking condition and whether there is a corner case where we might end up looping endlessly (although that doesn't seem like it should happen):

diff --git a/rmf_building_map_tools/pit_crew/pit_crew.py b/rmf_building_map_tools/pit_crew/pit_crew.py
index 459e81a..56040f6 100644
--- a/rmf_building_map_tools/pit_crew/pit_crew.py
+++ b/rmf_building_map_tools/pit_crew/pit_crew.py
@@ -840,7 +840,7 @@ def build_and_update_cache(cache_file_path=None, write_to_cache=True,
     # RFE: Doing this asynchronously will significantly speed this up.
     # Any solution must guarantee:
     #   - All new models are guaranteed to be added to the cache
-    #   - Loop breaks as soon as we start pulling up already cached models
+    #   - Models that are already cached are not added
     while status == 200 and not break_flag:
         logger.info("Fetching page: %d" % page)
 
@@ -853,12 +853,11 @@ def build_and_update_cache(cache_file_path=None, write_to_cache=True,
                 model_name = model.get("name", "")
                 author_name = model.get("owner", "")
 
-                # If a cached model was found, halt
+                # If a cached model was found, skip
                 if (model_name, author_name) in old_cache['model_cache']:
                     logger.info("Cached model found! "
-                                "Halting Fuel traversal...")
-                    break_flag = True
-                    break
+                                f"Skipping model {model_name} by {author_name}")
+                    continue
                 # Otherwise, add it to the cache
                 else:
                     new_cache_count += 1

Does this make sense? CC @methylDragon

Edit: Oh I see that it would always update the model cache if it doesn't exit early

koonpeng · 2024-04-04T08:07:53Z

The best approach would be to clean up the fuel db, that particular duplicated model has 0 downloads so far.

methylDragon · 2024-04-04T09:13:08Z

What I did locally was I made the model name tuple a three member tuple of (model name, author name, creation timestamp) (this fixes the issue since the db's current duplicated model only differs in timestamp)

But then that needed some other changes elsewhere that I wasn't confident of pushing in because it might lead to conflicts with preexisting model caches 😬

Best is to just fix the DB though, Ian said that the model-author pair is supposed to be unique.

methylDragon · 2024-04-04T09:17:00Z

I tried a different approach to actually skip the model instead of breaking, it seems to work for me but admittedly I'm not sure why duplicate name was made a breaking condition and whether there is a corner case where we might end up looping endlessly (although that doesn't seem like it should happen):

diff --git a/rmf_building_map_tools/pit_crew/pit_crew.py b/rmf_building_map_tools/pit_crew/pit_crew.py
index 459e81a..56040f6 100644
--- a/rmf_building_map_tools/pit_crew/pit_crew.py
+++ b/rmf_building_map_tools/pit_crew/pit_crew.py
@@ -840,7 +840,7 @@ def build_and_update_cache(cache_file_path=None, write_to_cache=True,
     # RFE: Doing this asynchronously will significantly speed this up.
     # Any solution must guarantee:
     #   - All new models are guaranteed to be added to the cache
-    #   - Loop breaks as soon as we start pulling up already cached models
+    #   - Models that are already cached are not added
     while status == 200 and not break_flag:
         logger.info("Fetching page: %d" % page)
 
@@ -853,12 +853,11 @@ def build_and_update_cache(cache_file_path=None, write_to_cache=True,
                 model_name = model.get("name", "")
                 author_name = model.get("owner", "")
 
-                # If a cached model was found, halt
+                # If a cached model was found, skip
                 if (model_name, author_name) in old_cache['model_cache']:
                     logger.info("Cached model found! "
-                                "Halting Fuel traversal...")
-                    break_flag = True
-                    break
+                                f"Skipping model {model_name} by {author_name}")
+                    continue
                 # Otherwise, add it to the cache
                 else:
                     new_cache_count += 1

Does this make sense? CC @methylDragon

Edit: Oh I see that it would always update the model cache if it doesn't exit early

And yes, exiting early is important otherwise builds will get unnecessarily extended (and fuel will get slammed with model list requests)

The models are always presented such that the latest models are shown first, which, when paired with the invariant that model-author pairs are unique, means we can halt accordingly.

methylDragon · 2024-04-04T09:51:20Z

See what I did here: #492

I haven't refined it but the basic idea is there.

Signed-off-by: Teo Koon Peng <[email protected]>

workaround fuel dup

78c28fd

Signed-off-by: Teo Koon Peng <[email protected]>

koonpeng mentioned this pull request Apr 5, 2024

Workaround for duplicate models for pit_crew #492

Draft

luca-della-vedova enabled auto-merge (squash) April 25, 2024 07:13

Merge branch 'main' into kp/workaround-fuel-dup

1939c6c

luca-della-vedova approved these changes Apr 25, 2024

View reviewed changes

luca-della-vedova merged commit e1d7a3e into main Apr 25, 2024
5 checks passed

luca-della-vedova deleted the kp/workaround-fuel-dup branch April 25, 2024 07:18

koonpeng added a commit that referenced this pull request Jun 12, 2024

workaround fuel dup (#490)

e147c5b

Signed-off-by: Teo Koon Peng <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workaround fuel duplicated model #490

Workaround fuel duplicated model #490

koonpeng commented Apr 4, 2024

luca-della-vedova commented Apr 4, 2024 •

edited

Loading

koonpeng commented Apr 4, 2024

methylDragon commented Apr 4, 2024 •

edited

Loading

methylDragon commented Apr 4, 2024 •

edited

Loading

methylDragon commented Apr 4, 2024

Workaround fuel duplicated model #490

Workaround fuel duplicated model #490

Conversation

koonpeng commented Apr 4, 2024

Bug fix

Fixed bug

Fix applied

New feature implementation

Implemented feature

Implementation description

luca-della-vedova commented Apr 4, 2024 • edited Loading

koonpeng commented Apr 4, 2024

methylDragon commented Apr 4, 2024 • edited Loading

methylDragon commented Apr 4, 2024 • edited Loading

methylDragon commented Apr 4, 2024

luca-della-vedova commented Apr 4, 2024 •

edited

Loading

methylDragon commented Apr 4, 2024 •

edited

Loading

methylDragon commented Apr 4, 2024 •

edited

Loading