Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new rule - replace d.update({'key': 'value'}) with d['key'] = 'value' #13533

Open
spaceby opened this issue Sep 27, 2024 · 10 comments · May be fixed by #14831
Open

new rule - replace d.update({'key': 'value'}) with d['key'] = 'value' #13533

spaceby opened this issue Sep 27, 2024 · 10 comments · May be fixed by #14831
Labels
accepted Ready for implementation rule Implementing or modifying a lint rule

Comments

@spaceby
Copy link

spaceby commented Sep 27, 2024

I suggest adding a rule that detects this

d = {}
d.update({'key': 'value'})

and replacing it

d = {}
d['key'] = 'value'

As a result, the code becomes simpler and more understandable.

@zanieb
Copy link
Member

zanieb commented Sep 27, 2024

This make sense, but is just a style rule right? Is there a performance diff since you're skipping creating a dictionary?

@zanieb zanieb added the rule Implementing or modifying a lint rule label Sep 27, 2024
@ugochukwu-850
Copy link

I like this issue, I am a new her btw [I'm also new to OSC],
But I did a little research and its variable , so I am not sure its safe to change based off only on aesthetics .
Basically at smaller and mid sized dict updates its ok to just assign but on larger updates creating a secondary dict as update does , would greatly increase the bulk update speed.

While its arguable , since the performance effects is variable maybe it should be optional.

@AlexWaygood
Copy link
Member

AlexWaygood commented Sep 29, 2024

This make sense, but is just a style rule right? Is there a performance diff since you're skipping creating a dictionary?

Even for small dictionaries, direct assignment is indeed quite a bit faster according to a very naive benchmark:

~/dev % python -m timeit -s "d = {}" "d.update({'key': 'value'})"            
5000000 loops, best of 5: 61.2 nsec per loop
~/dev % python -m timeit -s "d = {}" "d['key'] = 'value'"        
20000000 loops, best of 5: 12.2 nsec per loop

And this is what you'd expect, since the bytecode created for direct assignment is a lot simpler:

~/dev/cpython % ./python.exe
Python 3.14.0a0 (heads/main:986a4e1b6fc, Sep 26 2024, 12:02:18) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def x():
...     d = {}
...     d.update({'key': 'value'})
...     
>>> def y():
...     d = {}
...     d['key'] = 'value'
...     
>>> import dis
>>> dis.dis(x, adaptive=True)
  1           RESUME                   0

  2           BUILD_MAP                0
              STORE_FAST               0 (d)

  3           LOAD_FAST                0 (d)
              LOAD_ATTR                1 (update + NULL|self)
              LOAD_CONST               1 ('key')
              LOAD_CONST               2 ('value')
              BUILD_MAP                1
              CALL                     1
              POP_TOP
              RETURN_CONST             0 (None)
>>> dis.dis(y, adaptive=True)
  1           RESUME                   0

  2           BUILD_MAP                0
              STORE_FAST               0 (d)

  3           LOAD_CONST               1 ('value')
              LOAD_FAST                0 (d)
              LOAD_CONST               2 ('key')
              STORE_SUBSCR
              RETURN_CONST             0 (None)

But as I said, this benchmark is very naive:

  • the keys and values being stored in the dictionary are all constants
  • there's only a single item in the dictionary being passed to .update() when in most real-world uses of dict.update you'd generally pass in a dictionary that has multiple items.

@spaceby would your proposed rule only apply to dictionaries with a single item being passed to .update? Or do you also think it should suggest this?

-d.update({
-    "x": 1,
-    "y": 2,
-    "z": 3,
-})
+d["x"] = 1
+d["y"] = 2
+d["z"] = 3

@AlexWaygood AlexWaygood added needs-decision Awaiting a decision from a maintainer needs-design Needs further design before implementation labels Sep 29, 2024
@spaceby
Copy link
Author

spaceby commented Sep 29, 2024

I think the rule should only apply to a dictionary with single item.
A rule similar in meaning to B009 - rewrite the code to something simpler.

@AlexWaygood
Copy link
Member

Makes sense... though I think we should explicitly state this in the docs for the rule, or we'll inevitably get a PR "improving" the rule so that it also covers dictionaries with multiple items 😄

@AlexWaygood AlexWaygood removed the needs-design Needs further design before implementation label Sep 29, 2024
@autinerd
Copy link
Contributor

autinerd commented Oct 5, 2024

For the update call with multiple entries, maybe the |= operator would be an idea? It seems to be slightly faster 🤔

❯ python -m timeit -s "d = {}" "d |= {'x': 1,'y': 2,'z': 3}"
2000000 loops, best of 5: 141 nsec per loop
❯ python -m timeit -s "d = {}" "d.update({'x': 1,'y': 2,'z': 3})"
2000000 loops, best of 5: 179 nsec per loop

And of course with single items, |= should also be replaced with directly assigning the value to the key.

@AlexWaygood
Copy link
Member

For the update call with multiple entries, maybe the |= operator would be an idea? It seems to be slightly faster 🤔

Possibly, but I think that would be a different rule

@MichaReiser
Copy link
Member

This make sense, but is just a style rule right? Is there a performance diff since you're skipping creating a dictionary?

I still think it's worthwhile to frame the motivation properly because, IMO, the performance difference seems too marginal to justify making it a rule.

@tdulcet
Copy link

tdulcet commented Oct 7, 2024

Maybe this would be a separate rule, but perhaps this rule could be generalized to avoid creating unnecessary intermediate representations. For example, these:

l += ["value"]
l += ["value 1", "value 2"]
s |= {"value"}
s |= {"value 1", "value 2"}
d |= {"key": "value"}
d.update({"key": "value"})
# ect.

could potentially be rewritten as:

l.append("value")
l.extend(("value 1", "value 2"))
s.add("value")
s.update(("value 1", "value 2"))
d["key"] = "value"
d["key"] = "value"
# ect.

This was first proposed in #8877 (comment).

@AlexWaygood
Copy link
Member

This make sense, but is just a style rule right? Is there a performance diff since you're skipping creating a dictionary?

I still think it's worthwhile to frame the motivation properly because, IMO, the performance difference seems too marginal to justify making it a rule.

I think the performance difference is significant enough that it could easily make a difference in a hot loop somewhere... but I agree that the motivation is primarily stylistic here, and that this is the kind of performance difference that could easily change as performance improvements are made to CPython itself. So I agree that this should be classified as a stylistic rule.

Maybe this would be a separate rule, but perhaps this rule could be generalized to avoid creating unnecessary intermediate representations. For example, these:

Yes, this makes sense to me; all of these are essentially different manifestations of the same antipattern.

At this point I think I've seen enough support for this that I'd be happy to accept a PR implementing this rule (in the RUF category, I suppose, unless there have been any other linters to already implement this?).

@AlexWaygood AlexWaygood added help wanted Contributions especially welcome accepted Ready for implementation and removed needs-decision Awaiting a decision from a maintainer labels Oct 7, 2024
@MichaReiser MichaReiser changed the title new rule - replace d.update({'key': 'value'}) with d['key'] = 'value' new rule - replace d.update({'key': 'value'}) with d['key'] = 'value' Nov 25, 2024
@MichaReiser MichaReiser removed the help wanted Contributions especially welcome label Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation rule Implementing or modifying a lint rule
Projects
None yet
7 participants