Conversation
100cf6f to
355d560
Compare
|
I had to amend my commit because |
|
I don't think using a unkeyed hash like FNV is a reasonable choice here as it opens programs using this library up to denial of service attacks. If better hashing speed is wanted, a keyed hash like |
|
You are right. Good catch! I did not think of this as an attack surface but it could definitely be exploited. Will switch to ahash ASAP |
|
|
||
| /// The element classes. | ||
| pub classes: HashSet<LocalName>, | ||
| pub classes: HashSet<String>, |
There was a problem hiding this comment.
If we include #101, it might be preferable to keep this as LocalName as class names should be highly redundant seen over one or even multiple documents. Lazy initialization would then potentially allow avoiding some of the cost of deduplicating these strings when they are never matched for a given element.
|
Hi, sorry I haven't been responsive as of late - don't really have time to look at this, but if both of you think that the changes are worthwhile then very happy to approve so that we can merge this. |
|
See #101 |
I have tried to improve parsing performance by replacing the standard hash function with fnv and by replacing LocalNames with strings (in node.rs) as suggested in #45.
@teymour-aldridge are there any other changes we could make to improve parsing speed? Could someone that needs to parse multiple/huge HTMLs perform some benchmarks and report back?