Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fizz] Improve text separator byte efficiency #24630

Merged
merged 10 commits into from
May 28, 2022
Merged

Conversation

gnoff
Copy link
Collaborator

@gnoff gnoff commented May 27, 2022

[Fizz] Improve text separator byte efficiency

Previously text separators were inserted following any Text node in Fizz. This increases bytes sent when streaming and in some cases such as title elements these separators are not interpreted as comment nodes and leak into the visual aspects of a page as escaped text.

The reason simple tracking on the last pushed type doesn't work is that Segments can be filled in asynchronously later and so you cannot know in a single pass whether the preceding content was a text node or not. This commit adds a concept of TextEmbedding which provides a best effort signal to Segments on whether they are embedded within text. This allows the later resolution of that Segment to add text separators when possibly necessary but avoid them when they are surely not.

The current implementation can only "peek" head if the segment is a the Root Segment or a Suspense Boundary Segment. In these cases we know there is no trailing text embedding and we can eliminate the separator at the end of the segment if the last emitted element was Text. In normal Segments we cannot peek and thus have to assume there might be a trailing text embedding and we issue a separator defensively. This should be rare in practice as it is assumed most components that will cause segment creation will also emit some markup at the edges.

Another strategy employed is to take advantage of the two methods by which we get Nodes into the DOM. The method by which we get segment markup into the DOM differs depending on when the Segment resolves.

If a Segment resovles before flushing begins for it's parent it will be emitted inline with the parent markup. In these cases separators may be necessary because they are how we clue the browser into breakup up text into distinct nodes that will later match up with what will be hydrated on the client.

If a Segment resolves after flushing has happened a script will be used to patch up the DOM in the client. when this happens if there are any text nodes on the boundary of the patch they won't be "merged" and thus will continue to have distinct representation as Nodes in the DOM. Thus we can avoid doing any separators at the boudnaries in these cases.

After applying these changes the only time you will get text separators as follows

  • in between serial text nodes that emmit at the same time - these are necessary and cannot be eliminated unless we stop relying on the browser to automatically parse the correct text nodes when processing this HTML
  • after a final text node in a non-boundary segment that resolves before it's parent has flushed - these are sometimes extraneous, like when the next emitted thing is a non-Text node.

In all other cases text separators should be omitted which means the general byte efficiency of this approach should be pretty good

gnoff added 2 commits May 27, 2022 11:51
Previously text separators were inserted following any Text node in Fizz. This increases bytes sent when streaming and in some cases such as title elements these separators are not interpretted as comment nodes and leak into the visual aspects of a page as escaped text.

The reason simple tracking on the last pushed type doesn't work is that Segments can be filled in asynchronously later and so you cannot know in a single pass whether the preceding content was a text node or not. This commit adds a concetp of TextEmbedding which provides a best effort signal to Segments on whether they are embedded within text. This allows the later resolution of that Segment to add text separators when possibly necessary but avoid them when they are surely not.

The current implementation can only "peek" head if the segment is a the Root Segment or a Suspense Boundary Segment. In these cases we know there is no trailing text embedding and we can eliminate the separator at the end of the segment if the last emitted element was Text. In normal Segments we cannot peek and thus have to assume there might be a trailing text embedding and we issue a separator defensively. This should be rare in practice as it is assumed most components that will cause segment creation will also emit some markup at the edges.
@sizebot
Copy link

sizebot commented May 27, 2022

Comparing: 05c34de...277beb8

Critical size changes

Includes critical production bundles, as well as any change greater than 2%:

Name +/- Base Current +/- gzip Base gzip Current gzip
oss-stable/react-dom/cjs/react-dom.production.min.js = 131.28 kB 131.28 kB = 42.13 kB 42.13 kB
oss-experimental/react-dom/cjs/react-dom.production.min.js = 136.54 kB 136.54 kB = 43.68 kB 43.68 kB
facebook-www/ReactDOM-prod.classic.js = 439.35 kB 439.35 kB = 80.29 kB 80.29 kB
facebook-www/ReactDOM-prod.modern.js = 424.64 kB 424.64 kB = 78.13 kB 78.13 kB
facebook-www/ReactDOMForked-prod.classic.js = 439.35 kB 439.35 kB = 80.29 kB 80.29 kB
oss-experimental/react-noop-renderer/cjs/react-noop-renderer-server.development.js +2.81% 5.99 kB 6.16 kB +3.08% 1.66 kB 1.71 kB
oss-stable-semver/react-noop-renderer/cjs/react-noop-renderer-server.development.js +2.81% 5.99 kB 6.16 kB +3.08% 1.66 kB 1.71 kB
oss-stable/react-noop-renderer/cjs/react-noop-renderer-server.development.js +2.81% 5.99 kB 6.16 kB +3.08% 1.66 kB 1.71 kB
oss-stable-semver/react-server/cjs/react-server.production.min.js +2.31% 19.90 kB 20.36 kB +1.71% 6.85 kB 6.97 kB
oss-stable/react-server/cjs/react-server.production.min.js +2.31% 19.90 kB 20.36 kB +1.71% 6.85 kB 6.97 kB
oss-experimental/react-server/cjs/react-server.production.min.js +2.27% 20.23 kB 20.69 kB +1.71% 6.95 kB 7.07 kB

Significant size changes

Includes any change greater than 0.2%:

Expand to show
Name +/- Base Current +/- gzip Base gzip Current gzip
oss-experimental/react-noop-renderer/cjs/react-noop-renderer-server.development.js +2.81% 5.99 kB 6.16 kB +3.08% 1.66 kB 1.71 kB
oss-stable-semver/react-noop-renderer/cjs/react-noop-renderer-server.development.js +2.81% 5.99 kB 6.16 kB +3.08% 1.66 kB 1.71 kB
oss-stable/react-noop-renderer/cjs/react-noop-renderer-server.development.js +2.81% 5.99 kB 6.16 kB +3.08% 1.66 kB 1.71 kB
oss-stable-semver/react-server/cjs/react-server.production.min.js +2.31% 19.90 kB 20.36 kB +1.71% 6.85 kB 6.97 kB
oss-stable/react-server/cjs/react-server.production.min.js +2.31% 19.90 kB 20.36 kB +1.71% 6.85 kB 6.97 kB
oss-experimental/react-server/cjs/react-server.production.min.js +2.27% 20.23 kB 20.69 kB +1.71% 6.95 kB 7.07 kB
oss-stable-semver/react-dom/cjs/react-dom-server-legacy.browser.production.min.js +1.90% 32.35 kB 32.96 kB +0.69% 10.77 kB 10.84 kB
oss-stable/react-dom/cjs/react-dom-server-legacy.browser.production.min.js +1.89% 32.37 kB 32.98 kB +0.70% 10.79 kB 10.86 kB
oss-stable-semver/react-dom/umd/react-dom-server-legacy.browser.production.min.js +1.88% 32.49 kB 33.11 kB +0.93% 10.89 kB 11.00 kB
oss-stable/react-dom/umd/react-dom-server-legacy.browser.production.min.js +1.88% 32.52 kB 33.13 kB +0.92% 10.92 kB 11.02 kB
oss-experimental/react-dom/cjs/react-dom-server-legacy.browser.production.min.js +1.87% 32.79 kB 33.40 kB +0.68% 10.93 kB 11.01 kB
oss-experimental/react-dom/umd/react-dom-server-legacy.browser.production.min.js +1.86% 32.94 kB 33.55 kB +0.80% 11.07 kB 11.16 kB
oss-stable-semver/react-dom/cjs/react-dom-server-legacy.node.production.min.js +1.70% 36.03 kB 36.65 kB +0.65% 12.00 kB 12.08 kB
oss-stable/react-dom/cjs/react-dom-server-legacy.node.production.min.js +1.70% 36.06 kB 36.67 kB +0.66% 12.02 kB 12.10 kB
oss-experimental/react-dom/cjs/react-dom-server-legacy.node.production.min.js +1.68% 36.53 kB 37.15 kB +0.64% 12.18 kB 12.26 kB
oss-stable-semver/react-dom/cjs/react-dom-server.browser.production.min.js +1.43% 33.61 kB 34.09 kB +0.98% 11.48 kB 11.59 kB
oss-stable/react-dom/cjs/react-dom-server.browser.production.min.js +1.42% 33.63 kB 34.11 kB +0.99% 11.50 kB 11.61 kB
oss-stable-semver/react-dom/umd/react-dom-server.browser.production.min.js +1.41% 33.74 kB 34.22 kB +0.97% 11.60 kB 11.72 kB
oss-stable/react-dom/umd/react-dom-server.browser.production.min.js +1.41% 33.76 kB 34.24 kB +0.97% 11.63 kB 11.74 kB
oss-experimental/react-dom/cjs/react-dom-server.browser.production.min.js +1.41% 34.05 kB 34.53 kB +0.98% 11.65 kB 11.76 kB
oss-experimental/react-dom/umd/react-dom-server.browser.production.min.js +1.39% 34.18 kB 34.66 kB +0.92% 11.76 kB 11.87 kB
oss-experimental/react-noop-renderer/cjs/react-noop-renderer-server.production.min.js +1.37% 2.92 kB 2.96 kB +0.89% 1.13 kB 1.14 kB
oss-stable-semver/react-noop-renderer/cjs/react-noop-renderer-server.production.min.js +1.37% 2.92 kB 2.96 kB +0.89% 1.13 kB 1.14 kB
oss-stable/react-noop-renderer/cjs/react-noop-renderer-server.production.min.js +1.37% 2.92 kB 2.96 kB +0.89% 1.13 kB 1.14 kB
oss-stable-semver/react-dom/cjs/react-dom-server.node.production.min.js +1.31% 36.88 kB 37.36 kB +1.09% 12.52 kB 12.66 kB
oss-stable/react-dom/cjs/react-dom-server.node.production.min.js +1.31% 36.90 kB 37.38 kB +1.10% 12.55 kB 12.68 kB
oss-experimental/react-dom/cjs/react-dom-server.node.production.min.js +1.29% 37.38 kB 37.86 kB +0.90% 12.72 kB 12.84 kB
facebook-www/ReactDOMServerStreaming-prod.modern.js +1.29% 78.23 kB 79.24 kB +0.77% 16.65 kB 16.78 kB
oss-stable-semver/react-server/cjs/react-server.development.js +1.22% 124.77 kB 126.28 kB +1.19% 31.26 kB 31.63 kB
oss-stable/react-server/cjs/react-server.development.js +1.22% 124.77 kB 126.28 kB +1.19% 31.26 kB 31.63 kB
oss-experimental/react-server/cjs/react-server.development.js +1.21% 125.64 kB 127.15 kB +1.16% 31.47 kB 31.83 kB
facebook-www/ReactDOMServer-dev.modern.js +0.87% 236.37 kB 238.42 kB +0.84% 56.57 kB 57.05 kB
facebook-www/ReactDOMServerStreaming-dev.modern.js +0.85% 232.96 kB 234.94 kB +0.83% 55.79 kB 56.26 kB
oss-stable-semver/react-dom/cjs/react-dom-server-legacy.browser.development.js +0.80% 231.38 kB 233.23 kB +0.79% 56.22 kB 56.66 kB
oss-stable/react-dom/cjs/react-dom-server-legacy.browser.development.js +0.80% 231.40 kB 233.26 kB +0.79% 56.24 kB 56.69 kB
oss-experimental/react-dom/cjs/react-dom-server-legacy.browser.development.js +0.80% 232.95 kB 234.80 kB +0.78% 56.66 kB 57.11 kB
oss-stable-semver/react-dom/cjs/react-dom-server-legacy.node.development.js +0.79% 233.09 kB 234.94 kB +0.78% 56.69 kB 57.13 kB
oss-stable/react-dom/cjs/react-dom-server-legacy.node.development.js +0.79% 233.11 kB 234.97 kB +0.78% 56.71 kB 57.15 kB
oss-experimental/react-dom/cjs/react-dom-server-legacy.node.development.js +0.79% 234.66 kB 236.51 kB +0.77% 57.14 kB 57.58 kB
oss-stable-semver/react-dom/cjs/react-dom-server.browser.development.js +0.78% 231.02 kB 232.83 kB +0.81% 56.47 kB 56.93 kB
oss-stable/react-dom/cjs/react-dom-server.browser.development.js +0.78% 231.04 kB 232.85 kB +0.81% 56.49 kB 56.95 kB
oss-stable-semver/react-dom/umd/react-dom-server-legacy.browser.development.js +0.78% 242.85 kB 244.75 kB +0.82% 56.85 kB 57.31 kB
oss-stable/react-dom/umd/react-dom-server-legacy.browser.development.js +0.78% 242.88 kB 244.78 kB +0.82% 56.87 kB 57.34 kB
oss-stable-semver/react-dom/cjs/react-dom-server.node.development.js +0.78% 232.18 kB 233.99 kB +0.80% 56.41 kB 56.86 kB
oss-stable/react-dom/cjs/react-dom-server.node.development.js +0.78% 232.20 kB 234.02 kB +0.80% 56.43 kB 56.89 kB
oss-experimental/react-dom/cjs/react-dom-server.browser.development.js +0.78% 232.59 kB 234.40 kB +0.79% 56.88 kB 57.33 kB
oss-experimental/react-dom/umd/react-dom-server-legacy.browser.development.js +0.78% 244.51 kB 246.41 kB +0.79% 57.26 kB 57.71 kB
oss-experimental/react-dom/cjs/react-dom-server.node.development.js +0.78% 233.75 kB 235.56 kB +0.79% 56.82 kB 57.27 kB
oss-stable-semver/react-dom/umd/react-dom-server.browser.development.js +0.77% 242.45 kB 244.32 kB +0.83% 57.06 kB 57.54 kB
oss-stable/react-dom/umd/react-dom-server.browser.development.js +0.77% 242.47 kB 244.34 kB +0.83% 57.09 kB 57.56 kB
oss-experimental/react-dom/umd/react-dom-server.browser.development.js +0.76% 244.11 kB 245.98 kB +0.83% 57.46 kB 57.94 kB
facebook-www/ReactDOMServer-prod.modern.js = 76.22 kB 75.71 kB = 15.74 kB 15.74 kB

Generated by 🚫 dangerJS against 277beb8

@sebmarkbage
Copy link
Collaborator

@@ -279,6 +279,7 @@ function encodeHTMLTextNode(text: string): string {
}

const textSeparator = stringToPrecomputedChunk('<!-- -->');
let lastPushedText = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use ResponseState for this instead of module scope. It's a little safer because you can technically call renderToString from within renderToString I think.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call

The method by which we get segment markup into the DOM differs depending on when the Segment resolves.

If a Segment resovles before flushing begins for it's parent it will be emitted inline with the parent markup. In these cases separators may be necessary because they are how we clue the browser into breakup up text into distinct nodes that will later match up with what will be hydrated on the client.

If a Segment resolves after flushing has happened a script will be used to patch up the DOM in the client. when this happens if there are any text nodes on the boundary of the patch they won't be "merged" and thus will continue to have distinct representation as Nodes in the DOM. Thus we can avoid doing any separators at the boudnaries in these cases.

After applying these changes the only time you will get text separators as follows

* in between serial text nodes that emmit at the same time - these are necessary and cannot be eliminated unless we stop relying on the browser to automatically parse the correct text nodes when processing this HTML
* after a final text node in a non-boundary segment that resolves before it's parent has flushed - these are sometimes extraneous, like when the next emitted thing is a non-Text node.

In all other cases text separators should be omitted which means the general byte efficiency of this approach should be pretty good
return NO_TEXT_EMBED;
}

export function textEmbeddingForSegment(): TextEmbedding {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this whole feature is DOM specific and ideally we wouldn't punish other environments - it's also really DOM that's our main (and currently only target). I wouldn't be too hurt if this feature was just built into ReactFizzServer instead. Including the lastPushedText part.

Since otherwise we leak a lot of concepts and have to make up flags etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I had this thought as I kept adding more functions to the format config. glad you see it the same way and makes sense. I'll move things over so we can clean up the interface

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you able to just use the segment.textEmbedding flag as the running tally for lastPushedText?

Copy link
Collaborator Author

@gnoff gnoff May 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not totally, for boundaries, we need to track lastPushedText internally but we never need to push a separator at the end b/c we always have the suspense comment node. I suppose if I could know I was in a segment that was a boundary or the root I could use that as an alternative way to discriminating those cases

and actually for delayed segments we don't want to emit a final separator but they aren't boundaries so I need a way to know if it is delayed or not when I get to the final push

but I can just track that all as another property on the Segment

@@ -1652,6 +1653,7 @@ function flushSubtree(
// We're emitting a placeholder for this segment to be filled in later.
// Therefore we'll need to assign it an ID - to refer to it by.
const segmentID = (segment.id = request.nextSegmentId++);
segment.textEmbedding = textEmbeddingForDelayedSegment();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it always the case that nothing has been written to this segment yet? I think maybe yes, but not sure.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If some extra separators come along for the ride I don't think it should matter. As long as we are using the patchup in DOM we can handle separators but don't need them. so in this path we know we are getting a placeholder and thus will also get a patchup. Changing the embedding here just means that we have an opportunity to drop them if they have not yet been emitted.

That said... I do not know how it could have had anything written without being complete and thus avoiding this path

@gnoff
Copy link
Collaborator Author

gnoff commented May 27, 2022

You can see if you can remove this hack and see what happens to renderToString tests.

works beautifully without the hack

@gnoff gnoff requested a review from sebmarkbage May 27, 2022 23:28
// Adopt the parent segment's leading text embed
segment.lastPushedText,
// Assume we are text embedded at the trailing edge
true,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is where it gets non-deterministic because if you didn't suspend here it would not do this and would eliminate the unnecessary trailing edge.

Copy link
Collaborator

@sebmarkbage sebmarkbage May 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I had thought of solving this was to track a flag on both side of the segment. Like if the segment ends with text and another flag if the parent has written text right after that segment and another flag if a segment starts with text.

Then in the flush phase you can insert a separator during the flush if you're inlining two segments that ends and starts with text that are next to each other, or if there's a parent chunk in between you can tell if the parent wrote text after the segment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I think I could make that work but I think I'll merge this and think about it some more

Copy link
Collaborator

@sebmarkbage sebmarkbage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is non-deterministic but it's a better tradeoff than what's in main right now.

@gnoff gnoff merged commit a276638 into facebook:main May 28, 2022
@gnoff gnoff deleted the text-separators branch May 28, 2022 15:30
@gnoff
Copy link
Collaborator Author

gnoff commented May 29, 2022

This is non-deterministic but it's a better tradeoff than what's in main right now.

you nerdsniped me: #24637

facebook-github-bot pushed a commit to facebook/react-native that referenced this pull request Jun 6, 2022
Summary:
This sync includes the following changes:
- **[dd4950c90](facebook/react@dd4950c90 )**: [Flight] Implement useId hook ([#24172](facebook/react#24172)) //<Josh Story>//
- **[26a5b3c7f](facebook/react@26a5b3c7f )**: Explicitly set `highWaterMark` to 0 for `ReadableStream` ([#24641](facebook/react#24641)) //<Josh Larson>//
- **[aec575914](facebook/react@aec575914 )**: [Fizz] Send errors down to client ([#24551](facebook/react#24551)) //<Josh Story>//
- **[a2766387e](facebook/react@a2766387e )**: [Fizz] Improve text separator byte efficiency ([#24630](facebook/react#24630)) //<Josh Story>//
- **[f7860538a](facebook/react@f7860538a )**: Fix typo in useSyncExternalStore main entry point error ([#24631](facebook/react#24631)) //<François Chalifour>//
- **[1bed20731](facebook/react@1bed20731 )**: Add a module map option to the Webpack Flight Client ([#24629](facebook/react#24629)) //<Sebastian Markbåge>//
- **[b2763d3ea](facebook/react@b2763d3ea )**: Move hydration code out of normal Suspense path ([#24532](facebook/react#24532)) //<Andrew Clark>//
- **[357a61324](facebook/react@357a61324 )**: [DevTools][Transition Tracing] Added support for Suspense Boundaries ([#23365](facebook/react#23365)) //<Luna Ruan>//
- **[2c8a1452b](facebook/react@2c8a1452b )**: Fix ignored setState in Safari when iframe is touched ([#24459](facebook/react#24459)) //<dan>//
- **[62662633d](facebook/react@62662633d )**: Remove enableFlipOffscreenUnhideOrder ([#24545](facebook/react#24545)) //<Ricky>//
- **[34da5aa69](facebook/react@34da5aa69 )**: Only treat updates to lazy as a new mount in legacy mode ([#24530](facebook/react#24530)) //<Ricky>//
- **[46a6d77e3](facebook/react@46a6d77e3 )**: Unify JSResourceReference Interfaces ([#24507](facebook/react#24507)) //<Timothy Yung>//
- **[6cbf0f7fa](facebook/react@6cbf0f7fa )**: Fork ReactSymbols ([#24484](facebook/react#24484)) //<Ricky>//
- **[a10a9a6b5](facebook/react@a10a9a6b5 )**: Add test for hiding children after layout destroy ([#24483](facebook/react#24483)) //<Ricky>//
- **[b4eb0ad71](facebook/react@b4eb0ad71 )**: Do not replay erroring beginWork with invokeGuardedCallback when suspended or previously errored ([#24480](facebook/react#24480)) //<Josh Story>//
- **[99eef9e2d](facebook/react@99eef9e2d )**: Hide children of Offscreen after destroy effects ([#24446](facebook/react#24446)) //<Ricky>//
- **[ce1386028](facebook/react@ce1386028 )**: Remove enablePersistentOffscreenHostContainer flag ([#24460](facebook/react#24460)) //<Andrew Clark>//
- **[72b7462fe](facebook/react@72b7462fe )**: Bump local package.json versions for 18.1 release ([#24447](facebook/react#24447)) //<Andrew Clark>//
- **[22edb9f77](facebook/react@22edb9f77 )**: React `version` field should match package.json ([#24445](facebook/react#24445)) //<Andrew Clark>//
- **[6bf3deef5](facebook/react@6bf3deef5 )**: Upgrade react-shallow-renderer to support react 18 ([#24442](facebook/react#24442)) //<Michael サイトー 中村 Bashurov>//

Changelog:
[General][Changed] - React Native sync for revisions bd4784c...d300ceb

jest_e2e[run_all_tests]

Reviewed By: cortinico, kacieb

Differential Revision: D36874368

fbshipit-source-id: c0ee015f4ef2fa56e57f7a1f6bc37dd05c949877
@facebook facebook deleted a comment from bobis19 Jun 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants