Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tables have wrong columns if ClickHouse JSON datapoint is not ordered #281

Closed
genzgd opened this issue Aug 25, 2020 · 32 comments · Fixed by #308 or grafana/grafana-plugin-repository#803
Assignees

Comments

@genzgd
Copy link

genzgd commented Aug 25, 2020

We use Trickster to cache time series data. Because of the way that Trickster caches queries, the properties of the JSON datapoints returned from cached queries are in alphabetical order, not the same order as the ClickHouse response meta array. This breaks tables in the plugin, because it appears the construction of the table row assumes that the properties of the JSON object are in that same order (in particular, that the timestamp column is always first).

The plugin should match the timestamp column by property name, not index, at least for tables.

@Slach
Copy link
Collaborator

Slach commented Aug 28, 2020

@genzgd thanks a lot for reporting
I will try to reproduce it ASAP

@Slach Slach self-assigned this Aug 28, 2020
@ashutoshraina
Copy link

Just reporting that we also have the same problem. We proxy clickhouse queries through a http layer and the json response produced by the proxy layer isn't necessarily in the same order as the select clause. This causes the table to have wrong data under the wrong columns.

@Slach
Copy link
Collaborator

Slach commented Aug 31, 2020

@ashutoshraina do you mean "trickster" proxy or "proxy mode" in clickhouse data source?

@ashutoshraina
Copy link

We have a separate proxy written in go. We point the datasource to that proxy. The query comes back with the expected JSON but rendering of the table is incorrect due to the reasons mentioned above.
See the response below, the indices in the meta and the order in the results doesn't match which causes the rendering to become out of order in the table. If we matched on name, this would go away. The CQL select columns match the meta keys.

select  ErrorName, currentWeek, previousWeek , (previousWeek - currentWeek) as delta from 

image

image

@Slach
Copy link
Collaborator

Slach commented Sep 2, 2020

@genzgd could you share your trickster.conf [origins] and [caches] section?

@Slach
Copy link
Collaborator

Slach commented Sep 2, 2020

@ashutoshraina could you explain what exactly wrong? as I see on your screenshot everything looks fine, table have same column order as described in response meta part

@genzgd
Copy link
Author

genzgd commented Sep 2, 2020

Trickster config -- we run trickster against an nginx proxy in Kubernetes.

  [caches.default]
    cache_type = 'memory'
      [caches.default.index]
      max_size_bytes = 34359738368

  [origins.comcast]
  origin_type = 'clickhouse'
  origin_url = 'http://maple-trickster-nginx-svc.maple-trickster-nginx.svc.cluster.local/comcast_ch2/'
  backfill_tolerance_secs = 180

I am the primary author of the ClickHouse support for Trickster. I am very familiar with how cached queries are stored, using standard GoLang JSON support, and I know that order of properties is not preserved. We see the same results as @ashutoshraina -- the columns on the screenshot are in the correct order, but the data is from different columns. So if we have column b, c, d, a in our query and metadata, the resulting data comes back as a,b,c,d in each JSON object and that's how the table columns are populated.

@ashutoshraina
Copy link

ashutoshraina commented Sep 2, 2020

@Slach Exactly what @genzgd mentioned. If the meta and data keys are not in the same order then the table columns are not populated correctly.
If we match the key names instead of the indices, this would probably work correctly.
My guess is that this is the problem https://github.com/Vertamedia/clickhouse-grafana/blob/master/src/response_parser.ts#L30. Since indices don't match between meta and data, we get the wrong data under the columns.

@ashutoshraina
Copy link

ashutoshraina commented Sep 10, 2020

@Slach Is there some more information I can provide to help move this forward ?

Just to improve my understanding of the codebase, would the following be test case that should pass ( it currently fails ).

    describe("When meta and data keys do not have the same index", () => {
        const response = {
            "meta": [
                {
                    "name": "foo",
                    "type": "String",
                },
                {
                    "name": "bar",
                    "type": "String",
                },
            ],

            "data": [
                {
                    "bar": "bar_value",
                    "foo": "foo_value",
                },
            ],
        };

        // @ts-ignore
        const responseParser = new ResponseParser(this.$q);
        const data = responseParser.parse("SELECT col1 AS foo, col2 AS bar FROM host", response);

        it('should return key-value pairs', function () {
            expect(data[0]).toStrictEqual({"bar": "bar_value", "foo": "foo_value"});
        });
    });

@Slach
Copy link
Collaborator

Slach commented Sep 10, 2020

@ashutoshraina thanks a lot for testcase i will fix it ASAP

@Slach
Copy link
Collaborator

Slach commented Sep 10, 2020

@ashutoshraina hmm, look like your test case PASSED currently, on latest master branch
I will try to reproduce via docker-compose environment with trickster

@Slach
Copy link
Collaborator

Slach commented Sep 10, 2020

@genzgd as I see trickster doesn't support case when clickhouse-grafana pass SQL query via POST request body?

@Slach
Copy link
Collaborator

Slach commented Sep 10, 2020

@genzgd and @ashutoshraina
could you help me reproduce this issue?
I add trickster to docker-compose environment

Currently after run

docker-compose pull
docker-compose down 
docker-compose up -d --no-deps grafana clickhouse trickster

and open in browser http://localhost:3000/d/2LfUU3vMz/trickster-dashboard?orgId=1
I don't see changes in field order in "data" section when grafana pass request to trickster

maybe I do something wrong?

docker-compose logs -f trickster 

I can't figure out, was request from grafana to trickster:8480 cached or not

@genzgd
Copy link
Author

genzgd commented Sep 10, 2020

Correct, Trickster does nothing with POSTs but pass them through.  With a GET you can tell if the response is cached by looking at the special header coming back. Unfortunately it isn't currently caching correctly with the newest Plugin version, because of the additional wrapping of the $timeFilter variable in another level of parens (I'm working on fixing that now). When it does work you can see the result in the X-Trickster-Result header; however right now all you will likely see is status=proxy-only. Also the data recombination only happens on a second request to the cache for the same or overlapping time series data.

@Slach
Copy link
Collaborator

Slach commented Sep 10, 2020

@genzgd i got following results for my environment
image
image
what it mean?

@genzgd
Copy link
Author

genzgd commented Sep 10, 2020

Because the X-Trickster-Result is "proxy-only", Trickster did not cache this result but just returned the unchanged result fro ClickHouse. You need to send a query in that looks like a time series query and send it twice, so that the second response comes from the Trickster cache and gets recreated instead of coming directly from ClickHouse.

This is the sample query I use in the unit tests that should get cached: SELECT ( intDiv(toUInt32(datetime), 300) * 300) * 1000 AS t, count() as cnt FROM test_db.test_table WHERE datetime between 1589904000 AND 1589997600 GROUP BY t ORDER BY t DESC FORMAT JSON

@genzgd
Copy link
Author

genzgd commented Sep 10, 2020

Also just a reminder that the Grafana plugin works for charts, it seems that only tables are broken.

@Slach
Copy link
Collaborator

Slach commented Sep 11, 2020

@genzgd I can't catch cache behavior from trickster with change column orders

I trying to send following curl request:

curl 'http://localhost:3000/api/datasources/proxy/4/?query=SELECT%0A%20%20%20%20(intDiv(toUInt32(event_time)%2C%2030)%20*%2030)%20*%201000%20as%20t%2C%0A%20%20%20%20count()%20AS%20b%2C%0A%20%20%20%20sum(t%20%25%20100000)%20AS%20a%0AFROM%20default.test_grafana%0A%0AWHERE%20event_time%20%3E%3D%20toDateTime(1599735563)%0A%0AGROUP%20BY%20t%0A%0AORDER%20BY%20t%20FORMAT%20JSON' -H 'accept: application/json' -H 'Cookie: grafana_session=0cf3e63f193d25e2828bfde0aa267b1e'   --compressed -i -vvv

first time I got

X-Trickster-Result: engine=DeltaProxyCache; status=kmiss; ffstatus=off

and data section with the same fields order as meta section

{
 "meta":[
   {"name":"t","type":"UInt64"},
   {"name":"b","type":"UInt64"},
   {"name":"a","type":"UInt64"}
 ],
 "data":[
    {"t":"1599735540000","b":"3","a":"120000"}
...
]}

second time

X-Trickster-Result: engine=DeltaProxyCache; status=phit; fetched=[1599735540:1599776190,1599806790:1599807120]; ffstatus=off

with same JSON results

{
  "meta":[{"name":"t","type":"UInt64"},{"name":"b","type":"UInt64"},{"name":"a","type":"UInt64"}],
  "data":[{"t":"1599735540000","b":"3","a":"120000"} ... ]
}

what I should do to reproduce errornomus behavior?
also &add_http_cors_header=1 in query string completely turn off proxy cache

@Slach
Copy link
Collaborator

Slach commented Sep 14, 2020

@genzgd could you suggest to me how I can trigger change columns order behavior in trickster?

@genzgd
Copy link
Author

genzgd commented Sep 14, 2020

I'll try to find time to reproduce this; it's been a busy few days. Regardless, I think @ashutoshraina has identified the code that causes the problem. Because the plugin uses the order of JSON properties instead of their names, and the order of JSON properties is not guaranteed, the plugin is broken. You could just create an artificial response with "out of order" properties to test a fix.

@Slach
Copy link
Collaborator

Slach commented Sep 14, 2020

@genzgd, hmm, as I see @ashutoshraina found https://github.com/Vertamedia/clickhouse-grafana/blob/master/src/response_parser.ts#L30 but this part of code is executed only when we have __value and __text in field names in SQL query, it part of code usual used when we run queries for getting "grafana template variables" dropdown boxes

SELECT foo AS __value, bar AS __text FROM table

and indexes textColIndex and findColIndex is calculated right via ResponseParser.findColIndex

so, for common case when data section have different fields order with meta

we can try to look at
https://github.com/Vertamedia/clickhouse-grafana/blob/master/src/response_parser.ts#L32
but tests in
https://github.com/Vertamedia/clickhouse-grafana/blob/master/spec/response_parser.jest.ts#L37
PASSED

so look like "wrong" data transformation happens on Grafana TABLE plugin, cause you said "graph" panels works fine
I need to trigger for mismatch meta and data on trickster side for check this corner case

@ashutoshraina
Copy link

@Slach I think my test may be incorrect. I need your help in understanding if the test I submitted is correct or not.

What is the expected size of the array res for the test case I submitted ? Should it be 2 ?

Here's what I am thinking and please do let me know if this does not make sense.

sqlResults (https://github.com/Vertamedia/clickhouse-grafana/blob/master/src/response_parser.ts#L21) will have two entries based on the input in the test and we are adding each element of the sqlResults array to the res array.

My test passes with array containing one element. Should there be a data[0] with key as bar and data[1] with key as foo ( or the other way around?

@ashutoshraina
Copy link

Just bumping this up.
How can we move this forward ?

@Slach
Copy link
Collaborator

Slach commented Sep 26, 2020

@ashutoshraina thanks a lot for your attention I going to try to fix it on next week

@ashutoshraina
Copy link

In case this helps I have tried multiple versions of grafana and plugins and the issue was reproducible every time in both.

7.0.6 - Grafana with Clickhouse Plugin 1.9.1
7.2.0 - Grafana with Clickhouse Plugin 2.1.0

@Slach
Copy link
Collaborator

Slach commented Nov 10, 2020

@ashutoshraina, unfortunately, I can't reproduce JSON with different order with meta and data section

@genzgd could you help us trigger this behavior with trickster?
current steps to reproduce

git clone https://github.com/Vertamedia/clickhouse-grafana.git
cd clickhouse-grafana
docker-compose up -d --no-deps clickhouse trickster grafana

open
http://localhost:3000/d/2LfUU3vMz/trickster-dashboard
and try to refresh

@genzgd
Copy link
Author

genzgd commented Nov 10, 2020

I'll try to take a look, but the fragile ClickHouse "time series" parsing in the current version of Trickster is currently broken (by the addition of an extra set of parentheses around the $timeFilter expression), and I've not had time to fix it. What query does that dashboard use?

@Slach
Copy link
Collaborator

Slach commented Nov 24, 2020

@genzgd sorry for the late response, this dashboard uses a simple query with two time-series

SELECT
    $timeSeries as t,
    count() AS b,
    sum(t % 100000) AS a
FROM $table
WHERE $timeFilter
GROUP BY t
ORDER BY t

and i can't reproduce trickster behavior when "a" and "b" switched values in JSON cached response from trickster

steps to reproduce

git clone https://github.com/Vertamedia/clickhouse-grafana.git
cd clickhouse-grafana
docker-compose up -d --no-deps clickhouse trickster grafana

@ashutoshraina
Copy link

I tried to repro this as well using the steps mentioned. I wasn't able to do that.
I always saw the X-Trickster-Result: engine=HTTPProxy; status=proxy-only . Perhaps, my query isn't correct.

SELECT
    1 as t,
    2 as b,
    count() as cnt
FROM $table

WHERE $timeFilter

GROUP BY
    t,
    b
ORDER BY t
SELECT
    1 as t,
    2 as b,
    count() as cnt
FROM default.test_grafana

WHERE event_time >= toDateTime(1606268365) AND event_time <= toDateTime(1606275565)

GROUP BY
    t,
    b
ORDER BY t

I think there might be another way of trying to repro this.
The difference between the meta and data keys is the sort order of keys in the data section.
Is there a way in the plugin to mock the response coming from clickhouse where the data is set up in that way ?

@Slach
Copy link
Collaborator

Slach commented Nov 25, 2020

@genzgd @ashutoshraina maybe I catch it for the "Table" format of the query
please look to https://github.com/Vertamedia/clickhouse-grafana/pull/308/files#diff-c1f7653433bd942fb1367cfad8e55de756b7e32462eab1048cdb760f6856cbccR35-R36

could you check the latest master version in your environment and confirm the issue doesn't reproduce anymore?

Slach added a commit to Altinity/grafana-plugin-repository that referenced this issue Nov 30, 2020
# 2.2.0 (2020-11-30)
## Enhancement:

* add region support to annotation query, try to fix wrong column orders for table format, fix Altinity/clickhouse-grafana#303
* add plugin sign process, fix Altinity/clickhouse-grafana#212
* add `DateTime64` support, fix Altinity/clickhouse-grafana#292
* add `linux\arm64` backend plugin build
* improve ARRAY JOIN parsing, fix Altinity/clickhouse-grafana#284
* improve `docker-compose.yaml` add ability to redefine `GRAFANA_VERSION` and `CLICKHOUSE_VERSION` via environment variables `latest` by default

## Fixes:
* add `*.js.map` and `*.js` from src and spec folder to .gitignore
* don't apply adhoc filters twice when used $adhoc macros, fix Altinity/clickhouse-grafana#282
* fix corner case for table format with wrong columns order between meta and data response section, fix Altinity/clickhouse-grafana#281
* add trickster to docker-compose environment
* actualize links in README.md

Signed-off-by: Eugene Klimov <[email protected]>
@ashutoshraina
Copy link

I can confirm now that the table headers are rendered in the right order. Thanks a lot @Slach and @genzgd .

@Slach
Copy link
Collaborator

Slach commented Apr 6, 2021

@genzgd sorry for the inconvenience, could you consult me about some tricksterproxy internals?
I need to figure out how trickster SQL parser works to detect time range from SQL query for ClickHouse dialect
do you have email/telegram? could you share it to me on [email protected] or https://t.me/bloodjazman?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants