There seems to be a problem with ignoring NA values in a data file with the newest version of gnuplot. I run the following script (pitimes.plt):
labelstyle = "font \"Arial,10\" center tc 'black'"
set key autotitle columnhead
set xrange[0:11]
set boxwidth 0.9
set style fill solid 0.3 border lc 'red'
set title "Time Comparisions For Implementations"
formatlab(x) = x<60?sprintf("%d",x):(x<3600)?sprintf("%d:%02d",x/60,x%60):sprintf("%d:%02d:%02d",x/3600,(x%3600)/60,x%60)
set xtics rotate by -45 1,1,9
set yrange[0:*]
set ylab "Time (seconds)"
set ytics nomirror
unset my2tics
set y2range[0:5]
set xrange[0:10]
unset key
set y2tics format "%g"
set y2lab "log(Time) (seconds)"
plot "pitime.txt" u ($0+1):3:xtic(1) with boxes lc 'red',\
"pitime.txt" u ($0+1):(log10($3)) with impulses axes x1y2 lw 2 lc 'black',\
"pitime.txt" u ($0+1):(log10($3)+0.5):(formatlab(int($3))) with labels @labelstyle axes x1y2
on the following data file (pitime.txt)
Language Time20k Time100k
Java 2 45
Kotlin 2 45
Scala 3 71
C 4 103
Javascript 17 412
Groovy 42 1030
Ruby 115 3306
Python 151 3718
Perl 189 4858
Php 372 9231
R 1322 NA
This works correctly with version 5.0 and produces the following graph

However, in version 5.2, it does not produce any output and causes the error message "pitimes.plt", line 24: non-integer operand for % to be displayed (line 24 is the last line of the script). It seems that in version 5.0, when encontering the NA value in the last line of the data file, the line is ignored. However, in version 5.2, it attempts to feed the NA value to the formatlab function creating the error message.
I am using 5.0 patchlevel 6 and 5.2 patchlevel 2 on Windows 7.
Diff:
Diff:
You are seeing a side effect of an intentional change.
In older versions of gnuplot, processing of the "using" specifier for a line of data would stop as soon as it hit an undefined value. Now the program attempts to evaluate all of the "using" clauses even if one or more of them evaluates to NaN. Unfortunately your command manages to trigger a fatal error while evaluating the label format.
I will look into how it might be mitigated, but for now I can offer a work-around.
Since the quantity that is undefined is in column 3, you can explicitly skip the format evaluation in this case:
the test ($3 == $3) fails if and only if $3 is not-a-number or undefined.
Thanks for the suggested workaround, I will apply it to the script for now.
The intended mechanism is
set datafile missing "NA"
but this does not work in the example provided because the check for missing data is too shallow. This comment at datafile.c:2105 is relevant:
The problem is that by the time the program notices that the data field it is processing contains the "missing" flag it is deep inside evaluation of an expression. There is not currently a mechanism for it to bail out of the evaluation early and report the missing field cleanly. Instead it continues evaluation until it either finishes or it hits some more fatal error.
I see no path to fixing this in the general case that doesn't impose major penalties on the process of expression evaluation. The best I can think of is to document the limitation more clearly and give an example of the workaround.
... And then enlightenment hits. I eventually was hit by a concept for a much more powerful way to screen for missing values, so long as the using spec accesses the column as (func($N)) rather than something complicated like (column(func(whatever)).