Skip to content

Commit

Permalink
docs update
Browse files Browse the repository at this point in the history
  • Loading branch information
cnuernber committed Nov 18, 2022
1 parent 22aba0a commit 67d91fc
Show file tree
Hide file tree
Showing 9 changed files with 96 additions and 7 deletions.
1 change: 1 addition & 0 deletions deps.edn
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@
:language :clojurescript
:source-paths ["src"]
:output-path "docs"
:doc-paths ["topics"]
:source-uri "https://github.com/cnuernber/tmdjs/blob/master/{filepath}#L{line}"
:namespaces [tech.v3.dataset
tech.v3.dataset.node
Expand Down
88 changes: 88 additions & 0 deletions docs/Reductions.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
<!DOCTYPE html PUBLIC ""
"">
<html><head><meta charset="UTF-8" /><title>Some Reduction Timings</title><script async="true" src="https://www.googletagmanager.com/gtag/js?id=G-CLH3CS7E1R"></script><script>window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());

gtag('config', 'G-CLH3CS7E1R');</script><link rel="stylesheet" type="text/css" href="css/default.css" /><link rel="stylesheet" type="text/css" href="highlight/solarized-light.css" /><script type="text/javascript" src="highlight/highlight.min.js"></script><script type="text/javascript" src="js/jquery.min.js"></script><script type="text/javascript" src="js/page_effects.js"></script><script>hljs.initHighlightingOnLoad();</script></head><body><div id="header"><h2>Generated by <a href="https://github.com/weavejester/codox">Codox</a> with <a href="https://github.com/xsc/codox-theme-rdash">RDash UI</a> theme</h2><h1><a href="index.html"><span class="project-title"><span class="project-name">tmdjs</span> <span class="project-version">1.014</span></span></a></h1></div><div class="sidebar primary"><h3 class="no-link"><span class="inner">Project</span></h3><ul class="index-link"><li class="depth-1 "><a href="index.html"><div class="inner">Index</div></a></li></ul><h3 class="no-link"><span class="inner">Topics</span></h3><ul><li class="depth-1 current"><a href="Reductions.html"><div class="inner"><span>Some Reduction Timings</span></div></a></li></ul><h3 class="no-link"><span class="inner">Namespaces</span></h3><ul><li class="depth-1"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>tech</span></div></div></li><li class="depth-2"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>v3</span></div></div></li><li class="depth-3"><a href="tech.v3.dataset.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>dataset</span></div></a></li><li class="depth-4"><a href="tech.v3.dataset.node.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>node</span></div></a></li><li class="depth-3"><a href="tech.v3.datatype.html"><div class="inner"><span class="tree" style="top: -52px;"><span class="top" style="height: 61px;"></span><span class="bottom"></span></span><span>datatype</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.datatype.argops.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>argops</span></div></a></li><li class="depth-4"><a href="tech.v3.datatype.functional.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>functional</span></div></a></li><li class="depth-3"><div class="no-link"><div class="inner"><span class="tree" style="top: -83px;"><span class="top" style="height: 92px;"></span><span class="bottom"></span></span><span>libs</span></div></div></li><li class="depth-4"><a href="tech.v3.libs.cljs-ajax.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>cljs-ajax</span></div></a></li></ul></div><div class="document" id="content"><div class="doc"><div class="markdown"><h1>Some Reduction Timings</h1>
<p>The datatype library has some helpers that work with datasets that can make certain types of
reductions much faster.</p>
<h2>Filter On One Column, Reduce Another</h2>
<p>This is a very common operation so let's take a closer look. The generic dataset
pathway would be:</p>
<pre><code class="language-clojure">cljs.user&gt; (require '[tech.v3.dataset :as ds])
nil
cljs.user&gt; (def test-ds (ds/-&gt;dataset {:a (range 20000)
:b (repeatedly 20000 rand)}))
#'cljs.user/test-ds
cljs.user&gt; ;;filter on a, sum b.
cljs.user&gt; (reduce + 0.0 (-&gt; (ds/filter-column test-ds :a #(&gt; % 10000))
(ds/column :b)))
5000.898384571656
cljs.user&gt; (time (dotimes [idx 100] (reduce + 0.0 (-&gt; (ds/filter-column test-ds :a #(&gt; % 10000))
(ds/column :b)))))
"Elapsed time: 282.714231 msecs"
"Elapsed time: 282.714231 msecs"
"Elapsed time: 282.714231 msecs"
</code></pre>
<p>Think transducers are fast? What about a generic transducer pathway?</p>
<pre><code class="language-clojure">cljs.user&gt; (let [a (test-ds :a)
b (test-ds :b)]
(transduce (comp (filter #(&gt; (nth a %) 10000))
(map #(nth b %)))
(completing +)
(range (ds/row-count test-ds))))
5000.898384571656
cljs.user&gt; (time (dotimes [idx 100]
(let [a (test-ds :a)
b (test-ds :b)]
(transduce (comp (filter #(&gt; (nth a %) 10000))
(map #(nth b %)))
(completing +)
(range (ds/row-count test-ds))))))
"Elapsed time: 436.235972 msecs"
"Elapsed time: 436.235972 msecs"
"Elapsed time: 436.235972 msecs"
nil
</code></pre>
<p>Transducers are fast - after looking at this pathway we cound it
pays a lot for each nth call. The datatype library has a way
to get the fastest access available for a given container. Columns overload
this pathway such that if there are no missing they use the fastest
access for their buffer, else they have to wrap a missing check. Regardless,
this gets us a solid improvement:</p>
<pre><code class="language-clojure">cljs.user&gt; (require '[tech.v3.datatype :as dtype])
nil
cljs.user&gt; (time (dotimes [idx 100]
(let [a (dtype/-&gt;fast-nth (test-ds :a))
b (dtype/-&gt;fast-nth (test-ds :b))]
(transduce (comp (filter #(&gt; (a %) 10000))
(map #(b %)))
(completing +)
(range (ds/row-count test-ds))))))
"Elapsed time: 77.823553 msecs"
"Elapsed time: 77.823553 msecs"
"Elapsed time: 77.823553 msecs"
nil
</code></pre>
<p>OK - there is another more dangerous approach. dtype has another query,
as-agetable, that either returns something for which <code>aget</code> works or
nil. If you know your dataset's columns have no missing data and their
backing store data itself is agetable - then you can get an agetable. This
doesn't have a fallback so you risk null ptr issues - but it is the fastest
possible pathway.</p>
<pre><code class="language-clojure">cljs.user&gt; (time (dotimes [idx 100]
(let [a (dtype/as-agetable (test-ds :a))
b (dtype/as-agetable (test-ds :b))]
(transduce (comp (filter #(&gt; (aget a %) 10000))
(map #(aget b %)))
(completing +)
(range (ds/row-count test-ds))))))
"Elapsed time: 57.404783 msecs"
"Elapsed time: 57.404783 msecs"
"Elapsed time: 57.404783 msecs"
nil
</code></pre>
<p>In this simple example we find that a transducing pathway is indeed a quite bit faster but only
when it is coupled with an efficient per-element access pattern.</p>
</div></div></div></body></html>
2 changes: 1 addition & 1 deletion docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());

gtag('config', 'G-CLH3CS7E1R');</script><link rel="stylesheet" type="text/css" href="css/default.css" /><link rel="stylesheet" type="text/css" href="highlight/solarized-light.css" /><script type="text/javascript" src="highlight/highlight.min.js"></script><script type="text/javascript" src="js/jquery.min.js"></script><script type="text/javascript" src="js/page_effects.js"></script><script>hljs.initHighlightingOnLoad();</script></head><body><div id="header"><h2>Generated by <a href="https://github.com/weavejester/codox">Codox</a> with <a href="https://github.com/xsc/codox-theme-rdash">RDash UI</a> theme</h2><h1><a href="index.html"><span class="project-title"><span class="project-name">tmdjs</span> <span class="project-version">1.014</span></span></a></h1></div><div class="sidebar primary"><h3 class="no-link"><span class="inner">Project</span></h3><ul class="index-link"><li class="depth-1 current"><a href="index.html"><div class="inner">Index</div></a></li></ul><h3 class="no-link"><span class="inner">Namespaces</span></h3><ul><li class="depth-1"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>tech</span></div></div></li><li class="depth-2"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>v3</span></div></div></li><li class="depth-3"><a href="tech.v3.dataset.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>dataset</span></div></a></li><li class="depth-4"><a href="tech.v3.dataset.node.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>node</span></div></a></li><li class="depth-3"><a href="tech.v3.datatype.html"><div class="inner"><span class="tree" style="top: -52px;"><span class="top" style="height: 61px;"></span><span class="bottom"></span></span><span>datatype</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.datatype.argops.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>argops</span></div></a></li><li class="depth-4"><a href="tech.v3.datatype.functional.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>functional</span></div></a></li><li class="depth-3"><div class="no-link"><div class="inner"><span class="tree" style="top: -83px;"><span class="top" style="height: 92px;"></span><span class="bottom"></span></span><span>libs</span></div></div></li><li class="depth-4"><a href="tech.v3.libs.cljs-ajax.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>cljs-ajax</span></div></a></li></ul></div><div class="namespace-index" id="content"><h1><span class="project-title"><span class="project-name">tmdjs</span> <span class="project-version">1.014</span></span></h1><div class="doc"><p>Dataframe processing for ClojureScript.</p></div><h2>Namespaces</h2><div class="namespace"><h3><a href="tech.v3.dataset.html">tech.v3.dataset</a></h3><div class="doc"><div class="markdown"><p>Dataframe (map of columns) data processing system for clojurescript.
gtag('config', 'G-CLH3CS7E1R');</script><link rel="stylesheet" type="text/css" href="css/default.css" /><link rel="stylesheet" type="text/css" href="highlight/solarized-light.css" /><script type="text/javascript" src="highlight/highlight.min.js"></script><script type="text/javascript" src="js/jquery.min.js"></script><script type="text/javascript" src="js/page_effects.js"></script><script>hljs.initHighlightingOnLoad();</script></head><body><div id="header"><h2>Generated by <a href="https://github.com/weavejester/codox">Codox</a> with <a href="https://github.com/xsc/codox-theme-rdash">RDash UI</a> theme</h2><h1><a href="index.html"><span class="project-title"><span class="project-name">tmdjs</span> <span class="project-version">1.014</span></span></a></h1></div><div class="sidebar primary"><h3 class="no-link"><span class="inner">Project</span></h3><ul class="index-link"><li class="depth-1 current"><a href="index.html"><div class="inner">Index</div></a></li></ul><h3 class="no-link"><span class="inner">Topics</span></h3><ul><li class="depth-1 "><a href="Reductions.html"><div class="inner"><span>Some Reduction Timings</span></div></a></li></ul><h3 class="no-link"><span class="inner">Namespaces</span></h3><ul><li class="depth-1"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>tech</span></div></div></li><li class="depth-2"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>v3</span></div></div></li><li class="depth-3"><a href="tech.v3.dataset.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>dataset</span></div></a></li><li class="depth-4"><a href="tech.v3.dataset.node.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>node</span></div></a></li><li class="depth-3"><a href="tech.v3.datatype.html"><div class="inner"><span class="tree" style="top: -52px;"><span class="top" style="height: 61px;"></span><span class="bottom"></span></span><span>datatype</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.datatype.argops.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>argops</span></div></a></li><li class="depth-4"><a href="tech.v3.datatype.functional.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>functional</span></div></a></li><li class="depth-3"><div class="no-link"><div class="inner"><span class="tree" style="top: -83px;"><span class="top" style="height: 92px;"></span><span class="bottom"></span></span><span>libs</span></div></div></li><li class="depth-4"><a href="tech.v3.libs.cljs-ajax.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>cljs-ajax</span></div></a></li></ul></div><div class="namespace-index" id="content"><h1><span class="project-title"><span class="project-name">tmdjs</span> <span class="project-version">1.014</span></span></h1><div class="doc"><p>Dataframe processing for ClojureScript.</p></div><h2>Topics</h2><ul class="topics"><li><a href="Reductions.html">Some Reduction Timings</a></li></ul><h2>Namespaces</h2><div class="namespace"><h3><a href="tech.v3.dataset.html">tech.v3.dataset</a></h3><div class="doc"><div class="markdown"><p>Dataframe (map of columns) data processing system for clojurescript.
This API is a simplified version of the
<a href="https://techascent.github.io/tech.ml.dataset/">jvm-version's api</a>.</p>
</div></div><div class="index"><p>Public variables and functions:</p><ul><li> <a href="tech.v3.dataset.html#var--.3E.3Edataset">-&gt;&gt;dataset</a> </li><li> <a href="tech.v3.dataset.html#var--.3Edataset">-&gt;dataset</a> </li><li> <a href="tech.v3.dataset.html#var-column">column</a> </li><li> <a href="tech.v3.dataset.html#var-column-.3Edata">column-&gt;data</a> </li><li> <a href="tech.v3.dataset.html#var-column-count">column-count</a> </li><li> <a href="tech.v3.dataset.html#var-column-map">column-map</a> </li><li> <a href="tech.v3.dataset.html#var-column-names">column-names</a> </li><li> <a href="tech.v3.dataset.html#var-columns">columns</a> </li><li> <a href="tech.v3.dataset.html#var-concat">concat</a> </li><li> <a href="tech.v3.dataset.html#var-data-.3Ecolumn">data-&gt;column</a> </li><li> <a href="tech.v3.dataset.html#var-data-.3Edataset">data-&gt;dataset</a> </li><li> <a href="tech.v3.dataset.html#var-dataset-.3Edata">dataset-&gt;data</a> </li><li> <a href="tech.v3.dataset.html#var-dataset-.3Etransit-str">dataset-&gt;transit-str</a> </li><li> <a href="tech.v3.dataset.html#var-dataset.3F">dataset?</a> </li><li> <a href="tech.v3.dataset.html#var-filter">filter</a> </li><li> <a href="tech.v3.dataset.html#var-filter-column">filter-column</a> </li><li> <a href="tech.v3.dataset.html#var-filter-dataset">filter-dataset</a> </li><li> <a href="tech.v3.dataset.html#var-group-by">group-by</a> </li><li> <a href="tech.v3.dataset.html#var-group-by-column">group-by-column</a> </li><li> <a href="tech.v3.dataset.html#var-head">head</a> </li><li> <a href="tech.v3.dataset.html#var-intersect-missing-sets">intersect-missing-sets</a> </li><li> <a href="tech.v3.dataset.html#var-mapseq-parser">mapseq-parser</a> </li><li> <a href="tech.v3.dataset.html#var-merge-by-column">merge-by-column</a> </li><li> <a href="tech.v3.dataset.html#var-missing">missing</a> </li><li> <a href="tech.v3.dataset.html#var-remove-columns">remove-columns</a> </li><li> <a href="tech.v3.dataset.html#var-remove-missing">remove-missing</a> </li><li> <a href="tech.v3.dataset.html#var-remove-rows">remove-rows</a> </li><li> <a href="tech.v3.dataset.html#var-rename-columns">rename-columns</a> </li><li> <a href="tech.v3.dataset.html#var-replace-missing">replace-missing</a> </li><li> <a href="tech.v3.dataset.html#var-reverse-rows">reverse-rows</a> </li><li> <a href="tech.v3.dataset.html#var-row-at">row-at</a> </li><li> <a href="tech.v3.dataset.html#var-row-count">row-count</a> </li><li> <a href="tech.v3.dataset.html#var-row-map">row-map</a> </li><li> <a href="tech.v3.dataset.html#var-rows">rows</a> </li><li> <a href="tech.v3.dataset.html#var-rowvec-at">rowvec-at</a> </li><li> <a href="tech.v3.dataset.html#var-rowvecs">rowvecs</a> </li><li> <a href="tech.v3.dataset.html#var-select">select</a> </li><li> <a href="tech.v3.dataset.html#var-select-columns">select-columns</a> </li><li> <a href="tech.v3.dataset.html#var-select-missing">select-missing</a> </li><li> <a href="tech.v3.dataset.html#var-select-rows">select-rows</a> </li><li> <a href="tech.v3.dataset.html#var-soft-select-columns">soft-select-columns</a> </li><li> <a href="tech.v3.dataset.html#var-sort-by">sort-by</a> </li><li> <a href="tech.v3.dataset.html#var-sort-by-column">sort-by-column</a> </li><li> <a href="tech.v3.dataset.html#var-tail">tail</a> </li><li> <a href="tech.v3.dataset.html#var-transit-read-handler-map">transit-read-handler-map</a> </li><li> <a href="tech.v3.dataset.html#var-transit-str-.3Edataset">transit-str-&gt;dataset</a> </li><li> <a href="tech.v3.dataset.html#var-transit-write-handler-map">transit-write-handler-map</a> </li><li> <a href="tech.v3.dataset.html#var-union-missing-sets">union-missing-sets</a> </li><li> <a href="tech.v3.dataset.html#var-unique-by">unique-by</a> </li><li> <a href="tech.v3.dataset.html#var-unique-by-column">unique-by-column</a> </li><li> <a href="tech.v3.dataset.html#var-update">update</a> </li></ul></div></div><div class="namespace"><h3><a href="tech.v3.dataset.node.html">tech.v3.dataset.node</a></h3><div class="doc"><div class="markdown"><p>Functions and helpers that require the node runtime.</p>
Expand Down
Loading

0 comments on commit 67d91fc

Please sign in to comment.